Commit Graph

445 Commits

Author SHA1 Message Date
Serhiy Storchaka 9f69a58623
gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648)
If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
2025-05-12 20:42:23 +03:00
Ageev Maxim 7c3692fe27
gh-130928: Fix error message during bytes formatting for the `'i'` flag (#130967) 2025-03-24 22:07:03 +03:00
Bénédikt Tran a1205ef524
gh-111178: fix UBSan failures for `PyBytesObject` (#131603) 2025-03-24 11:02:09 +01:00
Victor Stinner 20c5f969dd
gh-131238: Remove more includes from pycore_interp.h (#131480) 2025-03-19 23:01:32 +01:00
Victor Stinner 061da44bac
gh-111178: Change Argument Clinic signature for `@classmethod` (#131157)
Use "PyObject*", instead of "PyTypeObject*", for `@classmethod`
functions to fix an undefined behavior.
2025-03-12 17:42:07 +01:00
Daniel Pope e0637cebe5
gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() (#129844)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-03-12 11:40:11 +01:00
Victor Stinner 9d759b63d8
gh-111178: Change Argument Clinic signature for METH_O (#130682)
Use "PyObject*" for METH_O functions to fix an undefined behavior.
2025-03-11 16:33:36 +01:00
Umar Butler 8d8b854824
gh-128016: Improved invalid escape sequence warning message (#128020) 2025-01-15 18:00:54 +01:00
Bénédikt Tran 295776c7f3
gh-111178: fix UBSan failures in `Objects/bytesobject.c` (GH-128237)
* remove redundant casts for `bytesobject`
* fix UBSan failures for `striterobject`
2025-01-10 11:48:06 +01:00
Abhijeet 0706bab1c0
gh-128133: use relaxed atomics for hash of bytes (#128412) 2025-01-03 13:50:56 +05:30
Srinivas Reddy Thatiparthy (తాటిపర్తి శ్రీనివాస్ రెడ్డి) db9bea0386
gh-127740: For odd-length input to bytes.fromhex(...) change the error message to ValueError: fromhex() arg must be of even length (#127756) 2024-12-11 08:35:17 +01:00
Pablo Galindo Salgado 30aeb00d36
gh-126076: Account for relocated objects in tracemalloc (#126077) 2024-11-19 10:35:17 +00:00
Mark Shannon fa40922597
GH-126547: Pre-assign version numbers for a few common classes (GH-126551) 2024-11-08 16:44:44 +00:00
Mark Shannon c9014374c5
GH-125174: Make immortal objects more robust, following design from PEP 683 (GH-125251) 2024-10-10 18:19:08 +01:00
Victor Stinner 6a39e96ab8
gh-115754: Use Py_GetConstant(Py_CONSTANT_EMPTY_BYTES) (#125195)
Replace PyBytes_FromString("") and PyBytes_FromStringAndSize("", 0)
with Py_GetConstant(Py_CONSTANT_EMPTY_BYTES).
2024-10-09 17:12:11 +02:00
Victor Stinner 1d3700f943
gh-111178: Fix function signatures in bytesobject.c (#124806) 2024-10-02 13:35:51 +02:00
Victor Stinner f1a0d96f41
gh-123091: Use _Py_IsImmortalLoose() (#123511)
Use _Py_IsImmortalLoose() in bytesobject.c, typeobject.c
and ceval.c.
2024-09-02 14:25:19 +02:00
Victor Stinner d8e69b2c1b
gh-122854: Add Py_HashBuffer() function (#122855) 2024-08-30 15:42:27 +00:00
Victor Stinner 3d60dfbe17
gh-121645: Add PyBytes_Join() function (#121646)
* Replace _PyBytes_Join() with PyBytes_Join().
* Keep _PyBytes_Join() as an alias to PyBytes_Join().
2024-08-30 12:57:33 +00:00
Victor Stinner fda6bd842a
Replace PyObject_Del with PyObject_Free (#122453)
PyObject_Del() is just a alias to PyObject_Free() kept for backward
compatibility. Use directly PyObject_Free() instead.
2024-08-01 14:12:33 +02:00
Serhiy Storchaka b313cc68d5
gh-117557: Improve error messages when a string, bytes or bytearray of length 1 are expected (GH-117631) 2024-05-28 12:01:37 +03:00
Geoffrey Thomas ef172521a9
Remove almost all unpaired backticks in docstrings (#119231)
As reported in #117847 and #115366, an unpaired backtick in a docstring
tends to confuse e.g. Sphinx running on subclasses of standard library
objects, and the typographic style of using a backtick as an opening
quote is no longer in favor. Convert almost all uses of the form

    The variable `foo' should do xyz

to

    The variable 'foo' should do xyz

and also fix up miscellaneous other unpaired backticks (extraneous /
missing characters).

No functional change is intended here other than in human-readable
docstrings.
2024-05-22 12:35:18 -04:00
Erlend E. Aasland deb921f851
gh-117431: Adapt bytes and bytearray .find() and friends to Argument Clinic (#117502)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following bytes and bytearray methods are adapted:

- count()
- find()
- index()
- rfind()
- rindex()

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
2024-04-12 07:40:55 +00:00
Sam Gross 1a6594f661
gh-117439: Make refleak checking thread-safe without the GIL (#117469)
This keeps track of the per-thread total reference count operations in
PyThreadState in the free-threaded builds. The count is merged into the
interpreter's total when the thread exits.
2024-04-08 12:11:36 -04:00
Erlend E. Aasland 595bb496b0
gh-117431: Adapt bytes and bytearray .startswith() and .endswith() to Argument Clinic (#117495)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used.
2024-04-03 13:11:14 +02:00
Serhiy Storchaka 0c1a42cf9c
gh-87193: Support bytes objects with refcount > 1 in _PyBytes_Resize() (GH-117160)
Create a new bytes object and destroy the old one if it has refcount > 1.
2024-03-25 16:32:11 +01:00
Victor Stinner 578ebc5d5f
gh-108767: Replace ctype.h functions with pyctype.h functions (#108772)
Replace <ctype.h> locale dependent functions with Python "pyctype.h"
locale independent functions:

* Replace isalpha() with Py_ISALPHA().
* Replace isdigit() with Py_ISDIGIT().
* Replace isxdigit() with Py_ISXDIGIT().
* Replace tolower() with Py_TOLOWER().

Leave Modules/_sre/sre.c unchanged, it uses locale dependent
functions on purpose.

Include explicitly <ctype.h> in _decimal.c to get isascii().
2023-09-01 18:36:53 +02:00
Victor Stinner b32d4cad15
gh-108444: Replace _PyLong_AsInt() with PyLong_AsInt() (#108459)
Change generated by the command:

sed -i -e 's!_PyLong_AsInt!PyLong_AsInt!g' \
    $(find -name "*.c" -o -name "*.h")
2023-08-25 01:01:30 +02:00
Victor Stinner c494fb333b
gh-106320: Remove private _PyEval function (#108433)
Move private _PyEval functions to the internal C API
(pycore_ceval.h):

* _PyEval_GetBuiltin()
* _PyEval_GetBuiltinId()
* _PyEval_GetSwitchInterval()
* _PyEval_MakePendingCalls()
* _PyEval_SetProfile()
* _PyEval_SetSwitchInterval()
* _PyEval_SetTrace()

No longer export most of these functions.
2023-08-24 20:25:22 +02:00
Brandt Bucher 05a824f294
GH-84436: Skip refcounting for known immortals (GH-107605) 2023-08-04 16:24:50 -07:00
Dennis Sweeney ab86426a34
gh-105235: Prevent reading outside buffer during mmap.find() (#105252)
* Add a special case for s[-m:] == p in _PyBytes_Find

* Add tests for _PyBytes_Find

* Make sure that start <= end in mmap.find
2023-07-12 22:50:45 -04:00
Inada Naoki d5bd32fb48
gh-104922: remove PY_SSIZE_T_CLEAN (#106315) 2023-07-02 15:07:46 +09:00
Inada Naoki 77ddc9a7b1
fix typos (#106247)
Most typos are in comments, but two typos are in docstring.
2023-06-30 13:00:22 +09:00
Victor Stinner ef300937c2
gh-92536: Remove PyUnicode_READY() calls (#105210)
Since Python 3.12, PyUnicode_READY() does nothing and always
returns 0.
2023-06-02 01:33:17 +02:00
John Belmonte 69621d1b09
gh-104018: remove unused format "z" handling in string formatfloat() (#104107)
This is a cleanup overlooked in PR #104033.
2023-05-07 10:11:42 +05:30
John Belmonte 3ed8c88290
gh-104018: disallow "z" format specifier in %-format of byte strings (GH-104033)
PEP-0682 specified that %-formatting would not support the "z" specifier,
but it was unintentionally allowed for bytes. This PR makes use of the "z"
flag an error for %-formatting in a bytestring.

Issue: #104018

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2023-05-01 20:47:14 +01:00
Eric Snow d2e2e53f73
gh-94673: Ensure Builtin Static Types are Readied Properly (gh-103940)
There were cases where we do unnecessary work for builtin static types. This also simplifies some work necessary for a per-interpreter GIL.
2023-04-27 16:19:43 -06:00
Eric Snow 743687434c
gh-102304: Move the Total Refcount to PyInterpreterState (gh-102545)
Moving it valuable with a per-interpreter GIL.  However, it is also useful without one, since it allows us to identify refleaks within a single interpreter or where references are escaping an interpreter.  This becomes more important as we move the obmalloc state to PyInterpreterState.

https://github.com/python/cpython/issues/102304
2023-03-21 11:46:09 -06:00
Eric Snow cbb0aa71d0
gh-102304: Consolidate Direct Usage of _Py_RefTotal (gh-102514)
This simplifies further changes to _Py_RefTotal (e.g. make it atomic or move it to PyInterpreterState).

https://github.com/python/cpython/issues/102304
2023-03-08 12:03:50 -07:00
Ionite 54dfa14c5a
gh-101765: Fix SystemError / segmentation fault in iter `__reduce__` when internal access of `builtins.__dict__` exhausts the iterator (#101769) 2023-02-24 15:02:04 -08:00
Nikita Sobolev b1a74a182d
gh-101056: Fix memory leak in `formatfloat()` in `bytesobject.c` (#101057) 2023-01-16 16:16:07 +05:30
Serhiy Storchaka a87c46eab3
bpo-15999: Accept arbitrary values for boolean parameters. (#15609)
builtins and extension module functions and methods that expect boolean values for parameters now accept any Python object rather than just a bool or int type. This is more consistent with how native Python code itself behaves.
2022-12-03 11:52:21 -08:00
Victor Stinner 135ec7cefb
gh-99537: Use Py_SETREF() function in C code (#99657)
Fix potential race condition in code patterns:

* Replace "Py_DECREF(var); var = new;" with "Py_SETREF(var, new);"
* Replace "Py_XDECREF(var); var = new;" with "Py_XSETREF(var, new);"
* Replace "Py_CLEAR(var); var = new;" with "Py_XSETREF(var, new);"

Other changes:

* Replace "old = var; var = new; Py_DECREF(var)"
  with "Py_SETREF(var, new);"
* Replace "old = var; var = new; Py_XDECREF(var)"
  with "Py_XSETREF(var, new);"
* And remove the "old" variable.
2022-11-22 13:39:11 +01:00
Victor Stinner c0feb99187
gh-99300: Use Py_NewRef() in Objects/ directory (#99332)
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and
Py_XNewRef() in C files of the Objects/ directory.
2022-11-10 16:27:32 +01:00
Kumar Aditya cb04a09d2d
GH-93207: Remove HAVE_STDARG_PROTOTYPES configure check for stdarg.h (#93215) 2022-05-27 13:30:45 +02:00
Victor Stinner f62ad4f2c4
gh-89653: Use int type for Unicode kind (#92704)
Use the same type that PyUnicode_FromKindAndData() kind parameter
type (public C API): int.
2022-05-13 12:41:05 +02:00
Victor Stinner d716a0dfe2
Use static inline function Py_EnterRecursiveCall() (#91988)
Currently, calling Py_EnterRecursiveCall() and
Py_LeaveRecursiveCall() may use a function call or a static inline
function call, depending if the internal pycore_ceval.h header file
is included or not. Use a different name for the static inline
function to ensure that the static inline function is always used in
Python internals for best performance. Similar approach than
PyThreadState_GET() (function call) and _PyThreadState_GET() (static
inline function).

* Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate()
* Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate()
* pycore_ceval.h: Rename Py_EnterRecursiveCall() to
  _Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and
  _Py_LeaveRecursiveCall()
2022-05-04 13:30:23 +02:00
Serhiy Storchaka 3483299a24
gh-81548: Deprecate octal escape sequences with value larger than 0o377 (GH-91668) 2022-04-30 13:16:27 +03:00
Inada Naoki 4d2403fd50
gh-91020: Add `PyBytes_Type.tp_alloc` for subclass (GH-91686) 2022-04-20 14:06:29 +09:00
John Belmonte b0b836b20c
bpo-45995: add "z" format specifer to coerce negative 0 to zero (GH-30049)
Add "z" format specifier to coerce negative 0 to zero.

See https://github.com/python/cpython/issues/90153 (originally https://bugs.python.org/issue45995) for discussion.
This covers `str.format()` and f-strings.  Old-style string interpolation is not supported.

Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
2022-04-11 15:34:18 +01:00