cpython

Commit Graph

Author	SHA1	Message	Date
Serhiy Storchaka	ab9893c406	[3.10] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) (GH-133944) (GH-134345) If the error handler is used, a new bytes object is created to set as the object attribute of UnicodeDecodeError, and that bytes object then replaces the original data. A pointer to the decoded data will became invalid after destroying that temporary bytes object. So we need other way to return the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal(). _PyBytes_DecodeEscape() does not have such issue, because it does not use the error handlers registry, but it should be changed for compatibility with _PyUnicode_DecodeUnicodeEscapeInternal(). (cherry picked from commit `9f69a58623`) (cherry picked from commit `6279eb8c07`) (cherry picked from commit `a75953b347`) (cherry picked from commit `0c33e5baed`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2025-06-02 17:55:48 +02:00
Miss Islington (bot)	1c937e5887	[3.10] gh-99612: Fix PyUnicode_DecodeUTF8Stateful() for ASCII-only data (GH-99613) (GH-107224) (#107230 ) Previously *consumed was not set in this case. (cherry picked from commit `b8b3e6afc0`) (cherry picked from commit `f08e52ccb0`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2023-08-22 20:24:40 +02:00
Miss Islington (bot)	4cc363611c	gh-101765: unicodeobject: use Py_XDECREF correctly (GH-102283) (cherry picked from commit `8d0f09b1be`) Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>	2023-02-26 15:09:43 -08:00
Jelle Zijlstra	6fa6c2a470	[3.10] gh-101765: Fix refcount issues in list and unicode pickling (GH-102265) (#102269 ) (cherry picked from commit `d71edbd1b7`)	2023-02-25 16:38:00 -08:00
Ionite	9f472f81bc	[3.10] gh-101765: Fix SystemError / segmentation fault in iter `__reduce__` when internal access of `builtins.__dict__` exhausts the iterator (GH-101769) (#102229 ) (cherry picked from commit `54dfa14c5a`)	2023-02-24 19:50:53 -08:00
Miss Islington (bot)	d985c8e2e0	bpo-36819: Fix crashes in built-in encoders with weird error handlers (GH-28593) If the error handler returns position less or equal than the starting position of non-encodable characters, most of built-in encoders didn't properly re-size the output buffer. This led to out-of-bounds writes, and segfaults. (cherry picked from commit `18b07d773e`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2022-05-02 02:58:41 -07:00
Miss Islington (bot)	72114c06fd	gh-91421: Use constant value check during runtime (GH-91422) (GH-91492) The left-hand side expression of the if-check can be converted to a constant by the compiler, but the addition on the right-hand side is performed during runtime. Move the addition from the right-hand side to the left-hand side by turning it into a subtraction there. Since the values are known to be large enough to not turn negative, this is a safe operation. Prevents a very unlikely integer overflow on 32 bit systems. Fixes GH-91421. (cherry picked from commit `0859368335`) Co-authored-by: Tobias Stoeckmann <stoeckmann@users.noreply.github.com>	2022-04-13 18:38:37 -07:00
Miss Islington (bot)	69edc30d2b	Fix bad grammar and import docstring for split/rsplit (GH-32381) (GH-32416)	2022-04-08 12:06:19 -05:00
Christian Heimes	55d5c96c57	[3.10] bpo-47182: Fix crash by named unicode characters after interpreter reinitialization (GH-32212) (GH-32216) Co-authored-by: Christian Heimes <christian@python.org>	2022-04-01 10:44:56 +02:00
Victor Stinner	72c260cf0c	[3.10] bpo-46006: Revert "bpo-40521: Per-interpreter interned strings (GH-20085)" (GH-30422) (GH-30425) This reverts commit `ea251806b8`. Keep "assert(interned == NULL);" in _PyUnicode_Fini(), but only for the main interpreter. Keep _PyUnicode_ClearInterned() changes avoiding the creation of a temporary Python list object. Leave the PyInterpreterState structure unchanged to keep the ABI backward compatibility with Python 3.10.0: rename the "interned" member to "unused_interned". (cherry picked from commit `35d6540c90`)	2022-01-06 16:12:28 +01:00
Serhiy Storchaka	4641afef66	[3.10] bpo-45467: Fix IncrementalDecoder and StreamReader in the "raw-unicode-escape" codec (GH-28944) (GH-28952) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.raw_unicode_escape_decode(). It is True by default to match the former behavior. (cherry picked from commit `39aa98346d`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2021-10-14 21:23:39 +03:00
Miss Islington (bot)	0bff4ccbfd	[3.10] bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) (GH-28943) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.unicode_escape_decode(). It is True by default to match the former behavior. (cherry picked from commit `c96d1546b1`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2021-10-14 20:02:20 +03:00
Christian Clauss	dcfbe4f72d	[3.10] Fix typos in the Objects directory (GH-28766) (GH-28797) (cherry picked from commit `5f401f1040`) Automerge-Triggered-By: GH:JulienPalard	2021-10-07 07:31:33 -07:00
Łukasz Langa	8c1e1da565	[3.10] [codemod] Fix non-matching bracket pairs (GH-28473) (GH-28511) Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Łukasz Langa <lukasz@langa.pl> (cherry picked from commit `8f943ca257`) Co-authored-by: Mohamad Mansour <66031317+mohamadmansourX@users.noreply.github.com>	2021-09-22 01:33:59 +02:00
Serhiy Storchaka	c43317d41e	[3.10] Add more const modifiers. (GH-26691). (GH-26692) (cherry picked from commit `be8b631b7a`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2021-06-12 18:44:32 +01:00
Jakub Kulík	9032cf5cb1	bpo-43667: Fix broken Unicode encoding in non-UTF locales on Solaris (GH-25096)	2021-04-30 15:21:42 +02:00
Victor Stinner	442ad74fc2	bpo-43687: Py_Initialize() creates singletons earlier (GH-25147) Reorganize pycore_interp_init() to initialize singletons before the the first PyType_Ready() call. Fix an issue when Python is configured using --without-doc-strings.	2021-04-02 15:28:13 +02:00
Jessica Clarke	dec0757549	bpo-43179: Generalise alignment for optimised string routines (GH-24624) * Remove m68k-specific hack from ascii_decode On m68k, alignments of primitives is more relaxed, with 4-byte and 8-byte types only requiring 2-byte alignment, thus using sizeof(size_t) does not work. Instead, use the portable alternative. Note that this is a minimal fix that only relaxes the assertion and the condition for when to use the optimised version remains overly strict. Such issues will be fixed tree-wide in the next commit. NB: In C11 we could use _Alignof(size_t) instead, but for compatibility we use autoconf. * Optimise string routines for architectures with non-natural alignment C only requires that sizeof(x) is a multiple of alignof(x), not that the two are equal. Thus anywhere where we optimise based on alignment we should be using alignof(x) not sizeof(x). This is more annoying than it would be in C11 where we could just use _Alignof(x) (and alignof(x) in C++11), but since we still require only C99 we must plumb the information all the way from autoconf through the various typedefs and defines.	2021-03-31 12:12:39 +02:00
Victor Stinner	9976834f80	bpo-35883: Py_DecodeLocale() escapes invalid Unicode characters (GH-24843) Python no longer fails at startup with a fatal error if a command line argument contains an invalid Unicode character. The Py_DecodeLocale() function now escapes byte sequences which would be decoded as Unicode characters outside the [U+0000; U+10ffff] range. Use MAX_UNICODE constant in unicodeobject.c.	2021-03-17 21:46:53 +01:00
Brandt Bucher	145bf269df	bpo-42128: Structural Pattern Matching (PEP 634) (GH-22917) Co-authored-by: Guido van Rossum <guido@python.org> Co-authored-by: Talin <viridia@gmail.com> Co-authored-by: Pablo Galindo <pablogsal@gmail.com>	2021-02-26 14:51:55 -08:00
Victor Stinner	bcb094b41f	bpo-43268: Pass interp rather than tstate to internal functions (GH-24580) Pass the current interpreter (interp) rather than the current Python thread state (tstate) to internal functions which only use the interpreter. Modified functions: * _PyXXX_Fini() and _PyXXX_ClearFreeList() functions * _PyEval_SignalAsyncExc(), make_pending_calls() * _PySys_GetObject(), sys_set_object(), sys_set_object_id(), sys_set_object_str() * should_audit(), set_flags_from_config(), make_flags() * _PyAtExit_Call() * init_stdio_encoding() * etc.	2021-02-19 15:10:45 +01:00
Victor Stinner	101bf69ff1	bpo-43268: _Py_IsMainInterpreter() now expects interp (GH-24577) The _Py_IsMainInterpreter() function now expects interp rather than tstate.	2021-02-19 13:33:31 +01:00
Pablo Galindo	a6d63a20df	Fix compiler warnings regarding loss of data (GH-23983)	2020-12-29 00:28:09 +00:00
Victor Stinner	f4507231e3	bpo-42745: finalize_interp_types() calls _PyType_Fini() (GH-23953) Call _PyType_Fini() in subinterpreters. Fix reference leaks in subinterpreters.	2020-12-26 20:26:08 +01:00
Victor Stinner	ea251806b8	bpo-40521: Per-interpreter interned strings (GH-20085) Make the Unicode dictionary of interned strings compatible with subinterpreters. Remove the INTERN_NAME_STRINGS macro in typeobject.c: names are always now interned (even if EXPERIMENTAL_ISOLATED_SUBINTERPRETERS macro is defined). _PyUnicode_ClearInterned() now uses PyDict_Next() to no longer allocate memory, to ensure that the interned dictionary is cleared.	2020-12-26 02:58:33 +01:00
Victor Stinner	ba3d67c2fb	bpo-39465: Fix _PyUnicode_FromId() for subinterpreters (GH-20058) Make _PyUnicode_FromId() function compatible with subinterpreters. Each interpreter now has an array of identifier objects (interned strings decoded from UTF-8). * Add PyInterpreterState.unicode.identifiers: array of identifiers objects. * Add _PyRuntimeState.unicode_ids used to allocate unique indexes to _Py_Identifier. * Rewrite the _Py_Identifier structure. Microbenchmark on _PyUnicode_FromId(&PyId_a) with _Py_IDENTIFIER(a): [ref] 2.42 ns +- 0.00 ns -> [atomic] 3.39 ns +- 0.00 ns: 1.40x slower This change adds 1 ns per _PyUnicode_FromId() call in average.	2020-12-26 00:41:46 +01:00
Serhiy Storchaka	2ad93821a6	bpo-42431: Fix outdated bytes comments (GH-23458) Also move definitions of internal macros F_LJUST etc to private header.	2020-12-03 12:46:16 +02:00
Victor Stinner	32bd68c839	bpo-42519: Replace PyObject_MALLOC() with PyObject_Malloc() (GH-23587) No longer use deprecated aliases to functions: * Replace PyObject_MALLOC() with PyObject_Malloc() * Replace PyObject_REALLOC() with PyObject_Realloc() * Replace PyObject_FREE() with PyObject_Free() * Replace PyObject_Del() with PyObject_Free() * Replace PyObject_DEL() with PyObject_Free()	2020-12-01 10:37:39 +01:00
Victor Stinner	00d7abd7ef	bpo-42519: Replace PyMem_MALLOC() with PyMem_Malloc() (GH-23586) No longer use deprecated aliases to functions: * Replace PyMem_MALLOC() with PyMem_Malloc() * Replace PyMem_REALLOC() with PyMem_Realloc() * Replace PyMem_FREE() with PyMem_Free() * Replace PyMem_Del() with PyMem_Free() * Replace PyMem_DEL() with PyMem_Free() Modify also the PyMem_DEL() macro to use directly PyMem_Free().	2020-12-01 09:56:42 +01:00
Christian Heimes	07f2adedf0	bpo-40998: Address compiler warnings found by ubsan (GH-20929) Signed-off-by: Christian Heimes <christian@python.org> Automerge-Triggered-By: GH:tiran	2020-11-18 07:38:53 -08:00
Ikko Ashimine	38811d68ca	Fix typo in unicodeobject.c (GH-23180) exeeds -> exceeds Automerge-Triggered-By: GH:Mariatta	2020-11-09 21:57:34 -08:00
Victor Stinner	920cb647ba	bpo-42157: unicodedata avoids references to UCD_Type (GH-22990) * UCD_Check() uses PyModule_Check() * Simplify the internal _PyUnicode_Name_CAPI structure: * Remove size and state members * Remove state and self parameters of getcode() and getname() functions * Remove global_module_state	2020-10-26 19:19:36 +01:00
Victor Stinner	47e1afd2a1	bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713) The private _PyUnicode_Name_CAPI structure of the PyCapsule API unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the structure gets a new state member which must be passed to the getcode() and getname() functions. * Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h * unicodedata module is now built with Py_BUILD_CORE_MODULE. * unicodedata: move hashAPI variable into unicodedata_module_state.	2020-10-26 16:43:47 +01:00
Ma Lin	a0c603cb9d	bpo-38252: Use 8-byte step to detect ASCII sequence in 64bit Windows build (GH-16334)	2020-10-18 17:48:38 +03:00
Max Bernstein	3635388f52	bpo-42065: Fix incorrectly formatted _codecs.charmap_decode error message (GH-19940)	2020-10-17 23:38:21 +03:00
Serhiy Storchaka	e2ec0b27c0	bpo-41974: Remove complex.__float__, complex.__floordiv__, etc (GH-22593) Remove complex special methods __int__, __float__, __floordiv__, __mod__, __divmod__, __rfloordiv__, __rmod__ and __rdivmod__ which always raised a TypeError.	2020-10-09 14:14:37 +03:00
Serhiy Storchaka	9ece9cd65c	bpo-41909: Enable previously disabled recursion checks. (GH-22536) Enable recursion checks which were disabled when get __bases__ of non-type objects in issubclass() and isinstance() and when intern strings. It fixes a stack overflow when getting __bases__ leads to infinite recursion. Originally recursion checks was disabled for PyDict_GetItem() which silences all errors including the one raised in case of detected recursion and can return incorrect result. But now the code uses PyDict_GetItemWithError() and PyDict_SetDefault() instead.	2020-10-05 00:55:57 +03:00
Victor Stinner	583ee5a5b1	bpo-41692: Deprecate PyUnicode_InternImmortal() (GH-22486) The PyUnicode_InternImmortal() function is now deprecated and will be removed in Python 3.12: use PyUnicode_InternInPlace() instead.	2020-10-02 14:49:00 +02:00
Victor Stinner	7f413a5d95	bpo-40521: Fix PyUnicode_InternInPlace() (GH-22376) Fix PyUnicode_InternInPlace() when the INTERNED_STRINGS macro is not defined (when the EXPERIMENTAL_ISOLATED_SUBINTERPRETERS macro is defined).	2020-09-23 14:05:32 +02:00
Victor Stinner	bb083d33f7	bpo-1635741: Port _string module to multi-phase init (GH-22148) Port the _string extension module to the multi-phase initialization API (PEP 489).	2020-09-08 15:33:08 +02:00
Serhiy Storchaka	12f433411b	bpo-41334: Convert constructors of str, bytes and bytearray to Argument Clinic (GH-21535)	2020-07-20 15:53:55 +03:00
Serhiy Storchaka	4c8f09d7ce	bpo-36346: Make using the legacy Unicode C API optional (GH-21437) Add compile time option USE_UNICODE_WCHAR_CACHE. Setting it to 0 makes the interpreter not using the wchar_t cache and the legacy Unicode C API.	2020-07-10 23:26:06 +03:00
Victor Stinner	8182cc2e68	bpo-39573: Use the Py_TYPE() macro (GH-21433) Replace obj->ob_type with Py_TYPE(obj).	2020-07-10 12:40:38 +02:00
Serhiy Storchaka	b3dd5cd4a3	bpo-36346: Undeprecate private function _PyUnicode_AsUnicode(). (GH-21336)	2020-07-05 18:53:45 +03:00
Victor Stinner	3549ca313a	bpo-1635741: Fix unicode_dealloc() for mortal interned string (GH-21270) When unicode_dealloc() is called on a mortal interned string, the string reference counter is now reset at zero.	2020-07-03 16:59:12 +02:00
Victor Stinner	666ecfb095	bpo-1635741: Release Unicode interned strings at exit (GH-21269) * PyUnicode_InternInPlace() now ensures that interned strings are ready. * Add _PyUnicode_ClearInterned(). * Py_Finalize() now releases Unicode interned strings: call _PyUnicode_ClearInterned().	2020-07-02 01:19:57 +02:00
Inada Naoki	038dd0f79d	bpo-36346: Raise DeprecationWarning when creating legacy Unicode (GH-20933)	2020-06-30 15:26:56 +09:00
Serhiy Storchaka	349f76c6aa	bpo-36346: Prepare for removing the legacy Unicode C API (AC only). (GH-21223)	2020-06-30 09:03:15 +03:00
Inada Naoki	b3332660ad	bpo-41123: Remove PyUnicode_AsUnicodeCopy (GH-21209)	2020-06-30 12:23:07 +09:00
Serhiy Storchaka	e67f7db3c3	bpo-37999: Simplify the conversion code for %c, %d, %x, etc. (GH-20437) Since PyLong_AsLong() no longer use __int__, explicit call of PyNumber_Index() before it is no longer needed.	2020-06-29 22:36:41 +03:00

1 2 3 4 5 ...

1565 Commits