cpython

Commit Graph

Author	SHA1	Message	Date
Serhiy Storchaka	8d35fd1b34	[3.9] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) (GH-133944) (#134346 ) * [3.9] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) (GH-133944) If the error handler is used, a new bytes object is created to set as the object attribute of UnicodeDecodeError, and that bytes object then replaces the original data. A pointer to the decoded data will became invalid after destroying that temporary bytes object. So we need other way to return the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal(). _PyBytes_DecodeEscape() does not have such issue, because it does not use the error handlers registry, but it should be changed for compatibility with _PyUnicode_DecodeUnicodeEscapeInternal(). (cherry picked from commit `9f69a58623`) (cherry picked from commit `6279eb8c07`) (cherry picked from commit `a75953b347`) (cherry picked from commit `0c33e5baed`) (cherry picked from commit `8b528cacbb`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2025-06-02 17:58:01 +02:00
Serhiy Storchaka	4a79328195	[3.9] gh-99612: Fix PyUnicode_DecodeUTF8Stateful() for ASCII-only data (GH-99613) (GH-107224) (#107231 ) Previously *consumed was not set in this case. (cherry picked from commit `f08e52ccb0`). (cherry picked from commit `b8b3e6afc0`)	2023-08-22 20:25:15 +02:00
Miss Islington (bot)	206f416bd0	bpo-36819: Fix crashes in built-in encoders with weird error handlers (GH-28593) If the error handler returns position less or equal than the starting position of non-encodable characters, most of built-in encoders didn't properly re-size the output buffer. This led to out-of-bounds writes, and segfaults. (cherry picked from commit `18b07d773e`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2022-05-02 02:59:40 -07:00
Miss Islington (bot)	edf1a77f23	gh-91421: Use constant value check during runtime (GH-91422) (GH-91493) The left-hand side expression of the if-check can be converted to a constant by the compiler, but the addition on the right-hand side is performed during runtime. Move the addition from the right-hand side to the left-hand side by turning it into a subtraction there. Since the values are known to be large enough to not turn negative, this is a safe operation. Prevents a very unlikely integer overflow on 32 bit systems. Fixes GH-91421. (cherry picked from commit `0859368335`) Co-authored-by: Tobias Stoeckmann <stoeckmann@users.noreply.github.com>	2022-04-13 18:38:55 -07:00
Serhiy Storchaka	6848602806	bpo-45467: Fix IncrementalDecoder and StreamReader in the "raw-unicode-escape" codec (GH-28944) (GH-28953) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.raw_unicode_escape_decode(). It is True by default to match the former behavior. (cherry picked from commit `39aa98346d`)	2021-10-14 21:23:52 +03:00
Serhiy Storchaka	7c722e32bf	[3.9] bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) (GH-28945) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.unicode_escape_decode(). It is True by default to match the former behavior. (cherry picked from commit `c96d1546b1`) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2021-10-14 20:03:29 +03:00
Christian Clauss	960e7b3ba1	[3.9] Fix typos in the Objects directory (GH-28766) (GH-28795) (cherry picked from commit `5f401f1040`) Automerge-Triggered-By: GH:JulienPalard	2021-10-07 07:09:41 -07:00
Łukasz Langa	5482db5800	[3.9] [codemod] Fix non-matching bracket pairs (GH-28473) (GH-28512) Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Łukasz Langa <lukasz@langa.pl>. (cherry picked from commit `8f943ca257`) Co-authored-by: Mohamad Mansour <66031317+mohamadmansourX@users.noreply.github.com>	2021-09-22 17:32:04 +02:00
Jakub Kulík	d3cc68900d	[3.9] bpo-43667: Fix broken Unicode encoding in non-UTF locales on Solaris (GH-25096) (GH-25847) (cherry picked from commit `9032cf5cb1`) Co-authored-by: Jakub Kulík <Kulikjak@gmail.com>	2021-05-21 16:59:39 +02:00
Miss Islington (bot)	aa967ec4d4	bpo-35883: Py_DecodeLocale() escapes invalid Unicode characters (GH-24843) Python no longer fails at startup with a fatal error if a command line argument contains an invalid Unicode character. The Py_DecodeLocale() function now escapes byte sequences which would be decoded as Unicode characters outside the [U+0000; U+10ffff] range. Use MAX_UNICODE constant in unicodeobject.c. (cherry picked from commit `9976834f80`) Co-authored-by: Victor Stinner <vstinner@python.org>	2021-03-17 14:11:14 -07:00
Serhiy Storchaka	651fc30af7	bpo-43499: Silence compiler warnings about using legacy C API on Windows (GH-24873)	2021-03-16 08:03:37 +02:00
Miss Islington (bot)	994c68f586	bpo-40998: Address compiler warnings found by ubsan (GH-20929) Signed-off-by: Christian Heimes <christian@python.org> Automerge-Triggered-By: GH:tiran (cherry picked from commit `07f2adedf0`) Co-authored-by: Christian Heimes <christian@python.org>	2020-11-18 08:01:48 -08:00
Miss Islington (bot)	2a86ade9e3	Fix typo in unicodeobject.c (GH-23180) exeeds -> exceeds Automerge-Triggered-By: GH:Mariatta (cherry picked from commit `38811d68ca`) Co-authored-by: Ikko Ashimine <eltociear@gmail.com>	2020-11-09 22:18:34 -08:00
Miss Skeleton (bot)	6a2aa4994e	bpo-42065: Fix incorrectly formatted _codecs.charmap_decode error message (GH-19940) (cherry picked from commit `3635388f52`) Co-authored-by: Max Bernstein <tekknolagi@users.noreply.github.com>	2020-10-18 09:00:18 +03:00
Serhiy Storchaka	7aa22ba923	[3.9] bpo-41909: Enable previously disabled recursion checks. (GH-22536) (GH-22550) Enable recursion checks which were disabled when get __bases__ of non-type objects in issubclass() and isinstance() and when intern strings. It fixes a stack overflow when getting __bases__ leads to infinite recursion. Originally recursion checks was disabled for PyDict_GetItem() which silences all errors including the one raised in case of detected recursion and can return incorrect result. But now the code uses PyDict_GetItemWithError() and PyDict_SetDefault() instead. (cherry picked from commit `9ece9cd65c`)	2020-10-05 01:27:38 +03:00
Inada Naoki	610a60c601	bpo-36346: Add Py_DEPRECATED to deprecated unicode APIs (GH-20878) Co-authored-by: Kyle Stanley <aeros167@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org> (cherry picked from commit `2c4928d37e`)	2020-06-18 17:30:53 +09:00
Victor Stinner	9512ad74b0	[3.9] bpo-40514: Remove --with-experimental-isolated-subinterpreters in 3.9 (GH-20228) Remove --with-experimental-isolated-subinterpreters configure option in Python 3.9: the experiment continues in the master branch, but it's no longer needed in 3.9.	2020-05-20 00:27:46 +02:00
Victor Stinner	3d17c045b4	bpo-40521: Add PyInterpreterState.unicode (GH-20081) Move PyInterpreterState.fs_codec into a new PyInterpreterState.unicode structure. Give a name to the fs_codec structure and use this structure in unicodeobject.c.	2020-05-14 01:48:38 +02:00
Victor Stinner	d6fb53fe42	bpo-39465: Remove _PyUnicode_ClearStaticStrings() from C API (GH-20078) Remove the _PyUnicode_ClearStaticStrings() function from the C API. Make the function fully private (declare it with "static").	2020-05-14 01:11:54 +02:00
Serhiy Storchaka	5650e76f63	bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing non-BMP characters on Windows. (GH-20053)	2020-05-12 16:18:00 +03:00
Serhiy Storchaka	74ea6b5a75	bpo-40593: Improve syntax errors for invalid characters in source code. (GH-20033)	2020-05-12 12:42:04 +03:00
Victor Stinner	607b1027fe	bpo-40521: Disable Unicode caches in isolated subinterpreters (GH-19933) When Python is built in the experimental isolated subinterpreters mode, disable Unicode singletons and Unicode interned strings since they are shared by all interpreters. Temporary workaround until these caches are made per-interpreter.	2020-05-05 18:50:30 +02:00
sweeneyde	a81849b031	bpo-39939: Add str.removeprefix and str.removesuffix (GH-18939) Added str.removeprefix and str.removesuffix methods and corresponding bytes, bytearray, and collections.UserString methods to remove affixes from a string if present. See PEP 616 for a full description.	2020-04-22 23:05:48 +02:00
Victor Stinner	e5014be049	bpo-40268: Remove a few pycore_pystate.h includes (GH-19510)	2020-04-14 17:52:15 +02:00
Victor Stinner	81a7be3fa2	bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509) Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET() for consistency with _PyThreadState_GET() and to have a shorter name (help to fit into 80 columns). Add also "assert(tstate != NULL);" to the function.	2020-04-14 15:14:01 +02:00
Victor Stinner	da7933ecc3	bpo-40268: Add _PyInterpreterState_GetConfig() (GH-19492) Don't access PyInterpreterState.config member directly anymore, but use new functions: * _PyInterpreterState_GetConfig() * _PyInterpreterState_SetConfig() * _Py_GetConfig()	2020-04-13 03:04:28 +02:00
Serhiy Storchaka	8f87eefe7f	bpo-39943: Add the const qualifier to pointers on non-mutable PyBytes data. (GH-19472)	2020-04-12 14:58:27 +03:00
Serhiy Storchaka	cd8295ff75	bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode data. (GH-19345)	2020-04-11 10:48:40 +03:00
Victor Stinner	a15e260b70	bpo-40170: Add _PyIndex_Check() internal function (GH-19426) Add _PyIndex_Check() function to the internal C API: fast inlined verson of PyIndex_Check(). Add Include/internal/pycore_abstract.h header file. Replace PyIndex_Check() with _PyIndex_Check() in C files of Objects and Python subdirectories.	2020-04-08 02:01:56 +02:00
Victor Stinner	d8acf0d9aa	bpo-37388: Don't check encoding/errors during finalization (GH-19409) str.encode() and str.decode() no longer check the encoding and errors in development mode or in debug mode during Python finalization. The codecs machinery can no longer work on very late calls to str.encode() and str.decode(). This change should help to call _PyObject_Dump() to debug during late Python finalization.	2020-04-07 16:07:42 +02:00
Serhiy Storchaka	17b4733f2f	bpo-40130: _PyUnicode_AsKind() should not be exported. (GH-19265) Make it a static function, and pass known attributes (kind, data, length) instead of the PyUnicode object.	2020-04-01 15:41:49 +03:00
Inada Naoki	3a8c56295d	Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985) * Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)" This reverts commit `c7ad974d34`. * Update unicodeobject.h	2020-03-14 15:59:27 +09:00
Inada Naoki	c7ad974d34	bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659) Co-authored-by: Victor Stinner <vstinner@python.org>	2020-03-14 12:43:18 +09:00
Andy Lester	dffe4c0709	bpo-39573: Finish converting to new Py_IS_TYPE() macro (GH-18601)	2020-03-04 14:15:20 +01:00
Inada Naoki	02a4d57263	bpo-39087: Optimize PyUnicode_AsUTF8AndSize() (GH-18327) Avoid using temporary bytes object.	2020-02-27 13:48:59 +09:00
Andy Lester	933fc53f3f	closes bpo-39684: Combine two if/thens and squash uninit var warning. (GH-18565)	2020-02-20 20:51:47 -08:00
Hai Shi	3d235f5c5c	bpo-39500: Fix compile warnings in unicodeobject.c (GH-18519)	2020-02-17 14:41:15 +01:00
Victor Stinner	45876a90e2	bpo-35081: Move bytes_methods.h to the internal C API (GH-18492) Move the bytes_methods.h header file to the internal C API as pycore_bytes_methods.h: it only contains private symbols (prefixed by "_Py"), except of the PyDoc_STRVAR_shared() macro.	2020-02-12 22:32:34 +01:00
Benjamin Peterson	95905ce0f4	bpo-39605: Remove a cast that causes a warning. (GH-18473)	2020-02-11 19:36:14 -08:00
Andy Lester	e6be9b59a9	closes bpo-39605: Fix some casts to not cast away const. (GH-18453) gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either: Adding the const to the type cast, as in: - return _PyUnicode_FromUCS1((unsigned char)s, size); + return _PyUnicode_FromUCS1((const unsigned char)s, size); or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in: - PyDTrace_FUNCTION_ENTRY((char )filename, (char )funcname, lineno); + PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno); These changes will not change code, but they will make it much easier to check for errors in consts	2020-02-11 18:28:35 -08:00
Petr Viktorin	ffd9753a94	bpo-39245: Switch to public API for Vectorcall (GH-18460) The bulk of this patch was generated automatically with: for name in \ PyObject_Vectorcall \ Py_TPFLAGS_HAVE_VECTORCALL \ PyObject_VectorcallMethod \ PyVectorcall_Function \ PyObject_CallOneArg \ PyObject_CallMethodNoArgs \ PyObject_CallMethodOneArg \ ; do echo $name git grep -lwz _$name \| xargs -0 sed -i "s/\b_$name\b/$name/g" done old=_PyObject_FastCallDict new=PyObject_VectorcallDict git grep -lwz $old \| xargs -0 sed -i "s/\b$old\b/$new/g" and then cleaned up: - Revert changes to in docs & news - Revert changes to backcompat defines in headers - Nudge misaligned comments	2020-02-11 17:46:57 +01:00
Victor Stinner	f3e7ea5b8c	bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397) PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the string is not ready.	2020-02-11 14:29:33 +01:00
Victor Stinner	58ac700fb0	bpo-39573: Use Py_TYPE() macro in Objects directory (GH-18392) Replace direct access to PyObject.ob_type with Py_TYPE().	2020-02-07 03:04:21 +01:00
Victor Stinner	c86a11221d	bpo-39573: Add Py_SET_REFCNT() function (GH-18389) Add a Py_SET_REFCNT() function to set the reference counter of an object.	2020-02-07 01:24:29 +01:00
Victor Stinner	bf305cc6f0	Add PyInterpreterState.fs_codec.utf8 (GH-18367) Add a fast-path for UTF-8 encoding in PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize(). Add _PyUnicode_FiniEncodings() helper function for _PyUnicode_Fini().	2020-02-05 17:39:57 +01:00
Victor Stinner	49932fec62	bpo-39542: Simplify _Py_NewReference() (GH-18332) * Remove _Py_INC_REFTOTAL and _Py_DEC_REFTOTAL macros: modify directly _Py_RefTotal. * _Py_ForgetReference() is no longer defined if the Py_TRACE_REFS macro is not defined. * Remove _Py_NewReference() implementation from object.c: unify the two implementations in object.h inline function. * Fix Py_TRACE_REFS build: _Py_INC_TPALLOCS() macro has been removed.	2020-02-03 17:55:04 +01:00
Victor Stinner	ec3c99c8a7	bpo-38631: Avoid Py_FatalError() in unicodeobject.c (GH-18281) Replace Py_FatalError() calls with _PyErr_WriteUnraisableMsg(), _PyObject_ASSERT_FAILED_MSG() or Py_UNREACHABLE() in unicode_dealloc() and unicode_release_interned().	2020-01-30 12:18:32 +01:00
Pablo Galindo	016b0280b8	Fix compiler warning in Objects/unicodeobject.c (GH-17440)	2019-12-02 18:09:43 +00:00
Victor Stinner	d68b592dd6	bpo-38896: Remove PyUnicode_ClearFreeList() function (GH-17354) Remove PyUnicode_ClearFreeList() function: the Unicode free list has been removed in Python 3.3.	2019-11-23 02:30:32 +01:00
Victor Stinner	3d4833488a	bpo-38858: Call _PyUnicode_Fini() in Py_EndInterpreter() (GH-17330) Py_EndInterpreter() now clears the filesystem codec.	2019-11-22 12:27:50 +01:00

1 2 3 4 5 ...

1520 Commits