cpython

Commit Graph

Author	SHA1	Message	Date
Victor Stinner	fe9f6e829a	gh-133968: Add fast path to PyUnicodeWriter_WriteStr() (#133969 ) Don't call PyObject_Str() if the input type is str.	2025-05-13 15:31:41 +02:00
Serhiy Storchaka	9f69a58623	gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648) If the error handler is used, a new bytes object is created to set as the object attribute of UnicodeDecodeError, and that bytes object then replaces the original data. A pointer to the decoded data will became invalid after destroying that temporary bytes object. So we need other way to return the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal(). _PyBytes_DecodeEscape() does not have such issue, because it does not use the error handlers registry, but it should be changed for compatibility with _PyUnicode_DecodeUnicodeEscapeInternal().	2025-05-12 20:42:23 +03:00
Stan Ulbrych	4fd1095280	gh-133610: Remove PyUnicode_AsDecoded/Encoded functions (#133612 )	2025-05-09 17:31:24 +02:00
Petr Viktorin	987e45e632	gh-128972: Add `_Py_ALIGN_AS` and revert `PyASCIIObject` memory layout. (GH-133085) Add `_Py_ALIGN_AS` as per C API WG vote: https://github.com/capi-workgroup/decisions/issues/61 This patch only adds it to free-threaded builds; the `#ifdef Py_GIL_DISABLED` can be removed in the future. Use this to revert `PyASCIIObject` memory layout for non-free-threaded builds. The long-term plan is to deprecate the entire struct; until that happens it's better to keep it unchanged, as courtesy to people that rely on it despite it not being stable ABI.	2025-05-02 18:30:40 +02:00
Lysandros Nikolaou	60202609a2	gh-132661: Implement PEP 750 (#132662 ) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Wingy <git@wingysam.xyz> Co-authored-by: Koudai Aono <koxudaxi@gmail.com> Co-authored-by: Dave Peck <davepeck@gmail.com> Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Paul Everitt <pauleveritt@me.com> Co-authored-by: sobolevn <mail@sobolevn.me>	2025-04-30 11:46:41 +02:00
Donghee Na	75cbb8d89e	gh-132070: Use _PyObject_IsUniquelyReferenced in unicodeobject (gh-133039) --------- Co-authored-by: Kumar Aditya <kumaraditya@python.org> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2025-04-29 09:48:53 +09:00
Stan Ulbrych	f6fb498c97	gh-132798: Schedule removal of `PyUnicode_AsDecoded/Encoded` functions for 3.15 (#132799 ) Co-authored-by: Victor Stinner <vstinner@python.org>	2025-04-25 15:07:41 +02:00
Jon Crall	fc0ec29889	gh-103997: Automatically dedent the argument to "-c" (#103998 ) Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com> Co-authored-by: Kirill Podoprigora <80244920+Eclips4@users.noreply.github.com> Co-authored-by: Inada Naoki <songofacandy@gmail.com> Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2025-04-18 17:39:30 +09:00
Bénédikt Tran	edbf7fb129	gh-111178: remove redundant casts for functions with correct signatures (#131673 )	2025-04-01 17:18:11 +02:00
Victor Stinner	22706843e0	gh-131238: Remove many includes from pycore_interp.h (#131472 )	2025-03-19 17:46:24 +00:00
Mark Shannon	a1aeec61c4	GH-131238: Core header refactor (GH-131250) * Moves most structs in pycore_ header files into pycore_structs.h and pycore_runtime_structs.h * Removes many cross-header dependencies	2025-03-17 09:19:04 +00:00
Mark Shannon	f30376c650	GH-127705: Fix _Py_RefcntAdd to handle objects becoming immortal (GH-131140)	2025-03-12 16:54:10 +00:00
Victor Stinner	ed8675c571	gh-111178: Fix function signatures of unicodeiter (#130684 )	2025-03-04 10:33:09 +01:00
Sergey Miryanov	3a7f17c7e2	gh-130790: Remove references about unicode's readiness from comments (#130801 )	2025-03-03 19:18:09 +00:00
Sergey B Kirpichev	f39a07be47	gh-87790: support thousands separators for formatting fractional part of floats (#125304 ) ```pycon >>> f"{123_456.123_456:_._f}" # Whole and fractional '123_456.123_456' >>> f"{123_456.123_456:_f}" # Integer component only '123_456.123456' >>> f"{123_456.123_456:._f}" # Fractional component only '123456.123_456' >>> f"{123_456.123_456:.4_f}" # with precision '123456.1_235' ```	2025-02-25 16:27:07 +01:00
Sam Gross	b9d2ee687c	gh-129701: Fix a data race in `intern_common` in the free threaded build (GH-130089) * gh-129701: Fix a data race in `intern_common` in the free threaded build * Use a mutex to avoid potentially returning a non-immortalized string, because immortalization happens after the insertion into the interned dict. * Use `Py_DECREF()` calls instead of `Py_SET_REFCNT(s, Py_REFCNT(s) - 2)` for thread-safety. This code path isn't performance sensistive, so just use `Py_DECREF()` unconditionally for simplicity.	2025-02-17 14:15:40 +01:00
Stan Ulbrych	3402e133ef	gh-82045: Correct and deduplicate "isprintable" docs; add test. (GH-130118) We had the definition of what makes a character "printable" documented in three places, giving two different definitions. The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that. With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics. Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place. Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs. Also add a thorough test that the implementation agrees with this definition. Author: Greg Price <gnprice@gmail.com> Co-authored-by: Greg Price <gnprice@gmail.com>	2025-02-14 18:16:47 +01:00
Victor Stinner	0373926260	gh-129354: Use PyErr_FormatUnraisable() function (#129511 ) Replace PyErr_WriteUnraisable() with PyErr_FormatUnraisable().	2025-01-31 13:16:08 +01:00
Victor Stinner	a810cb89f1	gh-89188: Implement PyUnicode_KIND() as a function (#129412 ) Implement PyUnicode_KIND() and PyUnicode_DATA() as function, in addition to the macros with the same names. The macros rely on C bit fields which have compiler-specific layout.	2025-01-30 11:27:27 +00:00
Umar Butler	8d8b854824	gh-128016: Improved invalid escape sequence warning message (#128020 )	2025-01-15 18:00:54 +01:00
Donghee Na	ae23a012e6	gh-128137: Update PyASCIIObject to handle interned field with the atomic operation (gh-128196)	2025-01-05 18:17:06 +09:00
Alexander Shadchin	46cb6340d7	gh-127903: Fix a crash on debug builds when calling `Objects/unicodeobject::_copy_characters`` (#127876 )	2025-01-03 18:47:58 +00:00
Sam Gross	8eebe4e6d0	gh-128212: Fix race in `_PyUnicode_CheckConsistency` (GH-128367) There was a data race on the utf8 field between `PyUnicode_SET_UTF8` and `_PyUnicode_CheckConsistency`. Use the `_PyUnicode_UTF8()` accessor, which uses an atomic load internally, to avoid the data race.	2025-01-02 14:02:54 -05:00
Kumar Aditya	3c168f7f79	gh-128013: fix data race in `PyUnicode_AsUTF8AndSize` on free-threading (#128021 )	2024-12-19 17:08:32 +05:30
Victor Stinner	f802c8bf87	gh-128013: Convert unicodeobject.c macros to functions (#128061 ) Convert unicodeobject.c macros to static inline functions. * Add _PyUnicode_SET_UTF8() and _PyUnicode_SET_UTF8_LENGTH() macros. * Add PyUnicode_HASH() and PyUnicode_SET_HASH() macros. * Remove unused _PyUnicode_KIND() and _PyUnicode_GET_LENGTH() macros.	2024-12-18 16:34:31 +01:00
Inada Naoki	5dd775bed0	gh-126024: unicodeobject: optimize find_first_nonascii (GH-127790) Remove 1 branch.	2024-12-13 17:21:46 +01:00
Bénédikt Tran	36c6178d37	gh-126024: fix UBSan failure in `unicodeobject.c:find_first_nonascii` (GH-127566)	2024-12-06 09:31:30 -05:00
Victor Stinner	bf21e2160d	Fix Unicode encode_wstr_utf8() (#127420 ) Raise RuntimeError instead of RuntimeWarning.	2024-12-02 11:14:47 +01:00
Inada Naoki	7043bbd1ca	gh-127417: fix UTF-8 decoder optimization on AIX (#127433 )	2024-11-30 21:52:37 +09:00
Inada Naoki	322b486010	gh-126024: optimize UTF-8 decoder for short non-ASCII string (#126025 )	2024-11-29 19:48:02 +09:00
Pablo Galindo Salgado	30aeb00d36	gh-126076: Account for relocated objects in tracemalloc (#126077 )	2024-11-19 10:35:17 +00:00
Eric Snow	6f26d496d3	gh-125286: Share the Main Refchain With Legacy Interpreters (gh-125709) They used to be shared, before 3.12. Returning to sharing them resolves a failure on Py_TRACE_REFS builds. Co-authored-by: Petr Viktorin <encukou@gmail.com>	2024-10-23 10:10:06 -06:00
Victor Stinner	1639d934b9	gh-125196: Add a free list to PyUnicodeWriter (#125227 )	2024-10-10 12:11:06 +02:00
Victor Stinner	1b2a5485f9	gh-125196: PyUnicodeWriter_Discard(NULL) does nothing (#125222 )	2024-10-09 23:32:02 +00:00
Victor Stinner	ee3167b978	gh-125196: Add fast-path for int in PyUnicodeWriter_WriteStr() (#125214 ) PyUnicodeWriter_WriteStr() and PyUnicodeWriter_WriteRepr() now call directly _PyLong_FormatWriter() if the argument is an int.	2024-10-10 00:01:02 +02:00
Eric Snow	f2cb399470	gh-116510: Fix a Crash Due to Shared Immortal Interned Strings (gh-124865) Fix a crash caused by immortal interned strings being shared between sub-interpreters that use basic single-phase init. In that case, the string can be used by an interpreter that outlives the interpreter that created and interned it. For interpreters that share obmalloc state, also share the interned dict with the main interpreter. This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS failures identified by gh-124785.	2024-10-09 11:32:16 -06:00
Victor Stinner	e0c87c64b1	gh-124502: Remove _PyUnicode_EQ() function (#125114 ) * Replace unicode_compare_eq() with unicode_eq(). * Use unicode_eq() in setobject.c. * Replace _PyUnicode_EQ() with _PyUnicode_Equal(). * Remove unicode_compare_eq() and _PyUnicode_EQ().	2024-10-09 10:15:17 +02:00
Victor Stinner	a7f0727ca5	gh-124502: Add PyUnicode_Equal() function (#124504 )	2024-10-07 21:24:53 +00:00
T. Wouters	7bdfabe2d1	gh-124785: Revert "gh-116510: Fix crash due to shared immortal interned strings (gh-124646)" (gh-124807) Revert "gh-116510: Fix crash due to shared immortal interned strings. (gh-124646)" This reverts commit `98b2ed7e23`.	2024-09-30 16:41:46 -07:00
Neil Schemenauer	98b2ed7e23	gh-116510: Fix crash due to shared immortal interned strings. (gh-124646)	2024-09-26 19:16:51 -07:00
Petr Viktorin	da5855e99a	gh-112301: Use literal format strings in unicode_fromformat_arg (GH-124203)	2024-09-25 19:46:01 +02:00
Victor Stinner	ef9d54703f	gh-107954, PEP 741: Add PyInitConfig C API (#123502 ) Add Doc/c-api/config.rst documentation.	2024-09-03 12:33:49 +00:00
Victor Stinner	d8e69b2c1b	gh-122854: Add Py_HashBuffer() function (#122855 )	2024-08-30 15:42:27 +00:00
Serhiy Storchaka	1a0b828994	gh-122561: Clean up and microoptimize str.translate and charmap codec (GH-122932) * Replace PyLong_AS_LONG() with PyLong_AsLong(). * Call PyLong_AsLong() only once per the replacement code. * Use PyMapping_GetOptionalItem() instead of PyObject_GetItem().	2024-08-28 12:11:13 +03:00
Eddie Elizondo	3203a74129	gh-113190: Reenable non-debug interned string cleanup (GH-113601)	2024-08-15 11:55:09 +00:00
Jelle Zijlstra	53ebb6232a	gh-122888: Fix crash on certain calls to str() (#122889 ) Fixes #122888	2024-08-12 09:20:09 -07:00
Victor Stinner	fda6bd842a	Replace PyObject_Del with PyObject_Free (#122453 ) PyObject_Del() is just a alias to PyObject_Free() kept for backward compatibility. Use directly PyObject_Free() instead.	2024-08-01 14:12:33 +02:00
Petr Viktorin	bb09ba6792	gh-122291: Intern latin-1 one-byte strings at startup (GH-122303)	2024-07-27 10:27:06 +02:00
Victor Stinner	bfdbeac355	gh-121849: Fix PyUnicodeWriter_WriteSubstring() crash if len=0 (#121896 ) Do nothing if start=end.	2024-07-17 10:26:05 +02:00
Petr Viktorin	b4aedb23ae	gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (#121364 ) * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs * Document immortality in some functions that take `const char ` This is PyUnicode_InternFromString; PyDict_SetItemString, PyObject_SetAttrString; PyObject_DelAttrString; PyUnicode_InternFromString; and the PyModule_Add convenience functions. Always point out a non-immortalizing alternative. Don't immortalize user-provided attr names in _ctypes	2024-07-16 15:36:21 +02:00

1 2 3 4 5 ...

1737 Commits