Commit Graph

1728 Commits

Author SHA1 Message Date
Victor Stinner 22706843e0
gh-131238: Remove many includes from pycore_interp.h (#131472) 2025-03-19 17:46:24 +00:00
Mark Shannon a1aeec61c4
GH-131238: Core header refactor (GH-131250)
* Moves most structs in pycore_ header files into pycore_structs.h and pycore_runtime_structs.h

* Removes many cross-header dependencies
2025-03-17 09:19:04 +00:00
Mark Shannon f30376c650
GH-127705: Fix _Py_RefcntAdd to handle objects becoming immortal (GH-131140) 2025-03-12 16:54:10 +00:00
Victor Stinner ed8675c571
gh-111178: Fix function signatures of unicodeiter (#130684) 2025-03-04 10:33:09 +01:00
Sergey Miryanov 3a7f17c7e2
gh-130790: Remove references about unicode's readiness from comments (#130801) 2025-03-03 19:18:09 +00:00
Sergey B Kirpichev f39a07be47
gh-87790: support thousands separators for formatting fractional part of floats (#125304)
```pycon
>>> f"{123_456.123_456:_._f}"  # Whole and fractional
'123_456.123_456'
>>> f"{123_456.123_456:_f}"    # Integer component only
'123_456.123456'
>>> f"{123_456.123_456:._f}"   # Fractional component only
'123456.123_456'
>>> f"{123_456.123_456:.4_f}"  # with precision
'123456.1_235'
```
2025-02-25 16:27:07 +01:00
Sam Gross b9d2ee687c
gh-129701: Fix a data race in `intern_common` in the free threaded build (GH-130089)
* gh-129701: Fix a data race in `intern_common` in the free threaded build

* Use a mutex to avoid potentially returning a non-immortalized string,
  because immortalization happens after the insertion into the interned
  dict.

* Use `Py_DECREF()` calls instead of `Py_SET_REFCNT(s, Py_REFCNT(s) - 2)`
  for thread-safety. This code path isn't performance sensistive, so
  just use `Py_DECREF()` unconditionally for simplicity.
2025-02-17 14:15:40 +01:00
Stan Ulbrych 3402e133ef
gh-82045: Correct and deduplicate "isprintable" docs; add test. (GH-130118)
We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.

With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with this definition.

Author:    Greg Price <gnprice@gmail.com>

Co-authored-by: Greg Price <gnprice@gmail.com>
2025-02-14 18:16:47 +01:00
Victor Stinner 0373926260
gh-129354: Use PyErr_FormatUnraisable() function (#129511)
Replace PyErr_WriteUnraisable() with PyErr_FormatUnraisable().
2025-01-31 13:16:08 +01:00
Victor Stinner a810cb89f1
gh-89188: Implement PyUnicode_KIND() as a function (#129412)
Implement PyUnicode_KIND() and PyUnicode_DATA() as function, in
addition to the macros with the same names. The macros rely on C bit
fields which have compiler-specific layout.
2025-01-30 11:27:27 +00:00
Umar Butler 8d8b854824
gh-128016: Improved invalid escape sequence warning message (#128020) 2025-01-15 18:00:54 +01:00
Donghee Na ae23a012e6
gh-128137: Update PyASCIIObject to handle interned field with the atomic operation (gh-128196) 2025-01-05 18:17:06 +09:00
Alexander Shadchin 46cb6340d7
gh-127903: Fix a crash on debug builds when calling `Objects/unicodeobject::_copy_characters`` (#127876) 2025-01-03 18:47:58 +00:00
Sam Gross 8eebe4e6d0
gh-128212: Fix race in `_PyUnicode_CheckConsistency` (GH-128367)
There was a data race on the utf8 field between `PyUnicode_SET_UTF8` and
`_PyUnicode_CheckConsistency`. Use the `_PyUnicode_UTF8()` accessor,
which uses an atomic load internally, to avoid the data race.
2025-01-02 14:02:54 -05:00
Kumar Aditya 3c168f7f79
gh-128013: fix data race in `PyUnicode_AsUTF8AndSize` on free-threading (#128021) 2024-12-19 17:08:32 +05:30
Victor Stinner f802c8bf87
gh-128013: Convert unicodeobject.c macros to functions (#128061)
Convert unicodeobject.c macros to static inline functions.

* Add _PyUnicode_SET_UTF8() and _PyUnicode_SET_UTF8_LENGTH() macros.
* Add PyUnicode_HASH() and PyUnicode_SET_HASH() macros.
* Remove unused _PyUnicode_KIND() and _PyUnicode_GET_LENGTH() macros.
2024-12-18 16:34:31 +01:00
Inada Naoki 5dd775bed0
gh-126024: unicodeobject: optimize find_first_nonascii (GH-127790)
Remove 1 branch.
2024-12-13 17:21:46 +01:00
Bénédikt Tran 36c6178d37
gh-126024: fix UBSan failure in `unicodeobject.c:find_first_nonascii` (GH-127566) 2024-12-06 09:31:30 -05:00
Victor Stinner bf21e2160d
Fix Unicode encode_wstr_utf8() (#127420)
Raise RuntimeError instead of RuntimeWarning.
2024-12-02 11:14:47 +01:00
Inada Naoki 7043bbd1ca
gh-127417: fix UTF-8 decoder optimization on AIX (#127433) 2024-11-30 21:52:37 +09:00
Inada Naoki 322b486010
gh-126024: optimize UTF-8 decoder for short non-ASCII string (#126025) 2024-11-29 19:48:02 +09:00
Pablo Galindo Salgado 30aeb00d36
gh-126076: Account for relocated objects in tracemalloc (#126077) 2024-11-19 10:35:17 +00:00
Eric Snow 6f26d496d3
gh-125286: Share the Main Refchain With Legacy Interpreters (gh-125709)
They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
2024-10-23 10:10:06 -06:00
Victor Stinner 1639d934b9
gh-125196: Add a free list to PyUnicodeWriter (#125227) 2024-10-10 12:11:06 +02:00
Victor Stinner 1b2a5485f9
gh-125196: PyUnicodeWriter_Discard(NULL) does nothing (#125222) 2024-10-09 23:32:02 +00:00
Victor Stinner ee3167b978
gh-125196: Add fast-path for int in PyUnicodeWriter_WriteStr() (#125214)
PyUnicodeWriter_WriteStr() and PyUnicodeWriter_WriteRepr() now call
directly _PyLong_FormatWriter() if the argument is an int.
2024-10-10 00:01:02 +02:00
Eric Snow f2cb399470
gh-116510: Fix a Crash Due to Shared Immortal Interned Strings (gh-124865)
Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.

This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785.
2024-10-09 11:32:16 -06:00
Victor Stinner e0c87c64b1
gh-124502: Remove _PyUnicode_EQ() function (#125114)
* Replace unicode_compare_eq() with unicode_eq().
* Use unicode_eq() in setobject.c.
* Replace _PyUnicode_EQ() with _PyUnicode_Equal().
* Remove unicode_compare_eq() and _PyUnicode_EQ().
2024-10-09 10:15:17 +02:00
Victor Stinner a7f0727ca5
gh-124502: Add PyUnicode_Equal() function (#124504) 2024-10-07 21:24:53 +00:00
T. Wouters 7bdfabe2d1
gh-124785: Revert "gh-116510: Fix crash due to shared immortal interned strings (gh-124646)" (gh-124807)
Revert "gh-116510: Fix crash due to shared immortal interned strings. (gh-124646)"

This reverts commit 98b2ed7e23.
2024-09-30 16:41:46 -07:00
Neil Schemenauer 98b2ed7e23
gh-116510: Fix crash due to shared immortal interned strings. (gh-124646) 2024-09-26 19:16:51 -07:00
Petr Viktorin da5855e99a
gh-112301: Use literal format strings in unicode_fromformat_arg (GH-124203) 2024-09-25 19:46:01 +02:00
Victor Stinner ef9d54703f
gh-107954, PEP 741: Add PyInitConfig C API (#123502)
Add Doc/c-api/config.rst documentation.
2024-09-03 12:33:49 +00:00
Victor Stinner d8e69b2c1b
gh-122854: Add Py_HashBuffer() function (#122855) 2024-08-30 15:42:27 +00:00
Serhiy Storchaka 1a0b828994
gh-122561: Clean up and microoptimize str.translate and charmap codec (GH-122932)
* Replace PyLong_AS_LONG() with PyLong_AsLong().
* Call PyLong_AsLong() only once per the replacement code.
* Use PyMapping_GetOptionalItem() instead of PyObject_GetItem().
2024-08-28 12:11:13 +03:00
Eddie Elizondo 3203a74129
gh-113190: Reenable non-debug interned string cleanup (GH-113601) 2024-08-15 11:55:09 +00:00
Jelle Zijlstra 53ebb6232a
gh-122888: Fix crash on certain calls to str() (#122889)
Fixes #122888
2024-08-12 09:20:09 -07:00
Victor Stinner fda6bd842a
Replace PyObject_Del with PyObject_Free (#122453)
PyObject_Del() is just a alias to PyObject_Free() kept for backward
compatibility. Use directly PyObject_Free() instead.
2024-08-01 14:12:33 +02:00
Petr Viktorin bb09ba6792
gh-122291: Intern latin-1 one-byte strings at startup (GH-122303) 2024-07-27 10:27:06 +02:00
Victor Stinner bfdbeac355
gh-121849: Fix PyUnicodeWriter_WriteSubstring() crash if len=0 (#121896)
Do nothing if start=end.
2024-07-17 10:26:05 +02:00
Petr Viktorin b4aedb23ae
gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (#121364)
* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

* Document immortality in some functions that take `const char *`

This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.

Always point out a non-immortalizing alternative.

* Don't immortalize user-provided attr names in _ctypes
2024-07-16 15:36:21 +02:00
Petr Viktorin 956270d08d
gh-113993: For string interning, do not rely on (or assert) _Py_IsImmortal (GH-121358)
Older stable ABI extensions are allowed to make immortal objects mortal.
Instead, use `_PyUnicode_STATE` (`interned` and `statically_allocated`).
2024-07-16 15:17:29 +02:00
Bénédikt Tran 6343486eb6
gh-121165: protect macro expansion of `ADJUST_INDICES` with do-while(0) (#121166) 2024-07-02 16:27:51 +05:30
Victor Stinner 12af8ec864
gh-121040: Use __attribute__((fallthrough)) (#121044)
Fix warnings when using -Wimplicit-fallthrough compiler flag.

Annotate explicitly "fall through" switch cases with a new
_Py_FALLTHROUGH macro which uses __attribute__((fallthrough)) if
available. Replace "fall through" comments with _Py_FALLTHROUGH.

Add _Py__has_attribute() macro. No longer define __has_attribute()
macro if it's not defined. Move also _Py__has_builtin() at the top
of pyport.h.

Co-Authored-By: Nikita Sobolev <mail@sobolevn.me>
2024-06-27 09:58:44 +00:00
Xie Yanbo 0153fd0940
Fix typos in comments (#120821) 2024-06-24 19:47:00 +02:00
Steve Dower e731554337
Fixes loop variables to be the same types as their limit (GH-120958) 2024-06-24 17:11:47 +01:00
Victor Stinner 2e157851e3
gh-119182: Add PyUnicodeWriter_WriteUCS4() function (#120849) 2024-06-24 17:40:39 +02:00
Serhiy Storchaka 6eb23b1311
gh-70278: Fix PyUnicode_FromFormat() with precision for %s and %V (GH-120365)
PyUnicode_FromFormat() no longer produces the ending \ufffd
character for truncated C string when use precision with %s and %V.
It now truncates the string before the start of truncated multibyte sequences.
2024-06-24 18:07:07 +03:00
Victor Stinner e213475495
gh-119182: Add checks to PyUnicodeWriter APIs (#120870) 2024-06-22 17:25:55 +02:00
Victor Stinner 879d1f28bb
gh-119182: Use PyUnicodeWriter_WriteWideChar() (#120851)
Use PyUnicodeWriter_WriteWideChar() in PyUnicode_FromFormat()
2024-06-22 08:58:22 +02:00