The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed
reference, has been replaced by using one of the following functions, which
return a strong reference and distinguish a missing attribute from an error:
_PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(),
_PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().
* Implement C recursion protection with limit pointers for Linux, MacOS and Windows
* Remove calls to PyOS_CheckStack
* Add stack protection to parser
* Make tests more robust to low stacks
* Improve error messages for stack overflow
This reduces the size of _PyInterpreterFrame by 8 bytes on 64-bit
platforms using the free threading build due to alignment requirements.
This allows for slightly more recursive calls into the interpreter (from
C), but `test_call.test_super_deep` still crashes.
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.
Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.
Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
On Windows, `long` is a signed 32-bit integer so it can't represent
`0xffff_ffff` without overflow. Windows exit codes are unsigned 32-bit
integers, so if a child process exits with `-1`, it will be represented
as `0xffff_ffff`.
Also fix a number of other possible cases where `_Py_HandleSystemExit`
could return with an exception set, leading to a `SystemError` (or
fatal error in debug builds) later on during shutdown.
The free-threaded build partially stores heap type reference counts in
distributed manner in per-thread arrays. This avoids reference count
contention when creating or destroying instances.
Co-authored-by: Ken Jin <kenjin@python.org>
The `PyStructSequence` destructor would crash if it was deallocated after
its type's dictionary was cleared by the GC, because it couldn't compute
the "real size" of the instance. This could occur with relatively
straightforward code in the free-threaded build or with a reference
cycle involving the type in the default build, due to differing orders
in which `tp_clear()` was called.
Account for the non-sequence fields in `tp_basicsize` and use that,
along with `Py_SIZE()`, to compute the "real" size of a
`PyStructSequence` in the dealloc function. This avoids the accesses to
the type's dictionary during dealloc, which were unsafe.
* Add an InternalDocs file describing how interning should work and how to use it.
* Add internal functions to *explicitly* request what kind of interning is done:
- `_PyUnicode_InternMortal`
- `_PyUnicode_InternImmortal`
- `_PyUnicode_InternStatic`
* Switch uses of `PyUnicode_InternInPlace` to those.
* Disallow using `_Py_SetImmortal` on strings directly.
You should use `_PyUnicode_InternImmortal` instead:
- Strings should be interned before immortalization, otherwise you're possibly
interning a immortalizing copy.
- `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
`SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
backports, as they are now part of public API and version-specific ABI.
* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.
* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
- `_Py_ID`
- `_Py_STR` (including the empty string)
- one-character latin-1 singletons
Now, when you intern a singleton, that exact singleton will be interned.
* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).
* Intern `_Py_STR` singletons at startup.
* For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup.
* Beef up the tests. Cover internal details (marked with `@cpython_only`).
* Add lots of assertions
Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>
* Add docs for new APIs
* Add soft-deprecation notices
* Add What's New porting entries
* Update comments referencing `PyFrame_LocalsToFast()` to mention the proxy instead
* Other related cleanups found when looking for refs to the deprecated APIs
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
The function returns `True` or `False` depending on whether the GIL is
currently enabled. In the default build, it always returns `True`
because the GIL is always enabled.
Most mutable data is protected by a striped lock that is keyed on the
referenced object's address. The weakref's hash is protected using the
weakref's per-object lock.
Note that this only affects free-threaded builds. Apart from some minor
refactoring, the added code is all either gated by `ifdef`s or is a no-op
(e.g. `Py_BEGIN_CRITICAL_SECTION`).
This adds a stop the world pause to make the two functions thread-safe
when the GIL is disabled in the free-threaded build.
Additionally, the main test thread may call `sys._current_exceptions()` as
soon as `g_raised.set()` is called. The background thread may not yet reach
the `leave_g.wait()` line.
This brings the code under test.support.interpreters, and the corresponding extension modules, in line with recent updates to PEP 734.
(Note: PEP 734 has not been accepted at this time. However, we are using an internal copy of the implementation in the test suite to exercise the existing subinterpreters feature.)
* gh-112529: Remove PyGC_Head from object pre-header in free-threaded build
This avoids allocating space for PyGC_Head in the free-threaded build.
The GC implementation for free-threaded CPython does not use the
PyGC_Head structure.
* The trashcan mechanism uses the `ob_tid` field instead of `_gc_prev`
in the free-threaded build.
* The GDB libpython.py file now determines the offset of the managed
dict field based on whether the running process is a free-threaded
build. Those are identified by the `ob_ref_local` field in PyObject.
* Fixes `_PySys_GetSizeOf()` which incorrectly incorrectly included the
size of `PyGC_Head` in the size of static `PyTypeObject`.
Statistics gathering is now off by default. Use the "-X pystats"
command line option or set the new PYTHONSTATS environment variable
to 1 to turn statistics gathering on at Python startup.
Statistics are no longer dumped at exit if statistics gathering was
off or statistics have been cleared.
Changes:
* Add PYTHONSTATS environment variable.
* sys._stats_dump() now returns False if statistics are not dumped
because they are all equal to zero.
* Add PyConfig._pystats member.
* Add tests on sys functions and on setting PyConfig._pystats to 1.
* Add Include/cpython/pystats.h and Include/internal/pycore_pystats.h
header files.
* Rename '_py_stats' variable to '_Py_stats'.
* Exclude Include/cpython/pystats.h from the Py_LIMITED_API.
* Move pystats.h include from object.h to Python.h.
* Add _Py_StatsOn() and _Py_StatsOff() functions. Remove
'_py_stats_struct' variable from the API: make it static in
specialize.c.
* Document API in Include/pystats.h and Include/cpython/pystats.h.
* Complete pystats documentation in Doc/using/configure.rst.
* Don't write "all zeros" stats: if _stats_off() and _stats_clear()
or _stats_dump() were called.
* _PyEval_Fini() now always call _Py_PrintSpecializationStats() which
does nothing if stats are all zeros.
Co-authored-by: Michael Droettboom <mdboom@gmail.com>
* Add get_recursion_available() and get_recursion_depth() functions
to the test.support module.
* Change infinite_recursion() default max_depth from 75 to 100.
* Fix test_tomllib recursion tests for WASI buildbots: reduce the
recursion limit and compute the maximum nested array/dict depending
on the current available recursion limit.
* test.pythoninfo logs sys.getrecursionlimit().
* Enhance test_sys tests on sys.getrecursionlimit()
and sys.setrecursionlimit().
Move the private _PyErr_WriteUnraisableMsg() functions to the
internal C API (pycore_pyerrors.h).
Move write_unraisable_exc() from _testcapi to _testinternalcapi.
Python built with "configure --with-trace-refs" (tracing references)
is now ABI compatible with Python release build and debug build.
Moreover, it now also supports the Limited API.
Change Py_TRACE_REFS build:
* Remove _PyObject_EXTRA_INIT macro.
* The PyObject structure no longer has two extra members (_ob_prev
and _ob_next).
* Use a hash table (_Py_hashtable_t) to trace references (all
objects): PyInterpreterState.object_state.refchain.
* Py_TRACE_REFS build is now ABI compatible with release build and
debug build.
* Limited C API extensions can now be built with Py_TRACE_REFS:
xxlimited, xxlimited_35, _testclinic_limited.
* No longer rename PyModule_Create2() and PyModule_FromDefAndSpec2()
functions to PyModule_Create2TraceRefs() and
PyModule_FromDefAndSpec2TraceRefs().
* _Py_PrintReferenceAddresses() is now called before
finalize_interp_delete() which deletes the refchain hash table.
* test_tracemalloc find_trace() now also filters by size to ignore
the memory allocated by _PyRefchain_Trace().
Test changes for Py_TRACE_REFS:
* Add test.support.Py_TRACE_REFS constant.
* Add test_sys.test_getobjects() to test sys.getobjects() function.
* test_exceptions skips test_recursion_normalizing_with_no_memory()
and test_memory_error_in_PyErr_PrintEx() if Python is built with
Py_TRACE_REFS.
* test_repl skips test_no_memory().
* test_capi skisp test_set_nomemory().
We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.
Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
Move private _PyMem functions to the internal C API (pycore_pymem.h):
* _PyMem_GetCurrentAllocatorName()
* _PyMem_RawStrdup()
* _PyMem_RawWcsdup()
* _PyMem_Strdup()
No longer export these functions.
Move pymem_getallocatorsname() function from _testcapi to _testinternalcapi,
since the API moved to the internal C API.