Optimize `LOAD_FAST` opcodes into faster versions that load borrowed references onto the operand stack when we can prove that the lifetime of the local outlives the lifetime of the temporary that is loaded onto the stack.
Minor readability fix in PyUnstable_GC_VisitObjects
Replaces `if (visit_generation())` with `if (visit_generation() < 0)`,
since we are checking for the failure case, and it's confusing to have
that be implicitly `true`.
Also fixes a misspelt variable name.
The reference count fields, such as `ob_tid` and `ob_ref_shared`, may be
accessed concurrently in the free threading build by a `_Py_TryXGetRef`
or similar operation. The PyObject header fields will be initialized by
`_PyObject_Init`, so only call `memset()` to zero-initialize the remainder
of the allocation.
* Mark almost all reachable objects before doing collection phase
* Add stats for objects marked
* Visit new frames before each increment
* Update docs
* Clearer calculation of work to do.
* Mark almost all reachable objects before doing collection phase
* Add stats for objects marked
* Visit new frames before each increment
* Remove lazy dict tracking
* Update docs
* Clearer calculation of work to do.
Fix a bug that can cause a crash when sub-interpreters use "basic"
single-phase extension modules. Shared objects could refer to PyGC_Head
nodes that had been freed as part of interpreter shutdown.
Use a `_PyStackRef` and defer the reference to `f_executable` when
possible. This avoids some reference count contention in the common case
of executing the same code object from multiple threads concurrently in
the free-threaded build.
The free-threaded GC now visits interpreter stacks to keep objects
that use deferred reference counting alive.
Interpreter frames are zero initialized in the free-threaded GC so
that the GC doesn't see garbage data. This is a temporary measure
until stack spilling around escaping calls is implemented.
Co-authored-by: Ken Jin <kenjin@python.org>
Use the new public Raw functions:
* _PyTime_PerfCounterUnchecked() with PyTime_PerfCounterRaw()
* _PyTime_TimeUnchecked() with PyTime_TimeRaw()
* _PyTime_MonotonicUnchecked() with PyTime_MonotonicRaw()
Remove internal functions:
* _PyTime_PerfCounterUnchecked()
* _PyTime_TimeUnchecked()
* _PyTime_MonotonicUnchecked()
Change old space bit of young objects from 0 to gcstate->visited_space.
This ensures that any object created *and* collected during cycle GC has the bit set correctly.
<pycore_time.h> include is no longer needed to get the PyTime_t type
in internal header files. This type is now provided by <Python.h>
include. Add <pycore_time.h> includes to C files instead.
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.
The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
* gh-112529: Implement GC for free-threaded builds
This implements a mark and sweep GC for the free-threaded builds of
CPython. The implementation relies on mimalloc to find GC tracked
objects (i.e., "containers").
* gh-112529: Use GC heaps for GC allocations in free-threaded builds
The free-threaded build's garbage collector implementation will need to
find GC objects by traversing mimalloc heaps. This hooks up the
allocation calls with the correct heaps by using a thread-local
"current_obj_heap" variable.
* Refactor out setting heap based on type
This splits part of Modules/gcmodule.c of into Python/gc.c, which
now contains the core garbage collection implementation. The Python
module remain in the Modules/gcmodule.c file.