Commit Graph

159 Commits

Author SHA1 Message Date
Mark Shannon 9fbd66a93d
GH-133912: Fix `PyObject_GenericSetDict` to handle inline values (GH-134725) 2025-05-28 19:03:41 +01:00
Neil Schemenauer fbbbc10055
gh-127266: avoid data races when updating type slots (gh-133177)
In the free-threaded build, avoid data races caused by updating type
slots or type flags after the type was initially created.  For those
(typically rare) cases, use the stop-the-world mechanism.  Remove the
use of atomics when reading or writing type flags.
2025-05-27 18:27:41 -07:00
Kumar Aditya a380d57873
gh-134043: use stackrefs in vectorcalling methods (#134044)
Adds `_PyObject_GetMethodStackRef` which uses stackrefs and takes advantage of deferred reference counting in free-threading while calling method objects in vectorcall.
2025-05-27 22:28:27 +05:30
Neil Schemenauer eecafc3380
Revert gh-127266: avoid data races when updating type slots (gh-131174) (gh-133129)
This is triggering deadlocks in test_opcache.  See GH-133130 for stack trace.
2025-04-28 23:38:29 -07:00
Neil Schemenauer e414a2d81c
gh-127266: avoid data races when updating type slots (gh-131174)
In the free-threaded build, avoid data races caused by updating type slots
or type flags after the type was initially created.  For those (typically
rare) cases, use the stop-the-world mechanism.  Remove the use of atomics
when reading or writing type flags.  The use of atomics is not sufficient to
avoid races (since flags are sometimes read without a lock and without
atomics) and are no longer required.
2025-04-28 20:28:44 +00:00
Sam Gross da53660f35
gh-131586: Avoid refcount contention in context managers (gh-131851)
This avoid reference count contention in the free threading build
when calling special methods like `__enter__` and `__exit__`.
2025-04-21 15:54:25 -04:00
Sam Gross 67fbfb42bd
gh-131586: Avoid refcount contention in some "special" calls (#131588)
In the free threaded build, the `_PyObject_LookupSpecial()` call can lead to
reference count contention on the returned function object becuase it
doesn't use stackrefs. Refactor some of the callers to use
`_PyObject_MaybeCallSpecialNoArgs`, which uses stackrefs internally.

This fixes the scaling bottleneck in the "lookup_special" microbenchmark
in `ftscalingbench.py`. However, the are still some uses of
`_PyObject_LookupSpecial()` that need to be addressed in future PRs.
2025-03-26 14:38:47 -04:00
Victor Stinner 1a082085ae
gh-131238: Remove pycore_object_deferred.h from pycore_object.h (#131549)
Remove also pycore_function.h from pycore_typeobject.h.
2025-03-21 16:44:10 +00:00
Mark Shannon 684a759c20
GH-127705: Don't call _Py_ForgetReference before _Py_Dealloc (GH-131508) 2025-03-20 15:45:43 +00:00
Victor Stinner 6c776abb90
gh-131238: Cleanup pycore_runtime.h includes (#131486) 2025-03-20 00:47:30 +00:00
Mark Shannon fd545d735d
GH-127705: Move mortal decrefs to internal header and make sure _PyReftracerTrack is called 2025-03-17 17:23:50 +00:00
Mark Shannon a45f25361d
GH-131238: More refactoring of core header files (GH-131351)
Adds new pycore_stats.h header file to help break dependencies involving the pycore_code.h header.
2025-03-17 14:41:05 +00:00
Mark Shannon f30376c650
GH-127705: Fix _Py_RefcntAdd to handle objects becoming immortal (GH-131140) 2025-03-12 16:54:10 +00:00
Mark Shannon 2bef8ea8ea
GH-127705: Use `_PyStackRef`s in the default build. (GH-127875) 2025-03-10 14:06:56 +00:00
Sam Gross f963239ff1
gh-130202: Fix bug in `_PyObject_ResurrectEnd` in free threaded build (gh-130281)
This fixes a fairly subtle bug involving finalizers and resurrection in
debug free threaded builds: if `_PyObject_ResurrectEnd` returns `1`
(i.e., the object was resurrected by a finalizer), it's not safe to
access the object because it might still be deallocated. For example:

 * The finalizer may have exposed the object to another thread. That
   thread may hold the last reference and concurrently deallocate it any
   time after `_PyObject_ResurrectEnd()` returns `1`.
 * `_PyObject_ResurrectEnd()` may call `_Py_brc_queue_object()`, which
   may internally deallocate the object immediately if the owning thread
   is dead.

Therefore, it's important not to access the object after it's
resurrected. We only violate this in two cases, and only in debug
builds:

 * We assert that the object is tracked appropriately. This is now moved
   up betewen the finalizer and the `_PyObject_ResurrectEnd()` call.

 * The `--with-trace-refs` builds may need to remember the object if
   it's resurrected. This is now handled by `_PyObject_ResurrectStart()`
   and `_PyObject_ResurrectEnd()`.

Note that `--with-trace-refs` is currently disabled in `--disable-gil`
builds because the refchain hash table isn't thread-safe, but this
refactoring avoids an additional thread-safety issue.
2025-02-25 12:03:28 -05:00
Dino Viehland 28f5e3de57
gh-129984: Mark immortal objects as deferred (#129985)
Mark immortal objects as deferred
2025-02-13 09:01:43 -08:00
Kumar Aditya 94cd2e0dde
gh-129289: fix crash when task finalizer is not called in asyncio (#129840) 2025-02-10 17:03:59 +05:30
Mikhail Efimov 8e20e42cc6
gh-125723: Fix crash with f_locals when generator frame outlive their generator (#126956)
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>
2025-01-22 15:50:01 +03:00
Sam Gross d66c08aa75
gh-128923: Use zero to indicate unassigned unique id (#128925)
In the free threading build, the per thread reference counting uses a
unique id for some objects to index into the local reference count
table. Use 0 instead of -1 to indicate that the id is not assigned. This
avoids bugs where zero-initialized heap type objects look like they have
a unique id assigned.
2025-01-17 16:42:27 +01:00
Victor Stinner 36c5e3bcc2
gh-128679: Redesign tracemalloc locking (#128888)
* Use TABLES_LOCK() to protect 'tracemalloc_config.tracing'.
* Hold TABLES_LOCK() longer while accessing tables.
* tracemalloc_realloc() and tracemalloc_free() no longer
  remove the trace on reentrant call.
* _PyTraceMalloc_Stop() unregisters _PyTraceMalloc_TraceRef().
* _PyTraceMalloc_GetTraces() sets the reentrant flag.
* tracemalloc_clear_traces_unlocked() sets the reentrant flag.
2025-01-15 20:22:44 +00:00
Kumar Aditya 7dc41ad6a7
gh-128002: fix `asyncio.all_tasks` against concurrent deallocations of tasks (#128541) 2025-01-09 21:26:00 +05:30
Ed Nutting ab05beb8ce
gh-127599: Fix _Py_RefcntAdd missing calls to _Py_INCREF_STAT_INC/_Py_INCREF_IMMORTAL_STAT_INC (#127717)
Previously, `_Py_RefcntAdd` hasn't called 
`_Py_INCREF_STAT_INC/_Py_INCREF_IMMORTAL_STAT_INC` which is incorrect. 

Now it has been fixed.
2024-12-15 15:51:03 +02:00
Mark Shannon 487fdbed40
GH-125174: Fix compiler warning (GH-127860)
Fix compiler warning
2024-12-12 11:22:20 +00:00
Mark Shannon bc262de06b
GH-125174: Mark objects as statically allocated. (#127797)
* Set a bit in the unused part of the refcount on 64 bit machines and the free-threaded build.

* Use the top of the refcount range on 32 bit machines
2024-12-11 17:37:38 +00:00
Sam Gross f4f530804b
gh-127582: Make object resurrection thread-safe for free threading. (GH-127612)
Objects may be temporarily "resurrected" in destructors when calling
finalizers or watcher callbacks. We previously undid the resurrection
by decrementing the reference count using `Py_SET_REFCNT`. This was not
thread-safe because other threads might be accessing the object
(modifying its reference count) if it was exposed by the finalizer,
watcher callback, or temporarily accessed by a racy dictionary or list
access.

This adds internal-only thread-safe functions for temporary object
resurrection during destructors.
2024-12-05 16:07:31 -05:00
mpage dabcecfd6d
gh-115999: Enable specialization of `CALL` instructions in free-threaded builds (#127123)
The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below.

A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe:

Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref.

Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__.

Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes.
Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure.

A few other miscellaneous changes were also needed:

Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it.

Add missing co_tlbc for _Py_InitCleanup.

Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.
2024-12-03 11:20:20 -08:00
Mark Shannon a8dd821d5b
GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-127110)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Update docs

* Clearer calculation of work to do.
2024-12-02 10:12:17 +00:00
Jelle Zijlstra dcf629213b
gh-119180: Add VALUE_WITH_FAKE_GLOBALS format to annotationlib (#124415) 2024-11-26 15:40:13 +00:00
mpage 09c240f20c
gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607)
Enable specialization of LOAD_GLOBAL in free-threaded builds.

Thread-safety of specialization in free-threaded builds is provided by the following:

A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization.
Generation of new keys versions is made atomic in free-threaded builds.
Existing helpers are used to atomically modify the opcode.
Thread-safety of specialized instructions in free-threaded builds is provided by the following:

Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards.
Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset.
The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
2024-11-21 11:22:21 -08:00
Pablo Galindo Salgado 30aeb00d36
gh-126076: Account for relocated objects in tracemalloc (#126077) 2024-11-19 10:35:17 +00:00
Hugo van Kemenade 899fdb213d
Revert "GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)" (#126983) 2024-11-19 11:25:09 +02:00
Mark Shannon b0fcc2c47a
GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Remove lazy dict tracking

* Update docs

* Clearer calculation of work to do.
2024-11-18 14:31:26 +00:00
Sam Gross 9b0bfba2a2
gh-124218: Use per-thread reference counting for globals and builtins (#125713)
Use per-thread refcounting for the reference from function objects to
the globals and builtins dictionaries.
2024-10-21 12:51:29 -04:00
Pablo Galindo Salgado f8ba9fb2ce
gh-125703: Correctly honour tracemalloc hooks on specialized DECREF paths (#125704) 2024-10-18 17:09:34 +01:00
Sam Gross 3ea488aac4
gh-124218: Use per-thread refcounts for code objects (#125216)
Use per-thread refcounting for the reference from function objects to
their corresponding code object. This can be a source of contention when
frequently creating nested functions. Deferred refcounting alone isn't a
great fit here because these references are on the heap and may be
modified by other libraries.
2024-10-15 15:06:41 -04:00
Mark Shannon c9014374c5
GH-125174: Make immortal objects more robust, following design from PEP 683 (GH-125251) 2024-10-10 18:19:08 +01:00
Sam Gross b482538523
gh-124218: Refactor per-thread reference counting (#124844)
Currently, we only use per-thread reference counting for heap type objects and
the naming reflects that. We will extend it to a few additional types in an
upcoming change to avoid scaling bottlenecks when creating nested functions.

Rename some of the files and functions in preparation for this change.
2024-10-01 17:05:42 +00:00
neonene d9d5b3d2ef
gh-124344: Make `_PyObject_IS_GC()` use underscored `PyType_IS_GC()` (#124349)
move up _PyType_IS_GC and use it
2024-09-23 21:14:15 +02:00
Mark Shannon c87b0e4a46
GH-124284: Add stats for refcount operations on immortal objects (GH-124288) 2024-09-23 19:10:55 +01:00
Victor Stinner ec08aa1fe4
gh-124064: Fix -Wconversion warnings in pycore_{long,object}.h (#124177)
Change also the fix for pycore_gc.h and pycore_stackref.h:
declare constants as uintptr_t, rather than casting constants.
2024-09-17 15:35:40 +00:00
Petr Viktorin 57c471a688
gh-123091: Use more _Py_IsImmortalLoose() (GH-123602)
Switch more _Py_IsImmortal(...) assertions to _Py_IsImmortalLoose(...)

The remaining calls to _Py_IsImmortal are in free-threaded-only code,
initialization of core objects, tests, and guards that fall back to
code that works with mortal objects.
2024-09-02 18:17:48 +02:00
Pieter Eendebak 7e38e6745d
gh-123271: Make builtin zip method safe under free-threading (#123272)
The `zip_next` function uses a common optimization technique for methods
that generate tuples. The iterator maintains an internal reference to
the returned tuple. When the method is called again, it checks if the
internal tuple's reference count is 1. If so, the tuple can be reused.
However, this approach is not safe under the free-threading build:
after checking the reference count, another thread may perform the same
check and also reuse the tuple. This can result in a double decref on
the items of the replaced tuple and a double incref (memory leak) on
the items of the tuple being set.

This adds a function, `_PyObject_IsUniquelyReferenced` that
encapsulates the stricter logic necessary for the free-threaded build:
the internal tuple must be owned by the current thread, have a local
refcount of one, and a shared refcount of zero.
2024-08-27 15:22:43 -04:00
Mark Shannon a4fd7aa4a6
GH-115776: Allow any fixed sized object to have inline values (GH-123192) 2024-08-21 15:52:04 +01:00
Mark Shannon bb1d30336e
GH-118093: Make `CALL_ALLOC_AND_ENTER_INIT` suitable for tier 2. (GH-123140)
* Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it

* Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
2024-08-20 16:52:58 +01:00
Sam Gross 40632b1f1d
gh-122974: Suppress GCC array bound warnings in free-threaded build (#123071)
GCC 11 and newer warn about the access to `unique_id` in non-debug builds
due to inlining the call on static non-heap types.
2024-08-17 16:03:50 -04:00
Sam Gross dc09301067
gh-122417: Implement per-thread heap type refcounts (#122418)
The free-threaded build partially stores heap type reference counts in
distributed manner in per-thread arrays. This avoids reference count
contention when creating or destroying instances.

Co-authored-by: Ken Jin <kenjin@python.org>
2024-08-06 14:36:57 -04:00
Victor Stinner b826e459ca
gh-121528: Fix _PyObject_Init() assertion for stable ABI (#121725)
Add _Py_IsImmortalLoose() function for assertions.
2024-07-17 21:49:37 +02:00
Hood Chatham 3086b86cfd
gh-121700 Emscripten trampolines not quite right since #106219 (GH-121701) 2024-07-14 11:24:09 +02:00
AN Long 294e724964
gh-117657: Fix data races reported by TSAN in some set methods (#120914)
Refactor the fast Unicode hash check into `_PyObject_HashFast` and use relaxed
atomic loads in the free-threaded build.

After this change, the TSAN doesn't report data races for this method.
2024-07-01 15:11:39 -04:00
Ken Jin 22b0de2755
gh-117139: Convert the evaluation stack to stack refs (#118450)
This PR sets up tagged pointers for CPython.

The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly.

Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out.

This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it.

The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs.

Please read Include/internal/pycore_stackref.h for more information!

---------

Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>
2024-06-27 03:10:43 +08:00