Commit Graph

51 Commits

Author SHA1 Message Date
Neil Schemenauer eecafc3380
Revert gh-127266: avoid data races when updating type slots (gh-131174) (gh-133129)
This is triggering deadlocks in test_opcache.  See GH-133130 for stack trace.
2025-04-28 23:38:29 -07:00
Neil Schemenauer e414a2d81c
gh-127266: avoid data races when updating type slots (gh-131174)
In the free-threaded build, avoid data races caused by updating type slots
or type flags after the type was initially created.  For those (typically
rare) cases, use the stop-the-world mechanism.  Remove the use of atomics
when reading or writing type flags.  The use of atomics is not sufficient to
avoid races (since flags are sometimes read without a lock and without
atomics) and are no longer required.
2025-04-28 20:28:44 +00:00
Kumar Aditya 4b15d105a2
gh-132070: add `PyObject_Realloc` suppression in free-threading (#132468) 2025-04-16 06:10:56 +05:30
Kumar Aditya 46e88540e6
gh-116738: remove multiprocessing suppressions (#131319) 2025-03-18 00:29:18 +05:30
Sam Gross 052cb717f5
gh-124878: Fix race conditions during interpreter finalization (#130649)
The PyThreadState field gains a reference count field to avoid
issues with PyThreadState being a dangling pointer to freed memory.
The refcount starts with a value of two: one reference is owned by the
interpreter's linked list of thread states and one reference is owned by
the OS thread. The reference count is decremented when the thread state
is removed from the interpreter's linked list and before the OS thread
calls `PyThread_hang_thread()`. The thread that decrements it to zero
frees the `PyThreadState` memory.

The `holds_gil` field is moved out of the `_status` bit field, to avoid
a data race where on thread calls `PyThreadState_Clear()`, modifying the
`_status` bit field while the OS thread reads `holds_gil` when
attempting to acquire the GIL.

The `PyThreadState.state` field now has `_Py_THREAD_SHUTTING_DOWN` as a
possible value. This corresponds to the `_PyThreadState_MustExit()`
check. This avoids race conditions in the free threading build when
checking `_PyThreadState_MustExit()`.
2025-03-06 10:38:34 -05:00
Sam Gross cc17307faa
gh-124878: Add temporary TSAN suppression for free_threadstate (gh-130602)
The race condition with `free_threadstate` and daemon threads exists in
both the free threading and default builds. We were missing a
suppression in the default build.
2025-02-28 09:27:51 -05:00
Sam Gross ca22147547
gh-111924: Fix data races when swapping allocators (gh-130287)
CPython current temporarily changes `PYMEM_DOMAIN_RAW` to the default
allocator during initialization and shutdown. The motivation is to
ensure that core runtime structures are allocated and freed using the
same allocator. However, modifying the current allocator changes global
state and is not thread-safe even with the GIL. Other threads may be
allocating or freeing objects use PYMEM_DOMAIN_RAW; they are not
required to hold the GIL to call PyMem_RawMalloc/PyMem_RawFree.

This adds new internal-only functions like `_PyMem_DefaultRawMalloc`
that aren't affected by calls to `PyMem_SetAllocator()`, so they're
appropriate for Python runtime initialization and finalization. Use
these calls in places where we previously swapped to the default raw
allocator.
2025-02-20 11:31:15 -05:00
T. Wouters 388e1ca9f0
gh-115999: Make list and tuple iteration more thread-safe. (#128637)
Make tuple iteration more thread-safe, and actually test concurrent iteration of tuple, range and list. (This is prep work for enabling specialization of FOR_ITER in free-threaded builds.) The basic premise is:

Iterating over a shared iterable (list, tuple or range) should be safe, not involve data races, and behave like iteration normally does.

Using a shared iterator should not crash or involve data races, and should only produce items regular iteration would produce. It is not guaranteed to produce all items, or produce each item only once. (This is not the case for range iteration even after this PR.)

Providing stronger guarantees is possible for some of these iterators, but it's not always straight-forward and can significantly hamper the common case. Since iterators in general aren't shared between threads, and it's simply impossible to concurrently use many iterators (like generators), better to make sharing iterators without explicit synchronization clearly wrong.

Specific issues fixed in order to make the tests pass:

 - List iteration could occasionally fail an assertion when a shared list was shrunk and an item past the new end was retrieved concurrently. There's still some unsafety when deleting/inserting multiple items through for example slice assignment, which uses memmove/memcpy.

 - Tuple iteration could occasionally crash when the iterator's reference to the tuple was cleared on exhaustion. Like with list iteration, in free-threaded builds we can't safely and efficiently clear the iterator's reference to the iterable (doing it safely would mean extra, slow refcount operations), so just keep the iterable reference around.
2025-02-18 16:52:46 -08:00
Sam Gross f151d27159
gh-117657: Enable test_opcache under TSAN (GH-129831)
Fix a few thread-safety bugs to enable test_opcache when run with TSAN:

 * Use relaxed atomics when clearing `ht->_spec_cache.getitem`
   (gh-115999)
 * Add temporary suppression for type slot modifications (gh-127266)
 * Use atomic load when reading `*dictptr`
2025-02-11 16:53:08 -05:00
Sam Gross a191d6f78e
gh-117657: Include all of test_free_threading in TSAN tests (#129749) 2025-02-07 00:37:05 +01:00
Bogdan Romanyuk 365cf5fc23
gh-117657: Fix data race in `new_reference` for free threaded build (gh-129665) 2025-02-06 15:35:37 -05:00
🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) 03d9cdb729
Merge TSAN test matrices in CI (#123278) 2025-01-29 11:16:51 +02:00
Nadeshiko Manju 8a46a2ec50
gh-117657: Fix file descriptor race in test_socket.py (#123697) 2024-09-06 15:00:28 -04:00
Sam Gross 2b163aa9e7
gh-117657: Avoid race in `PAUSE_ADAPTIVE_COUNTER` in free-threaded build (#122190)
The adaptive counter doesn't do anything currently in the free-threaded
build and TSan reports a data race due to concurrent modifications to
the counter.
2024-07-30 13:53:47 -04:00
Sam Gross c557ae97d6
gh-122201: Lock mutex when setting handling_thread to NULL (#122204)
In the free-threaded build, we need to lock pending->mutex when clearing
the handling_thread in order not to race with a concurrent
make_pending_calls in the same interpreter.
2024-07-26 13:06:07 -04:00
Sam Gross 7641743d48
gh-117657: Remove TSAN suppressions for _abc.c (#121508)
The functions look thread-safe and I haven't seen any warnings issued
when running the tests locally.
2024-07-10 17:08:10 -04:00
Sam Gross 3ec719fabf
gh-117657: Fix TSan race in _PyDict_CheckConsistency (#121551)
The only remaining race in dictobject.c was in _PyDict_CheckConsistency
when the dictionary has shared keys.
2024-07-10 14:04:12 -04:00
Sam Gross 9c08f40a61
gh-117657: Fix TSAN races in setobject.c (#121511)
The `used` field must be written using atomic stores because `set_len`
and iterators may access the field concurrently without holding the
per-object lock.
2024-07-09 12:11:43 -04:00
AN Long 294e724964
gh-117657: Fix data races reported by TSAN in some set methods (#120914)
Refactor the fast Unicode hash check into `_PyObject_HashFast` and use relaxed
atomic loads in the free-threaded build.

After this change, the TSAN doesn't report data races for this method.
2024-07-01 15:11:39 -04:00
AN Long 8a5176772c
gh-117657: Use critical section to make _socket.socket.close thread safe (GH-120490) 2024-07-01 16:38:30 +02:00
Daniele Parmeggiani 362cd2680b
gh-117657: Fix `__slots__` thread safety in free-threaded build (#119368)
Fix a race in `PyMember_GetOne` and `PyMember_SetOne` for `Py_T_OBJECT_EX`.
These functions implement `__slots__` accesses for Python objects.
2024-06-17 18:44:54 +00:00
Sam Gross 460cc9e14e
gh-117657: Fix TSan reported data race on ioctl_works (#120175) 2024-06-17 13:23:40 -04:00
AN Long 2bacc2343c
gh-117657: Add TSAN suppression for set_default_allocator_unlocked (#120500)
Add TSAN suppression for set_default_allocator_unlocked
2024-06-15 00:10:18 +08:00
Ken Jin eebae2c460
gh-117657: Make PyType_HasFeature atomic (GH-120210)
Make PyType_HasFeature atomic
2024-06-13 17:29:19 +08:00
Ken Jin e16aed63f6
gh-117657: Make Py_TYPE and Py_SET_TYPE thread safe (GH-120165)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Nadeshiko Manju <me@manjusaka.me>
2024-06-12 20:41:07 +08:00
Sam Gross e21057b999
gh-117657: Fix TSAN race involving import lock (#118523)
This adds a `_PyRecursiveMutex` type based on `PyMutex` and uses that
for the import lock. This fixes some data races in the free-threaded
build and generally simplifies the import lock code.
2024-06-06 13:40:58 -04:00
Sam Gross e69d068ad0
gh-117657: Fix race involving GC and heap initialization (#119923)
The `_PyThreadState_Bind()` function is called before the first
`PyEval_AcquireThread()` so it's not synchronized with the stop the
world GC. We had a race where `gc_visit_heaps()` might visit a thread's
heap while it's being initialized.

Use a simple atomic int to avoid visiting heaps for threads that are not
yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called).

The race was reproducible by running:
`python Lib/test/test_importlib/partial/pool_in_threads.py`.
2024-06-04 09:42:13 -04:00
Sam Gross 47fb4327b5
gh-117657: Fix race involving immortalizing objects (#119927)
The free-threaded build currently immortalizes objects that use deferred
reference counting (see gh-117783). This typically happens once the
first non-main thread is created, but the behavior can be suppressed for
tests, in subinterpreters, or during a compile() call.

This fixes a race condition involving the tracking of whether the
behavior is suppressed.
2024-06-03 20:58:41 +00:00
Sam Gross 41c1cefbae
gh-117657: Avoid `sem_clockwait` in TSAN (#119915)
The `sem_clockwait` function is not currently instrumented, which leads
to false positives.
2024-06-03 13:42:27 -04:00
Donghee Na 0594a27e5f
gh-117657: Fix data races report by TSAN unicode-hash (gh-119907) 2024-06-03 12:22:41 +09:00
Sam Gross f3b89a63cb
gh-117657: Fix TSAN reported race in `_PyEval_IsGILEnabled`. (#119921)
The GIL may be disabled concurrently with this call so we need to use a
relaxed atomic load.
2024-06-02 10:19:02 -04:00
Sam Gross 7dc745d1f5
gh-117657: Add TSAN suppression for `set_discard_entry` (#119908)
Seen in CI occasionally when running `test_weakref`.
2024-06-01 12:15:58 -04:00
Sam Gross 90ec19fd33
gh-117657: Fix TSAN race in QSBR assertion (#119887)
Due to a limitation in TSAN, all reads from `PyThreadState.state` must be
atomic to avoid reported races.
2024-06-01 10:04:38 -04:00
Sam Gross 60593b2052
gh-117657: Fix TSAN race in free-threaded GC (#119883)
Only call `gc_restore_tid()` from stop-the-world contexts.
`worklist_pop()` can be called while other threads are running, so use a
relaxed atomic to modify `ob_tid`.
2024-06-01 10:04:05 -04:00
Arnon Yaari 87939bd579
gh-117657: Fix itertools.count thread safety (#119268)
Fix itertools.count in free-threading mode
2024-05-21 10:16:34 -07:00
mpage b88889e9ff
gh-117657: Log TSAN warnings to separate files and archive them (#118747)
This ensures we don't lose races that occur in subprocesses or
interleave races from workers running in parallel.

Log files are collected and packaged into a zipfile that can be
downloaded from the "Artifacts" section of the workflow run.
2024-05-10 17:54:23 -04:00
Alex Turner 33d20199af
gh-117657: Fix QSBR race condition (#118843)
`_Py_qsbr_unregister` is called when the PyThreadState is already
detached, so the access to `tstate->qsbr` isn't safe without locking the
shared mutex. Grab the `struct _qsbr_shared` from the interpreter
instead.
2024-05-10 10:26:35 -04:00
mpage 22d5185308
gh-117657: Fix data races reported by TSAN on `interp->threads.main` (#118865)
Use relaxed loads/stores when reading/writing to this field.
2024-05-10 09:59:14 -04:00
Brett Simmers 98ff3f65c0
gh-117657: Replace TSAN suppresions with more specific rules (#118722)
Using `race:` filters out warnings if the function appears anywhere in
the stack trace. This can hide a lot of unrelated warnings, especially
for a function like `_PyEval_EvalFrameDefault`, which is somewhere on
the stack more often than not.

Change all free-threaded suppressions to `race_top:`, which only matches
the top frame, and add any new suppressions this exposes.
2024-05-09 17:02:39 -04:00
mpage cb6f75a32c
gh-117657: Fix data races when writing / reading `ob_gc_bits` (#118292)
Use relaxed atomics when reading / writing to the field. There are still a
few places in the GC where we do not use atomics. Those should be safe as
the world is stopped.
2024-05-08 16:03:39 -04:00
mpage 37d0950022
gh-117657: Disable the function/code cache in free-threaded builds (#118301)
This is only used by the specializing interpreter and the tier 2
optimizer, both of which are disabled in free-threaded builds.
2024-05-03 16:21:04 -04:00
Alex Turner 2ba1aed596
gh-117657: TSAN fix race on `gstate->young.count` (#118313) 2024-04-29 20:26:26 +00:00
mpage a5eeb832c2
gh-117657: Fix race data race in `_Py_IsOwnedByCurrentThread()` (#118258) 2024-04-26 10:39:08 -04:00
Dino Viehland 5da0280648
gh-117657: Fixes a few small TSAN issues in dictobject (#118200)
Fixup TSAN errors for dict
2024-04-25 08:53:29 -07:00
mpage cce5ae6082
gh-117657: Add a couple more TSAN suppressions (#118256) 2024-04-25 11:48:16 -04:00
mpage f14e9f9154
gh-117657: Fix data race in `_Py_IsImmortal` (#118261)
The load of `ob_ref_local races with stores. Using a relaxed load is
sufficient; stores to the field are relaxed.
2024-04-25 11:31:57 -04:00
Dino Viehland 1e4a4c4897
gh-117657: use relaxed loads for checking dict keys immortality (#118067)
Use relaxed load to check if dictkeys are immortal
2024-04-19 09:25:08 -07:00
mpage 0d29302155
gh-117657: Quiet erroneous TSAN reports of data races in `_PySeqLock` (#117955)
Quiet erroneous TSAN reports of data races in `_PySeqLock`

TSAN reports a couple of data races between the compare/exchange in
`_PySeqLock_LockWrite` and the non-atomic loads in `_PySeqLock_{Abandon,Unlock}Write`.
This is another instance of TSAN incorrectly modeling failed compare/exchange
as a write instead of a load.
2024-04-17 17:19:28 +00:00
mpage b6c62c79e7
gh-117657: Fix data races in the method cache in free-threaded builds (#117954)
Fix data races in the method cache in free-threaded builds

These are technically data races, but I think they're benign (to
the extent that that is actually possible). We update cache entries
non-atomically but read them atomically from another thread, and there's
nothing that establishes a happens-before relationship between the
reads and writes that I can see.
2024-04-17 09:42:56 -07:00
mpage 47832067da
gh-117657: Add TSAN suppressions for the free-threaded build (#117736)
Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
2024-04-15 12:08:25 -04:00