If `Py_IsFinalizing()` is true, non-daemon threads (other than the current one)
are done, and daemon threads are prevented from running, so they
cannot finalize themselves and become done. Joining them (without timeout)
would block forever.
Raise PythonFinalizationError instead of hanging.
Raise even when a timeout is given, for consistency with trying to join your own thread.
See gh-123940 for a use case: calling `join()` from `__del__`. This is
ill-advised, but an exception should at least make it easier to diagnose.
The PyThreadState field gains a reference count field to avoid
issues with PyThreadState being a dangling pointer to freed memory.
The refcount starts with a value of two: one reference is owned by the
interpreter's linked list of thread states and one reference is owned by
the OS thread. The reference count is decremented when the thread state
is removed from the interpreter's linked list and before the OS thread
calls `PyThread_hang_thread()`. The thread that decrements it to zero
frees the `PyThreadState` memory.
The `holds_gil` field is moved out of the `_status` bit field, to avoid
a data race where on thread calls `PyThreadState_Clear()`, modifying the
`_status` bit field while the OS thread reads `holds_gil` when
attempting to acquire the GIL.
The `PyThreadState.state` field now has `_Py_THREAD_SHUTTING_DOWN` as a
possible value. This corresponds to the `_PyThreadState_MustExit()`
check. This avoids race conditions in the free threading build when
checking `_PyThreadState_MustExit()`.
The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed
reference, has been replaced by using one of the following functions, which
return a strong reference and distinguish a missing attribute from an error:
_PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(),
_PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().
Implement set_name() with SetThreadDescription() and _get_name() with
GetThreadDescription(). If SetThreadDescription() or
GetThreadDescription() is not available in kernelbase.dll, delete the
method when the _thread module is imported.
Truncate the thread name to 32766 characters.
Co-authored-by: Eryk Sun <eryksun@gmail.com>
This change, along with the LOAD_ATTR specializations, makes the
"thread_local_read" micro benchmark in Tools/ftscalingbench/ftscalingbench.py
scale well to multiple threads.
On Linux, threading.Thread now sets the thread name to the operating
system.
* configure now checks if pthread_getname_np()
and pthread_setname_np() functions are available.
* Add PYTHREAD_NAME_MAXLEN macro.
* Add _thread._NAME_MAXLEN constant for test_threading.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
Avoid detaching thread state when stopping the world. When re-attaching
the thread state, the thread would attempt to resume the top-most
critical section, which might now be held by a thread paused for our
stop-the-world request.
Starting in Python 3.12, we prevented calling fork() and starting new threads
during interpreter finalization (shutdown). This has led to a number of
regressions and flaky tests. We should not prevent starting new threads
(or `fork()`) until all non-daemon threads exit and finalization starts in
earnest.
This changes the checks to use `_PyInterpreterState_GetFinalizing(interp)`,
which is set immediately before terminating non-daemon threads.
Even though it has no internal references to Python objects it still
has a reference to its type by virtue of being a heap type. We need
to provide a traverse function that visits the type, but we do not
need to provide a clear function.
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:
1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
However, C holds it, so the assertion fails.
The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.
There are two main parts to this PR:
1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
The event is set by the runtime prior to the thread being cleared (in the same
place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
`PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
`_tstate_lock`s.
[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Previously, the `locked` field was set after releasing the lock. This reverses
the order so that the `locked` field is set while the lock is still held.
There is still one thread-safety issue where `locked` is checked prior to
releasing the lock, however, in practice that will only be an issue when
unlocking the lock is contended, which should be rare.
Make `_thread.ThreadHandle` thread-safe in free-threaded builds
We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.
The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.
The internal thread handle `detach()` method has been removed.
Remove references to the old names _PyTime_MIN
and _PyTime_MAX, now that PyTime_MIN and
PyTime_MAX are public.
Replace also _PyTime_MIN with PyTime_MIN.
<pycore_time.h> include is no longer needed to get the PyTime_t type
in internal header files. This type is now provided by <Python.h>
include. Add <pycore_time.h> includes to C files instead.
The ID of the owning thread (`rlock_owner`) may be accessed by
multiple threads without holding the underlying lock; relaxed
atomics are used in place of the previous loads/stores.
The number of times that the lock has been acquired (`rlock_count`)
is only ever accessed by the thread that holds the lock; we do not
need to use atomics to access it.
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.
The following functions now raise PythonFinalizationError, instead of
RuntimeError:
* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()
Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
This marks dead ThreadHandles as non-joinable earlier in
`PyOS_AfterFork_Child()` before we execute any Python code. The handles
are stored in a global linked list in `_PyRuntimeState` because `fork()`
affects the entire process.
`threading.Lock` is now the underlying class and is constructable rather than the old
factory function. This allows for type annotations to refer to it which had no non-ugly
way to be expressed prior to this.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>