The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed
reference, has been replaced by using one of the following functions, which
return a strong reference and distinguish a missing attribute from an error:
_PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(),
_PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().
(cherry picked from commit 0ef4ffeefd)
gh-127190: Fix local_setattro() error handling (GH-127366)
Don't make the assumption that the 'name' argument is a string. Use
repr() to format the 'name' argument instead.
(cherry picked from commit 20657fbdb1)
Co-authored-by: Victor Stinner <vstinner@python.org>
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.
(cherry picked from commit ca3ea9ad05)
Co-authored-by: Radislav Chugunov <52372310+chgnrdv@users.noreply.github.com>
This is a small refactoring to the current design that allows us to
avoid manually iterating over threads.
This should also fix gh-118490.
(cherry picked from commit e059aa6b01)
Co-authored-by: mpage <mpage@meta.com>
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
Avoid detaching thread state when stopping the world. When re-attaching
the thread state, the thread would attempt to resume the top-most
critical section, which might now be held by a thread paused for our
stop-the-world request.
Starting in Python 3.12, we prevented calling fork() and starting new threads
during interpreter finalization (shutdown). This has led to a number of
regressions and flaky tests. We should not prevent starting new threads
(or `fork()`) until all non-daemon threads exit and finalization starts in
earnest.
This changes the checks to use `_PyInterpreterState_GetFinalizing(interp)`,
which is set immediately before terminating non-daemon threads.
Even though it has no internal references to Python objects it still
has a reference to its type by virtue of being a heap type. We need
to provide a traverse function that visits the type, but we do not
need to provide a clear function.
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:
1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
However, C holds it, so the assertion fails.
The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.
There are two main parts to this PR:
1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
The event is set by the runtime prior to the thread being cleared (in the same
place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
`PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
`_tstate_lock`s.
[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Previously, the `locked` field was set after releasing the lock. This reverses
the order so that the `locked` field is set while the lock is still held.
There is still one thread-safety issue where `locked` is checked prior to
releasing the lock, however, in practice that will only be an issue when
unlocking the lock is contended, which should be rare.
Make `_thread.ThreadHandle` thread-safe in free-threaded builds
We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.
The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.
The internal thread handle `detach()` method has been removed.
Remove references to the old names _PyTime_MIN
and _PyTime_MAX, now that PyTime_MIN and
PyTime_MAX are public.
Replace also _PyTime_MIN with PyTime_MIN.
<pycore_time.h> include is no longer needed to get the PyTime_t type
in internal header files. This type is now provided by <Python.h>
include. Add <pycore_time.h> includes to C files instead.
The ID of the owning thread (`rlock_owner`) may be accessed by
multiple threads without holding the underlying lock; relaxed
atomics are used in place of the previous loads/stores.
The number of times that the lock has been acquired (`rlock_count`)
is only ever accessed by the thread that holds the lock; we do not
need to use atomics to access it.
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.
The following functions now raise PythonFinalizationError, instead of
RuntimeError:
* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()
Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
This marks dead ThreadHandles as non-joinable earlier in
`PyOS_AfterFork_Child()` before we execute any Python code. The handles
are stored in a global linked list in `_PyRuntimeState` because `fork()`
affects the entire process.
`threading.Lock` is now the underlying class and is constructable rather than the old
factory function. This allows for type annotations to refer to it which had no non-ugly
way to be expressed prior to this.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
_PyDict_Pop_KnownHash(): remove the default value and the return type
becomes an int.
Co-authored-by: Stefan Behnel <stefan_ml@behnel.de>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Joining a thread now ensures the underlying OS thread has exited. This is required for safer fork() in multi-threaded processes.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
* gh-106320: Re-add _PyLong_FromByteArray(), _PyLong_AsByteArray() and _PyLong_GCD() to the public header files since they are used by third-party packages and there is no efficient replacement.
See https://github.com/python/cpython/issues/111140
See https://github.com/python/cpython/issues/111139
* gh-111262: Re-add _PyDict_Pop() to have a C-API until a new public one is designed.
There were a few things I did in gh-110565 that need to be fixed. I also forgot to add tests in that PR.
(Note that this PR exposes a refleak introduced by gh-110246. I'll take care of that separately.)
In a few places we switch to another interpreter without knowing if it has a thread state associated with the current thread. For the main interpreter there wasn't much of a problem, but for subinterpreters we were *mostly* okay re-using the tstate created with the interpreter (located via PyInterpreterState_ThreadHead()). There was a good chance that tstate wasn't actually in use by another thread.
However, there are no guarantees of that. Furthermore, re-using an already used tstate is currently fragile. To address this, now we create a new thread state in each of those places and use it.
One consequence of this change is that PyInterpreterState_ThreadHead() may not return NULL (though that won't happen for the main interpreter).
The existence of background threads running on a subinterpreter was preventing interpreters from getting properly destroyed, as well as impacting the ability to run the interpreter again. It also affected how we wait for non-daemon threads to finish.
We add PyInterpreterState.threads.main, with some internal C-API functions.
Fix _thread.start_new_thread() race condition. If a thread is created
during Python finalization, the newly spawned thread now exits
immediately instead of trying to access freed memory and lead to a
crash.
thread_run() calls PyEval_AcquireThread() which checks if the thread
must exit. The problem was that tstate was dereferenced earlier in
_PyThreadState_Bind() which leads to a crash most of the time.
Move _PyThreadState_CheckConsistency() from thread_run() to
_PyThreadState_Bind().
thread_run() of _threadmodule.c now calls
_PyThreadState_CheckConsistency() to check if tstate is a dangling
pointer when Python is built in debug mode.
Rename ceval_gil.c is_tstate_valid() to
_PyThreadState_CheckConsistency() to reuse it in _threadmodule.c.
Move the private _PyErr_WriteUnraisableMsg() functions to the
internal C API (pycore_pyerrors.h).
Move write_unraisable_exc() from _testcapi to _testinternalcapi.
Move private functions to the internal C API (pycore_sysmodule.h):
* _PySys_GetAttr()
* _PySys_GetSizeOf()
No longer export most of these functions.
Fix also a typo in Include/cpython/optimizer.h: add a missing space.
Move private functions to the internal C API (pycore_dict.h):
* _PyDictView_Intersect()
* _PyDictView_New()
* _PyDict_ContainsId()
* _PyDict_DelItemId()
* _PyDict_DelItem_KnownHash()
* _PyDict_GetItemIdWithError()
* _PyDict_GetItem_KnownHash()
* _PyDict_HasSplitTable()
* _PyDict_NewPresized()
* _PyDict_Next()
* _PyDict_Pop()
* _PyDict_SetItemId()
* _PyDict_SetItem_KnownHash()
* _PyDict_SizeOf()
No longer export most of these functions.
Move also the _PyDictViewObject structure to the internal C API.
Move dict_getitem_knownhash() function from _testcapi to the
_testinternalcapi extension. Update test_capi.test_dict for this
change.
Move private _PyEval functions to the internal C API
(pycore_ceval.h):
* _PyEval_GetBuiltin()
* _PyEval_GetBuiltinId()
* _PyEval_GetSwitchInterval()
* _PyEval_MakePendingCalls()
* _PyEval_SetProfile()
* _PyEval_SetSwitchInterval()
* _PyEval_SetTrace()
No longer export most of these functions.
Hold a strong reference on the object, rather than using a borrowed reference:
replace PyWeakref_GET_OBJECT() with PyWeakref_GetRef() and
_PyWeakref_GET_REF().
Remove assert(PyWeakref_CheckRef(localweakref)) since it's already
tested by _PyWeakref_GET_REF().
For a while now, pending calls only run in the main thread (in the main interpreter). This PR changes things to allow any thread run a pending call, unless the pending call was explicitly added for the main thread to run.