cpython

Commit Graph

Author	SHA1	Message	Date
Xuanteng Huang	1821f8f10c	gh-131281: fix compile error due to `BINARY_SUBSCR` (GH-131283) * fix compile error due to `BINARY_SUBSCR` * replace stat_inc with `BINARY_OP`	2025-03-15 23:38:46 +08:00
T. Wouters	de2f7da77d	gh-115999: Add free-threaded specialization for FOR_ITER (#128798 ) Add free-threaded versions of existing specialization for FOR_ITER (list, tuples, fast range iterators and generators), without significantly affecting their thread-safety. (Iterating over shared lists/tuples/ranges should be fine like before. Reusing iterators between threads is not fine, like before. Sharing generators between threads is a recipe for significant crashes, like before.)	2025-03-12 16:21:46 +01:00
Mark Shannon	2bef8ea8ea	GH-127705: Use `_PyStackRef`s in the default build. (GH-127875)	2025-03-10 14:06:56 +00:00
Sam Gross	a025f27d94	gh-130920: Fix data race in STORE_SUBSCR_LIST_INT (#130923 ) The write of the item to the list needs to use an atomic operation in the free threading build. Co-authored-by: Tomasz Pytel <tompytel@gmail.com>	2025-03-06 15:59:48 -05:00
mpage	d7bb7c7817	gh-118331: Fix a couple of issues when list allocation fails (#130811 ) * Fix use after free in list objects Set the items pointer in the list object to NULL after the items array is freed during list deallocation. Otherwise, we can end up with a list object added to the free list that contains a pointer to an already-freed items array. * Mark `_PyList_FromStackRefStealOnSuccess` as escaping I think technically it's not escaping, because the only object that can be decrefed if allocation fails is an exact list, which cannot execute arbitrary code when it is destroyed. However, this seems less intrusive than trying to special cases objects in the assert in `_Py_Dealloc` that checks for non-null stackpointers and shouldn't matter for performance.	2025-03-05 10:42:09 -08:00
Mark Shannon	54965f3fb2	GH-130296: Avoid stack transients in four instructions. (GH-130310) * Combine _GUARD_GLOBALS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_MODULE_FROM_KEYS into _LOAD_GLOBAL_MODULE * Combine _GUARD_BUILTINS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_BUILTINS_FROM_KEYS into _LOAD_GLOBAL_BUILTINS * Combine _CHECK_ATTR_MODULE_PUSH_KEYS and _LOAD_ATTR_MODULE_FROM_KEYS into _LOAD_ATTR_MODULE * Remove stack transient in LOAD_ATTR_WITH_HINT	2025-02-28 18:00:38 +00:00
Petr Viktorin	fecf8bc8f2	gh-130595: Fix leak in WITH_EXCEPT_START error case (GH-130626) Co-authored-by: Ken Jin <kenjin@python.org>	2025-02-28 08:58:50 +00:00
Dino Viehland	5c8e8704c3	gh-130595: Keep traceback alive for WITH_EXCEPT_START (#130562 ) Keep traceback alive for WITH_EXCEPT_START	2025-02-26 10:41:26 -08:00
Mark Shannon	014223649c	GH-130396: Use computed stack limits on linux (GH-130398) * Implement C recursion protection with limit pointers for Linux, MacOS and Windows * Remove calls to PyOS_CheckStack * Add stack protection to parser * Make tests more robust to low stacks * Improve error messages for stack overflow	2025-02-25 09:24:48 +00:00
Petr Viktorin	ef29104f7d	GH-91079: Revert "GH-91079: Implement C stack limits using addresses, not counters. (GH-130007)" for now (GH130413) Revert "GH-91079: Implement C stack limits using addresses, not counters. (GH-130007)" for now Unfortunatlely, the change broke some buildbots. This reverts commit `2498c22fa0`.	2025-02-24 11:16:08 +01:00
Mark Shannon	2498c22fa0	GH-91079: Implement C stack limits using addresses, not counters. (GH-130007) * Implement C recursion protection with limit pointers * Remove calls to PyOS_CheckStack * Add stack protection to parser * Make tests more robust to low stacks * Improve error messages for stack overflow	2025-02-19 11:44:57 +00:00
Mark Shannon	72f56654d0	GH-128682: Account for escapes in `DECREF_INPUTS` (GH-129953) * Handle escapes in DECREF_INPUTS * Mark a few more functions as escaping * Replace DECREF_INPUTS with PyStackRef_CLOSE where possible	2025-02-12 17:44:59 +00:00
Irit Katriel	a1417b211f	gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR (#129700 )	2025-02-07 22:39:54 +00:00
Brandt Bucher	5fa7e1b7fd	GH-129715: Remove _DYNAMIC_EXIT (GH-129716)	2025-02-07 11:41:17 -08:00
Brandt Bucher	70e387c990	GH-129709: Clean up tier two (GH-129710)	2025-02-07 09:52:49 -08:00
Mark Shannon	96ff4c2486	GH-128682: Mark two more macros as escaping. (GH-129645) Expand out SETLOCAL so that code generator can see the decref. Mark Py_CLEAR as escaping	2025-02-04 14:00:51 +00:00
Mark Shannon	2effea4dab	GH-128682: Spill the stack pointer in labels, as well as instructions (GH-129618)	2025-02-04 12:18:31 +00:00
Mark Shannon	75b628adeb	GH-128563: Generate `opcode = ...` in instructions that need `opcode` (GH-129608) * Remove support for GO_TO_INSTRUCTION	2025-02-03 15:09:21 +00:00
Mark Shannon	808071b994	GH-128682: Make `PyStackRef_CLOSE` escaping. (GH-129404)	2025-02-03 12:41:32 +00:00
Mark Shannon	75b4962157	GH-128914: Remove all but one conditional stack effects (GH-129226) * Remove all 'if (0)' and 'if (1)' conditional stack effects * Use array instead of conditional for BUILD_SLICE args * Refactor LOAD_GLOBAL to use a common conditional uop * Remove conditional stack effects from LOAD_ATTR specializations * Replace conditional stack effects in LOAD_ATTR with a 0 or 1 sized array. * Remove conditional stack effects from CALL_FUNCTION_EX	2025-01-27 16:24:48 +00:00
Irit Katriel	c39ae8922b	gh-128799: Add frame of except* to traceback when wrapping a naked exception (#128971 )	2025-01-25 13:00:23 +00:00
Sam Gross	a10f99375e	Revert "GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918)" (GH-129202) The commit introduced a ~2.5-3% regression in the free threading build. This reverts commit `ab61d3f430`.	2025-01-23 09:26:25 +00:00
Mark Shannon	470a0a68eb	GH-128682: Change a couple of functions to only steal references on success. (GH-129132) Change PyTuple_FromStackRefSteal and PyList_FromStackRefSteal to only steal on success to avoid escaping	2025-01-22 10:51:37 +00:00
Ken Jin	5809b25909	gh-128563: Move lltrace into the frame struct (GH-129113)	2025-01-21 22:17:15 +08:00
Mark Shannon	f5b6356a11	GH-128563: Add new frame owner type for interpreter entry frames (GH-129078) Add new frame owner type for interpreter entry frames	2025-01-21 10:15:02 +00:00
Mark Shannon	ab61d3f430	GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918)	2025-01-20 17:09:23 +00:00
Xuanteng Huang	b44ff6d0df	GH-126599: Remove the "counter" optimizer/executor (GH-126853)	2025-01-16 15:57:04 -08:00
Irit Katriel	3893a92d95	gh-100239: specialize long tail of binary operations (#128722 )	2025-01-16 15:22:13 +00:00
mpage	b5ee0258bf	gh-115999: Specialize `LOAD_ATTR` for instance and class receivers in free-threaded builds (#128164 ) Finish specialization for LOAD_ATTR in the free-threaded build by adding support for class and instance receivers.	2025-01-14 11:56:11 -08:00
Mark Shannon	f49a1df6f3	GH-128682: Convert explicit loops closing arrays into `DECREF_INPUTS`. (GH-128822) * Mark Py_DECREF and Py_XDECREF as escaping * Remove explicit loops for clearing array inputs	2025-01-14 15:08:56 +00:00
Mark Shannon	517dc65ffc	GH-128682: Stronger checking of `PyStackRef_CLOSE` and `DEAD`. (GH-128683)	2025-01-13 12:37:48 +00:00
Mark Shannon	ddd959987c	GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708)	2025-01-13 10:30:28 +00:00
Brandt Bucher	65ae3d5a73	GH-127809: Fix the JIT's understanding of ** (GH-127844)	2025-01-07 17:25:48 -08:00
Mark Shannon	f826beca0c	GH-128375: Better instrument for `FOR_ITER` (GH-128445)	2025-01-06 17:54:47 +00:00
Ken Jin	7ef4907412	gh-128262: Allow specialization of calls to classes with __slots__ (GH-128263)	2024-12-31 12:24:17 +08:00
Mark Shannon	128cc47fbd	GH-127705: Add debug mode for `_PyStackRef`s inspired by HPy debug mode (GH-128121)	2024-12-20 16:52:20 +00:00
Neil Schemenauer	1b15c89a17	gh-115999: Specialize `STORE_ATTR` in free-threaded builds. (gh-127838) * Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and use in place of `_PyDictKeys_StringLookup`. * Change `_PyObject_TryGetInstanceAttribute` to use that function in the case of split keys. * Add `unicodekeys_lookup_split` helper which allows code sharing between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`. * Fix locking for `STORE_ATTR_INSTANCE_VALUE`. Create `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and `tp_version_tag` cannot change. * Pass `tp_version_tag` to `specialize_dict_access()`, ensuring the version we store on the cache is the correct one (in case of it changing during the specalize analysis). * Split `analyze_descriptor` into `analyze_descriptor_load` and `analyze_descriptor_store` since those don't share much logic. Add `descriptor_is_class` helper function. * In `specialize_dict_access`, double check `_PyObject_GetManagedDict()` in case we race and dict was materialized before the lock. * Avoid borrowed references in `_Py_Specialize_StoreAttr()`. * Use `specialize()` and `unspecialize()` helpers. * Add unit tests to ensure specializing happens as expected in FT builds. * Add unit tests to attempt to trigger data races (useful for running under TSAN). * Add `has_split_table` function to `_testinternalcapi`.	2024-12-19 10:21:17 -08:00
Mark Shannon	d2f1d917e8	GH-122548: Implement branch taken and not taken events for sys.monitoring (GH-122564)	2024-12-19 16:59:51 +00:00
Donghee Na	48c70b8f7d	gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737)	2024-12-19 11:08:17 +09:00
mpage	2de048ce79	gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711 ) We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds: _CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version. _LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.	2024-12-13 10:17:16 -08:00
Pieter Eendebak	5fc6bb2754	gh-126868: Add freelist for compact int objects (GH-126865)	2024-12-13 10:06:26 +00:00
mpage	c84928ed6d	gh-115999: Specialize `CALL_KW` in free-threaded builds (#127713 ) * Enable specialization of CALL_KW * Fix bug pushing frame in _PY_FRAME_KW `_PY_FRAME_KW` pushes a pointer to the new frame onto the stack for consumption by the next uop. When pushing the frame fails, we do not want to push the result, `NULL`, to the stack because it is not a valid stackref. This works in the default build because `PyStackRef_NULL` and `NULL` are the same value, so the `PyStackRef_XCLOSE()` in the error handler ignores it. In the free-threaded build the values are not the same; `PyStackRef_XCLOSE()` will attempt to decref a null pointer.	2024-12-11 15:18:22 -08:00
mpage	dabcecfd6d	gh-115999: Enable specialization of `CALL` instructions in free-threaded builds (#127123 ) The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below. A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe: Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref. Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__. Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes. Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure. A few other miscellaneous changes were also needed: Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it. Add missing co_tlbc for _Py_InitCleanup. Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.	2024-12-03 11:20:20 -08:00
Donghee Na	7c2bd9b226	gh-115999: Use light-weight lock for UNPACK_SEQUENCE_LIST (gh-127514)	2024-12-03 00:14:40 +09:00
Donghee Na	e2713409cf	gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227)	2024-12-02 10:38:17 +09:00
Sam Gross	71ede1142d	gh-115999: Add free-threaded specialization for `STORE_SUBSCR` (#127169 ) The specialization only depends on the type, so no special thread-safety considerations there. STORE_SUBSCR_LIST_INT needs to lock the list before modifying it. `_PyDict_SetItem_Take2` already internally locks the dictionary using a critical section.	2024-11-26 16:46:06 -05:00
Sam Gross	4759ba6eec	gh-127022: Simplify `PyStackRef_FromPyObjectSteal` (#127024 ) This gets rid of the immortal check in `PyStackRef_FromPyObjectSteal()`. Overall, this improves performance about 2% in the free threading build. This also renames `PyStackRef_Is()` to `PyStackRef_IsExactly()` because the macro requires that the tag bits of the arguments match, which is only true in certain special cases.	2024-11-22 12:55:33 -05:00
Kirill Podoprigora	27486c3365	gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` (#126600 ) Add free-threaded specialization for `UNPACK_SEQUENCE` opcode. `UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable. `UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack. --------- Co-authored-by: Matt Page <mpage@meta.com> Co-authored-by: mpage <mpage@cs.stanford.edu>	2024-11-22 19:00:35 +02:00
Donghee Na	78a530a578	gh-115999: Add free-threaded specialization for ``TO_BOOL`` (gh-126616)	2024-11-22 07:52:16 +09:00
mpage	09c240f20c	gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607 ) Enable specialization of LOAD_GLOBAL in free-threaded builds. Thread-safety of specialization in free-threaded builds is provided by the following: A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization. Generation of new keys versions is made atomic in free-threaded builds. Existing helpers are used to atomically modify the opcode. Thread-safety of specialized instructions in free-threaded builds is provided by the following: Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards. Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset. The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.	2024-11-21 11:22:21 -08:00

1 2 3 4 5 ...

264 Commits