cpython

Commit Graph

Author	SHA1	Message	Date
Mark Shannon	f6f4e8a662	GH-132554: "Virtual" iterators (GH-132555) * FOR_ITER now pushes either the iterator and NULL or leaves the iterable and pushes tagged zero * NEXT_ITER uses the tagged int as the index into the sequence or, if TOS is NULL, iterates as before.	2025-05-27 15:59:45 +01:00
Tomas R.	c492ac7252	GH-131798: Split up and optimize CALL_ISINSTANCE (GH-133339)	2025-05-08 14:26:30 -07:00
Irit Katriel	296cd128bf	Revert "gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396 )" (#133498 )	2025-05-06 13:12:26 +03:00
Diego Russo	9cc77aaf9d	GH-131798: Split CALL_LEN into several uops (GH-133180)	2025-05-05 14:31:48 -07:00
Irit Katriel	082dbf7788	gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396 )	2025-05-05 17:46:56 +01:00
Irit Katriel	5529213d4e	gh-100239: specialize BINARY_OP/SUBSCR for list-slice (#132626 )	2025-05-01 10:28:52 +00:00
Mark Shannon	3831752689	PyStats: Make sure that the `failure_kinds` array is big enough. (#133245 )	2025-05-01 10:02:51 +00:00
Mark Shannon	622300bdfa	GH-132554: Add stats for GET_ITER (GH-132592) * Add stats for GET_ITER * Look for common iterable types, not iterator types * Add stats for self iter and fix naming in summary	2025-04-29 09:00:14 +01:00
Kumar Aditya	f3d877a27a	gh-132643: use atomic load for dict in specializer (#132653 )	2025-04-18 15:06:27 +05:30
mpage	619edb802e	gh-132336: Mark a few "slow path" functions used by the interpreter loop as noinline (#132337 ) Mark a few functions used by the interpreter loop as noinline These are all the slow path and should not be inlined into the interpreter loop. Unfortunately, they end up being inlined with LTO and the current PGO task.	2025-04-10 10:41:15 +02:00
Irit Katriel	8c9ef8f1f8	gh-100239: more stats for BINARY_OP/SUBSCR specialization (#132230 )	2025-04-08 08:50:51 +00:00
Irit Katriel	68e72cf3a8	gh-100239: fix bug in comparison (#132093 )	2025-04-04 18:09:49 +01:00
Irit Katriel	df59226997	gh-100239: more refined specialisation stats for BINARY_OP/SUBSCR (#132068 )	2025-04-04 15:33:31 +01:00
Dino Viehland	2984ff9e51	gh-130373: Avoid locking in _LOAD_ATTR_WITH_HINT (#130372 ) Avoid locking in _LOAD_ATTR_WITH_HINT	2025-03-28 15:16:41 -07:00
Victor Stinner	5c44d7d99c	gh-130931: Add pycore_interpframe.h internal header (#131249 ) Move _PyInterpreterFrame and associated functions to a new pycore_interpframe.h header.	2025-03-19 18:17:44 +01:00
Ken Jin	b2ed7a6d6a	gh-131281: Add include for pystats builds (#131369 ) Add include to for pystats builds	2025-03-18 00:36:06 +08:00
Mark Shannon	a45f25361d	GH-131238: More refactoring of core header files (GH-131351) Adds new pycore_stats.h header file to help break dependencies involving the pycore_code.h header.	2025-03-17 14:41:05 +00:00
T. Wouters	de2f7da77d	gh-115999: Add free-threaded specialization for FOR_ITER (#128798 ) Add free-threaded versions of existing specialization for FOR_ITER (list, tuples, fast range iterators and generators), without significantly affecting their thread-safety. (Iterating over shared lists/tuples/ranges should be fine like before. Reusing iterators between threads is not fine, like before. Sharing generators between threads is a recipe for significant crashes, like before.)	2025-03-12 16:21:46 +01:00
Irit Katriel	a1417b211f	gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR (#129700 )	2025-02-07 22:39:54 +00:00
Brandt Bucher	5fa7e1b7fd	GH-129715: Remove _DYNAMIC_EXIT (GH-129716)	2025-02-07 11:41:17 -08:00
Diego Russo	567394517a	GH-128842: Collect JIT memory stats (GH-128941)	2025-02-02 15:17:53 -08:00
Yan Yanchii	e6c76b947b	GH-128872: Remove unused argument from _PyCode_Quicken (GH-128873) Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>	2025-02-02 15:09:30 -08:00
Irit Katriel	4815131910	gh-100239: specialize bitwise logical binary ops on ints (#128927 )	2025-01-29 09:28:21 +00:00
Sam Gross	a10f99375e	Revert "GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918)" (GH-129202) The commit introduced a ~2.5-3% regression in the free threading build. This reverts commit `ab61d3f430`.	2025-01-23 09:26:25 +00:00
Mark Shannon	ab61d3f430	GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918)	2025-01-20 17:09:23 +00:00
Kirill Podoprigora	6c52ada551	gh-100239: Handle NaN and zero division in guards for `BINARY_OP_EXTEND` (#128963 ) Co-authored-by: Tomas R. <tomas.roun8@gmail.com> Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>	2025-01-19 11:02:49 +00:00
Irit Katriel	3893a92d95	gh-100239: specialize long tail of binary operations (#128722 )	2025-01-16 15:22:13 +00:00
mpage	b5ee0258bf	gh-115999: Specialize `LOAD_ATTR` for instance and class receivers in free-threaded builds (#128164 ) Finish specialization for LOAD_ATTR in the free-threaded build by adding support for class and instance receivers.	2025-01-14 11:56:11 -08:00
Mark Shannon	ddd959987c	GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708)	2025-01-13 10:30:28 +00:00
T. Wouters	8f93dd8a8f	gh-115999: Add free-threaded specialization for COMPARE_OP (#126410 ) Add free-threaded specialization for COMPARE_OP, and tests for COMPARE_OP specialization in general. Co-authored-by: Donghee Na <donghee.na92@gmail.com>	2025-01-07 06:41:01 -08:00
Ken Jin	7ef4907412	gh-128262: Allow specialization of calls to classes with __slots__ (GH-128263)	2024-12-31 12:24:17 +08:00
Yan Yanchii	fe4dd07a84	gh-119786: Mention `InternalDocs/interpreter.md` instead of non-existing `adaptive.md` (#128329 ) `Python/specialize.c`: Mention `InternalDocs/interpreter.md` instead of non-existing `adaptive.md` Co-authored-by: Peter Bierma <zintensitydev@gmail.com>	2024-12-30 18:38:09 +00:00
Neil Schemenauer	1b15c89a17	gh-115999: Specialize `STORE_ATTR` in free-threaded builds. (gh-127838) * Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and use in place of `_PyDictKeys_StringLookup`. * Change `_PyObject_TryGetInstanceAttribute` to use that function in the case of split keys. * Add `unicodekeys_lookup_split` helper which allows code sharing between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`. * Fix locking for `STORE_ATTR_INSTANCE_VALUE`. Create `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and `tp_version_tag` cannot change. * Pass `tp_version_tag` to `specialize_dict_access()`, ensuring the version we store on the cache is the correct one (in case of it changing during the specalize analysis). * Split `analyze_descriptor` into `analyze_descriptor_load` and `analyze_descriptor_store` since those don't share much logic. Add `descriptor_is_class` helper function. * In `specialize_dict_access`, double check `_PyObject_GetManagedDict()` in case we race and dict was materialized before the lock. * Avoid borrowed references in `_Py_Specialize_StoreAttr()`. * Use `specialize()` and `unspecialize()` helpers. * Add unit tests to ensure specializing happens as expected in FT builds. * Add unit tests to attempt to trigger data races (useful for running under TSAN). * Add `has_split_table` function to `_testinternalcapi`.	2024-12-19 10:21:17 -08:00
Donghee Na	48c70b8f7d	gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737)	2024-12-19 11:08:17 +09:00
mpage	2de048ce79	gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711 ) We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds: _CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version. _LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.	2024-12-13 10:17:16 -08:00
mpage	c84928ed6d	gh-115999: Specialize `CALL_KW` in free-threaded builds (#127713 ) * Enable specialization of CALL_KW * Fix bug pushing frame in _PY_FRAME_KW `_PY_FRAME_KW` pushes a pointer to the new frame onto the stack for consumption by the next uop. When pushing the frame fails, we do not want to push the result, `NULL`, to the stack because it is not a valid stackref. This works in the default build because `PyStackRef_NULL` and `NULL` are the same value, so the `PyStackRef_XCLOSE()` in the error handler ignores it. In the free-threaded build the values are not the same; `PyStackRef_XCLOSE()` will attempt to decref a null pointer.	2024-12-11 15:18:22 -08:00
Sam Gross	a353455fca	gh-125610: Fix `STORE_ATTR_INSTANCE_VALUE` specialization check (GH-125612) The `STORE_ATTR_INSTANCE_VALUE` opcode doesn't support objects with non-NULL managed dictionaries, so don't specialize to that op in that case.	2024-12-06 10:48:24 -05:00
mpage	dabcecfd6d	gh-115999: Enable specialization of `CALL` instructions in free-threaded builds (#127123 ) The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below. A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe: Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref. Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__. Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes. Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure. A few other miscellaneous changes were also needed: Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it. Add missing co_tlbc for _Py_InitCleanup. Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.	2024-12-03 11:20:20 -08:00
Neil Schemenauer	276cd66ccb	gh-115999: Add free-threaded specialization for `SEND` (gh-127426) No additional thread safety changes are required. Note that sending to a generator that is shared between threads is currently not safe in the free-threaded build.	2024-12-03 10:25:12 -08:00
Neil Schemenauer	0cb5222079	gh-115999: Specialize `LOAD_SUPER_ATTR` in free-threaded builds (gh-127128) Use existing helpers to atomically modify the bytecode. Add unit tests to ensure specializing is happening as expected. Add test_specialize.py that can be used with ThreadSanitizer to detect data races. Fix thread safety issue with cell_set_contents().	2024-12-03 09:32:26 -08:00
Michael Droettboom	edefb8678a	gh-127518: Fix pystats build after #127169 (#127526 ) gh-127518: Fix pystats build after #127619	2024-12-02 20:17:08 +00:00
Mark Shannon	a8dd821d5b	GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-127110) * Mark almost all reachable objects before doing collection phase * Add stats for objects marked * Visit new frames before each increment * Update docs * Clearer calculation of work to do.	2024-12-02 10:12:17 +00:00
Donghee Na	e2713409cf	gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227)	2024-12-02 10:38:17 +09:00
Sam Gross	71ede1142d	gh-115999: Add free-threaded specialization for `STORE_SUBSCR` (#127169 ) The specialization only depends on the type, so no special thread-safety considerations there. STORE_SUBSCR_LIST_INT needs to lock the list before modifying it. `_PyDict_SetItem_Take2` already internally locks the dictionary using a critical section.	2024-11-26 16:46:06 -05:00
mpage	d24a22e9b6	gh-115999: Record success in `specialize` (#127167 ) Record success in `specialize` This matches the existing behavior where we increment the success stat for the generic opcode each time we successfully specialize an instruction.	2024-11-22 12:07:05 -08:00
Kirill Podoprigora	27486c3365	gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` (#126600 ) Add free-threaded specialization for `UNPACK_SEQUENCE` opcode. `UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable. `UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack. --------- Co-authored-by: Matt Page <mpage@meta.com> Co-authored-by: mpage <mpage@cs.stanford.edu>	2024-11-22 19:00:35 +02:00
Donghee Na	78a530a578	gh-115999: Add free-threaded specialization for ``TO_BOOL`` (gh-126616)	2024-11-22 07:52:16 +09:00
mpage	09c240f20c	gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607 ) Enable specialization of LOAD_GLOBAL in free-threaded builds. Thread-safety of specialization in free-threaded builds is provided by the following: A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization. Generation of new keys versions is made atomic in free-threaded builds. Existing helpers are used to atomically modify the opcode. Thread-safety of specialized instructions in free-threaded builds is provided by the following: Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards. Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset. The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.	2024-11-21 11:22:21 -08:00
mpage	32428cf9ea	gh-115999: Don't take a reason in unspecialize (#127030 ) Don't take a reason in unspecialize We only want to compute the reason if stats are enabled. Optimizing compilers should optimize this away for us (gcc and clang do), but it's better to be safe than sorry.	2024-11-20 14:54:48 -08:00
Hugo van Kemenade	899fdb213d	Revert "GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)" (#126983 )	2024-11-19 11:25:09 +02:00

1 2 3 4 5 ...

297 Commits