Commit Graph

113 Commits

Author SHA1 Message Date
Tomas R. c492ac7252
GH-131798: Split up and optimize CALL_ISINSTANCE (GH-133339) 2025-05-08 14:26:30 -07:00
Irit Katriel 296cd128bf
Revert "gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396)" (#133498) 2025-05-06 13:12:26 +03:00
Diego Russo 9cc77aaf9d
GH-131798: Split CALL_LEN into several uops (GH-133180) 2025-05-05 14:31:48 -07:00
Irit Katriel 082dbf7788
gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396) 2025-05-05 17:46:56 +01:00
Mark Shannon ac7d5ba96e
GH-133231: Changes to executor management to support proposed `sys._jit` module (GH-133287)
* Track the current executor, not the previous one, on the thread-state. 

* Batch executors for deallocation to avoid having to constantly incref executors; this is an ad-hoc form of deferred reference counting.
2025-05-04 10:05:35 +01:00
Ken Jin ddac7ac59a
gh-132744: Check recursion limit in CALL_PY_GENERAL (GH-132746) 2025-05-02 17:36:29 +01:00
Irit Katriel a4be3bc34f
gh-133258: Fix crash in test_index (GH-133262) 2025-05-01 19:15:53 +02:00
Irit Katriel 5529213d4e
gh-100239: specialize BINARY_OP/SUBSCR for list-slice (#132626) 2025-05-01 10:28:52 +00:00
Lysandros Nikolaou 60202609a2
gh-132661: Implement PEP 750 (#132662)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Wingy <git@wingysam.xyz>
Co-authored-by: Koudai Aono <koxudaxi@gmail.com>
Co-authored-by: Dave Peck <davepeck@gmail.com>
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Paul Everitt <pauleveritt@me.com>
Co-authored-by: sobolevn <mail@sobolevn.me>
2025-04-30 11:46:41 +02:00
Tomas R. 08e3389e8c
GH-131798: Split up and optimize CALL_TUPLE_1 in the JIT (GH-132851) 2025-04-24 15:55:03 -07:00
Tomas R. 0a387b311e
GH-131798: Split up and optimize CALL_STR_1 in the JIT (GH-132849) 2025-04-24 12:54:46 -07:00
Tomas R. a6a3dbb7db
GH-131798: JIT: Split CALL_TYPE_1 into several uops (GH-132419) 2025-04-22 09:30:38 -07:00
Sam Gross da53660f35
gh-131586: Avoid refcount contention in context managers (gh-131851)
This avoid reference count contention in the free threading build
when calling special methods like `__enter__` and `__exit__`.
2025-04-21 15:54:25 -04:00
Brandt Bucher 40ae88988c
GH-131498: Replace single-element arrays with scalars in bytecodes.c (GH-132615) 2025-04-18 07:16:28 -07:00
Mark Shannon 844596c09f
GH-131498: Cases generator: Allow input and 'peek' variables to be modified (GH-132506) 2025-04-14 12:19:53 +01:00
Brandt Bucher 20926c73b5
GH-131798: Remove JIT guards for dict, frozenset, list, set, and tuple (GH-132289) 2025-04-09 14:32:21 -07:00
Brandt Bucher 3a8cefba0b
GH-131726: Split up _CHECK_VALIDITY_AND_SET_IP (GH-131810) 2025-04-01 16:55:05 -07:00
Brandt Bucher 1a9d4a1fb3
GH-131798: Allow the JIT to remove more int/float/str guards (GH-131800) 2025-04-01 15:10:15 -07:00
mpage 053c285f6b
gh-130704: Strength reduce `LOAD_FAST{_LOAD_FAST}` (#130708)
Optimize `LOAD_FAST` opcodes into faster versions that load borrowed references onto the operand stack when we can prove that the lifetime of the local outlives the lifetime of the temporary that is loaded onto the stack.
2025-04-01 10:18:42 -07:00
Amit Lavon 685fd74f81
GH-131798: Remove type checks for _TO_BOOL_STR (GH-131816) 2025-03-30 16:07:25 -07:00
Sam Gross 3d4ac1a2c2
gh-123358: Use `_PyStackRef` in `LOAD_DEREF` (gh-130064)
Concurrent accesses from multiple threads to the same `cell` object did not
scale well in the free-threaded build. Use `_PyStackRef` and optimistically
avoid locking to improve scaling.

With the locks around cell reads gone, some of the free threading tests were
prone to starvation: the readers were able to run in a tight loop and the
writer threads weren't scheduled frequently enough to make timely progress.
Adjust the tests to avoid this.

Co-authored-by: Donghee Na <donghee.na@python.org>
2025-03-26 12:08:20 -04:00
Mark Shannon 1b8bb1ed0c
GH-131729: Code-gen better liveness analysis (GH-131732)
* Rename 'defined' attribute to 'in_local' to more accurately reflect how it is used

* Make death of variables explicit even for array variables.

* Convert in_memory from boolean to stack offset

* Don't apply liveness analysis to optimizer generated code

* Fix RETURN_VALUE in optimizer
2025-03-26 15:21:35 +00:00
Savannah Ostrowski b92ee14b80
GH-130415: Optimize constant comparison in JIT builds (GH-131489) 2025-03-21 11:23:12 -07:00
T. Wouters de2f7da77d
gh-115999: Add free-threaded specialization for FOR_ITER (#128798)
Add free-threaded versions of existing specialization for FOR_ITER (list, tuples, fast range iterators and generators), without significantly affecting their thread-safety. (Iterating over shared lists/tuples/ranges should be fine like before. Reusing iterators between threads is not fine, like before. Sharing generators between threads is a recipe for significant crashes, like before.)
2025-03-12 16:21:46 +01:00
Mark Shannon 2bef8ea8ea
GH-127705: Use `_PyStackRef`s in the default build. (GH-127875) 2025-03-10 14:06:56 +00:00
mpage d7bb7c7817
gh-118331: Fix a couple of issues when list allocation fails (#130811)
* Fix use after free in list objects

Set the items pointer in the list object to NULL after the items array
is freed during list deallocation. Otherwise, we can end up with a list
object added to the free list that contains a pointer to an already-freed
items array.

* Mark `_PyList_FromStackRefStealOnSuccess` as escaping

I think technically it's not escaping, because the only object that
can be decrefed if allocation fails is an exact list, which cannot
execute arbitrary code when it is destroyed. However, this seems less
intrusive than trying to special cases objects in the assert in `_Py_Dealloc`
that checks for non-null stackpointers and shouldn't matter for performance.
2025-03-05 10:42:09 -08:00
Mark Shannon 54965f3fb2
GH-130296: Avoid stack transients in four instructions. (GH-130310)
* Combine _GUARD_GLOBALS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_MODULE_FROM_KEYS into _LOAD_GLOBAL_MODULE

* Combine _GUARD_BUILTINS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_BUILTINS_FROM_KEYS into _LOAD_GLOBAL_BUILTINS

* Combine _CHECK_ATTR_MODULE_PUSH_KEYS and _LOAD_ATTR_MODULE_FROM_KEYS into _LOAD_ATTR_MODULE

* Remove stack transient in LOAD_ATTR_WITH_HINT
2025-02-28 18:00:38 +00:00
Mark Shannon 72f56654d0
GH-128682: Account for escapes in `DECREF_INPUTS` (GH-129953)
* Handle escapes in DECREF_INPUTS

* Mark a few more functions as escaping

* Replace DECREF_INPUTS with PyStackRef_CLOSE where possible
2025-02-12 17:44:59 +00:00
Irit Katriel a1417b211f
gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR (#129700) 2025-02-07 22:39:54 +00:00
Brandt Bucher 5fa7e1b7fd
GH-129715: Remove _DYNAMIC_EXIT (GH-129716) 2025-02-07 11:41:17 -08:00
Brandt Bucher 70e387c990
GH-129709: Clean up tier two (GH-129710) 2025-02-07 09:52:49 -08:00
Mark Shannon 808071b994
GH-128682: Make `PyStackRef_CLOSE` escaping. (GH-129404) 2025-02-03 12:41:32 +00:00
Irit Katriel 4815131910
gh-100239: specialize bitwise logical binary ops on ints (#128927) 2025-01-29 09:28:21 +00:00
Mark Shannon 75b4962157
GH-128914: Remove all but one conditional stack effects (GH-129226)
* Remove all 'if (0)' and 'if (1)' conditional stack effects

* Use array instead of conditional for BUILD_SLICE args

* Refactor LOAD_GLOBAL to use a common conditional uop

* Remove conditional stack effects from LOAD_ATTR specializations

* Replace conditional stack effects in LOAD_ATTR with a 0 or 1 sized array.

* Remove conditional stack effects from CALL_FUNCTION_EX
2025-01-27 16:24:48 +00:00
Sam Gross a10f99375e
Revert "GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918)" (GH-129202)
The commit introduced a ~2.5-3% regression in the free threading build.

This reverts commit ab61d3f430.
2025-01-23 09:26:25 +00:00
Mark Shannon 470a0a68eb
GH-128682: Change a couple of functions to only steal references on success. (GH-129132)
Change PyTuple_FromStackRefSteal and PyList_FromStackRefSteal to only steal on success to avoid escaping
2025-01-22 10:51:37 +00:00
Mark Shannon ab61d3f430
GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918) 2025-01-20 17:09:23 +00:00
Xuanteng Huang b44ff6d0df
GH-126599: Remove the "counter" optimizer/executor (GH-126853) 2025-01-16 15:57:04 -08:00
Irit Katriel 3893a92d95
gh-100239: specialize long tail of binary operations (#128722) 2025-01-16 15:22:13 +00:00
mpage b5ee0258bf
gh-115999: Specialize `LOAD_ATTR` for instance and class receivers in free-threaded builds (#128164)
Finish specialization for LOAD_ATTR in the free-threaded build by adding support for class and instance receivers.
2025-01-14 11:56:11 -08:00
Mark Shannon 517dc65ffc
GH-128682: Stronger checking of `PyStackRef_CLOSE` and `DEAD`. (GH-128683) 2025-01-13 12:37:48 +00:00
Mark Shannon 39fc7ef4fe
GH-124483: Mark `Py_DECREF`, etc. as escaping for the JIT (GH-128678) 2025-01-13 11:42:45 +00:00
Mark Shannon ddd959987c
GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708) 2025-01-13 10:30:28 +00:00
Mark Shannon f826beca0c
GH-128375: Better instrument for `FOR_ITER` (GH-128445) 2025-01-06 17:54:47 +00:00
Ken Jin 7ef4907412
gh-128262: Allow specialization of calls to classes with __slots__ (GH-128263) 2024-12-31 12:24:17 +08:00
Neil Schemenauer 1b15c89a17
gh-115999: Specialize `STORE_ATTR` in free-threaded builds. (gh-127838)
* Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and
  use in place of `_PyDictKeys_StringLookup`.

* Change `_PyObject_TryGetInstanceAttribute` to use that function
  in the case of split keys.

* Add `unicodekeys_lookup_split` helper which allows code sharing
  between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`.

* Fix locking for `STORE_ATTR_INSTANCE_VALUE`.  Create
  `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and
  `tp_version_tag` cannot change.

* Pass `tp_version_tag` to `specialize_dict_access()`, ensuring
  the version we store on the cache is the correct one (in case of
  it changing during the specalize analysis).

* Split `analyze_descriptor` into `analyze_descriptor_load` and
  `analyze_descriptor_store` since those don't share much logic.
  Add `descriptor_is_class` helper function.

* In `specialize_dict_access`, double check `_PyObject_GetManagedDict()`
  in case we race and dict was materialized before the lock.

* Avoid borrowed references in `_Py_Specialize_StoreAttr()`.

* Use `specialize()` and `unspecialize()` helpers.

* Add unit tests to ensure specializing happens as expected in FT builds.

* Add unit tests to attempt to trigger data races (useful for running under TSAN).

* Add `has_split_table` function to `_testinternalcapi`.
2024-12-19 10:21:17 -08:00
Donghee Na 48c70b8f7d
gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737) 2024-12-19 11:08:17 +09:00
mpage 2de048ce79
gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711)
We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds:

_CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version.

_LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.
2024-12-13 10:17:16 -08:00
Donghee Na 7c2bd9b226
gh-115999: Use light-weight lock for UNPACK_SEQUENCE_LIST (gh-127514) 2024-12-03 00:14:40 +09:00
Donghee Na e2713409cf
gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227) 2024-12-02 10:38:17 +09:00