Commit Graph

103 Commits

Author SHA1 Message Date
Mark Shannon 1b8bb1ed0c
GH-131729: Code-gen better liveness analysis (GH-131732)
* Rename 'defined' attribute to 'in_local' to more accurately reflect how it is used

* Make death of variables explicit even for array variables.

* Convert in_memory from boolean to stack offset

* Don't apply liveness analysis to optimizer generated code

* Fix RETURN_VALUE in optimizer
2025-03-26 15:21:35 +00:00
Savannah Ostrowski b92ee14b80
GH-130415: Optimize constant comparison in JIT builds (GH-131489) 2025-03-21 11:23:12 -07:00
Mark Shannon 7ebd71ee14
GH-131498: Remove conditional stack effects (GH-131499)
* Adds some missing #includes
2025-03-20 15:39:38 +00:00
T. Wouters de2f7da77d
gh-115999: Add free-threaded specialization for FOR_ITER (#128798)
Add free-threaded versions of existing specialization for FOR_ITER (list, tuples, fast range iterators and generators), without significantly affecting their thread-safety. (Iterating over shared lists/tuples/ranges should be fine like before. Reusing iterators between threads is not fine, like before. Sharing generators between threads is a recipe for significant crashes, like before.)
2025-03-12 16:21:46 +01:00
Jamie Phan 10cdd7f91f
GH-130903: typo in optimizer DSL for _GUARD_BOTH_UNICODE (#130904)
Typo introduced in gh-118910.
2025-03-06 12:11:08 +01:00
Amit Lavon 691354ccb0
GH-130415: Narrow str to "" based on boolean tests (GH-130476) 2025-03-04 13:20:17 -08:00
Klaus117 c989e74446
GH-130415: Narrow int to 0 based on boolean tests (GH-130772) 2025-03-04 12:44:09 -08:00
Brandt Bucher 7afa476874
GH-130415: Use boolean guards to narrow types to values in the JIT (GH-130659) 2025-03-02 13:21:34 -08:00
Mark Shannon 54965f3fb2
GH-130296: Avoid stack transients in four instructions. (GH-130310)
* Combine _GUARD_GLOBALS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_MODULE_FROM_KEYS into _LOAD_GLOBAL_MODULE

* Combine _GUARD_BUILTINS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_BUILTINS_FROM_KEYS into _LOAD_GLOBAL_BUILTINS

* Combine _CHECK_ATTR_MODULE_PUSH_KEYS and _LOAD_ATTR_MODULE_FROM_KEYS into _LOAD_ATTR_MODULE

* Remove stack transient in LOAD_ATTR_WITH_HINT
2025-02-28 18:00:38 +00:00
Mark Shannon 72f56654d0
GH-128682: Account for escapes in `DECREF_INPUTS` (GH-129953)
* Handle escapes in DECREF_INPUTS

* Mark a few more functions as escaping

* Replace DECREF_INPUTS with PyStackRef_CLOSE where possible
2025-02-12 17:44:59 +00:00
Irit Katriel a1417b211f
gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR (#129700) 2025-02-07 22:39:54 +00:00
Brandt Bucher 5fa7e1b7fd
GH-129715: Remove _DYNAMIC_EXIT (GH-129716) 2025-02-07 11:41:17 -08:00
Brandt Bucher 70e387c990
GH-129709: Clean up tier two (GH-129710) 2025-02-07 09:52:49 -08:00
Mark Shannon 75b628adeb
GH-128563: Generate `opcode = ...` in instructions that need `opcode` (GH-129608)
* Remove support for GO_TO_INSTRUCTION
2025-02-03 15:09:21 +00:00
Mark Shannon 808071b994
GH-128682: Make `PyStackRef_CLOSE` escaping. (GH-129404) 2025-02-03 12:41:32 +00:00
Mark Shannon 75b4962157
GH-128914: Remove all but one conditional stack effects (GH-129226)
* Remove all 'if (0)' and 'if (1)' conditional stack effects

* Use array instead of conditional for BUILD_SLICE args

* Refactor LOAD_GLOBAL to use a common conditional uop

* Remove conditional stack effects from LOAD_ATTR specializations

* Replace conditional stack effects in LOAD_ATTR with a 0 or 1 sized array.

* Remove conditional stack effects from CALL_FUNCTION_EX
2025-01-27 16:24:48 +00:00
Sam Gross a10f99375e
Revert "GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918)" (GH-129202)
The commit introduced a ~2.5-3% regression in the free threading build.

This reverts commit ab61d3f430.
2025-01-23 09:26:25 +00:00
Mark Shannon ab61d3f430
GH-128914: Remove conditional stack effects from `bytecodes.c` and the code generators (GH-128918) 2025-01-20 17:09:23 +00:00
Mark Shannon f0f7b978be
GH-128939: Refactor JIT optimize structs (GH-128940) 2025-01-20 15:49:15 +00:00
Xuanteng Huang b44ff6d0df
GH-126599: Remove the "counter" optimizer/executor (GH-126853) 2025-01-16 15:57:04 -08:00
Irit Katriel 3893a92d95
gh-100239: specialize long tail of binary operations (#128722) 2025-01-16 15:22:13 +00:00
mpage b5ee0258bf
gh-115999: Specialize `LOAD_ATTR` for instance and class receivers in free-threaded builds (#128164)
Finish specialization for LOAD_ATTR in the free-threaded build by adding support for class and instance receivers.
2025-01-14 11:56:11 -08:00
Mark Shannon f49a1df6f3
GH-128682: Convert explicit loops closing arrays into `DECREF_INPUTS`. (GH-128822)
* Mark Py_DECREF and Py_XDECREF as escaping

* Remove explicit loops for clearing array inputs
2025-01-14 15:08:56 +00:00
Mark Shannon ddd959987c
GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708) 2025-01-13 10:30:28 +00:00
Brandt Bucher 65ae3d5a73
GH-127809: Fix the JIT's understanding of ** (GH-127844) 2025-01-07 17:25:48 -08:00
Mark Shannon f826beca0c
GH-128375: Better instrument for `FOR_ITER` (GH-128445) 2025-01-06 17:54:47 +00:00
Yan Yanchii 30efede33c
gh-128195: Add `_REPLACE_WITH_TRUE` to the tier2 optimizer (GH-128203)
Add `_REPLACE_WITH_TRUE` to the tier2 optimizer
2024-12-24 05:17:47 +08:00
Neil Schemenauer 1b15c89a17
gh-115999: Specialize `STORE_ATTR` in free-threaded builds. (gh-127838)
* Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and
  use in place of `_PyDictKeys_StringLookup`.

* Change `_PyObject_TryGetInstanceAttribute` to use that function
  in the case of split keys.

* Add `unicodekeys_lookup_split` helper which allows code sharing
  between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`.

* Fix locking for `STORE_ATTR_INSTANCE_VALUE`.  Create
  `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and
  `tp_version_tag` cannot change.

* Pass `tp_version_tag` to `specialize_dict_access()`, ensuring
  the version we store on the cache is the correct one (in case of
  it changing during the specalize analysis).

* Split `analyze_descriptor` into `analyze_descriptor_load` and
  `analyze_descriptor_store` since those don't share much logic.
  Add `descriptor_is_class` helper function.

* In `specialize_dict_access`, double check `_PyObject_GetManagedDict()`
  in case we race and dict was materialized before the lock.

* Avoid borrowed references in `_Py_Specialize_StoreAttr()`.

* Use `specialize()` and `unspecialize()` helpers.

* Add unit tests to ensure specializing happens as expected in FT builds.

* Add unit tests to attempt to trigger data races (useful for running under TSAN).

* Add `has_split_table` function to `_testinternalcapi`.
2024-12-19 10:21:17 -08:00
Mark Shannon d2f1d917e8
GH-122548: Implement branch taken and not taken events for sys.monitoring (GH-122564) 2024-12-19 16:59:51 +00:00
Donghee Na 48c70b8f7d
gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737) 2024-12-19 11:08:17 +09:00
mpage 2de048ce79
gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711)
We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds:

_CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version.

_LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.
2024-12-13 10:17:16 -08:00
Ken Jin 6293d00e72
gh-120619: Strength reduce function guards, support 2-operand uop forms (GH-124846)
Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
2024-11-09 11:35:33 +08:00
mpage 2e95c5ba3b
gh-115999: Implement thread-local bytecode and enable specialization for `BINARY_OP` (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.

Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.

Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
2024-11-04 11:13:32 -08:00
Mark Shannon faa3272fb8
GH-125837: Split `LOAD_CONST` into three. (GH-125972)
* Add LOAD_CONST_IMMORTAL opcode

* Add LOAD_SMALL_INT opcode

* Remove RETURN_CONST opcode
2024-10-29 11:15:42 +00:00
Brandt Bucher b5b06349eb
GH-125912: Teach the JIT's optimizer about _BINARY_OP_INPLACE_ADD_UNICODE (GH-125935) 2024-10-28 14:37:16 -07:00
mpage f978fb4f8d
gh-115999: Refactor `LOAD_GLOBAL` specializations to avoid reloading {globals, builtins} keys (gh-124953)
Each of the `LOAD_GLOBAL` specializations is implemented roughly as:

1. Load keys version.
2. Load cached keys version.
3. Deopt if (1) and (2) don't match.
4. Load keys.
5. Load cached index into keys.
6. Load object from (4) at offset from (5).

This is not thread-safe in free-threaded builds; the keys object may be replaced
in between steps (3) and (4).

This change refactors the specializations to avoid reloading the keys object and
instead pass the keys object from guards to be consumed by downstream uops.
2024-10-09 15:18:25 +00:00
Mark Shannon da071fa3e8
GH-119866: Spill the stack around escaping calls. (GH-124392)
* Spill the evaluation around escaping calls in the generated interpreter and JIT. 

* The code generator tracks live, cached values so they can be saved to memory when needed.

* Spills the stack pointer around escaping calls, so that the exact stack is visible to the cycle GC.
2024-10-07 14:56:39 +01:00
Ken Jin b84a763dca
gh-120619: Optimize through `_Py_FRAME_GENERAL` (GH-124518)
* Optimize through _Py_FRAME_GENERAL

* refactor
2024-10-03 01:10:51 +08:00
Savannah Ostrowski 65f1237098
GH-123516: Improve JIT memory consumption by invalidating cold executors (GH-124443)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-09-27 00:35:42 +00:00
Ken Jin 8810e286fa
gh-121459: Deferred LOAD_GLOBAL (GH-123128)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Sam Gross <655866+colesbury@users.noreply.github.com>
2024-09-14 00:23:51 +08:00
Mark Shannon 4ed7d1d6ac
GH-123996: Explicitly mark 'self_or_null' as an array of size 1 to ensure that it is kept in memory for calls (GH-124003) 2024-09-12 15:32:45 +01:00
Mark Shannon a4fd7aa4a6
GH-115776: Allow any fixed sized object to have inline values (GH-123192) 2024-08-21 15:52:04 +01:00
Mark Shannon bb1d30336e
GH-118093: Make `CALL_ALLOC_AND_ENTER_INIT` suitable for tier 2. (GH-123140)
* Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it

* Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
2024-08-20 16:52:58 +01:00
Mark Shannon c13e7d98fb
GH-118093: Specialize `CALL_KW` (GH-123006) 2024-08-16 17:11:24 +01:00
Mark Shannon eec7bdaf01
GH-120024: Remove `CHECK_EVAL_BREAKER` macro. (GH-122968)
* Factor some instructions into micro-ops to isolate CHECK_EVAL_BREAKER for escape analysis

* Eliminate CHECK_EVAL_BREAKER macro
2024-08-14 12:04:05 +01:00
Mark Shannon 1795d6ceba
GH-122869: Add missing tier two optimizer cases (GH-122936) 2024-08-12 10:35:52 -07:00
Mark Shannon df13a1821a
GH-118095: Add tier two support for BINARY_SUBSCR_GETITEM (GH-120793) 2024-08-01 16:19:05 -07:00
Mark Shannon a9d56e38a0
GH-122155: Track local variables between pops and pushes in cases generator (GH-122286) 2024-08-01 09:27:26 +01:00
Mark Shannon 1ca99ed240
Manually override bytecode definition in optimizer, to avoid build error (GH-122316) 2024-07-26 18:38:52 +01:00
Brandt Bucher 64857d849f
GH-122294: Burn in the addresses of side exits (GH-122295) 2024-07-26 09:40:15 -07:00