Commit Graph

260 Commits

Author SHA1 Message Date
Pablo Galindo Salgado 42b25ad4d3
gh-91048: Refactor and optimize remote debugging module (#134652)
Completely refactor Modules/_remote_debugging_module.c with improved
code organization, replacing scattered reference counting and error
handling with centralized goto error paths. This cleanup improves
maintainability and reduces code duplication throughout the module while
preserving the same external API.

Implement memory page caching optimization in Python/remote_debug.h to
avoid repeated reads of the same memory regions during debugging
operations. The cache stores previously read memory pages and reuses
them for subsequent reads, significantly reducing system calls and
improving performance.

Add code object caching mechanism with a new code_object_generation
field in the interpreter state that tracks when code object caches need
invalidation. This allows efficient reuse of parsed code object metadata
and eliminates redundant processing of the same code objects across
debugging sessions.

Optimize memory operations by replacing multiple individual structure
copies with single bulk reads for the same data structures. This reduces
the number of memory operations and system calls required to gather
debugging information from the target process.

Update Makefile.pre.in to include Python/remote_debug.h in the headers
list, ensuring that changes to the remote debugging header force proper
recompilation of dependent modules and maintain build consistency across
the codebase.

Also, make the module compatible with the free threading build as an extra :)

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2025-05-25 20:19:29 +00:00
Eric Snow 27128e4fa8
gh-132775: Unrevert "Add _PyCode_VerifyStateless()" (gh-133528)
This reverts commit 3c73cf5 (gh-133497), which itself reverted
the original commit d270bb5 (gh-133221).

We reverted the original change due to failing android tests.
The checks in _PyCode_CheckNoInternalState() were too strict,
so we've relaxed them.
2025-05-08 00:00:33 +00:00
Petr Viktorin 3c73cf51df
gh-132775: Revert "gh-132775: Add _PyCode_VerifyStateless() (gh-133221)" (#133497) 2025-05-06 13:09:41 +03:00
Eric Snow d270bb5792
gh-132775: Add _PyCode_VerifyStateless() (gh-133221)
"Stateless" code is a function or code object which does not rely on external state or internal state.
It may rely on arguments and builtins, but not globals or a closure. I've left a comment in
pycore_code.h that provides more detail.

We also add _PyFunction_VerifyStateless(). The new functions will be used in several later changes
that facilitate "sharing" functions and code objects between interpreters.
2025-05-05 21:48:58 +00:00
Eric Snow 24ebb9ccfd
gh-132775: Unrevert "Add _PyCode_GetVarCounts()" (gh-133265)
This reverts commit 811edcf (gh-133232), which itself reverted the original commit 811edcf (gh-133128).

We reverted the original change due to failing s390 builds (a big-endian architecture).
It ended up that I had not accommodated op caches.
2025-05-05 13:24:29 -06:00
Eric Snow 811edcf9cd
Revert "gh-132775: Add _PyCode_GetVarCounts() (gh-133128)" (gh-133232)
The change broke the s390 builds, so I'm reverting it while I investigate.

This reverts commit 94b4fcd806.
2025-05-01 02:35:20 +00:00
Eric Snow 94b4fcd806
gh-132775: Add _PyCode_GetVarCounts() (gh-133128)
This helper is useful in a variety of ways, including in demonstrating how the different counts relate to one another.

It will be used in a later change to help identify if a function is "stateless", meaning it doesn't have any free vars or globals.

Note that a majority of this change is tests.
2025-04-30 18:19:20 +00:00
Eric Snow 96a7fb93a8
gh-132775: Add _PyCode_ReturnsOnlyNone() (gh-132981)
The function indicates whether or not the function has a return statement.

This is used by a later change related treating some functions like scripts.
2025-04-28 20:12:52 -06:00
Bénédikt Tran a81232c769
gh-132399: fix invalid function signatures on the free-threaded build (#132400) 2025-04-12 07:46:33 +00:00
Victor Stinner 1a082085ae
gh-131238: Remove pycore_object_deferred.h from pycore_object.h (#131549)
Remove also pycore_function.h from pycore_typeobject.h.
2025-03-21 16:44:10 +00:00
Mark Shannon 7ebd71ee14
GH-131498: Remove conditional stack effects (GH-131499)
* Adds some missing #includes
2025-03-20 15:39:38 +00:00
Victor Stinner b69da006a4
gh-131238: Remove includes from pycore_interp.h (#131495)
Remove also now unused includes in C files.
2025-03-20 11:35:23 +00:00
Victor Stinner 20c5f969dd
gh-131238: Remove more includes from pycore_interp.h (#131480) 2025-03-19 23:01:32 +01:00
Victor Stinner a5776639c8
gh-111178: Fix function signatures to fix undefined behavior (#131191) 2025-03-14 09:52:15 +00:00
Sam Gross 12db45211d
gh-130851: Only intern constants of types generated by the compiler (#130901)
The free-threading build interns and immortalizes most constants
generated by the bytecode compiler. However, users can construct their
own code objects with arbitrary constants. We should not intern or
immortalize these objects if they are not of a type that we know how to
handle.

This change fixes a reference leak failure in the recently added
`test_code.test_unusual_constants` test. It also addresses a potential
crash that could occur when attempting to destroy an immortalized
object during interpreter shutdown.
2025-03-07 10:34:53 -05:00
Sam Gross 2905690a91
gh-130851: Don't crash when deduping unusual code constants (#130853)
The bytecode compiler only generates a few different types of constants,
like str, int, tuple, slices, etc. Users can construct code objects with
various unusual constants, including ones that are not hashable or not
even constant.

The free threaded build previously crashed with a fatal error when
confronted with these constants. Instead, treat distinct objects of
otherwise unhandled types as not equal for the purposes of deduplication.
2025-03-05 15:04:49 +01:00
Hugo Beauzée-Luyssen 830f04b505
Postpone <stdbool.h> inclusion after Python.h (#130641)
Remove inclusions prior to Python.h.

<stdbool.h> will cause <features.h> to be included before Python.h can
define some macros to enable some additional features, causing multiple
types not to be defined down the line.
2025-02-28 10:09:27 +01:00
Yan Yanchii e6c76b947b
GH-128872: Remove unused argument from _PyCode_Quicken (GH-128873)
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
2025-02-02 15:09:30 -08:00
Mark Shannon 7239da7559
GH-127953: Make line number lookup O(1) regardless of the size of the code object (GH-128350) 2025-01-21 09:33:23 +00:00
Erlend E. Aasland 537296cdcd
gh-111178: Generate correct signature for most self converters (#128447) 2025-01-20 12:40:18 +01:00
Sam Gross d66c08aa75
gh-128923: Use zero to indicate unassigned unique id (#128925)
In the free threading build, the per thread reference counting uses a
unique id for some objects to index into the local reference count
table. Use 0 instead of -1 to indicate that the id is not assigned. This
avoids bugs where zero-initialized heap type objects look like they have
a unique id assigned.
2025-01-17 16:42:27 +01:00
Bénédikt Tran 4533036e50
gh-111178: fix UBSan failures in `Objects/codeobject.c` (GH-128240) 2025-01-13 14:25:04 +01:00
Mark Shannon d2f1d917e8
GH-122548: Implement branch taken and not taken events for sys.monitoring (GH-122564) 2024-12-19 16:59:51 +00:00
Sam Gross f4f530804b
gh-127582: Make object resurrection thread-safe for free threading. (GH-127612)
Objects may be temporarily "resurrected" in destructors when calling
finalizers or watcher callbacks. We previously undid the resurrection
by decrementing the reference count using `Py_SET_REFCNT`. This was not
thread-safe because other threads might be accessing the object
(modifying its reference count) if it was exposed by the finalizer,
watcher callback, or temporarily accessed by a racy dictionary or list
access.

This adds internal-only thread-safe functions for temporary object
resurrection during destructors.
2024-12-05 16:07:31 -05:00
Eric Snow 9dabace39d
gh-114940: Add _Py_FOR_EACH_TSTATE_UNLOCKED(), and Friends (gh-127077)
This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock.  This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock.  Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.
2024-11-21 11:08:38 -07:00
Sam Gross 3926842117
gh-127020: Make `PyCode_GetCode` thread-safe for free threading (#127043)
Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initializations and accesses thread-safe.
2024-11-21 11:00:50 -05:00
Michael Droettboom a38e82bd8c
gh-126298: Don't deduplicate slice constants based on equality (#126398)
* gh-126298: Don't deduplicated slice constants based on equality

* NULL check for PySlice_New

* Fix refcounting

* Fix refcounting some more

* Fix refcounting

* Make tests more complete

* Fix tests
2024-11-07 16:39:23 +00:00
mpage 2e95c5ba3b
gh-115999: Implement thread-local bytecode and enable specialization for `BINARY_OP` (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.

Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.

Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
2024-11-04 11:13:32 -08:00
Sam Gross 332356b880
gh-125900: Clean-up logic around immortalization in free-threading (#125901)
* Remove `@suppress_immortalization` decorator
* Make suppression flag per-thread instead of per-interpreter
* Suppress immortalization in `eval()` to avoid refleaks in three tests
  (test_datetime.test_roundtrip, test_logging.test_config8_ok, and
   test_random.test_after_fork).
* frozenset() is constant, but not a singleton. When run multiple times,
  the test could fail due to constant interning.
2024-10-24 18:09:59 -04:00
Sam Gross 3ea488aac4
gh-124218: Use per-thread refcounts for code objects (#125216)
Use per-thread refcounting for the reference from function objects to
their corresponding code object. This can be a source of contention when
frequently creating nested functions. Deferred refcounting alone isn't a
great fit here because these references are on the heap and may be
modified by other libraries.
2024-10-15 15:06:41 -04:00
Victor Stinner 3ee474f568
gh-111178: Fix function signatures in codeobject.c (#125180) 2024-10-09 15:02:24 +00:00
Michael Droettboom c6127af868
gh-125063: Emit slices as constants in the bytecode compiler (#125064)
* Make slices marshallable

* Emit slices as constants

* Update Python/marshal.c

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>

* Refactor codegen_slice into two functions so it
always has the same net effect

* Fix for free-threaded builds

* Simplify marshal loading of slices

* Only return SUCCESS/ERROR from codegen_slice

---------

Co-authored-by: Mark Shannon <mark@hotpy.org>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
2024-10-08 13:18:39 -04:00
Victor Stinner d8e69b2c1b
gh-122854: Add Py_HashBuffer() function (#122855) 2024-08-30 15:42:27 +00:00
Mark Shannon 7a65439b93
GH-122390: Replace `_Py_GetbaseOpcode` with `_Py_GetBaseCodeUnit` (GH-122942) 2024-08-13 14:22:57 +01:00
Victor Stinner fda6bd842a
Replace PyObject_Del with PyObject_Free (#122453)
PyObject_Del() is just a alias to PyObject_Free() kept for backward
compatibility. Use directly PyObject_Free() instead.
2024-08-01 14:12:33 +02:00
Petr Viktorin cffad5c6ef
gh-121863: Immortalize names in code objects to avoid crash (GH-121903) 2024-07-17 11:31:28 +02:00
Xie Yanbo 0153fd0940
Fix typos in comments (#120821) 2024-06-24 19:47:00 +02:00
Steve Dower e731554337
Fixes loop variables to be the same types as their limit (GH-120958) 2024-06-24 17:11:47 +01:00
Petr Viktorin 6f1d448bc1
gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520)
* Add an InternalDocs file describing how interning should work and how to use it.

* Add internal functions to *explicitly* request what kind of interning is done:
  - `_PyUnicode_InternMortal`
  - `_PyUnicode_InternImmortal`
  - `_PyUnicode_InternStatic`

* Switch uses of `PyUnicode_InternInPlace` to those.

* Disallow using `_Py_SetImmortal` on strings directly.
  You should use `_PyUnicode_InternImmortal` instead:
  - Strings should be interned before immortalization, otherwise you're possibly
    interning a immortalizing copy.
  - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
    `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
    backports, as they are now part of public API and version-specific ABI.

* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
  - `_Py_ID`
  - `_Py_STR` (including the empty string)
  - one-character latin-1 singletons

  Now, when you intern a singleton, that exact singleton will be interned.

* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

* Intern `_Py_STR` singletons at startup.

* For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup.

* Beef up the tests. Cover internal details (marked with `@cpython_only`).

* Add lots of assertions

Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>
2024-06-21 17:19:31 +02:00
Sam Gross 47fb4327b5
gh-117657: Fix race involving immortalizing objects (#119927)
The free-threaded build currently immortalizes objects that use deferred
reference counting (see gh-117783). This typically happens once the
first non-main thread is created, but the behavior can be suppressed for
tests, in subinterpreters, or during a compile() call.

This fixes a race condition involving the tracking of whether the
behavior is suppressed.
2024-06-03 20:58:41 +00:00
Jelle Zijlstra 98e855fcc1
gh-119180: Add LOAD_COMMON_CONSTANT opcode (#119321)
The PEP 649 implementation will require a way to load NotImplementedError
from the bytecode. @markshannon suggested implementing this by converting
LOAD_ASSERTION_ERROR into a more general mechanism for loading constants.

This PR adds this new opcode. I will work on the rest of the implementation
of the PEP separately.

Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
2024-05-22 00:46:39 +00:00
Victor Stinner f6da790122
gh-111389: Add PyHASH_MULTIPLIER constant (#119214) 2024-05-21 19:51:51 +02:00
Sam Gross 723d4d2fe8
gh-118527: Intern code consts in free-threaded build (#118667)
We already intern and immortalize most string constants. In the
free-threaded build, other constants can be a source of reference count
contention because they are shared by all threads running the same code
objects.
2024-05-06 20:12:39 -04:00
Sam Gross 2ba2c142a6
gh-118527: Intern code name and filename on default build (#118576)
Interned and non-interned strings are treated differently by `marshal`,
so be consistent between the default and free-threaded build.
2024-05-06 17:24:14 -04:00
Sam Gross 37c31bea72
gh-118527: Intern filename, name, and qualname in code objects. (#118558)
This interns the strings for `co_filename`, `co_name`, and `co_qualname`
on codeobjects in the free-threaded build. This partially addresses a
reference counting bottleneck when creating closures concurrently. The
closures take the name and qualified name from the code object.
2024-05-03 18:16:45 -04:00
mpage 37d0950022
gh-117657: Disable the function/code cache in free-threaded builds (#118301)
This is only used by the specializing interpreter and the tier 2
optimizer, both of which are disabled in free-threaded builds.
2024-05-03 16:21:04 -04:00
Mark Shannon f6fab21721
GH-118095: Make invalidating and clearing executors memory safe (GH-118459) 2024-05-01 11:34:50 +01:00
Guido van Rossum 7d83f7bcc4
gh-118335: Configure Tier 2 interpreter at build time (#118339)
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.

We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.

On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.

In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
2024-04-30 18:26:34 -07:00
Sam Gross 241ed5f2cd
gh-117376: Make code objects use deferred reference counting (#117823)
We want code objects to use deferred reference counting in the
free-threaded build. This requires them to be tracked by the GC, so we
set `Py_TPFLAGS_HAVE_GC` in the free-threaded build, but not the default
build.
2024-04-16 12:42:53 -04:00
Serhiy Storchaka 6e05537676
gh-117764: Add docstrings and signatures for the __replace__ methods (GH-117768) 2024-04-12 08:46:20 +00:00