This change adds variables that had been added since the last time the whitelist was updated. It also cleans up the list a little.
https://bugs.python.org/issue36876
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
This reduces the size of the data segment by **300 KB** of the executable because if the modules are deep-frozen then the marshalled frozen data just wastes space. This was inspired by comment by @gvanrossum in https://github.com/python/cpython/pull/29118#issuecomment-958521863. Note: There is a new option `--deepfreeze-only` in `freeze_modules.py` to change this behavior, it is on be default to save disk space.
```console
# du -s ./python before
27892 ./python
# du -s ./python after
27524 ./python
```
Automerge-Triggered-By: GH:ericsnowcurrently
Disable compiler optimization within test_peg_generator.
This speed up test_peg_generator by always disabling compiler
optimizations by using -O0 or equivalent when the test is building its
own C extensions.
A build not using --with-pydebug in order to speed up test execution
winds up with this test taking a very long time as it would do
repeated compilation of parser C code using the same optimization
flags as CPython was built with.
This speeds the test up 6-8x on gps-raspbian.
Also incorporate's #31017's win32 conditional and flags.
Co-authored-by: Kumar Aditya kumaraditya303
This change is a prerequisite for generating code for other global objects (like strings in gh-30928).
(We borrowed some code from Tools/scripts/deepfreeze.py.)
https://bugs.python.org/issue46541
The build system now uses a :program:`_bootstrap_python` interpreter for
freezing and deepfreezing again. To speed up build process the build tools
:program:`_bootstrap_python` and :program:`_freeze_module` are no longer
build with LTO.
Cross building depends on a build Python interpreter, which must have same
version and bytecode as target host Python.
Instead we use $(PYTHON_FOR_REGEN) .../deepfreeze.py with the
frozen .h file as input, as we did for Windows in bpo-45850.
We also get rid of the code that generates the .h files
when make regen-frozen is run (i.e., .../make_frozen.py),
and the MANIFEST file.
Restore Python 3.8 and 3.9 as Windows host Python again
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Implement changes to build with deep-frozen modules on Windows.
Note that we now require Python 3.10 as the "bootstrap" or "host" Python.
This causes a modest startup speed (around 7%) on Windows.
This gains 10% or more in startup time for `python -c pass` on UNIX-ish systems.
The Makefile.pre.in generating code builds on Eric's work for bpo-45020, but the .c file generator is new.
Windows version TBD.
Currently custom modules (the array set on PyImport_FrozenModules) replace all the frozen stdlib modules. That can be problematic and is unlikely to be what the user wants. This change treats the custom frozen modules as additions instead. They take precedence over all other frozen modules except for those needed to bootstrap the import system. If the "code" field of an entry in the custom array is NULL then that frozen module is treated as disabled, which allows a custom entry to disable a frozen stdlib module.
This change allows us to get rid of is_essential_frozen_module() and simplifies the logic for which frozen modules should be ignored.
https://bugs.python.org/issue45395
The "freeze" tool has been part of the repo for a long time. However, it hasn't had any tests in the test suite to guard against regressions. We add such a test here. This is especially important as there has been a lot of change recently related to frozen modules, with more to come.
Note that as part of the test we build Python out-of-tree and install it in a temp dir.
https://bugs.python.org/issue45629
This is a cross-platform check that the symbols are actually
exported in the ABI, not e.g. hidden in a macro.
Caveat: PyModule_Create2 & PyModule_FromDefAndSpec2 are skipped.
These aren't exported on some of our buildbots. This is a bug
(bpo-44133). This test now makes sure all the others don't regress.
There are two errors that this commit fixes:
* The parser was not correctly computing the offset and the string
source for E_LINECONT errors due to the incorrect usage of strtok().
* The parser was not correctly unwinding the call stack when a tokenizer
exception happened in rules involving optionals ('?', [...]) as we
always make them return valid results by using the comma operator. We
need to check first if we don't have an error before continuing.
* Never change types' cached keys. It could invalidate inline attribute objects.
* Lazily create object dictionaries.
* Update specialization of LOAD/STORE_ATTR.
* Don't update shared keys version for deletion of value.
* Update gdb support to handle instance values.
* Rename SPLIT_KEYS opcodes to INSTANCE_VALUE.
Like #28744 but for the Tools directory.
[skip issue] Opening a related issue is pending python/psf-infra-meta#130
Automerge-Triggered-By: GH:pablogsal
In the list of generated frozen modules at the top of Tools/scripts/freeze_modules.py, you will find that some of the modules have a different name than the module (or .py file) that is actually frozen. Let's call each case an "alias". Aliases do not come into play until we get to the (generated) list of modules in Python/frozen.c. (The tool for freezing modules, Programs/_freeze_module, is only concerned with the source file, not the module it will be used for.)
Knowledge of which frozen modules are aliases (and the identity of the original module) normally isn't important. However, this information is valuable when we go to set __file__ on frozen stdlib modules. This change updates Tools/scripts/freeze_modules.py to map aliases to the original module name (or None if not a stdlib module) in Python/frozen.c. We also add a helper function in Python/import.c to look up a frozen module's alias and add the result of that function to the frozen info returned from find_frozen().
https://bugs.python.org/issue45020
I've added a number of test-only modules. Some of those cases are covered by the recently frozen stdlib modules (and some will be once we add encodings back in). However, I figured we'd play it safe by having a set of modules guaranteed to be there during tests.
https://bugs.python.org/issue45020
Currently we're freezing the __init__.py twice, duplicating the built data unnecessarily With this change we do it once. There is no change in runtime behavior.
https://bugs.python.org/issue45020
* Makefile.pre.in: Add $(srcdir) when needed, remove it when it was
used by mistake.
* freeze_modules.py tool uses ./Programs/_freeze_module if the
executable doesn't exist in the source tree.
The update_file.py tool now preserves the end of line of the updated
file. Fix the "make regen-frozen" command: it no longer changes the
end of line of PCbuild/ files on Unix. Git changes the end of line
depending on the platform.
The main advantage is that the files will no longer show up in diffs and PRs. That means, for a PR, the number of files / lines changed will more clearly reflect the actual change. (This is essentially an un-revert of gh-28375.)
https://bugs.python.org/issue45020
The main advantage is that the files will no longer show up in diffs and PRs. That means, for a PR, the number of files / lines changed will more clearly reflect the actual change.
https://bugs.python.org/issue45020
Here's one more small cleanup that should have been in PR gh-28319. We eliminate stdout side-effects from importing the frozen __hello__ module, and update tests accordingly. We also move the module's source file into Lib/ from Toos/freeze/flag.py.
https://bugs.python.org/issue45019
This will enable us to drop the frozen module header files from the repository.
It does currently cause many source files to be built twice, which just takes more time. For whoever comes to fix this in the future, the files shared between freeze_module and pythoncore should be put into a static library that is consumed by both.
Doing this provides significant performance gains for runtime startup (~15% with all the imported modules frozen). We don't yet freeze all the imported modules because there are a few hiccups in the build systems we need to sort out first. (See bpo-45186 and bpo-45188.)
Note that in PR GH-28320 we added a command-line flag (-X frozen_modules=[on|off]) that allows users to opt out of (or into) using frozen modules. The default is still "off" but we will change it to "on" as soon as we can do it in a way that does not cause contributors pain.
https://bugs.python.org/issue45020
There are a few things I missed in gh-27980. This is a follow-up that will make subsequent PRs cleaner. It includes fixes to tests and tools that reference the frozen modules.
https://bugs.python.org/issue45019
* Constructors of subclasses of some buitin classes (e.g. tuple, list,
frozenset) no longer accept arbitrary keyword arguments.
* Subclass of set can now define a __new__() method with additional
keyword parameters without overriding also __init__().
Frozen modules must be added to several files in order to work properly. Before this change this had to be done manually. Here we add a tool to generate the relevant lines in those files instead. This helps us avoid mistakes and omissions.
https://bugs.python.org/issue45019
Places the locals between the specials and stack. This is the more "natural" layout for a C struct, makes the code simpler and gives a slight speedup (~1%)
* Convert "specials" array to InterpreterFrame struct, adding f_lasti, f_state and other non-debug FrameObject fields to it.
* Refactor, calls pushing the call to the interpreter upward toward _PyEval_Vector.
* Compute f_back when on thread stack, only filling in value when frame object outlives stack invocation.
* Move ownership of InterpreterFrame in generator from frame object to generator object.
* Do not create frame objects for Python calls.
* Do not create frame objects for generators.
* Remove struct _node from the stable ABI list
This struct was removed along with the old parser in Python 3.9 (PEP 617)
* Stable ABI list: Use the public name "PyFrameObject" rather than "_frame"
* Ensure limited API doesn't contain private names
Names prefixed by an underscore are private by definition.
* Add a blurb
* Specialize LOAD_ATTR with LOAD_ATTR_SLOT and LOAD_ATTR_SPLIT_KEYS
* Move dict-common.h to internal/pycore_dict.h
* Add LOAD_ATTR_WITH_HINT specialized opcode.
* Quicken in function if loopy
* Specialize LOAD_ATTR for module attributes.
* Add specialization stats
These were reverted in gh-26530 (commit 17c4edc) due to refleaks.
* 2c1e258 - Compute deref offsets in compiler (gh-25152)
* b2bf2bc - Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)
This change fixes the refleaks.
https://bugs.python.org/issue43693
* Revert "bpo-43693: Compute deref offsets in compiler (gh-25152)"
This reverts commit b2bf2bc1ec.
* Revert "bpo-43693: Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)"
This reverts commit 2c1e2583fd.
These two commits are breaking the refleak buildbots.
A number of places in the code base (notably ceval.c and frameobject.c) rely on mapping variable names to indices in the frame "locals plus" array (AKA fast locals), and thus opargs. Currently the compiler indirectly encodes that information on the code object as the tuples co_varnames, co_cellvars, and co_freevars. At runtime the dependent code must calculate the proper mapping from those, which isn't ideal and impacts performance-sensitive sections. This is something we can easily address in the compiler instead.
This change addresses the situation by replacing internal use of co_varnames, etc. with a single combined tuple of names in locals-plus order, along with a minimal array mapping each to its kind (local vs. cell vs. free). These two new PyCodeObject fields, co_fastlocalnames and co_fastllocalkinds, are not exposed to Python code for now, but co_varnames, etc. are still available with the same values as before (though computed lazily).
Aside from the (mild) performance impact, there are a number of other benefits:
* there's now a clear, direct relationship between locals-plus and variables
* code that relies on the locals-plus-to-name mapping is simpler
* marshaled code objects are smaller and serialize/de-serialize faster
Also note that we can take this approach further by expanding the possible values in co_fastlocalkinds to include specific argument types (e.g. positional-only, kwargs). Doing so would allow further speed-ups in _PyEval_MakeFrameVector(), which is where args get unpacked into the locals-plus array. It would also allow us to shrink marshaled code objects even further.
https://bugs.python.org/issue43693
The invalid assignment rules are very delicate since the parser can
easily raise an invalid assignment when a keyword argument is provided.
As they are very deep into the grammar tree, is very difficult to
specify in which contexts these rules can be used and in which don't.
For that, we need to use a different version of the rule that doesn't do
error checking in those situations where we don't want the rule to raise
(keyword arguments and generator expressions).
We also need to check if we are in left-recursive rule, as those can try
to eagerly advance the parser even if the parse will fail at the end of
the expression. Failing to do this allows the parser to start parsing a
call as a tuple and incorrectly identify a keyword argument as an
invalid assignment, before it realizes that it was not a tuple after all.
* Remove 'zombie' frames. We won't need them once we are allocating fixed-size frames.
* Add co_nlocalplus field to code object to avoid recomputing size of locals + frees + cells.
* Move locals, cells and freevars out of frame object into separate memory buffer.
* Use per-threadstate allocated memory chunks for local variables.
* Move globals and builtins from frame object to per-thread stack.
* Move (slow) locals frame object to per-thread stack.
* Move internal frame functions to internal header.
The patch from [bpo-44074]() does not account for a possibly non-English locale and blindly greps for "HEAD branch" in a possibly localized text.
Automerge-Triggered-By: GH:pitrou
- Reformat the C API and ABI Versioning page (and extend/clarify a bit)
- Rewrite the stable ABI docs into a general text on C API Compatibility
- Add a list of Limited API contents, and notes for the individual items.
- Replace `Include/README.rst` with a link to a devguide page with the same info