Commit Graph

279 Commits

Author SHA1 Message Date
Victor Stinner 5fef4ff9ed
gh-111178: Fix function signature in pyexpat.c (#131674)
Move _Py_NO_SANITIZE_UNDEFINED macro from faulthandler.c to pyport.h.
2025-03-24 17:22:45 +00:00
Bénédikt Tran c070abadb2
gh-111178: fix UBSan failures in `Modules/pyexpat.c` (GH-129789)
Fix UBSan failures for `xmlparseobject`, XML handler setters
Suppress unused return values

Rewrite `RC_HANDLER` with better semantics
2025-02-24 11:56:32 +01:00
Victor Stinner 3447f4a56a
gh-129354: Use PyErr_FormatUnraisable() function (#129514)
Replace PyErr_WriteUnraisable() with PyErr_FormatUnraisable().
2025-01-31 14:20:35 +01:00
Sebastian Pipping 8e48a6edc7
gh-126624: Expose error code ``XML_ERROR_NOT_STARTED`` of Expat >=2.6.4 (#126625)
Expose error code ``XML_ERROR_NOT_STARTED`` in `xml.parsers.expat.errors` which was
introduced in Expat 2.6.4.


Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-01-02 14:54:38 +02:00
Bénédikt Tran 7303f06846
gh-126742: Add _PyErr_SetLocaleString, use it for gdbm & dlerror messages (GH-126746)
- Add a helper to set an error from locale-encoded `char*`
- Use the helper for gdbm & dlerror messages

Co-authored-by: Victor Stinner <vstinner@python.org>
2024-12-17 12:12:45 +01:00
Mark Shannon 00257c746c
GH-119462: Enforce invariants of type versioning (GH-120731)
* Remove uses of Py_TPFLAGS_VALID_VERSION_TAG
2024-06-19 17:38:45 +01:00
Geoffrey Thomas ef172521a9
Remove almost all unpaired backticks in docstrings (#119231)
As reported in #117847 and #115366, an unpaired backtick in a docstring
tends to confuse e.g. Sphinx running on subclasses of standard library
objects, and the typographic style of using a backtick as an opening
quote is no longer in favor. Convert almost all uses of the form

    The variable `foo' should do xyz

to

    The variable 'foo' should do xyz

and also fix up miscellaneous other unpaired backticks (extraneous /
missing characters).

No functional change is intended here other than in human-readable
docstrings.
2024-05-22 12:35:18 -04:00
Brett Simmers c2627d6eea
gh-116322: Add Py_mod_gil module slot (#116882)
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.

PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.

A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
2024-05-03 11:30:55 -04:00
Sebastian Pipping 6a95676bb5
gh-115398: Expose Expat >=2.6.0 reparse deferral API (CVE-2023-52425) (GH-115623)
Allow controlling Expat >=2.6.0 reparse deferral (CVE-2023-52425) by adding five new methods:

- `xml.etree.ElementTree.XMLParser.flush`
- `xml.etree.ElementTree.XMLPullParser.flush`
- `xml.parsers.expat.xmlparser.GetReparseDeferralEnabled`
- `xml.parsers.expat.xmlparser.SetReparseDeferralEnabled`
- `xml.sax.expatreader.ExpatParser.flush`

Based on the "flush" idea from https://github.com/python/cpython/pull/115138#issuecomment-1932444270 .

### Notes

- Please treat as a security fix related to CVE-2023-52425.

Includes code suggested-by: Snild Dolkow <snild@sony.com>
and by core dev Serhiy Storchaka.
2024-02-29 14:52:50 -08:00
Sam Gross ef3ceab09d
gh-112066: Use `PyDict_SetDefaultRef` in place of `PyDict_SetDefault`. (#112211)
This changes a number of internal usages of `PyDict_SetDefault` to use `PyDict_SetDefaultRef`.

Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2024-02-07 13:43:18 -05:00
Erlend E. Aasland dcd28b5c35
gh-114569: Use PyMem_* APIs for most non-PyObject uses (#114574)
Fix usage in Modules, Objects, and Parser subdirectories.
2024-01-26 10:11:35 +00:00
Kirill Podoprigora cf34b7704b
gh-103092: Make ``pyexpat`` module importable in sub-interpreters (#113555) 2023-12-29 18:43:46 +05:30
Serhiy Storchaka d8908932fc
gh-111789: Use PyDict_GetItemRef() in Modules/pyexpat.c (gh-112079) 2023-11-27 18:51:31 +01:00
Serhiy Storchaka 59ea0f523e
gh-110093: Replace trivial Py_BuildValue() with direct C API call (GH-110094) 2023-10-20 18:08:41 +03:00
Victor Stinner 03c4080c71
gh-108765: Python.h no longer includes <ctype.h> (#108831)
Remove <ctype.h> in C files which don't use it; only sre.c and
_decimal.c still use it.

Remove _PY_PORT_CTYPE_UTF8_ISSUE code from pyport.h:

* Code added by commit b5047fd019
  in 2004 for MacOSX and FreeBSD.
* Test removed by commit 52ddaefb6b
  in 2007, since Python str type now uses locale independent
  functions like Py_ISALPHA() and Py_TOLOWER() and the Unicode
  database.

Modules/_sre/sre.c replaces _PY_PORT_CTYPE_UTF8_ISSUE with new
functions: sre_isalnum(), sre_tolower(), sre_toupper().

Remove unused includes:

* _localemodule.c: remove <stdio.h>.
* getargs.c: remove <float.h>.
* dynload_win.c: remove <direct.h>, it no longer calls _getcwd()
  since commit fb1f68ed7c (in 2001).
2023-09-03 18:54:27 +02:00
Victor Stinner 546cab8444
gh-106320: Remove private _PyTraceback functions (#108453)
Move private functions to the internal C API (pycore_traceback.h):

* _Py_DisplaySourceLine()
* _PyTraceback_Add()
2023-08-24 23:35:47 +00:00
Victor Stinner 1a3faba9f1
gh-106869: Use new PyMemberDef constant names (#106871)
* Remove '#include "structmember.h"'.
* If needed, add <stddef.h> to get offsetof() function.
* Update Parser/asdl_c.py to regenerate Python/Python-ast.c.
* Replace:

  * T_SHORT => Py_T_SHORT
  * T_INT => Py_T_INT
  * T_LONG => Py_T_LONG
  * T_FLOAT => Py_T_FLOAT
  * T_DOUBLE => Py_T_DOUBLE
  * T_STRING => Py_T_STRING
  * T_OBJECT => _Py_T_OBJECT
  * T_CHAR => Py_T_CHAR
  * T_BYTE => Py_T_BYTE
  * T_UBYTE => Py_T_UBYTE
  * T_USHORT => Py_T_USHORT
  * T_UINT => Py_T_UINT
  * T_ULONG => Py_T_ULONG
  * T_STRING_INPLACE => Py_T_STRING_INPLACE
  * T_BOOL => Py_T_BOOL
  * T_OBJECT_EX => Py_T_OBJECT_EX
  * T_LONGLONG => Py_T_LONGLONG
  * T_ULONGLONG => Py_T_ULONGLONG
  * T_PYSSIZET => Py_T_PYSSIZET
  * T_NONE => _Py_T_NONE
  * READONLY => Py_READONLY
  * PY_AUDIT_READ => Py_AUDIT_READ
  * READ_RESTRICTED => Py_AUDIT_READ
  * PY_WRITE_RESTRICTED => _Py_WRITE_RESTRICTED
  * RESTRICTED => (READ_RESTRICTED | _Py_WRITE_RESTRICTED)
2023-07-25 15:28:30 +02:00
Serhiy Storchaka 329e4a1a3f
gh-86493: Modernize modules initialization code (GH-106858)
Use PyModule_Add() or PyModule_AddObjectRef() instead of soft deprecated
PyModule_AddObject().
2023-07-25 14:34:49 +03:00
Victor Stinner 89f9875448
gh-106320: Move private _PyHash API to the internal C API (#107026)
* No longer export most private _PyHash symbols, only export the ones
  which are needed by shared extensions.
* Modules/_xxtestfuzz/fuzzer.c now uses the internal C API.
2023-07-22 13:49:37 +00:00
Serhiy Storchaka a293fa5915
gh-86493: Use PyModule_Add() instead of PyModule_AddObjectRef() (GH-106860) 2023-07-18 23:59:53 +03:00
Serhiy Storchaka be1b968dc1
gh-106521: Remove _PyObject_LookupAttr() function (GH-106642) 2023-07-12 08:57:10 +03:00
Victor Stinner 2e92edbf6d
gh-106320: Remove private _PyImport C API functions (#106383)
* Remove private _PyImport C API functions: move them to the internal
  C API (pycore_import.h).
* No longer export most of these private functions.
* _testcapi avoids private _PyImport_GetModuleAttrString().
2023-07-03 23:02:07 +00:00
Erlend E. Aasland 20a56d8bec
gh-105375: Harden pyexpat initialisation (#105606)
Add proper error handling to add_errors_module() to prevent exceptions
from possibly being overwritten.
2023-06-11 20:18:46 +00:00
Victor Stinner ef300937c2
gh-92536: Remove PyUnicode_READY() calls (#105210)
Since Python 3.12, PyUnicode_READY() does nothing and always
returns 0.
2023-06-02 01:33:17 +02:00
Kumar Aditya 20e994c535
GH-103092: isolate `pyexpat` (#104506) 2023-05-16 20:03:01 +00:00
Eric Snow a9c6e0618f
gh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (gh-104205)
Here we are doing no more than adding the value for Py_mod_multiple_interpreters and using it for stdlib modules.  We will start checking for it in gh-104206 (once PyInterpreterState.ceval.own_gil is added in gh-104204).
2023-05-05 21:11:27 +00:00
Kumar Aditya b652d40f1c
GH-101797: allocate `PyExpat_CAPI` capsule on heap (#101798) 2023-02-11 14:07:39 +05:30
Nikita Sobolev b034fd3e59
gh-100689: Revert "bpo-41798: pyexpat: Allocate the expat_CAPI on the heap memory (GH-24061)" (#100745)
* gh-100689: Revert "bpo-41798: pyexpat: Allocate the expat_CAPI on the heap memory (GH-24061)"

This reverts commit 7c83eaa536.
2023-01-08 18:24:40 +05:30
Serhiy Storchaka a87c46eab3
bpo-15999: Accept arbitrary values for boolean parameters. (#15609)
builtins and extension module functions and methods that expect boolean values for parameters now accept any Python object rather than just a bool or int type. This is more consistent with how native Python code itself behaves.
2022-12-03 11:52:21 -08:00
Victor Stinner 3e2f7135e6
gh-99300: Use Py_NewRef() in Modules/ directory (#99469)
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and
Py_XNewRef() in test C files of the Modules/ directory.
2022-11-14 16:21:23 +01:00
Serhiy Storchaka 3fe7d7c020
gh-99426: Use PyUnicode_FromFormat() and PyErr_Format() instead of sprintf (GH-99427) 2022-11-14 15:25:34 +02:00
jonasdlindner ede6cb2615
Correct some typos in comments (GH-98194)
Automerge-Triggered-By: GH:AlexWaygood
2022-11-06 08:54:44 -08:00
Christian Heimes 4a7f5a55dc
gh-95853: Address wasm build and test issues (GH-95985) 2022-08-15 07:41:10 +02:00
Victor Stinner 27b9894033
gh-93937, C API: Move PyFrame_GetBack() to Python.h (#93938)
Move the follow functions and type from frameobject.h to pyframe.h,
so the standard <Python.h> provide frame getter functions:

* PyFrame_Check()
* PyFrame_GetBack()
* PyFrame_GetBuiltins()
* PyFrame_GetGenerator()
* PyFrame_GetGlobals()
* PyFrame_GetLasti()
* PyFrame_GetLocals()
* PyFrame_Type

Remove #include "frameobject.h" from many C files. It's no longer
needed.
2022-06-19 12:02:33 +02:00
Victor Stinner f62ad4f2c4
gh-89653: Use int type for Unicode kind (#92704)
Use the same type that PyUnicode_FromKindAndData() kind parameter
type (public C API): int.
2022-05-13 12:41:05 +02:00
Victor Stinner 7cdaf87ec5
gh-91731: Replace Py_BUILD_ASSERT() with static_assert() (#91730)
Python 3.11 now uses C11 standard which adds static_assert()
to <assert.h>.

* In pytime.c, replace Py_BUILD_ASSERT() with preprocessor checks on
  SIZEOF_TIME_T with #error.
* On macOS, py_mach_timebase_info() now accepts timebase members with
  the same size than _PyTime_t.
* py_get_monotonic_clock() now saturates GetTickCount64() to
  _PyTime_MAX: GetTickCount64() is unsigned, whereas _PyTime_t is
  signed.
2022-04-20 19:26:40 +02:00
Dong-hee Na 2b86616456
bpo-46541: Remove usage of _Py_IDENTIFIER from pyexpat (GH-31468) 2022-02-21 23:46:52 +09:00
Eric Snow 81c72044a1
bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
2022-02-08 13:39:07 -07:00
Sebastian Pipping e18d81569f
bpo-45321: Add missing error codes to module `xml.parsers.expat.errors` (GH-30188)
The idea is to ensure that module `xml.parsers.expat.errors`
contains all known error codes and messages,
even when CPython is compiled or run with an outdated version of libexpat.

https://bugs.python.org/issue45321
2021-12-31 10:57:00 +01:00
TAGAMI Yukihiro 0742abdc48
bpo-45329: Fix freed memory access in pyexpat.c (GH-28649) 2021-10-02 12:57:13 +03:00
Erlend Egeberg Aasland 00710e6346
bpo-43908: Make heap types converted during 3.10 alpha immutable (GH-26351)
* Make functools types immutable

* Multibyte codec types are now immutable

* pyexpat.xmlparser is now immutable

* array.arrayiterator is now immutable

* _thread types are now immutable

* _csv types are now immutable

* _queue.SimpleQueue is now immutable

* mmap.mmap is now immutable

* unicodedata.UCD is now immutable

* sqlite3 types are now immutable

* _lsprof.Profiler is now immutable

* _overlapped.Overlapped is now immutable

* _operator types are now immutable

* winapi__overlapped.Overlapped is now immutable

* _lzma types are now immutable

* _bz2 types are now immutable

* _dbm.dbm and _gdbm.gdbm are now immutable
2021-06-17 11:06:09 +01:00
Erlend Egeberg Aasland 59af59c2df
bpo-42972: Fully support GC for pyexpat, unicodedata, and dbm/gdbm heap types (GH-26376)
* bpo-42972: pyexpat
* bpo-42972: unicodedata
* bpo-42972: dbm/gdbm
2021-05-27 17:29:00 +10:00
Erlend Egeberg Aasland 9746cda705
bpo-43916: Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to selected types (GH-25748)
Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to the following types:

* _dbm.dbm
* _gdbm.gdbm
* _multibytecodec.MultibyteCodec
* _sre..SRE_Scanner
* _thread._localdummy
* _thread.lock
* _winapi.Overlapped
* array.arrayiterator
* functools.KeyWrapper
* functools._lru_list_elem
* pyexpat.xmlparser
* re.Match
* re.Pattern
* unicodedata.UCD
* zlib.Compress
* zlib.Decompress
2021-04-30 16:04:57 +02:00
Mohamed Koubaa c8a87addb1
bpo-1635741: Port pyexpat to multi-phase init (PEP 489) (GH-22222) 2021-01-04 15:34:26 +01:00
Hai Shi 7c83eaa536
bpo-41798: pyexpat: Allocate the expat_CAPI on the heap memory (GH-24061) 2021-01-03 16:47:44 +01:00
Mohamed Koubaa 7184218e18
bpo-1635741: Fix PyInit_pyexpat() error handling (GH-22489)
Split PyInit_pyexpat() into sub-functions and fix reference leaks
on error paths.
2020-11-04 18:37:23 +01:00
Serhiy Storchaka b510e101f8
bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986)
If PyDict_GetItemWithError is only used to check whether the key is in dict,
it is better to use PyDict_Contains instead.

And if it is used in combination with PyDict_SetItem, PyDict_SetDefault can
replace the combination.
2020-10-26 12:47:57 +02:00
Victor Stinner 4a21e57fe5
bpo-40268: Remove unused structmember.h includes (GH-19530)
If only offsetof() is needed: include stddef.h instead.

When structmember.h is used, add a comment explaining that
PyMemberDef is used.
2020-04-15 02:35:41 +02:00
Serhiy Storchaka cd8295ff75
bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode data. (GH-19345) 2020-04-11 10:48:40 +03:00
Petr Viktorin ffd9753a94
bpo-39245: Switch to public API for Vectorcall (GH-18460)
The bulk of this patch was generated automatically with:

    for name in \
        PyObject_Vectorcall \
        Py_TPFLAGS_HAVE_VECTORCALL \
        PyObject_VectorcallMethod \
        PyVectorcall_Function \
        PyObject_CallOneArg \
        PyObject_CallMethodNoArgs \
        PyObject_CallMethodOneArg \
    ;
    do
        echo $name
        git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
    done

    old=_PyObject_FastCallDict
    new=PyObject_VectorcallDict
    git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"

and then cleaned up:

- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments
2020-02-11 17:46:57 +01:00