Commit Graph

53733 Commits

Author SHA1 Message Date
Serhiy Storchaka 97b2ceaaaf
gh-127217: Fix pathname2url() for paths starting with multiple slashes on Posix (GH-127218) 2024-11-24 19:30:29 +02:00
Bénédikt Tran e0ef08f5b4
gh-122356: restore the position of a file-like object after `zipfile.is_zipfile` (#122397) 2024-11-24 11:36:15 -05:00
Savannah Ostrowski 2104bde572
GH-127133: Remove ability to nest argument groups & mutually exclusive groups (#127186) 2024-11-24 15:20:37 +00:00
Sergey B Kirpichev f7bb658124
gh-113841: fix possible undefined division by 0 in _Py_c_pow() (GH-127211)
`x**y == 1/x**-y ` thus changing `/=` to `*=` by negating the exponent.
2024-11-23 23:37:37 -08:00
Stephen Morton a4d4c1ede2
gh-126662: harmonize naming for three namedtuple base classes in urllib.parse (GH-126663)
harmonize naming for three namedtuple base classes in urllib.parse
2024-11-23 18:36:48 -08:00
Barney Gale 4ea71278ca
pathlib tests: move `walk()` tests into their own classes (GH-126651)
Move tests for Path.walk() into a new PathWalkTest class, and apply a similar change in tests for the ABCs. This allows us to properly tear down the walk test hierarchy in tearDown(), rather than leaving it to os_helper.rmtree().
2024-11-23 18:31:00 -08:00
Barney Gale cc813e10ff
GH-125866: Preserve Windows drive letter case in file URIs (#127138)
Stop converting Windows drive letters to uppercase in
`urllib.request.pathname2url()` and `url2pathname()`. This behaviour is
unnecessary and inconsistent with pathlib's file URI implementation.
2024-11-23 10:41:39 +00:00
Radislav Chugunov ca3ea9ad05
gh-109746: Make _thread.start_new_thread delete state of new thread on its startup failure (GH-109761)
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-11-22 21:20:07 +02:00
Kirill Podoprigora 27486c3365
gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` (#126600)
Add free-threaded specialization for `UNPACK_SEQUENCE` opcode.
`UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable.
`UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack.


---------

Co-authored-by: Matt Page <mpage@meta.com>
Co-authored-by: mpage <mpage@cs.stanford.edu>
2024-11-22 19:00:35 +02:00
Andrei Bodrov 1848ce61f3
gh-88110: Clear concurrent.futures.thread._threads_queues after fork to avoid joining parent process' threads (GH-126098)
Threads are gone after fork, so clear the queues too. Otherwise the
child process (here created via multiprocessing.Process) crashes on
interpreter exit.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-11-22 18:20:34 +02:00
Jun Komoda 7725c0371a
gh-126384: Add tests to verify the behavior of basic COM methods. (GH-126610) 2024-11-22 16:57:31 +01:00
Victor Stinner 0cb20177d6
gh-109413: Fix libregrtest get_running() (#127153)
Skip threads which are not running.
2024-11-22 16:56:03 +01:00
Serhiy Storchaka 8899e85de1
gh-127001: Fix PATHEXT issues in shutil.which() on Windows (GH-127035)
* Name without a PATHEXT extension is only searched if the mode does not
  include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
  (".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
2024-11-22 17:52:15 +02:00
Serhiy Storchaka 0cb4d6c654
gh-86463: Fix default prog in subparsers if usage is used in the main parser (GH-125891)
The usage parameter of argparse.ArgumentParser no longer
affects the default value of the prog parameter in subparsers.

Previously the full custom usage of the main parser was used as
the prog prefix in subparsers.
2024-11-22 17:29:33 +02:00
Cody Maloney 46f8a7bbdb
gh-127076: Ignore memory mmap in FileIO testing (#127088)
`mmap`, `munmap`, and `mprotect` are used by CPython for memory
management, which may occur in the middle of the FileIO tests. The
system calls can also be used with files, so `strace` includes them
in its `%file` and `%desc` filters.

Filter out the `mmap` system calls related to memory allocation for the
file tests. Currently FileIO doesn't do `mmap` at all, so didn't add
code to track from `mmap` through `munmap` since it wouldn't be used.
For now if an `mmap` on a fd happens, the call will be included (which
may cause test to fail), and at that time support for tracking the
address throug `munmap` could be added.
2024-11-22 15:55:32 +01:00
Tomas R. 0a1944cda8
gh-126700: pygettext: Support more gettext functions (GH-126912)
Support multi-argument gettext functions: ngettext(), pgettext(), dgettext(), etc.
2024-11-22 16:52:16 +02:00
Barney Gale 8c98ed846a
GH-127078: `url2pathname()`: handle extra slash before UNC drive in URL path (#127132)
Decode a file URI like `file://///server/share` as a UNC path like
`\\server\share`. This form of file URI is created by software the simply
prepends `file:///` to any absolute Windows path.
2024-11-22 04:12:50 +00:00
Barney Gale ebf564a1d3
GH-126766: `url2pathname()`: handle 'localhost' authority (#127129)
Discard any 'localhost' authority from the beginning of a `file:` URI. As a
result, file URIs like `//localhost/etc/hosts` are correctly decoded as
`/etc/hosts`.
2024-11-22 03:17:06 +00:00
Barney Gale fd133d4f21
GH-126601: `pathname2url()`: handle NTFS alternate data streams (#126760)
Adjust `pathname2url()` to encode embedded colon characters in Windows
paths, rather than bailing out with an `OSError`.

Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2024-11-22 00:29:05 +00:00
Jacob Bower e8bb053941
gh-126091: Always link generator frames when propagating a thrown-in exception through a yield-from chain (#126092)
Always link generator frames when propagating a thrown-in exception through a yield-from chain.
2024-11-21 17:37:49 -06:00
Donghee Na 78a530a578
gh-115999: Add free-threaded specialization for ``TO_BOOL`` (gh-126616) 2024-11-22 07:52:16 +09:00
mpage 09c240f20c
gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607)
Enable specialization of LOAD_GLOBAL in free-threaded builds.

Thread-safety of specialization in free-threaded builds is provided by the following:

A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization.
Generation of new keys versions is made atomic in free-threaded builds.
Existing helpers are used to atomically modify the opcode.
Thread-safety of specialized instructions in free-threaded builds is provided by the following:

Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards.
Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset.
The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
2024-11-21 11:22:21 -08:00
Dino Viehland bf542f8bb9
gh-124470: Fix crash when reading from object instance dictionary while replacing it (#122489)
Delay free a dictionary when replacing it
2024-11-21 10:41:19 -06:00
Sam Gross 3926842117
gh-127020: Make `PyCode_GetCode` thread-safe for free threading (#127043)
Some fields in PyCodeObject are lazily initialized. Use atomics and
critical sections to make their initializations and accesses thread-safe.
2024-11-21 11:00:50 -05:00
Hugo van Kemenade dc7a2b6522
gh-118761: Improve import time of `mimetypes` (#126979) 2024-11-21 16:55:28 +02:00
Victor Stinner bab4b0462e
gh-124873: Tolerate 100 ms in TimerfdTests on Android (#127101)
On Android, TimerfdTests of test_os now uses 100 ms accuracy instead
of 10 ms.
2024-11-21 15:53:52 +01:00
Nice Zombies 60ec854bc2
gh-126780: Fix `ntpath.normpath()` for drive-relative paths (GH-126801) 2024-11-21 14:43:36 +00:00
Serhiy Storchaka 4803cd0244
gh-126727: Fix locale.nl_langinfo(locale.ERA) (GH-126730)
It now returns multiple era description segments separated by semicolons.
Previously it only returned the first segment on platforms with Glibc.
2024-11-21 13:16:08 +02:00
Serhiy Storchaka eaf2171082
gh-126997: Fix support of non-ASCII strings in pickletools (GH-127062)
* Fix support of STRING and GLOBAL opcodes with non-ASCII arguments.
* dis() now outputs non-ASCII bytes in STRING, BINSTRING and
  SHORT_BINSTRING arguments as escaped (\xXX).
2024-11-21 13:15:12 +02:00
Cody Maloney ff2278e2bf
gh-127076: Disable strace tests under LD_PRELOAD (#127086)
Distribution tooling (ex. sandbox on Gentoo and fakeroot on Debian) uses
LD_PRELOAD to intercept system calls and potentially modify them when
building. These tools can change the set of system calls, so disable
system call testing under these cases.

Co-authored-by: Michał Górny <mgorny@gentoo.org>
2024-11-21 10:33:12 +01:00
Mark Shannon aea0c586d1
GH-127010: Don't lazily track and untrack dicts (GH-127027) 2024-11-20 16:41:20 +00:00
Gregory P. Smith 7191b7662e
gh-97514: Authenticate the forkserver control socket. (GH-99309)
This adds authentication to the forkserver control socket. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type is a good idea.

This reuses the HMAC based shared key auth already used by `multiprocessing.connection` sockets for other purposes.

Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket.

### pyperformance benchmarks

No significant changes. Including `concurrent_imap` which exercises `multiprocessing.Pool.imap` in that suite.

### Microbenchmarks

This does _slightly_ slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more.

Typically a 1-4% slowdown on a very targeted process creation microbenchmark, with a worst case overloaded system slowdown of 20%.  No evidence that these slowdowns appear in practical sense.  See the PR for details.
2024-11-20 08:18:58 -08:00
Brandt Bucher 48c50ff1a2
GH-126892: Reset warmup counters when JIT compiling code (GH-126893) 2024-11-20 08:11:25 -08:00
Serhiy Storchaka addb225f38
gh-126991: Add tests for unpickling bad object state (GH-127031)
This catches a memory leak in loading the BUILD opcode.
2024-11-20 17:31:50 +02:00
Jun Komoda 5b4502560b
gh-126615: `ctypes`: Make `COMError` public (GH-126686)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-11-20 12:53:43 +00:00
Barney Gale c9b399fbdb
GH-85168: Use filesystem encoding when converting to/from `file` URIs (#126852)
Adjust `urllib.request.url2pathname()` and `pathname2url()` to use the
filesystem encoding when quoting and unquoting file URIs, rather than
forcing use of UTF-8.

No changes are needed in the `nturl2path` module because Windows always
uses UTF-8, per PEP 529.
2024-11-19 21:19:30 +00:00
Hugo van Kemenade 2cdfb41d0c Merge remote-tracking branch 'upstream/main' 2024-11-19 22:10:24 +02:00
Eric Snow 1c0a104eca
gh-126914: Store the Preallocated Thread State's Pointer in a PyInterpreterState Field (gh-126989)
This approach eliminates the originally reported race. It also gets rid of the deadlock reported in gh-96071, so we can remove the workaround added then.
2024-11-19 12:59:19 -07:00
sobolevn 824afbf548
gh-109413: Enable mypy's `disallow_any_generics` setting when checking `libregrtest` (#127033) 2024-11-19 22:41:59 +03:00
Beomsoo Kim 8da9920a80
gh-126947: Typechecking for _pydatetime.timedelta.__new__ arguments (#126949)
Co-authored-by: sobolevn <mail@sobolevn.me>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
2024-11-19 22:40:52 +03:00
Malcolm Smith c5c9286804
gh-118201: Simplify conv_confname (#126089) 2024-11-19 10:42:19 -05:00
Hugo van Kemenade add43c3420 Python 3.14.0a2 2024-11-19 16:52:44 +02:00
sobolevn 3932e1db53
gh-126980: Fix `bytearray.__buffer__` crash on `PyBUF_{READ,WRITE}` (#126981)
Co-authored-by: Victor Stinner <vstinner@python.org>
2024-11-19 17:44:53 +03:00
Barney Gale 4d771977b1
GH-84850: Remove `urllib.request.URLopener` and `FancyURLopener` (#125739) 2024-11-19 16:01:49 +02:00
Hugo van Kemenade 899fdb213d
Revert "GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)" (#126983) 2024-11-19 11:25:09 +02:00
Victor Stinner 84f07c3a4c
gh-126594: Fix typeobject.c wrap_buffer() cast (#126754)
Reject flags smaller than INT_MIN.

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
2024-11-19 09:13:20 +01:00
Victor Stinner b3687ad454
gh-126876: Fix socket internal_select() for large timeout (#126968)
If the timeout is larger than INT_MAX, replace it with INT_MAX, in
the poll() code path.

Add an unit test.
2024-11-19 09:08:42 +01:00
Brandt Bucher 4cd10762b0
GH-126795: Increase the JIT threshold from 16 to 4096 (GH-126816) 2024-11-18 11:11:23 -08:00
Hugo van Kemenade 933f21c3c9
gh-85957: Add missing MIME types for images with RFCs (#126966) 2024-11-18 20:13:20 +02:00
Serhiy Storchaka f7ef0203d4
gh-123803: Support arbitrary code page encodings on Windows (GH-123804)
If the cpXXX encoding is not directly implemented in Python, fall back
to use the Windows-specific API codecs.code_page_encode() and
codecs.code_page_decode().
2024-11-18 17:45:25 +00:00