cpython

Commit Graph

Author	SHA1	Message	Date
Barney Gale	01d91500ca	GH-128520: Make `pathlib._abc.WritablePath` a sibling of `ReadablePath` (#129014 ) In the private pathlib ABCs, support write-only virtual filesystems by making `WritablePath` inherit directly from `JoinablePath`, rather than subclassing `ReadablePath`. There are two complications: - `ReadablePath.open()` applies to both reading and writing - `ReadablePath.copy` is secretly an object that supports the read side of copying, whereas `WritablePath.copy` is a different kind of object supporting the write side We untangle these as follow: - A new `pathlib._abc.magic_open()` function replaces the `open()` method, which is dropped from the ABCs but remains in `pathlib.Path`. The function works like `io.open()`, but additionally accepts objects with `__open_rb__()` or `__open_wb__()` methods as appropriate for the mode. These new dunders are made abstract methods of `ReadablePath` and `WritablePath` respectively. If the pathlib ABCs are made public, we could consider blessing an "openable" protocol and supporting it in `io.open()`, removing the need for `pathlib._abc.magic_open()`. - `ReadablePath.copy` becomes a true method, whereas `WritablePath.copy` is deleted. A new `ReadablePath._copy_reader` property provides a `CopyReader` object, and similarly `WritablePath._copy_writer` is a `CopyWriter` object. Once GH-125413 is resolved, we'll be able to move the `CopyReader` functionality into `ReadablePath.info` and eliminate `ReadablePath._copy_reader`.	2025-01-21 18:35:37 +00:00
Barney Gale	22a442181d	GH-128520: Divide pathlib ABCs into three classes (#128523 ) In the private pathlib ABCs, rename `PurePathBase` to `JoinablePath`, and split `PathBase` into `ReadablePath` and `WritablePath`. This improves the API fit for read-only virtual filesystems. The split of `PathBase` entails a similar split of `CopyWorker` (implements copying) and the test cases in `test_pathlib_abc`. In a later patch, we'll make `WritablePath` inherit directly from `JoinablePath` rather than `ReadablePath`. For a couple of reasons, this isn't quite possible yet.	2025-01-11 19:27:47 +00:00
Barney Gale	fd94c6a803	pathlib tests: create `walk()` test hierarchy without using class under test (#128338 ) In the tests for `pathlib.Path.walk()`, avoid using the path class under test (`self.cls`) in test setup. Instead we use `os` functions in `test_pathlib`, and direct manipulation of `DummyPath` internal data in `test_pathlib_abc`.	2025-01-04 15:45:24 +00:00
Barney Gale	95352dcb93	GH-127381: pathlib ABCs: remove `PathBase.move()` and `move_into()` (#128337 ) These methods combine `_delete()` and `copy()`, but `_delete()` isn't part of the public interface, and it's unlikely to be added until the pathlib ABCs are made official, or perhaps even later.	2025-01-04 12:53:51 +00:00
Barney Gale	ef63cca494	GH-127381: pathlib ABCs: remove uncommon `PurePathBase` methods (#127853 ) Remove `PurePathBase.relative_to()` and `is_relative_to()` because they don't account for other being an entirely different kind of path, and they can't use `__eq__()` because it's not on the `PurePathBase` interface. Remove `PurePathBase.drive`, `root`, `is_absolute()` and `as_posix()`. These are all too specific to local filesystems.	2024-12-29 22:07:12 +00:00
Barney Gale	c78729f2df	GH-127381: pathlib ABCs: remove `PathBase.stat()` (#128334 ) Remove the `PathBase.stat()` method. Its use of the `os.stat_result` API, with its 10 mandatory fields and low-level types, makes it an awkward fit for virtual filesystems. We'll look to add a `PathBase.info` attribute later - see GH-125413.	2024-12-29 21:42:07 +00:00
Barney Gale	d61542b5ff	pathlib tests: create test hierarchy without using class under test (#128200 ) In the pathlib tests, avoid using the path class under test (`self.cls`) in test setup. Instead we use `os` functions in `test_pathlib`, and direct manipulation of `DummyPath` internal data in `test_pathlib_abc`.	2024-12-23 17:22:15 +00:00
Barney Gale	a959ea1b0a	GH-127807: pathlib ABCs: remove `PurePathBase._raw_paths` (#127883 ) Remove the `PurePathBase` initializer, and make `with_segments()` and `__str__()` abstract. This allows us to drop the `_raw_paths` attribute, and also the `Parser.join()` protocol method.	2024-12-22 01:17:59 +00:00
Barney Gale	7146f18946	GH-127807: pathlib ABCs: remove `PathBase._unsupported_msg()` (#127855 ) This method helped us customise the `UnsupportedOperation` message depending on the type. But we're aiming to make `PathBase` a proper ABC soon, so `NotImplementedError` is the right exception to raise there.	2024-12-12 17:39:24 +00:00
Barney Gale	292afd1d51	GH-127381: pathlib ABCs: remove remaining uncommon `PathBase` methods (#127714 ) Remove the following methods from `pathlib._abc.PathBase`: - `expanduser()` - `hardlink_to()` - `touch()` - `chmod()` - `lchmod()` - `owner()` - `group()` - `from_uri()` - `as_uri()` These operations aren't regularly supported in virtual filesystems, so they don't win a place in the `PathBase` interface. (Some of them probably don't deserve a place in `Path` :P.) They're quasi-abstract (except `lchmod()`), and they're not called by other `PathBase` methods.	2024-12-12 06:49:34 +00:00
Barney Gale	12b4f1a5a1	GH-127381: pathlib ABCs: remove `PathBase.samefile()` and rarer `is_*()` (#127709 ) Remove `PathBase.samefile()`, which is fairly specific to the local FS, and relies on `stat()`, which we're aiming to remove from `PathBase`. Also remove `PathBase.is_mount()`, `is_junction()`, `is_block_device()`, `is_char_device()`, `is_fifo()` and `is_socket()`. These rely on POSIX file type numbers that we're aiming to remove from the `PathBase` API.	2024-12-11 00:09:55 +00:00
Barney Gale	5c89adf385	GH-127456: pathlib ABCs: add protocol for path parser (#127494 ) Change the default value of `PurePathBase.parser` from `ParserBase()` to `posixpath`. As a result, user subclasses of `PurePathBase` and `PathBase` use POSIX path syntax by default, which is very often desirable. Move `pathlib._abc.ParserBase` to `pathlib._types.Parser`, and convert it to a runtime-checkable protocol. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2024-12-09 18:31:22 +00:00
Barney Gale	7f8ec52302	GH-127381: pathlib ABCs: remove `PathBase.unlink()` and `rmdir()` (#127736 ) Virtual filesystems don't always make a distinction between deleting files and empty directories, and sometimes support deleting non-empty directories in a single operation. Here we remove `PathBase.unlink()` and `rmdir()`, leaving `_delete()` as the sole deletion method, now made abstract. I hope to drop the underscore prefix later on.	2024-12-08 18:45:09 +00:00
Barney Gale	31c9f3ced2	GH-127381: pathlib ABCs: remove `PathBase.resolve()` and `absolute()` (#127707 ) Remove our implementation of POSIX path resolution in `PathBase.resolve()`. This functionality is rather fragile and isn't necessary in most cases. It depends on `PathBase.stat()`, which we're looking to remove. Also remove `PathBase.absolute()`. Many legitimate virtual filesystems lack the notion of a 'current directory', so it's wrong to include in the basic interface.	2024-12-06 21:39:45 +00:00
Barney Gale	5b6635f772	GH-127381: pathlib ABCs: remove `PathBase.rename()` and `replace()` (#127658 ) These methods are obviated by `PathBase.move()`, which can move directories and supports any `PathBase` object as a target.	2024-12-06 18:10:00 +00:00
Barney Gale	8b3cccf3f9	GH-125413: Revert addition of `pathlib.Path.scandir()` method (#127377 ) Remove documentation for `pathlib.Path.scandir()`, and rename the method to `_scandir()`. In the private pathlib ABCs, make `iterdir()` abstract and call it from `_scandir()`. It's not worthwhile to add this method at the moment - see discussion: https://discuss.python.org/t/ergonomics-of-new-pathlib-path-scandir/71721 Co-authored-by: Steve Dower <steve.dower@microsoft.com>	2024-12-05 21:39:43 +00:00
Hood Chatham	43634fc1fc	gh-127146: Emscripten: Skip segfaults in test suite (#127151 ) Added skips for tests known to cause problems when running on Emscripten. These mostly relate to the limited stack depth on Emscripten.	2024-12-05 08:26:25 +08:00
Barney Gale	328187cc4f	GH-127381: pathlib ABCs: remove `PathBase.cwd()` and `home()` (#127427 ) These classmethods presume that the user has retained the original `__init__()` signature, which may not be the case. Also, many virtual filesystems don't provide current or home directories.	2024-11-30 18:39:39 +00:00
Barney Gale	38264a060a	GH-127381: pathlib ABCs: remove `PathBase.lstat()` (#127382 ) Remove the `PathBase.lstat()` method, which is a trivial variation of `stat()`. No user-facing changes because the pathlib ABCs are still private.	2024-11-29 21:03:39 +00:00
Barney Gale	4ea71278ca	pathlib tests: move `walk()` tests into their own classes (GH-126651) Move tests for Path.walk() into a new PathWalkTest class, and apply a similar change in tests for the ABCs. This allows us to properly tear down the walk test hierarchy in tearDown(), rather than leaving it to os_helper.rmtree().	2024-11-23 18:31:00 -08:00
Barney Gale	266328552e	pathlib ABCs: tighten up `resolve()` and `absolute()` (#126611 ) In `PathBase.resolve()`, raise `UnsupportedOperation` if a non-POSIX path parser is used (our implementation uses `posixpath._realpath()`, which produces incorrect results for non-POSIX path flavours.) Also tweak code to call `self.absolute()` upfront rather than supplying an emulated `getcwd()` function. Adjust `PathBase.absolute()` to work somewhat like `resolve()`. If a POSIX path parser is used, we treat the root directory as the current directory. This is the simplest useful behaviour for concrete path types without a current directory cursor.	2024-11-09 18:47:49 +00:00
Barney Gale	0f47a3199c	pathlib ABCs: support initializing paths with no arguments (#126608 ) In the past I've equivocated about whether to require at least one argument in the `PurePathBase` (and `PathBase`) initializer, and what the default should be if we make it optional. I now have a local use case that has persuaded me to make it optional and default to the empty string (a `zipp.Path`-like class that treats relative and absolute paths similarly.) Happily this brings the base class more in line with `PurePath` and `Path`.	2024-11-09 18:21:53 +00:00
Barney Gale	5e9168492f	pathlib ABCs: defer path joining (#126409 ) Defer joining of path segments in the private `PurePathBase` ABC. The new behaviour matches how the public `PurePath` class handles path segments. This removes a hard-to-grok difference between the ABCs and the main classes. It also slightly reduces the size of `PurePath` objects by eliminating a `_raw_path` slot.	2024-11-05 21:19:36 +00:00
Barney Gale	37651cfbce	GH-125413: pathlib ABCs: use `scandir()` to speed up `walk()` (#126262 ) Use the new `PathBase.scandir()` method in `PathBase.walk()`, which greatly reduces the number of `PathBase.stat()` calls needed when walking. There are no user-facing changes, because the pathlib ABCs are still private and `Path.walk()` doesn't use the implementation in its superclass.	2024-11-01 18:52:00 +00:00
Barney Gale	68a51e0178	GH-125413: pathlib ABCs: use `scandir()` to speed up `glob()` (#126261 ) Use the new `PathBase.scandir()` method in `PathBase.glob()`, which greatly reduces the number of `PathBase.stat()` calls needed when globbing. There are no user-facing changes, because the pathlib ABCs are still private and `Path.glob()` doesn't use the implementation in its superclass.	2024-11-01 17:48:58 +00:00
Barney Gale	260843df1b	GH-125413: Add `pathlib.Path.scandir()` method (#126060 ) Add `pathlib.Path.scandir()` as a trivial wrapper of `os.scandir()`. This will be used to implement several `PathBase` methods more efficiently, including methods that provide `Path.copy()`.	2024-11-01 01:19:01 +00:00
Barney Gale	7bd6ebf696	GH-73991: Prune `pathlib.Path.copy()` and `copy_into()` arguments (#123337 ) Remove ignore and on_error arguments from `pathlib.Path.copy[_into]()`, because these arguments are under-designed. Specifically: - ignore is appropriated from `shutil.copytree()`, but it's not clear how it should apply when the user copies a non-directory. We've changed the callback signature from the `shutil` version, but I'm not confident the new signature is as good as it can be. - on_error is a generalisation of `shutil.copytree()`'s error handling, which is to accumulate exceptions and raise a single `shutil.Error` at the end. It's not obvious which solution is better. Additionally, this arguments may be challenging to implement in future user subclasses of `PathBase`, which might utilise a native recursive copying method.	2024-08-26 17:05:34 +01:00
Barney Gale	033d537cd4	GH-73991: Make `pathlib.Path.delete()` private. (#123315 ) Per feedback from Paul Moore on GH-123158, it's better to defer making `Path.delete()` public than ship it with under-designed error handling capabilities. We leave a remnant `_delete()` method, which is used by `move()`. Any functionality not needed by `move()` is deleted.	2024-08-26 16:26:34 +01:00
Barney Gale	c68a93c582	GH-73991: Add `pathlib.Path.copy_into()` and `move_into()` (#123314 ) These two methods accept an existing directory path, onto which we join the source path's base name to form the final target path. A possible alternative implementation is to check for directories in `copy()` and `move()` and adjust the target path, which is done in several `shutil` functions. This behaviour is helpful in a shell context, but less so in a stored program that explicitly specifies destinations. For example, a user that calls `Path('foo.py').copy('bar.py')` might not imagine that `bar.py/foo.py` would be created, but under the alternative implementation this will happen if `bar.py` is an existing directory.	2024-08-26 14:14:23 +01:00
Barney Gale	625d0705b9	GH-73991: Add `pathlib.Path.move()` (#122073 ) Add a `Path.move()` method that moves a file or directory tree, and returns a new `Path` instance pointing to the target. This method is similar to `shutil.move()`, except that it doesn't accept a copy_function argument, and it doesn't check whether the destination is an existing directory.	2024-08-25 16:51:51 +01:00
Barney Gale	d7ae4dc5c1	GH-73991: Disallow copying directory into itself via `pathlib.Path.copy()` (#122924 )	2024-08-23 20:03:11 +01:00
Cody Maloney	35d8ac7cd7	GH-120754: Disable buffering in Path.read_bytes (#122111 ) `Path.read_bytes()` is used to read a whole file. buffering / BufferedIO is focused around making small, possibly interleaved, read/write efficient which doesn't add value in this case. On my Mac, running the benchmark: ```python import pyperf from pathlib import Path def read_all(all_paths): for p in all_paths: p.read_bytes() def read_file(path_obj): path_obj.read_bytes() all_rst = list(Path("Doc").glob("*/.rst")) all_py = list(Path(".").glob("*/.py")) assert all_rst, "Should have found rst files" assert all_py, "Should have found python source files" runner = pyperf.Runner() runner.bench_func("read_file_small", read_file, Path("Doc/howto/clinic.rst")) runner.bench_func("read_file_large", read_file, Path("Doc/c-api/typeobj.rst")) ``` before: ```python ..................... read_file_small: Mean +- std dev: 6.80 us +- 0.07 us ..................... read_file_large: Mean +- std dev: 10.8 us +- 0.2 us ```` after: ```python ..................... read_file_small: Mean +- std dev: 5.67 us +- 0.05 us ..................... read_file_large: Mean +- std dev: 9.77 us +- 0.52 us ```	2024-08-16 13:52:41 -07:00
Barney Gale	a6644d4464	GH-73991: Rework `pathlib.Path.copytree()` into `copy()` (#122369 ) Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.) Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying non-directories. This simplifies the interface for users, and nicely complements the upcoming `move()` and `delete()` methods (which will also accept any type of file.) Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>	2024-08-11 22:43:18 +01:00
Barney Gale	98dba73010	GH-73991: Rework `pathlib.Path.rmtree()` into `delete()` (#122368 ) Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting non-directories. This simplifies the interface for users, and nicely complements the upcoming `move()` and `copy()` methods (which will also accept any type of file.)	2024-08-07 01:34:44 +01:00
Barney Gale	094375b9b7	GH-73991: Add `pathlib.Path.rmtree()` (#119060 ) Add a `Path.rmtree()` method that removes an entire directory tree, like `shutil.rmtree()`. The signature of the optional on_error argument matches the `Path.walk()` argument of the same name, but differs from the onexc and onerror arguments to `shutil.rmtree()`. Consistency within pathlib is probably more important. In the private pathlib ABCs, we add an implementation based on `walk()`. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2024-07-20 20:14:13 +00:00
Barney Gale	f09d184821	GH-73991: Support copying directory symlinks on older Windows (#120807 ) Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and fall back to the `PathBase.copy()` implementation.	2024-07-03 04:30:29 +01:00
Barney Gale	35e998f560	GH-73991: Add `pathlib.Path.copytree()` (#120718 ) Add `pathlib.Path.copytree()` method, which recursively copies one directory to another. This differs from `shutil.copytree()` in the following respects: 1. Our method has a follow_symlinks argument, whereas shutil's has a symlinks argument with an inverted meaning. 2. Our method lacks something like a copy_function argument. It always uses `Path.copy()` to copy files. 3. Our method lacks something like a ignore_dangling_symlinks argument. Instead, users can filter out danging symlinks with ignore, or ignore exceptions with on_error 4. Our ignore argument is a callable that accepts a single path object, whereas shutil's accepts a path and a list of child filenames. 5. We add an on_error argument, which is a callable that accepts an `OSError` instance. (`Path.walk()` also accepts such a callable). Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>	2024-06-23 22:01:12 +01:00
Barney Gale	20d5b84f57	GH-73991: Add follow_symlinks argument to `pathlib.Path.copy()` (#120519 ) Add support for not following symlinks in `pathlib.Path.copy()`. On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs. No news as `copy()` was only just added.	2024-06-19 00:59:54 +00:00
Barney Gale	9f741e55c1	GH-73991: pathlib ABC tests: add `DummyPath.unlink()` and `rmdir()` (#120715 ) In preparation for the addition of `PathBase.rmtree()`, implement `DummyPath.unlink()` and `rmdir()`, and move corresponding tests into `test_pathlib_abc` so they're run against `DummyPath`.	2024-06-18 22:13:45 +00:00
Barney Gale	7c38097add	GH-73991: Add `pathlib.Path.copy()` (#119058 ) Add a `Path.copy()` method that copies the content of one file to another. This method is similar to `shutil.copyfile()` but differs in the following ways: - Uses `fcntl.FICLONE` where available (see GH-81338) - Uses `os.copy_file_range` where available (see GH-81340) - Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`. The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata. Incorporates code from GH-81338 and GH-93152. Co-authored-by: Eryk Sun <eryksun@gmail.com>	2024-06-14 17:15:49 +01:00
Barney Gale	e418fc3a6e	GH-82805: Fix handling of single-dot file extensions in pathlib (#118952 ) pathlib now treats "`.`" as a valid file extension (suffix). This brings it in line with `os.path.splitext()`. In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method that splits a path into a `(root, ext)` pair, like `os.path.splitext()`. This method is called by `PurePathBase.stem`, `suffix`, etc. In a future version of pathlib, we might make these base classes public, and so users will be able to define their own `splitext()` method to control file extension splitting. In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes` properties that don't use `splitext()`, which avoids computing the path base name twice.	2024-05-25 21:01:36 +01:00
Barney Gale	3c28510b98	GH-119113: Raise `TypeError` from `pathlib.PurePath.with_suffix(None)` (#119124 ) Restore behaviour from 3.12 when `path.with_suffix(None)` is called.	2024-05-19 17:04:56 +01:00
Barney Gale	a74f117dab	GH-115060: Speed up `pathlib.Path.glob()` by omitting initial `stat()` (#117831 ) Since `6258844c`, paths that might not exist can be fed into pathlib's globbing implementation, which will call `os.scandir()` / `os.lstat()` only when strictly necessary. This allows us to drop an initial `self.is_dir()` call, which saves a `stat()`. Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>	2024-04-14 00:08:03 +01:00
Barney Gale	0eb52f5f26	GH-115060: Speed up `pathlib.Path.glob()` by not scanning literal parts (#117732 ) Don't bother calling `os.scandir()` to scan for literal pattern segments, like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call through to the next selector with `exists=False`, which signals that the path might not exist. Subsequent selectors will call `os.scandir()` or `os.lstat()` to filter out missing paths as needed.	2024-04-12 22:19:21 +01:00
Barney Gale	6150bb2412	GH-77609: Add recurse_symlinks argument to `pathlib.Path.glob()` (#117311 ) Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows: follow_symlinks recurse_symlinks =============== ================ False N/A None False True True We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells. This makes the API a easier to grok by eliminating `None` as an option. No news blurb as `follow_symlinks` was new in 3.13.	2024-04-05 18:51:54 +00:00
Barney Gale	752e18389e	GH-114575: Rename `PurePath.pathmod` to `PurePath.parser` (#116513 ) And rename the private base class from `PathModuleBase` to `ParserBase`.	2024-03-31 19:14:48 +01:00
Barney Gale	0634201f53	GH-116377: Stop raising `ValueError` from `glob.translate()`. (#116378 ) Stop raising `ValueError` from `glob.translate()` when a `**` sub-string appears in a non-recursive pattern segment. This matches `glob.glob()` behaviour.	2024-03-17 17:09:35 +00:00
Barney Gale	1dce0073da	pathlib ABCs: follow all symlinks in `PathBase.glob()` (#116293 ) Switch the default value of follow_symlinks from `None` to `True` in `pathlib._abc.PathBase.glob()` and `rglob()`. This speeds up recursive globbing. No change to the public pathlib classes.	2024-03-04 02:26:33 +00:00
Barney Gale	e3dedeae7a	GH-114610: Fix `pathlib.PurePath.with_stem('')` handling of file extensions (#114612 ) Raise `ValueError` if `with_stem('')` is called on a path with a file extension. Paths may only have an empty stem if they also have an empty suffix.	2024-02-24 19:37:03 +00:00
Barney Gale	1b1f8398d0	GH-106747: Make pathlib ABC globbing more consistent with `glob.glob()` (#115056 ) When expanding `` wildcards, ensure we add a trailing slash to the topmost directory path. This matches `glob.glob()` behaviour: >>> glob.glob('dirA/', recursive=True) ['dirA/', 'dirA/dirB', 'dirA/dirB/dirC'] This does not affect `pathlib.Path.glob()`, because trailing slashes aren't supported in pathlib proper.	2024-02-06 02:48:18 +00:00

1 2

76 Commits