cpython

Commit Graph

Author	SHA1	Message	Date
Barney Gale	f3192dac66	GH-90812: Add test for `urlopen()` of file URI for UNC path (#132489 )	2025-04-15 19:16:34 +01:00
Serhiy Storchaka	f98b9b4cbb	gh-71339: Use new assertion methods in the urllib tests (GH-129056)	2025-04-14 09:24:41 +03:00
Barney Gale	ccad61e35d	GH-125866: Support complete "file:" URLs in urllib (#132378 ) Add optional add_scheme argument to `urllib.request.pathname2url()`; when set to true, a complete URL is returned. Likewise add optional require_scheme argument to `url2pathname()`; when set to true, a complete URL is accepted. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2025-04-14 01:49:02 +01:00
Barney Gale	66cdb2bd8a	GH-123599: `url2pathname()`: handle authority section in file URL (#126844 ) In `urllib.request.url2pathname()`, if the authority resolves to the current host, discard it. If an authority is present but resolves somewhere else, then on Windows we return a UNC path (as before), and on other platforms we raise `URLError`. Affects `pathlib.Path.from_uri()` in the same way. Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2025-04-10 19:58:04 +00:00
Barney Gale	d783d7b51d	GH-126367: `url2pathname()`: handle NTFS alternate data streams (#131428 ) Adjust `url2pathname()` to decode embedded colon characters in Windows URIs, rather than bailing out with an `OSError`.	2025-03-18 23:37:12 +00:00
Serhiy Storchaka	5ace71713b	gh-128734: Fix ResourceWarning in urllib tests (GH-128735)	2025-01-12 12:53:17 +02:00
Barney Gale	79b7cab50a	GH-127090: Fix `urllib.response.addinfourl.url` value for opened `file:` URIs (#127091 ) The canonical `file:` URL (as generated by `pathname2url()`) is now used as the `url` attribute of the returned `addinfourl` object. The `addinfourl.url` attribute reflects the resolved URL for both `file:` or `http[s]:` URLs now.	2024-12-07 17:58:42 +00:00
Barney Gale	5bb059fe60	GH-127236: `pathname2url()`: generate RFC 1738 URL for absolute POSIX path (#127194 ) When handed an absolute Windows path such as `C:\foo` or `//server/share`, the `urllib.request.pathname2url()` function returns a URL with an authority section, such as `///C:/foo` or `//server/share` (or before GH-126205, `////server/share`). Only the `file:` prefix is omitted. But when handed an absolute POSIX path such as `/etc/hosts`, or a Windows path of the same form (rooted but lacking a drive), the function returns a URL without an authority section, such as `/etc/hosts`. This patch corrects the discrepancy by adding a `//` prefix before drive-less, rooted paths when generating URLs.	2024-11-25 19:59:20 +00:00
Serhiy Storchaka	97b2ceaaaf	gh-127217: Fix pathname2url() for paths starting with multiple slashes on Posix (GH-127218)	2024-11-24 19:30:29 +02:00
Barney Gale	cc813e10ff	GH-125866: Preserve Windows drive letter case in file URIs (#127138 ) Stop converting Windows drive letters to uppercase in `urllib.request.pathname2url()` and `url2pathname()`. This behaviour is unnecessary and inconsistent with pathlib's file URI implementation.	2024-11-23 10:41:39 +00:00
Barney Gale	8c98ed846a	GH-127078: `url2pathname()`: handle extra slash before UNC drive in URL path (#127132 ) Decode a file URI like `file://///server/share` as a UNC path like `\\server\share`. This form of file URI is created by software the simply prepends `file:///` to any absolute Windows path.	2024-11-22 04:12:50 +00:00
Barney Gale	ebf564a1d3	GH-126766: `url2pathname()`: handle 'localhost' authority (#127129 ) Discard any 'localhost' authority from the beginning of a `file:` URI. As a result, file URIs like `//localhost/etc/hosts` are correctly decoded as `/etc/hosts`.	2024-11-22 03:17:06 +00:00
Barney Gale	fd133d4f21	GH-126601: `pathname2url()`: handle NTFS alternate data streams (#126760 ) Adjust `pathname2url()` to encode embedded colon characters in Windows paths, rather than bailing out with an `OSError`. Co-authored-by: Steve Dower <steve.dower@microsoft.com>	2024-11-22 00:29:05 +00:00
Barney Gale	c9b399fbdb	GH-85168: Use filesystem encoding when converting to/from `file` URIs (#126852 ) Adjust `urllib.request.url2pathname()` and `pathname2url()` to use the filesystem encoding when quoting and unquoting file URIs, rather than forcing use of UTF-8. No changes are needed in the `nturl2path` module because Windows always uses UTF-8, per PEP 529.	2024-11-19 21:19:30 +00:00
Barney Gale	4d771977b1	GH-84850: Remove `urllib.request.URLopener` and `FancyURLopener` (#125739 )	2024-11-19 16:01:49 +02:00
Barney Gale	cae9d9d20f	GH-126766: `url2pathname()`: handle empty authority section. (#126767 ) Discard two leading slashes from the beginning of a `file:` URI if they introduce an empty authority section. As a result, file URIs like `///etc/hosts` are correctly parsed as `/etc/hosts`.	2024-11-14 20:22:14 +00:00
Barney Gale	bf224bd7ce	GH-120423: `pathname2url()`: handle forward slashes in Windows paths (#126593 ) Adjust `urllib.request.pathname2url()` so that forward slashes in Windows paths are handled identically to backward slashes.	2024-11-12 19:52:30 +00:00
Barney Gale	54c63a32d0	GH-126212: Fix removal of slashes in file URIs on Windows (#126214 ) Adjust `urllib.request.pathname2url()` and `url2pathname()` so that they don't remove slashes from Windows DOS drive paths and URLs. There was no basis for this behaviour, and it conflicts with how UNC and POSIX paths are handled.	2024-11-08 16:47:51 +00:00
Barney Gale	951cb2c369	GH-126205: Fix conversion of UNC paths to file URIs (#126208 ) File URIs for Windows UNC paths should begin with two slashes, not four.	2024-10-30 22:56:58 +00:00
Barney Gale	6742f14dfd	GH-125866: Improve tests for `pathname2url()` and `url2pathname()` (#125993 ) Merge `URL2PathNameTests` and `PathName2URLTests` test cases (which test only the Windows-specific implementations from `nturl2path`) into the main `Pathname_Tests` test case for these functions. Copy/port some test cases for `pathlib.Path.as_uri()` and `from_uri()`.	2024-10-29 20:44:57 +00:00
Victor Stinner	2587b9f64e	gh-105382: Remove urllib.request cafile parameter (#105384 ) Remove cafile, capath and cadefault parameters of the urllib.request.urlopen() function, deprecated in Python 3.6.	2023-06-06 21:17:45 +00:00
Gregory P. Smith	2e279e85fe	gh-88500: Reduce memory use of `urllib.unquote` (#96763 ) `urllib.unquote_to_bytes` and `urllib.unquote` could both potentially generate `O(len(string))` intermediate `bytes` or `str` objects while computing the unquoted final result depending on the input provided. As Python objects are relatively large, this could consume a lot of ram. This switches the implementation to using an expanding `bytearray` and a generator internally instead of precomputed `split()` style operations. Microbenchmarks with some antagonistic inputs like `mess = "\u0141%%%20a%fe"1000` show this is 10-20% slower for unquote and unquote_to_bytes and no different for typical inputs that are short or lack much unicode or % escaping. But the functions are already quite fast anyways so not a big deal. The slowdown scales consistently linear with input size as expected. Memory usage observed manually using `/usr/bin/time -v` on `python -m timeit` runs of larger inputs. Unittesting memory consumption is difficult and does not seem worthwhile. Observed memory usage is ~1/2 for `unquote()` and <1/3 for `unquote_to_bytes()` using `python -m timeit -s 'from urllib.parse import unquote, unquote_to_bytes; v="\u0141%01\u0161%20"500_000' 'unquote_to_bytes(v)'` as a test.	2022-12-10 16:17:39 -08:00
Christian Heimes	760ec8940a	gh-90473: WASI: skip gethostname tests (GH-93092) - WASI's ``gethostname()`` is a stub that always fails with OSError ``ENOTSUP`` - skip mailcap ``test`` if subprocess is not available - WASI process_time clock does not work.	2022-05-23 10:39:57 +02:00
Serhiy Storchaka	086c6b1b0f	bpo-45046: Support context managers in unittest (GH-28045) Add methods enterContext() and enterClassContext() in TestCase. Add method enterAsyncContext() in IsolatedAsyncioTestCase. Add function enterModuleContext().	2022-05-08 17:49:09 +03:00
Steve Dower	3513d55a61	bpo-43607: Fix urllib handling of Windows paths with \\?\ prefix (GH-25539)	2021-04-23 18:02:47 +01:00
Hai Shi	3ddc634cd5	bpo-40275: Use new test.support helper submodules in tests (GH-21219)	2020-06-30 15:46:06 +02:00
Serhiy Storchaka	700cfa8c90	bpo-41069: Make TESTFN and the CWD for tests containing non-ascii characters. (GH-21035)	2020-06-25 17:56:31 +03:00
Ashwin Ramaswami	9165addc22	bpo-38576: Disallow control characters in hostnames in http.client (GH-18995) Add host validation for control characters for more CVE-2019-18348 protection.	2020-03-14 11:56:06 -07:00
Serhiy Storchaka	6a265f0d0c	bpo-39057: Fix urllib.request.proxy_bypass_environment(). (GH-17619) Ignore leading dots and no longer ignore a trailing newline.	2020-01-05 14:14:31 +02:00
Victor Stinner	ae7aa42774	Remove code commented for more than 10 years (GH-16965) test_urllib commented since 2007: commit `d9880d07fc` Author: Facundo Batista <facundobatista@gmail.com> Date: Fri May 25 04:20:22 2007 +0000 Commenting out the tests until find out who can test them in one of the problematic enviroments. pynche code commented since 1998 and 2001: commit `ef30092207` Author: Barry Warsaw <barry@python.org> Date: Tue Dec 15 01:04:38 1998 +0000 Added most of the mechanism to change the strips from color variations to color constants (i.e. red constant, green constant, blue constant). But I haven't hooked this up yet because the UI gets more crowded and the arrows don't reflect the correct values. Added "Go to Black" and "Go to White" buttons. commit `741eae0b31` Author: Barry Warsaw <barry@python.org> Date: Wed Apr 18 03:51:55 2001 +0000 StripWidget.__init__(), update_yourself(): Removed some unused local variables reported by PyChecker. __togglegentype(): PyChecker accurately reported that the variable __gentypevar was unused -- actually this whole method is currently unused so comment it out.	2019-10-28 22:35:31 +01:00
Stein Karlsen	aad2ee0156	bpo-32498: urllib.parse.unquote also accepts bytes (GH-7768)	2019-10-14 13:36:29 +03:00
Ashwin Ramaswami	ff2e182865	bpo-12707: deprecate info(), geturl(), getcode() methods in favor of headers, url, and status properties for HTTPResponse and addinfourl (GH-11447) Co-Authored-By: epicfaace <aramaswamis@gmail.com>	2019-09-13 12:40:07 +01:00
Victor Stinner	7cb9204ee1	bpo-37421: urllib.request tests call urlcleanup() (GH-14529) urllib.request tests now call urlcleanup() to remove temporary files created by urlretrieve() tests and to clear the _opener global variable set by urlopen() and functions calling indirectly urlopen(). regrtest now checks if urllib.request._url_tempfiles and urllib.request._opener are changed by tests.	2019-07-02 14:50:19 +02:00
Victor Stinner	eb976e47e2	bpo-36918: Fix "Exception ignored in" in test_urllib (GH-13996) Mock the HTTPConnection.close() method in a few unit tests to avoid logging "Exception ignored in: ..." messages.	2019-06-12 04:07:38 +02:00
Victor Stinner	0c2b6a3943	bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474) CVE-2019-9948: Avoid file reading as disallowing the unnecessary URL scheme in URLopener().open() and URLopener().retrieve() of urllib.request. Co-Authored-By: SH <push0ebp@gmail.com>	2019-05-22 22:15:01 +02:00
Berker Peksag	2725cb01d7	bpo-36948: Fix test_urlopener_retrieve_file on Windows (GH-13476)	2019-05-22 02:00:35 +03:00
Xtreak	c661b30f89	bpo-36948: Fix NameError in urllib.request.URLopener.retrieve (GH-13389)	2019-05-19 16:40:05 +03:00
Gregory P. Smith	b7378d7728	bpo-30458: Use InvalidURL instead of ValueError. (GH-13044) Use http.client.InvalidURL instead of ValueError as the new error case's exception.	2019-05-01 16:39:21 -04:00
Xtreak	2fc936ed24	bpo-30458: Disable https related urllib tests on a build without ssl (GH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. https://bugs.python.org/issue30458	2019-05-01 04:59:48 -07:00
Gregory P. Smith	c4e671eec2	bpo-30458: Disallow control chars in http URLs. (GH-12755) Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected.	2019-04-30 19:12:21 -07:00
Stéphane Wirtel	a40681dd5d	bpo-36019: Use pythontest.net instead of example.com in network tests (GH-11941)	2019-02-22 14:45:36 +01:00
Senthil Kumaran	efbd4ea65d	Minor spell fix and formatting fixes in urllib tests. (#959 )	2017-04-01 23:47:35 -07:00
Ratnadeep Debnath	21024f0662	bpo-16285: Update urllib quoting to RFC 3986 (#173 ) * bpo-16285: Update urllib quoting to RFC 3986 urllib.parse.quote is now based on RFC 3986, and hence includes `'~'` in the set of characters that is not escaped by default. Patch by Christian Theune and Ratnadeep Debnath.	2017-02-25 19:00:28 +10:00
Xiang Zhang	c44d58a77a	Issue #29142 : Merge 3.5.	2017-01-09 11:50:02 +08:00
Xiang Zhang	959ff7f1c6	Issue #29142 : Fix suffixes in no_proxy handling in urllib. In urllib.request, suffixes in no_proxy environment variable with leading dots could match related hostnames again (e.g. .b.c matches a.b.c). Patch by Milan Oberkirch.	2017-01-09 11:47:55 +08:00
Christian Heimes	d04863771b	Issue #28022 : Deprecate ssl-related arguments in favor of SSLContext. The deprecation include manual creation of SSLSocket and certfile/keyfile (or similar) in ftplib, httplib, imaplib, smtplib, poplib and urllib. ssl.wrap_socket() is not marked as deprecated yet.	2016-09-10 23:23:33 +02:00
Martin Panter	0be894b2f6	Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).	2016-09-07 12:03:06 +00:00
R David Murray	44b548dda8	#27364 : fix "incorrect" uses of escape character in the stdlib. And most of the tools. Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and Martin Panter.	2016-09-08 13:59:53 -04:00
Raymond Hettinger	15f44ab043	Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).	2016-08-30 10:47:49 -07:00
Senthil Kumaran	17742f2d45	[merge from 3.4] - Prevent HTTPoxy attack (CVE-2016-1000110) Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which indicates that the script is in CGI mode. Issue #27568 Reported and patch contributed by Rémi Rampin.	2016-07-30 23:39:06 -07:00

1 2 3 4 5

207 Commits