cpython

Commit Graph

Author	SHA1	Message	Date
Serhiy Storchaka	f98b9b4cbb	gh-71339: Use new assertion methods in the urllib tests (GH-129056)	2025-04-14 09:24:41 +03:00
Seth Michael Larson	d89a5f6a6e	gh-105704: Disallow square brackets (`[` and `]`) in domain names for parsed URLs (#129418 ) * gh-105704: Disallow square brackets ( and ) in domain names for parsed URLs * Use Sphinx references Co-authored-by: Peter Bierma <zintensitydev@gmail.com> * Add mismatched bracket test cases, fix news format * Add more test coverage for ports --------- Co-authored-by: Peter Bierma <zintensitydev@gmail.com>	2025-01-31 09:41:34 -08:00
Serhiy Storchaka	7577307ebd	gh-116897: Deprecate generic false values in urllib.parse.parse_qsl() (GH-116903) Accepting objects with false values (like 0 and []) except empty strings and byte-like objects and None in urllib.parse functions parse_qsl() and parse_qs() is now deprecated.	2024-11-12 21:10:29 +02:00
Serhiy Storchaka	dbb6e22cb1	gh-125926: Fix urllib.parse.urljoin() for base URI with undefined authority (GH-125989) Although this goes beyond the application of RFC 3986, urljoin() should support relative base URIs for backward compatibility.	2024-11-07 09:09:59 +02:00
Serhiy Storchaka	fc897fcc01	gh-76960: Fix urljoin() and urldefrag() for URIs with empty components (GH-123273) * urljoin() with relative reference "?" sets empty query and removes fragment. * Preserve empty components (authority, params, query, fragment) in urljoin(). * Preserve empty components (authority, params, query) in urldefrag(). Also refactor the code and get rid of double _coerce_args() and _coerce_result() calls in urljoin(), urldefrag(), urlparse() and urlunparse().	2024-08-31 12:42:08 +03:00
Serhiy Storchaka	90c892efea	gh-85110: Preserve relative path in URL without netloc in urllib.parse.urlunsplit() (GH-123179)	2024-08-21 10:17:38 +03:00
Nikita Sobolev	84c3191954	gh-118827: Remove `Quoter` from `urllib.parse` (#118828 ) Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>	2024-06-03 10:50:29 +03:00
Serhiy Storchaka	331d385af9	Add yet few cases for urlparse/urlunparse roundtrip tests (GH-119031) Add yet few cases for urlparse/urlunparse tests	2024-05-14 16:59:21 +03:00
Serhiy Storchaka	e237b25a4f	gh-67693: Fix urlunparse() and urlunsplit() for URIs with path starting with multiple slashes and no authority (GH-113563)	2024-05-14 12:24:37 +03:00
Serhiy Storchaka	1069a462f6	gh-116764: Fix regressions in urllib.parse.parse_qsl() (GH-116801) * Restore support of None and other false values. * Raise TypeError for non-zero integers and non-empty sequences. The regressions were introduced in gh-74668 (`bdba8ef42b`).	2024-03-16 12:36:05 +02:00
Serhiy Storchaka	bdba8ef42b	gh-74668: Fix support of bytes in urllib.parse.parse_qsl() (GH-115771) urllib.parse functions parse_qs() and parse_qsl() now support bytes arguments containing raw and percent-encoded non-ASCII data.	2024-03-05 17:49:50 +02:00
Illia Volochii	2f630e1ce1	gh-102153: Start stripping C0 control and space chars in `urlsplit` (#102508 ) `urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit #25595. This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/#url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329). --------- Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>	2023-05-17 01:49:20 -07:00
JohnJamesUtley	29f348e232	gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (#103849 ) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- Co-authored-by: Gregory P. Smith <greg@krypto.org>	2023-05-10 00:18:35 +00:00
Gregory P. Smith	82f789be3b	gh-104139: Add itms-services to uses_netloc urllib.parse. (#104312 ) Teach unsplit to retain the `"//"` when assembling `itms-services://?action=generate-bugs` style [Apple Platform Deployment](https://support.apple.com/en-gb/guide/deployment/depce7cefc4d/web) URLs.	2023-05-09 07:04:50 -07:00
Ben Kallus	439b9cfaf4	gh-99418: Make urllib.parse.urlparse enforce that a scheme must begin with an alphabetical ASCII character. (#99421 ) Prevent urllib.parse.urlparse from accepting schemes that don't begin with an alphabetical ASCII character. RFC 3986 defines a scheme like this: `scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )` RFC 2234 defines an ALPHA like this: `ALPHA = %x41-5A / %x61-7A` The WHATWG URL spec defines a scheme like this: `"A URL-scheme string must be one ASCII alpha, followed by zero or more of ASCII alphanumeric, U+002B (+), U+002D (-), and U+002E (.)."`	2022-11-13 10:25:55 -08:00
Ben Kallus	6f15ca8c7a	gh-96035: Make urllib.parse.urlparse reject non-numeric ports (#98273 ) Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>	2022-10-20 14:00:56 -07:00
Gregory P. Smith	e61ca22431	gh-95865: Further reduce quote_from_bytes memory consumption (#96860 ) on large input values. Based on Dennis Sweeney's chunking idea.	2022-09-19 16:06:25 -07:00
Jacob Walls	c0f2fcf9bb	Speed up test_urlsplit_normalization (GH-26688)	2021-07-22 10:45:53 +03:00
Gregory P. Smith	d597fdc5fd	bpo-44002: Switch to lru_cache in urllib.parse. (GH-25798) Switch to lru_cache in urllib.parse. urllib.parse now uses functool.lru_cache for its internal URL splitting and quoting caches instead of rolling its own like its the 90s. The undocumented internal Quoted class API is now deprecated as it had no reason to be public and no existing OSS users were found. The clear_cache() API remains undocumented but gets an explicit test as it is used in a few projects' (twisted, gevent) tests as well as our own regrtest.	2021-05-11 17:01:44 -07:00
Senthil Kumaran	985ac01637	bpo-43882 Remove the newline, and tab early. From query and fragments. (GH-25921)	2021-05-05 15:50:05 -07:00
Senthil Kumaran	76cd81d603	bpo-43882 - urllib.parse should sanitize urls containing ASCII newline and tabs. (GH-25595) * issue43882 - urllib.parse should sanitize urls containing ASCII newline and tabs. Co-authored-by: Gregory P. Smith <greg@krypto.org> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2021-04-29 10:16:50 -07:00
Ken Jin	b38601d496	bpo-42967: coerce bytes separator to string in urllib.parse_qs(l) (#24818 ) * coerce bytes separator to string * Add news * Update Misc/NEWS.d/next/Library/2021-03-11-00-31-41.bpo-42967.2PeQRw.rst	2021-04-11 06:26:09 -07:00
Adam Goldschmidt	fcbe0cb04d	bpo-42967: only use '&' as a query string separator (#24297 ) bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl(). urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator. Co-authored-by: Éric Araujo <merwok@netwok.org> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Co-authored-by: Éric Araujo <merwok@netwok.org>	2021-02-14 14:41:57 -08:00
Tim Graham	5a88d50ff0	bpo-27657: Fix urlparse() with numeric paths (#661 ) * bpo-27657: Fix urlparse() with numeric paths Revert parsing decision from bpo-754016 in favor of the documented consensus in bpo-16932 of how to treat strings without a // to designate the netloc. * bpo-22891: Remove urlsplit() optimization for 'http' prefixed inputs.	2019-10-18 06:07:20 -07:00
Steve Dower	8d0ef0b5ed	bpo-36742: Corrects fix to handle decomposition in usernames (#13812 )	2019-06-04 17:55:29 +02:00
Rémi Lapeyre	674ee12600	bpo-35397: Remove deprecation and document urllib.parse.unwrap (GH-11481)	2019-05-27 09:43:45 -04:00
Steve Dower	d537ab0ff9	bpo-36742: Fixes handling of pre-normalization characters in urlsplit() (GH-13017)	2019-04-30 12:03:02 +00:00
Steve Dower	16e6f7dee7	bpo-36216: Add check for characters in netloc that normalize to separators (GH-12201)	2019-03-07 08:02:26 -08:00
Srinivas Thatiparthy (శ్రీనివాస్ తాటిపర్తి)	90d0cfb222	bpo-35202: Remove unused imports in tests. (GH-10561)	2018-11-16 17:32:58 +02:00
matthewbelisle-wf	209144831b	bpo-34866: Adding max_num_fields to cgi.FieldStorage (GH-9660) Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by limiting the number of `MiniFieldStorage` objects created by `FieldStorage`.	2018-10-19 03:52:59 -07:00
Cheryl Sabella	867b825830	bpo-27485: Change urlparse tests to use private methods. (GH-7070)	2018-06-03 17:31:32 +03:00
Cheryl Sabella	0250de4819	bpo-27485: Rename and deprecate undocumented functions in urllib.parse (GH-2205)	2018-04-25 16:51:54 -07:00
Matt Eaton	2cb4661707	bpo-33034: Improve exception message when cast fails for {Parse,Split}Result.port (GH-6078)	2018-03-20 09:41:37 +03:00
Коренберг Марк	fbd605151f	bpo-32323: urllib.parse.urlsplit() must not lowercase() IPv6 scope value (#4867 )	2017-12-21 14:16:17 +02:00
postmasters	90e01e50ef	urllib: Simplify splithost by calling into urlparse. (#1849 ) The current regex based splitting produces a wrong result. For example:: http://abc#@def Web browsers parse that URL as ``http://abc/#@def``, that is, the host is ``abc``, the path is ``/``, and the fragment is ``#@def``.	2017-06-20 15:02:44 +02:00
Senthil Kumaran	257b980b31	correct parse_qs and parse_qsl test case descriptions. (#968 ) * correct parse_qs and parse_qsl test case descriptions.	2017-04-04 21:19:43 -07:00
Berker Peksag	f8479eeb34	Issue #25895 : Merge from 3.5	2016-09-16 14:45:15 +03:00
Berker Peksag	f676748a05	Issue #25895 : Enable WebSocket URL schemes in urllib.parse.urljoin Patch by Gergely Imreh and Markus Holtermann.	2016-09-16 14:43:58 +03:00
Senthil Kumaran	4d4ac5bd02	merge 3.5 issue26775 - Improve test coverage for urllib.parse Patch contributed by Luiz Poleto.	2016-04-16 07:34:24 -07:00
Senthil Kumaran	e38415e776	issue26775 - Improve test coverage for urllib.parse Patch contributed by Luiz Poleto.	2016-04-16 07:33:15 -07:00
Robert Collins	dfa95c9a8f	Issue #20059 : urllib.parse raises ValueError on all invalid ports. Patch by Martin Panter.	2015-08-10 09:53:30 +12:00
Berker Peksag	a7c781724f	Issue #23684 : Clarify the return value of the scheme attribute of ParseResult and SplitResult objects. Patch by Martin Panter.	2015-06-25 23:39:26 +03:00
Berker Peksag	89584c97e4	Issue #23684 : Clarify the return value of the scheme attribute of ParseResult and SplitResult objects. Patch by Martin Panter.	2015-06-25 23:38:48 +03:00
R David Murray	c17686f071	Issue #13866 : add quote_via argument to urlencode. Patch by samwyse, completed by Arnon Yaari, and reviewed by Martin Panter.	2015-05-17 20:44:50 -04:00
Berker Peksag	20416f7994	Issue #23703 : Fix a regression in urljoin() introduced in 901e4e52b20a. Patch by Demian Brecht.	2015-04-16 02:31:14 +03:00
Serhiy Storchaka	1515450440	Issue #23411 : Added DefragResult, ParseResult, SplitResult, DefragResultBytes, ParseResultBytes, and SplitResultBytes to urllib.parse.__all__. Patch by Martin Panter.	2015-04-07 19:09:01 +03:00
Serhiy Storchaka	5e0fd95e3b	Added more tests for urllib.parse utility functions. These functions are not documented but used in third-party code.	2015-03-02 16:33:08 +02:00
Serhiy Storchaka	9270be7662	Added more tests for urllib.parse utility functions. These functions are not documented but used in third-party code.	2015-03-02 16:32:29 +02:00
Senthil Kumaran	a66e3885fb	Issue #22278 : Fix urljoin problem with relative urls, a regression observed after changes to issue22118 were submitted. Patch contributed by Demian Brecht and reviewed by Antoine Pitrou.	2014-09-22 15:49:16 +08:00
Antoine Pitrou	55ac5b3f7b	Issue #22118 : Switch urllib.parse to use RFC 3986 semantics for the resolution of relative URLs, rather than RFCs 1808 and 2396. Patch by Demian Brecht.	2014-08-21 19:16:17 -04:00

1 2 3

114 Commits