Commit Graph

191 Commits

Author SHA1 Message Date
Serhiy Storchaka 22bbb0c4c7
[3.10] gh-98740: Fix validation of conditional expressions in RE (GH-98764) (GH-99046)
In very rare circumstances the JUMP opcode could be confused with the
argument of the opcode in the "then" part which doesn't end with the
JUMP opcode. This led to incorrect detection of the final JUMP opcode
and incorrect calculation of the size of the subexpression.

NOTE: Changed return value of functions _validate_inner() and
_validate_charset() in Modules/_sre/sre.c.  Now they return 0 on success,
-1 on failure, and 1 if the last op is JUMP (which usually is a failure).
Previously they returned 1 on success and 0 on failure.
(cherry picked from commit e9ac890c02)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2022-11-03 12:18:50 +02:00
Miss Islington (bot) 24908f1f20
Add re.VERBOSE flag documentation example (GH-97678)
The current re.VERBOSE documentation example leaves space for ambiguous
interpretation. One may read that spaces within the `(?:` token are
spaces inside the non-capturing group (such as `(?: )`). This patch
removes the ambiguity by including examples after the statement.
(cherry picked from commit 0ceafa7fa4)

Co-authored-by: Athos Ribeiro <athoscribeiro@gmail.com>
2022-10-04 18:37:01 -07:00
Miss Islington (bot) 619a67cc06
gh-73137: Added sub-subsection headers for flags in re (GH-93000)
Fixes GH-73137
(cherry picked from commit b7a6610bc8)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
2022-05-22 19:06:54 -07:00
Miss Islington (bot) aff69c34b0
Fix the "Finding all Adverbs" example (GH-21420) (#28839)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
(cherry picked from commit dbd62e74da)

Co-authored-by: Rim Chatti <chattiriim@gmail.com>
2021-10-10 14:43:38 -07:00
Miss Islington (bot) 519bcc698c
bpo-44940: Clarify the documentation of re.findall() (GH-27849)
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Vedran Čačić <vedgar+github@gmail.com>
(cherry picked from commit 64f9e7b19d)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2021-08-22 00:45:02 -07:00
Miss Islington (bot) f7f1c26423
Update URLs in comments and metadata to use HTTPS (GH-27458) (GH-27478)
(cherry picked from commit be42c06bb0)

Co-authored-by: Noah Kantrowitz <noah@coderanger.net>
2021-07-30 16:25:28 +02:00
Raymond Hettinger bf1a81258c
Minor modernization and readability improvement to the tokenizer example (GH-19558) 2020-04-16 19:54:13 -07:00
Ricardo Bánffy 15ae75d660 bpo-38294: Add list of no-longer-escaped chars to re.escape documentation. (GH-16442)
Prior to 3.7, re.escape escaped many characters that don't have
special meaning in Python, but that use to require escaping in other
tools and languages. This commit aims to make it clear which characters
were, but are no longer escaped.
2019-10-07 23:54:35 +03:00
Julien Palard 1fae844451 Doc: Fix missing negation. (GH-14640)
Reported by Hug Capella on docs@.



Automerge-Triggered-By: @matrixise
2019-09-11 08:55:22 -07:00
Robert DiPietro fb6c1f8d3b Fix typo in re.escape documentation (GH-14722) 2019-07-13 16:35:04 +08:00
mollison 5ebfa840a1 bpo-36645: Fix ambiguous formatting in re.sub() documentation (GH-12879) 2019-04-22 01:14:45 +03:00
Serhiy Storchaka a180b007d9
bpo-28450: Fix and improve the documentation for unknown escapes in RE. (GH-11920) 2019-02-25 17:58:30 +02:00
animalize 4a7f44a2ed bpo-34294: re module, fix wrong capturing groups in rare cases. (GH-11546)
Need to reset capturing groups between two SRE(match) callings in loops, this fixes wrong capturing groups in rare cases.

Also add a missing index in re.rst.
2019-02-18 15:26:37 +02:00
Pablo Galindo e8239b8e81
Add information about DeprecationWarning for invalid escaped characters in the re module (GH-5255) 2019-01-20 18:57:56 +00:00
Raymond Hettinger b83942c755 Cleanup and improve the regex tokenizer example. (GH-10426)
1) Convert weird field name "typ" to the more standard "type".
2) For the NUMBER type, convert the value to an int() or float().
3) Simplify ``group(kind)`` to the shorter and faster ``group()`` call.
4) Simplify logic go a single if-elif chain to make this easier to extend.
5) Reorder the tests to match the order the tokens are specified.
   This isn't necessary for correctness but does make the example
   easier to follow.
6) Move the "column" calculation before the if-elif chain so that
   users have the option of using this value in error messages.
2018-11-09 01:19:33 -08:00
Serhiy Storchaka 913876d824
bpo-35054: Add yet more index entries for symbols. (GH-10121) 2018-10-28 13:41:26 +02:00
Serhiy Storchaka ddb961d2ab
bpo-35054: Add more index entries for symbols. (GH-10064) 2018-10-26 09:00:49 +03:00
Stéphane Wirtel 859c068e52 bpo-34962: make doctest in Doc/ now passes, and is enforced in CI (GH-9806) 2018-10-12 09:51:05 +02:00
Andrés Delfino 7dfbd49671 Correct grammar mistake in re.rst. (GH-9745) 2018-10-06 22:48:30 +03:00
Andrés Delfino 5092439c2c bpo-33892: Doc: Use gender neutral words (GH-7770) 2018-06-18 13:34:30 +09:00
Stéphane Wirtel 19177fbd5d bpo-33503: Fix the broken pypi link in the source and the documentation (GH-6814) 2018-05-15 14:58:35 -04:00
Berker Peksag a0a42d22d8
Fix a reference to the MRE book in re docs (GH-1113)
Reported by Maksym Nikulyak on docs.p.o.
2018-03-23 16:46:52 +03:00
Serhiy Storchaka a445feb729
bpo-30688: Support \N{name} escapes in re patterns. (GH-5588)
Co-authored-by: Jonathan Eunice <jonathan.eunice@gmail.com>
2018-02-10 00:08:17 +02:00
Cheryl Sabella 66771422d0 bpo-32614: Modify re examples to use a raw string to prevent warning (GH-5265)
Modify RE examples in documentation to use raw strings to prevent DeprecationWarning.
Add text to REGEX HOWTO to highlight the deprecation.  Approved by Serhiy Storchaka.
2018-02-02 16:16:27 -05:00
Serhiy Storchaka fbb490fd2f
bpo-32308: Replace empty matches adjacent to a previous non-empty match in re.sub(). (#4846) 2018-01-04 11:06:13 +02:00
Serhiy Storchaka 70d56fb525
bpo-25054, bpo-1647489: Added support of splitting on zerowidth patterns. (#4471)
Also fixed searching patterns that could match an empty string.
2017-12-04 14:29:05 +02:00
Serhiy Storchaka c615be5166
Use raw strings in the re module examples. (#4616) 2017-11-28 22:51:38 +02:00
Serhiy Storchaka 05cb728d68
bpo-30349: Raise FutureWarning for nested sets and set operations (#1553)
in regular expressions.
2017-11-16 12:38:26 +02:00
Serhiy Storchaka b0b44b4b33
bpo-15606: Improve the re.VERBOSE documentation. (#4366) 2017-11-14 17:21:26 +02:00
Serhiy Storchaka 3557b05c5a bpo-31690: Allow the inline flags "a", "L", and "u" to be used as group flags for RE. (#3885) 2017-10-24 23:31:42 +03:00
Serhiy Storchaka cd195e2a7a bpo-31714: Improved regular expression documentation. (#3907) 2017-10-14 11:14:26 +03:00
Serhiy Storchaka 0b5e61ddca bpo-30397: Add re.Pattern and re.Match. (#1646) 2017-10-04 20:09:49 +03:00
Henk-Jaap Wagenaar ed94a8b285 bpo-26656: Improve re.compile documentation (GH-3211)
- Link to the regular expressions object documentation
- Clarify that it can be used with more than the two methods currently stated.
2017-08-27 22:41:20 -07:00
Serhiy Storchaka 12d6b5d156 bpo-30398: Add a docstring for re.error. (#1647)
Also document that some attributes may be None.
2017-05-27 16:12:48 +03:00
Brian Ward c9d6dbc290 Added effect of re.ASCII and reworded slightly (#1782) 2017-05-24 00:03:38 -07:00
Serhiy Storchaka 898ff03e1e bpo-30215: Make re.compile() locale agnostic. (#1361)
Compiled regular expression objects with the re.LOCALE flag no longer
depend on the locale at compile time.  Only the locale at matching
time affects the result of matching.
2017-05-05 08:53:40 +03:00
Serhiy Storchaka fdbd01151d bpo-10076: Compiled regular expression and match objects now are copyable. (#1000) 2017-04-16 10:16:03 +03:00
Serhiy Storchaka 5908300e4b bpo-29995: re.escape() now escapes only special characters. (#1007) 2017-04-13 21:06:43 +03:00
Serhiy Storchaka 8fc7bc2b76 bpo-30021: Add examples for re.escape(). (#1048)
And fix the parameter name.
2017-04-13 19:17:36 +03:00
Marco Buttu ed6795e46f bpo-22594: Add a link to the regex module in re documentation (GH-241) 2017-02-26 07:26:23 -08:00
Raymond Hettinger 0fa47469a9 merge 2017-02-06 07:15:57 -08:00
Raymond Hettinger d0b9158666 Substitute a more readable f-string 2017-02-06 07:15:31 -08:00
Martin Panter 186b204997 Fix typos in comment and documentation 2016-12-10 05:32:55 +00:00
Serhiy Storchaka ff3dbe9141 Merge documentation for issue #27030 from 3.6. 2016-12-06 19:25:19 +02:00
Serhiy Storchaka 53c53ea4c5 Issue #27030: Unknown escapes in re.sub() replacement template are allowed
again.  But they still are deprecated and will be disabled in 3.7.
2016-12-06 19:15:29 +02:00
Ethan Furman c88c80b716 closes issue28082: doc update and NEWS entry 2016-11-21 08:29:31 -08:00
Martin Panter 479eb760f4 Issue #27800: Merge RE repetition doc from 3.5 into 3.6 2016-10-15 01:39:01 +00:00
Martin Panter 684340ede5 Issue #27800: Document limitation and workaround for multiple RE repetitions 2016-10-15 01:18:16 +00:00
Eric V. Smith 605bdae078 Issue 24454: Improve the usability of the re match object named group API 2016-09-11 08:55:43 -04:00
Serhiy Storchaka bd48d27944 Issue #22493: Inline flags now should be used only at the start of the
regular expression.  Deprecation warning is emitted if uses them in the
middle of the regular expression.
2016-09-11 12:50:02 +03:00