Christian Heimes
ef1327e3b6
bpo-40280: Skip more tests on Emscripten (GH-31947)
...
- lchmod, lchown are not fully implemented
- skip umask tests
- cannot fstat unlinked or renamed files yet
- ignore musl libc issues that affect Emscripten
2022-03-17 12:09:57 +01:00
Erlend Egeberg Aasland
fbff5387c3
bpo-43988: Use check disallow instantiation helper (GH-26392)
2021-05-27 08:43:52 +02:00
Zackery Spytz
6cc8ac9499
bpo-40736: Improve the error message for re.search() TypeError (GH-23312)
...
Include the invalid type in the error message.
2021-05-21 22:02:42 +01:00
Erlend Egeberg Aasland
9746cda705
bpo-43916: Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to selected types (GH-25748)
...
Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to the following types:
* _dbm.dbm
* _gdbm.gdbm
* _multibytecodec.MultibyteCodec
* _sre..SRE_Scanner
* _thread._localdummy
* _thread.lock
* _winapi.Overlapped
* array.arrayiterator
* functools.KeyWrapper
* functools._lru_list_elem
* pyexpat.xmlparser
* re.Match
* re.Pattern
* unicodedata.UCD
* zlib.Compress
* zlib.Decompress
2021-04-30 16:04:57 +02:00
Erlend Egeberg Aasland
5daf70b22e
bpo-43908: Make re types immutable (GH-25697)
...
Co-authored-by: Victor Stinner <vstinner@python.org>
2021-04-29 08:47:11 +02:00
Ethan Furman
7aaeb2a3d6
bpo-38250: [Enum] single-bit flags are canonical (GH-24215)
...
Flag members are now divided by one-bit verses multi-bit, with multi-bit being treated as aliases. Iterating over a flag only returns the contained single-bit flags.
Iterating, repr(), and str() show members in definition order.
When constructing combined-member flags, any extra integer values are either discarded (CONFORM), turned into ints (EJECT) or treated as errors (STRICT). Flag classes can specify which of those three behaviors is desired:
>>> class Test(Flag, boundary=CONFORM):
... ONE = 1
... TWO = 2
...
>>> Test(5)
<Test.ONE: 1>
Besides the three above behaviors, there is also KEEP, which should not be used unless necessary -- for example, _convert_ specifies KEEP as there are flag sets in the stdlib that are incomplete and/or inconsistent (e.g. ssl.Options). KEEP will, as the name suggests, keep all bits; however, iterating over a flag with extra bits will only return the canonical flags contained, not the extra bits.
Iteration is now in member definition order. If member definition order
matches increasing value order, then a more efficient method of flag
decomposition is used; otherwise, sort() is called on the results of
that method to get definition order.
``re`` module:
repr() has been modified to support as closely as possible its previous
output; the big difference is that inverted flags cannot be output as
before because the inversion operation now always returns the comparable
positive result; i.e.
re.A|re.I|re.M|re.S is ~(re.L|re.U|re.S|re.T|re.DEBUG)
in both of the above terms, the ``value`` is 282.
re's tests have been updated to reflect the modifications to repr().
2021-01-25 14:26:19 -08:00
Erlend Egeberg Aasland
a6109ef68d
bpo-1635741: Convert _sre types to heap types and establish module state (PEP 384) (GH-23393)
2020-11-20 21:36:23 +09:00
Victor Stinner
57572b103e
bpo-40443: Remove unused imports in tests (GH-19805)
2020-04-30 01:48:37 +02:00
Serhiy Storchaka
14a0e16c88
bpo-36548: Improve the repr of re flags. (GH-12715)
2019-05-31 10:39:47 +03:00
Max Bernstein
ccb7ca728e
bpo-36929: Modify io/re tests to allow for missing mod name ( #13392 )
...
* bpo-36929: Modify io/re tests to allow for missing mod name
For a vanishingly small number of internal types, CPython sets the
tp_name slot to mod_name.type_name, either in the PyTypeObject or the
PyType_Spec. There are a few minor places where this surfaces:
* Custom repr functions for those types (some of which ignore the
tp_name in favor of using a string literal, such as _io.TextIOWrapper)
* Pickling error messages
The test suite only tests the former. This commit modifies the test
suite to allow Python implementations to omit the module prefix.
https://bugs.python.org/issue36929
2019-05-21 10:09:21 -07:00
Victor Stinner
ab71f8b793
bpo-29571: Fix test_re.test_locale_flag() (GH-12099)
...
Use locale.getpreferredencoding() rather than locale.getlocale() to
get the locale encoding. With some locales, locale.getlocale()
returns the wrong encoding.
For example, on Fedora 29, locale.getlocale() returns ISO-8859-1
encoding for the "en_IN" locale, whereas
locale.getpreferredencoding() reports the correct encoding: UTF-8.
2019-03-01 00:08:03 +01:00
animalize
4a7f44a2ed
bpo-34294: re module, fix wrong capturing groups in rare cases. (GH-11546)
...
Need to reset capturing groups between two SRE(match) callings in loops, this fixes wrong capturing groups in rare cases.
Also add a missing index in re.rst.
2019-02-18 15:26:37 +02:00
Serhiy Storchaka
a445feb729
bpo-30688: Support \N{name} escapes in re patterns. (GH-5588)
...
Co-authored-by: Jonathan Eunice <jonathan.eunice@gmail.com>
2018-02-10 00:08:17 +02:00
Serhiy Storchaka
fbb490fd2f
bpo-32308: Replace empty matches adjacent to a previous non-empty match in re.sub(). ( #4846 )
2018-01-04 11:06:13 +02:00
Serhiy Storchaka
b748e3b258
Fix improper use of re.escape() in tests. ( #4814 )
2017-12-12 19:21:50 +02:00
Serhiy Storchaka
70d56fb525
bpo-25054, bpo-1647489: Added support of splitting on zerowidth patterns. ( #4471 )
...
Also fixed searching patterns that could match an empty string.
2017-12-04 14:29:05 +02:00
Serhiy Storchaka
05cb728d68
bpo-30349: Raise FutureWarning for nested sets and set operations ( #1553 )
...
in regular expressions.
2017-11-16 12:38:26 +02:00
Serhiy Storchaka
3557b05c5a
bpo-31690: Allow the inline flags "a", "L", and "u" to be used as group flags for RE. ( #3885 )
2017-10-24 23:31:42 +03:00
Serhiy Storchaka
0b5e61ddca
bpo-30397: Add re.Pattern and re.Match. ( #1646 )
2017-10-04 20:09:49 +03:00
Serhiy Storchaka
5075416b8f
bpo-30978: str.format_map() now passes key lookup exceptions through. ( #2790 )
...
Previously any exception was replaced with a KeyError exception.
2017-08-03 11:45:23 +03:00
Roy Williams
171b9a354e
bpo-30605: Fix compiling binary regexs with BytesWarnings enabled. ( #2016 )
...
Running our unit tests with `-bb` enabled triggered this failure.
2017-06-10 08:01:16 +03:00
Serhiy Storchaka
c7ac7280c3
bpo-30375: Correct the stacklevel of regex compiling warnings. ( #1595 )
...
Warnings emitted when compile a regular expression now always point
to the line in the user code. Previously they could point into inners
of the re module if emitted from inside of groups or conditionals.
2017-05-16 15:16:15 +03:00
Serhiy Storchaka
4ab6abfca4
bpo-30299: Display a bytecode when compile a regex in debug mode. ( #1491 )
...
`re.compile(..., re.DEBUG)` now displays the compiled bytecode in
human readable form.
2017-05-14 09:05:13 +03:00
Serhiy Storchaka
821a9d146b
bpo-30340: Enhanced regular expressions optimization. ( #1542 )
...
This increased the performance of matching some patterns up to 25 times.
2017-05-14 08:32:33 +03:00
Serhiy Storchaka
305ccbe27e
bpo-30298: Weaken the condition of deprecation warnings for inline modifiers. ( #1490 )
...
Now allowed several subsequential inline modifiers at the start of the
pattern (e.g. '(?i)(?s)...'). In verbose mode whitespaces and comments
now are allowed before and between inline modifiers (e.g.
'(?x) (?i) (?s)...').
2017-05-10 06:05:20 +03:00
Serhiy Storchaka
6d336a0279
bpo-30285: Optimize case-insensitive matching and searching ( #1482 )
...
of regular expressions.
2017-05-09 23:37:14 +03:00
Serhiy Storchaka
7186cc29be
bpo-30277: Replace _sre.getlower() with _sre.ascii_tolower() and _sre.unicode_tolower(). ( #1468 )
2017-05-05 10:42:46 +03:00
Serhiy Storchaka
898ff03e1e
bpo-30215: Make re.compile() locale agnostic. ( #1361 )
...
Compiled regular expression objects with the re.LOCALE flag no longer
depend on the locale at compile time. Only the locale at matching
time affects the result of matching.
2017-05-05 08:53:40 +03:00
Serhiy Storchaka
fdbd01151d
bpo-10076: Compiled regular expression and match objects now are copyable. ( #1000 )
2017-04-16 10:16:03 +03:00
Serhiy Storchaka
5908300e4b
bpo-29995: re.escape() now escapes only special characters. ( #1007 )
2017-04-13 21:06:43 +03:00
Victor Stinner
d6debb24e0
bpo-29919: Remove unused imports found by pyflakes ( #137 )
...
Make also minor PEP8 coding style fixes on modified imports.
2017-03-27 16:05:26 +02:00
Benjamin Peterson
21a74312f2
Revert "bpo-29571: Use correct locale encoding in test_re ( #149 )" ( #554 )
...
This reverts commit ace5c0fdd9
.
2017-03-07 22:48:09 -08:00
Benjamin Peterson
1e68716fd5
Revert "make the locale_flag fallback code work again ( #375 )" ( #387 )
...
This reverts commit 43f5df5bfa
.
2017-03-01 21:53:00 -08:00
Benjamin Peterson
43f5df5bfa
make the locale_flag fallback code work again ( #375 )
2017-02-28 23:59:12 -08:00
Nick Coghlan
ace5c0fdd9
bpo-29571: Use correct locale encoding in test_re ( #149 )
...
``local.getlocale(locale.LC_CTYPE)`` and
``locale.getpreferredencoding(False)`` may give different answers
in some cases (such as the ``en_IN`` locale).
``re.LOCALE`` uses the latter, so update the test case to match.
2017-02-18 15:01:22 +05:30
Serhiy Storchaka
ef5176769d
Issue #29444 : Fixed out-of-bounds buffer access in the group() method of
...
the match object. Based on patch by WGH.
2017-02-04 22:57:44 +02:00
Serhiy Storchaka
86e42376c2
Issue #29444 : Fixed out-of-bounds buffer access in the group() method of
...
the match object. Based on patch by WGH.
2017-02-04 22:55:40 +02:00
Serhiy Storchaka
7e10dbbd45
Issue #29444 : Fixed out-of-bounds buffer access in the group() method of
...
the match object. Based on patch by WGH.
2017-02-04 22:53:57 +02:00
Serhiy Storchaka
70d28a184c
Remove unused imports.
2016-12-16 20:00:15 +02:00
Victor Stinner
726a57d45f
Issue #28765 : _sre.compile() now checks the type of groupindex and indexgroup
...
groupindex must a dictionary and indexgroup must be a tuple.
Previously, indexgroup was a list. Use a tuple to reduce the memory usage.
2016-11-22 23:04:39 +01:00
Serhiy Storchaka
53c53ea4c5
Issue #27030 : Unknown escapes in re.sub() replacement template are allowed
...
again. But they still are deprecated and will be disabled in 3.7.
2016-12-06 19:15:29 +02:00
Victor Stinner
bcf4dccfa7
Issue #28727 : Optimize pattern_richcompare() for a==a
...
A pattern is equal to itself.
2016-11-22 15:30:38 +01:00
Victor Stinner
b44fb128ae
Implement rich comparison for _sre.SRE_Pattern
...
Issue #28727 : Regular expression patterns, _sre.SRE_Pattern objects created by
re.compile(), become comparable (only x==y and x!=y operators). This change
should fix the issue #18383 : don't duplicate warning filters when the warnings
module is reloaded (thing usually only done in unit tests).
2016-11-21 16:35:08 +01:00
Victor Stinner
8bf43e6d0b
Issue #28082 : Add basic unit tests on re enums
2016-11-14 12:38:43 +01:00
Serhiy Storchaka
662cef66d7
Issue #25953 : re.sub() now raises an error for invalid numerical group
...
reference in replacement template even if the pattern is not found in
the string. Error message for invalid group reference now includes the
group index and the position of the reference.
Based on patch by SilentGhost.
2016-10-23 12:11:19 +03:00
Serhiy Storchaka
0eb60a7cb9
Issue #11957 : Restored re tests for passing count and maxsplit as positional
...
arguments.
2016-09-25 20:39:04 +03:00
Serhiy Storchaka
b02f8fc3af
Issue #11957 : Restored re tests for passing count and maxsplit as positional
...
arguments.
2016-09-25 20:36:23 +03:00
Serhiy Storchaka
abf275af58
Issue #22493 : Warning message emitted by using inline flags in the middle of
...
regular expression now contains a (truncated) regex pattern.
Patch by Tim Graham.
2016-09-17 01:29:58 +03:00
Eric V. Smith
605bdae078
Issue 24454: Improve the usability of the re match object named group API
2016-09-11 08:55:43 -04:00
Serhiy Storchaka
bd48d27944
Issue #22493 : Inline flags now should be used only at the start of the
...
regular expression. Deprecation warning is emitted if uses them in the
middle of the regular expression.
2016-09-11 12:50:02 +03:00
Serhiy Storchaka
cc66a6528d
Backported tests for issue #28070 .
2016-09-11 01:39:51 +03:00
Serhiy Storchaka
d65cd091e9
Issue #28070 : Fixed parsing inline verbose flag in regular expressions.
2016-09-11 01:39:01 +03:00
Serhiy Storchaka
be9a4e5c85
Issue #433028 : Added support of modifier spans in regular expressions.
2016-09-10 00:57:55 +03:00
R David Murray
44b548dda8
#27364 : fix "incorrect" uses of escape character in the stdlib.
...
And most of the tools.
Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
2016-09-08 13:59:53 -04:00
Serhiy Storchaka
977b3ac1c1
Issue #27177 : Match objects in the re module now support index-like objects
...
as group indices. Based on patches by Jeroen Demeyer and Xiang Zhang.
2016-06-18 16:48:07 +03:00
Serhiy Storchaka
9bd85b83f6
Issue #27030 : Unknown escapes consisting of ``'\'`` and ASCII letter in
...
regular expressions now are errors.
2016-06-11 19:15:00 +03:00
Serhiy Storchaka
485407ce1e
Issue #24580 : Symbolic group references to open group in re patterns now are
...
explicitly forbidden as well as numeric group references.
2015-07-18 23:27:00 +03:00
Serhiy Storchaka
07360df481
Issue #14260 : The groupindex attribute of regular expression pattern object
...
now is non-modifiable mapping.
2015-03-30 01:01:48 +03:00
Serhiy Storchaka
632a77e6a3
Issue #22364 : Improved some re error messages using regex for hints.
2015-03-25 21:03:47 +02:00
Serhiy Storchaka
a54aae0683
Issue #23622 : Unknown escapes in regular expressions that consist of ``'\'``
...
and ASCII letter now raise a deprecation warning and will be forbidden in
Python 3.6.
2015-03-24 22:58:14 +02:00
Serhiy Storchaka
4eea62fd2e
Issues #814253 , #9179 : Group references and conditional group references now
...
work in lookbehind assertions in regular expressions.
2015-02-21 10:07:35 +02:00
Serhiy Storchaka
83e802796c
Issue #22818 : Splitting on a pattern that could match an empty string now
...
raises a warning. Patterns that can only match empty strings are now
rejected.
2015-02-03 11:04:19 +02:00
Serhiy Storchaka
22a309a434
Issue #21032 : Deprecated the use of re.LOCALE flag with str patterns or
...
re.ASCII. It was newer worked.
2014-12-01 11:50:07 +02:00
Serhiy Storchaka
fb028336f9
Issue #22838 : All test_re tests now work with unittest test discovery.
2014-12-01 11:08:27 +02:00
Serhiy Storchaka
9cba989502
Issue #22838 : All test_re tests now work with unittest test discovery.
2014-12-01 11:06:45 +02:00
Benjamin Peterson
16e802f4ae
merge 3.4 ( #9179 )
2014-11-30 11:51:16 -05:00
Benjamin Peterson
66323415c7
backout 9fcf4008b626 ( #9179 ) for further consideration
2014-11-30 11:49:00 -05:00
Serhiy Storchaka
ab14088141
Minor code clean up and improvements in the re module.
2014-11-11 21:13:28 +02:00
Serhiy Storchaka
b99c132bd9
Fixed AttributeError when the regular expression starts from illegal escape.
2014-11-10 14:38:16 +02:00
Serhiy Storchaka
ad446d57a9
Issue #22578 : Added attributes to the re.error class.
2014-11-10 13:49:00 +02:00
Serhiy Storchaka
5619ab926b
Issue #12728 : Different Unicode characters having the same uppercase but
...
different lowercase are now matched in case-insensitive regular expressions.
2014-11-10 12:43:14 +02:00
Serhiy Storchaka
0c938f6d24
Issue #12728 : Different Unicode characters having the same uppercase but
...
different lowercase are now matched in case-insensitive regular expressions.
2014-11-10 12:37:16 +02:00
Serhiy Storchaka
c7f7d3897e
Issue #22434 : Constants in sre_constants are now named constants (enum-like).
2014-11-09 20:48:36 +02:00
Serhiy Storchaka
6276b32799
Issues #814253 , #9179 : Group references and conditional group references now
...
work in lookbehind assertions in regular expressions.
2014-11-07 21:45:17 +02:00
Serhiy Storchaka
84df7fe6a2
Issues #814253 , #9179 : Group references and conditional group references now
...
work in lookbehind assertions in regular expressions.
2014-11-07 21:43:57 +02:00
Serhiy Storchaka
4b8f8949b4
Issue #17381 : Fixed handling of case-insensitive ranges in regular expressions.
...
Added new opcode RANGE_IGNORE.
2014-10-31 12:36:56 +02:00
Serhiy Storchaka
7cc0a1f7cb
Issue #22410 : Module level functions in the re module now cache compiled
...
locale-dependent regular expressions taking into account the locale.
2014-10-31 00:56:45 +02:00
Serhiy Storchaka
4659cc0756
Issue #22410 : Module level functions in the re module now cache compiled
...
locale-dependent regular expressions taking into account the locale.
2014-10-31 00:53:49 +02:00
Victor Stinner
55e614a2a8
Issue #11957 : Explicit parameter name when calling re.split() and re.sub()
2014-10-29 16:58:59 +01:00
Serhiy Storchaka
7438e4b56f
Issue 1519638: Now unmatched groups are replaced with empty strings in re.sub()
...
and re.subn().
2014-10-10 11:06:31 +03:00
Serhiy Storchaka
9baa5b2de2
Issue #22437 : Number of capturing groups in regular expression is no longer
...
limited by 100.
2014-09-29 22:49:23 +03:00
Serhiy Storchaka
c563caf3a2
Issue #22362 : Forbidden ambiguous octal escapes out of range 0-0o377 in
...
regular expressions.
2014-09-23 23:22:41 +03:00
Serhiy Storchaka
cd9032d45b
Fixed bytes literals in tests.
2014-09-23 23:04:21 +03:00
Serhiy Storchaka
44dae8bde3
Issue #22423 : Fixed debugging output of the GROUPREF_EXISTS opcode in the re
...
module.
2014-09-21 22:47:55 +03:00
Serhiy Storchaka
b1847e7541
Issue #17381 : Fixed handling of case-insensitive ranges in regular expressions.
2014-10-31 12:37:50 +02:00
Serhiy Storchaka
b85a97600a
Restored re pickling test.
2014-09-15 11:33:19 +03:00
Serhiy Storchaka
d9cf65f00e
Use more appropriate asserts in re tests.
2014-09-14 16:20:20 +03:00
Serhiy Storchaka
a25875cfd0
Fixed re tests incorrectly ported from 2.x to 3.x.
2014-09-14 15:56:27 +03:00
Serhiy Storchaka
429b59ec69
Issue #20998 : Fixed re.fullmatch() of repeated single character pattern
...
with ignore case. Original patch by Matthew Barnett.
2014-05-14 21:48:17 +03:00
Serhiy Storchaka
a537eb45fd
Issue #20283 : RE pattern methods now accept the string keyword parameters
...
as documented. The pattern and source keyword parameters are left as
deprecated aliases.
2014-03-06 11:36:15 +02:00
Serhiy Storchaka
ccdf352370
Issue #20283 : RE pattern methods now accept the string keyword parameters
...
as documented. The pattern and source keyword parameters are left as
deprecated aliases.
2014-03-06 11:28:32 +02:00
Antoine Pitrou
c49672f25e
Issue #20426 : When passing the re.DEBUG flag, re.compile() displays the debug output every time it is called, regardless of the compilation cache.
2014-02-03 21:01:35 +01:00
Antoine Pitrou
d2cc743ca4
Issue #20426 : When passing the re.DEBUG flag, re.compile() displays the debug output every time it is called, regardless of the compilation cache.
2014-02-03 20:59:59 +01:00
Serhiy Storchaka
32eddc1bbc
Issue #16203 : Add re.fullmatch() function and regex.fullmatch() method,
...
which anchor the pattern at both ends of the string to match.
Original patch by Matthew Barnett.
2013-11-23 23:20:30 +02:00
Serhiy Storchaka
5c24d0e504
Issue #13592 : Improved the repr for regular expression pattern objects.
...
Based on patch by Hugo Lopes Tavares.
2013-11-23 22:42:43 +02:00
Serhiy Storchaka
9eabac68a3
Issue #18685 : Restore re performance to pre-PEP 393 levels.
2013-10-26 10:45:48 +03:00
Antoine Pitrou
79aa68dfc1
Issue #19387 : explain and test the sre overlap table
2013-10-25 21:36:10 +02:00
Serhiy Storchaka
8b150ecfc9
Issue #19327 : Fixed the working of regular expressions with too big charset.
2013-10-24 22:04:37 +03:00
Serhiy Storchaka
be80fc9a84
Issue #19327 : Fixed the working of regular expressions with too big charset.
2013-10-24 22:02:58 +03:00
Serhiy Storchaka
36af10c1f7
Issue #17087 : Improved the repr for regular expression match objects.
2013-10-20 13:13:31 +03:00