Now re.error is raised instead of OverflowError or RuntimeError for
too large width of look-behind pattern.
The limit is increased to 2**32-1 (was 2**31-1).
(cherry picked from commit e2b3d831fd)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
TypeError would be overwritten by OverflowError
if 'code' param contained non-ints.
(cherry picked from commit 344d3a222a)
Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
Counting for signal checking now continues in new match from the point where
it ended in the previous match instead of starting from 0.
(cherry picked from commit 8ac2085b80)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Restore the global Input Stream pointer after trying to match a sub-pattern.
Co-authored-by: Ma Lin <animalize@users.noreply.github.com>
(cherry picked from commit abd9cc52d9)
Co-authored-by: SKO <41810398+uyw4687@users.noreply.github.com>
In very rare circumstances the JUMP opcode could be confused with the
argument of the opcode in the "then" part which doesn't end with the
JUMP opcode. This led to incorrect detection of the final JUMP opcode
and incorrect calculation of the size of the subexpression.
NOTE: Changed return value of functions _validate_inner() and
_validate_charset() in Modules/_sre/sre.c. Now they return 0 on success,
-1 on failure, and 1 if the last op is JUMP (which usually is a failure).
Previously they returned 1 on success and 0 on failure.
(cherry picked from commit e9ac890c02)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Revert "bpo-23689: re module, fix memory leak when a match is terminated by a signal or memory allocation failure (GH-32283)"
This reverts commit 6e3eee5c11.
Manual fixups to increase the MAGIC number and to handle conflicts with
a couple of changes that landed after that.
Thanks for reviews by Ma Lin and Serhiy Storchaka.
(cherry picked from commit 4beee0c7b0)
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Revert "bpo-47211: Remove function re.template() and flag re.TEMPLATE (GH-32300)"
This reverts commit b09184bf05.
(cherry picked from commit 16a7e4a0b7)
Co-authored-by: Miro Hrončok <miro@hroncok.cz>
It was initially added to support atomic groups, but that
support was never fully implemented, and CALL was only left
in the compiler, but not interpreter and parser.
ATOMIC_GROUP is now used to support atomic groups.
Limit the maximum capturing group to 2**30-1 on 64-bit platforms
(it was 2**31-1). No change on 32-bit platforms (2**28-1).
It allows to reduce the size of SRE(match_context):
- On 32 bit platform: 36 bytes, no change. (msvc2022)
- On 64 bit platform: 72 bytes -> 56 bytes. (msvc2022/gcc9.4)
which leads to increasing the depth of backtracking.
* Move the code for generating Modules/_sre/sre_constants.h from
Lib/re/_constants.py into a separate script
Tools/scripts/generate_sre_constants.py.
* Add target `regen-sre` in the makefile.
* Make target `regen-all` depending on `regen-sre`.