Commit Graph

905 Commits

Author SHA1 Message Date
Miss Islington (bot) a657bff349
bpo-46762: Fix an assert failure in f-strings where > or < is the last character if the f-string is missing a trailing right brace. (GH-31365)
(cherry picked from commit ffd9f8ff84)

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>
2022-02-16 03:18:16 -08:00
Miss Islington (bot) c314e3e829
bpo-46503: Prevent an assert from firing when parsing some invalid \N sequences in f-strings. (GH-30865) (30867)
* bpo-46503: Prevent an assert from firing.  Also fix one nearby tiny PEP-7 nit.

* Added blurb.
(cherry picked from commit 0daf72194b)

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>
2022-01-24 22:08:42 -05:00
Pablo Galindo Salgado e5cf31d3c2
[3.9] bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177) (#30215)
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>.
(cherry picked from commit e9898bf153)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-12-20 17:18:13 +00:00
Victor Stinner 93a540d74c
bpo-45866: pegen strips directory of "generated from" header (GH-29777) (GH-29792) (GH-29797)
"make regen-all" now produces the same output when run from a
directory other than the source tree: when building Python out of the
source tree.

(cherry picked from commit 253b7a0a9f)
(cherry picked from commit b6defde2af)
2021-11-26 17:23:41 +01:00
Miss Islington (bot) 00ee14e814
[3.9] bpo-45820: Fix a segfault when the parser fails without reading any input (GH-29580) (GH-29584)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2021-11-18 01:24:43 +01:00
Pablo Galindo Salgado 0ef308a289
bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are not provided (GH-29582) (GH-29585)
(cherry picked from commit da20d7401d)
2021-11-18 00:18:16 +01:00
Pablo Galindo Salgado 142fcb40b6
bpo-45738: Fix computation of error location for invalid continuation characters in the parser (GH-29550) (GH-29552)
(cherry picked from commit 25835c518a)
2021-11-14 01:47:27 +00:00
Łukasz Langa 88f4ec88e2
[3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) (#29071)
There are two errors that this commit fixes:

* The parser was not correctly computing the offset and the string
  source for E_LINECONT errors due to the incorrect usage of strtok().
* The parser was not correctly unwinding the call stack when a tokenizer
  exception happened in rules involving optionals ('?', [...]) as we
  always make them return valid results by using the comma operator. We
  need to check first if we don't have an error before continuing..
(cherry picked from commit a106343f63)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>

NOTE: unlike the cherry-picked original, this commit points at a crazy location
due to a bug in the tokenizer that required a big refactor in 3.10 to fix.
We are leaving as-is for 3.9.
2021-10-20 18:51:13 +02:00
Serhiy Storchaka 7c722e32bf
[3.9] bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) (GH-28945)
They support now splitting escape sequences between input chunks.

Add the third parameter "final" in codecs.unicode_escape_decode().
It is True by default to match the former behavior.
(cherry picked from commit c96d1546b1)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2021-10-14 20:03:29 +03:00
Łukasz Langa 4e4d35d332
[3.9] bpo-44947: Refine the syntax error for trailing commas in import statements (GH-27814) (GH-27817)
(cherry picked from commit b2f68b1900)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-08-18 23:03:59 +02:00
Pablo Galindo Salgado 4b86c9c514
[3.9] bpo-44885: Correct the ast locations of f-strings with format specs and repeated expressions (GH-27729) (GH-27744)
(cherry picked from commit 8e832fb2a2)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
2021-08-12 18:46:35 +01:00
Łukasz Langa 168879e366
[3.9] Update URLs in comments and metadata to use HTTPS (GH-27458) (GH-27480)
(cherry picked from commit be42c06bb0)

Co-authored-by: Noah Kantrowitz <noah@coderanger.net>
2021-07-30 16:34:04 +02:00
Pablo Galindo 0d0a9eaa82
[3.9] bpo-44409: Fix error location in tokenizer errors that happen during initialization (GH-26712). (GH-26723)
(cherry picked from commit 507ed6fa1d)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-06-14 18:07:51 +01:00
Lysandros Nikolaou 3ce35bfbbe
[3.9] bpo-44385: Remove unused grammar rules (GH-26655) (GH-26659)
(cherry picked from commit e7b4644607)
2021-06-10 15:52:49 -07:00
Batuhan Taskaya de58b319af
[3.9] bpo-11105: Do not crash when compiling recursive ASTs (GH-20594) (GH-26522)
When compiling an AST object with a direct / indirect reference
cycles, on the conversion phase because of exceeding amount of
calls, a segfault was raised. This patch adds recursion guards to
places for preventing user inputs to not to crash AST but instead
raise a RecursionError..
(cherry picked from commit f3491242e4)

Co-authored-by: Batuhan Taskaya <batuhan@python.org>
2021-06-03 22:22:34 +01:00
Pablo Galindo d4a9264ab8
[3.9] bpo-44168: Fix error message in the parser for keyword arguments for invalid expressions (GH-26210) (GH-26250)
(cherry picked from commit 33c0c90dea)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-05-19 19:26:59 +01:00
Erlend Egeberg Aasland 76d270ec2b
[3.9] bpo-43779: Fix possible refleak involving _PyArena_AddPyObject (GH-25289). (GH-25294)
* [3.9] Fix possible refleak involving _PyArena_AddPyObject (GH-25289).
(cherry picked from commit c0e11a3ceb)

Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>

* Update Parser/pegen/pegen.c

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-04-09 18:46:32 +01:00
Miss Islington (bot) 994a519915
bpo-43555: Report the column offset for invalid line continuation character (GH-24939) (#24975)
(cherry picked from commit 96eeff5162)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-03-22 19:07:05 +00:00
Pablo Galindo bfc413ce4f
[3.9] bpo-42806: Fix ast locations of f-strings inside parentheses (GH-24067) (GH-24069)
(cherry picked from commit bd2728b1e8)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-01-03 01:32:43 +00:00
Lysandros Nikolaou 9a608ac17c
[3.9] bpo-40631: Disallow single parenthesized star target (GH-24027) (GH-24068)
(cherry picked from commit 2ea320dddd)

Automerge-Triggered-By: GH:pablogsal
2021-01-02 16:59:39 -08:00
Pablo Galindo 87c87b5bd6
[3.9] bpo-42381: Allow walrus in set literals and set comprehensions (GH-23332) (GH-23333)
Currently walruses are not allowerd in set literals and set comprehensions:

>>> {y := 4, 4**2, 3**3}
  File "<stdin>", line 1
    {y := 4, 4**2, 3**3}
       ^
SyntaxError: invalid syntax

but they should be allowed as well per PEP 572.
(cherry picked from commit b0aba1fcdc)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-11-18 23:44:30 +00:00
Miss Islington (bot) 994c68f586
bpo-40998: Address compiler warnings found by ubsan (GH-20929)
Signed-off-by: Christian Heimes <christian@python.org>

Automerge-Triggered-By: GH:tiran
(cherry picked from commit 07f2adedf0)

Co-authored-by: Christian Heimes <christian@python.org>
2020-11-18 08:01:48 -08:00
Lysandros Nikolaou 2b800ef809
bpo-42374: Allow unparenthesized walrus in genexps (GH-23319) (GH-23329)
This fixes a regression that was introduced by the new parser.

(cherry picked from commit cb3e5ed071)
2020-11-17 01:38:58 +02:00
Lysandros Nikolaou cfcb952e30
[3.9] bpo-42218: Correctly handle errors in left-recursive rules (GH-23065) (GH-23066)
Left-recursive rules need to check for errors explicitly, since
even if the rule returns NULL, the parsing might continue and lead
to long-distance failures.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
(cherry picked from commit 02cdfc93f8)

Automerge-Triggered-By: GH:lysnikolaou
2020-10-31 12:06:03 -07:00
Pablo Galindo ddcd57e3ea
[3.9] bpo-42214: Fix check for NOTEQUAL token in the PEG parser for the barry_as_flufl rule (GH-23048) (GH-23051)
(cherry picked from commit 06f8c3328d)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-10-31 00:40:42 +00:00
Lysandros Nikolaou 24a7c298d4
[3.9] bpo-42123: Run the parser two times and only enable invalid rules on the second run (GH-22111) (GH-23011)
* Implement running the parser a second time for the errors messages

The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.

(cherry picked from commit bca7014032)
2020-10-28 02:14:15 +02:00
Lysandros Nikolaou c4b58cea47
[3.9] bpo-41659: Disallow curly brace directly after primary (GH-22996) (#23006)
(cherry picked from commit 15acc4eaba)
2020-10-28 00:38:42 +02:00
Miss Skeleton (bot) 0b290dd217
bpo-42150: Avoid buffer overflow in the new parser (GH-22978)
(cherry picked from commit e68c67805e)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-10-25 16:24:56 -07:00
Batuhan Taskaya 42157b9eaa
[3.9] bpo-41979: Accept star-unpacking on with-item targets (GH-22611) (GH-22612)
Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com>

Automerge-Triggered-By: @pablogsal
2020-10-09 03:31:07 -07:00
Pablo Galindo 55e0836849
[3.9] bpo-41631: _ast module uses again a global state (GH-21961) (GH-22258)
Partially revert commit ac46eb4ad6662cf6d771b20d8963658b2186c48c:
"bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions..
(cherry picked from commit e5fbe0cbd4)

Co-authored-by: Victor Stinner <vstinner@python.org>
2020-09-15 20:32:56 +02:00
Pablo Galindo be17295280
[3.9] bpo-41697: Correctly handle KeywordOrStarred when parsing arguments in the parser (GH-22077) (GH-22079)
(cherry picked from commit 315a61f7a9)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-09-03 16:35:17 +01:00
Pablo Galindo 8de34cdb95
[3.9] bpo-41690: Use a loop to collect args in the parser instead of recursion (GH-22053) (GH-22067)
This program can segfault the parser by stack overflow:

```
import ast

code = "f(" + ",".join(['a' for _ in range(100000)]) + ")"
print("Ready!")
ast.parse(code)
```

the reason is that the rule for arguments has a simple recursion when collecting args:

args[expr_ty]:
    [...]
    | a=named_expression b=[',' c=args { c }] {
        [...] }.
(cherry picked from commit 4a97b1517a)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-09-02 21:30:51 +01:00
Victor Stinner d2bea2636d
[3.9] bpo-41194: Convert _ast extension to PEP 489 (GH-21807)
* bpo-41194: Convert _ast extension to PEP 489 (GH-21293)

Convert the _ast extension module to PEP 489 "Multiphase
initialization". Replace the global _ast state with a module state.

(cherry picked from commit b1cc6ba73a)

* bpo-41204: Fix compiler warning in ast_type_init() (GH-21307)

(cherry picked from commit 1f76453173)
2020-08-10 15:55:54 +02:00
Miss Islington (bot) b6724be804
bpo-38156: Fix compiler warning in PyOS_StdioReadline() (GH-21721)
incr cannot be larger than INT_MAX: downcast to int explicitly.
(cherry picked from commit bde48fd811)

Co-authored-by: Victor Stinner <vstinner@python.org>
2020-08-03 17:56:54 -07:00
Miss Islington (bot) 22216107f2
closes bpo-38156: Always handle interrupts in PyOS_StdioReadline. (GH-21569)
This consolidates the handling of my_fgets return values, so that interrupts are always handled, even if they come after EOF.

 I believe PyOS_StdioReadline is still buggy in that I/O errors will not result in a proper Python exception being set. However, that is a separate issue.
(cherry picked from commit a74eea238f)

Co-authored-by: Benjamin Peterson <benjamin@python.org>
2020-07-28 18:16:19 -07:00
Pablo Galindo bc2c0e9a57
[3.9] Validate the AST produced by the parser in debug mode (GH-21643) (GH-21646)
This will improve the debug experience if something fails in the produced AST. Previously, errors in the produced AST can be felt much later like in the garbage collector or the compiler, making debugging them much more difficult..
(cherry picked from commit 1332226b32)

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-07-28 00:12:31 +01:00
Miss Islington (bot) 9d8b8c3ed2
Fix trivial typo in the PEG string parser (GH-21508)
(cherry picked from commit 0275e0452a)

Co-authored-by: Eric V. Smith <ericvsmith@users.noreply.github.com>
2020-07-16 09:30:19 -07:00
Miss Islington (bot) 961703cdc8
Fix possibly-unitialized warning in string_parser.c. (GH-21503)
GCC says
```
../cpython/Parser/string_parser.c: In function ‘fstring_find_expr’:
../cpython/Parser/string_parser.c:404:93: warning: ‘cols’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  404 |     p2->starting_col_offset = p->tok->first_lineno == p->tok->lineno ? t->col_offset + cols : cols;
      |                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
../cpython/Parser/string_parser.c:384:16: note: ‘cols’ was declared here
  384 |     int lines, cols;
      |                ^~~~
../cpython/Parser/string_parser.c:403:45: warning: ‘lines’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  403 |     p2->starting_lineno = t->lineno + lines - 1;
      |                           ~~~~~~~~~~~~~~~~~~^~~
../cpython/Parser/string_parser.c:384:9: note: ‘lines’ was declared here
  384 |     int lines, cols;
      |         ^~~~~
```

and, indeed, if `PyBytes_AsString` somehow fails, lines & cols will not be initialized.
(cherry picked from commit 2ad7e9c011)

Co-authored-by: Benjamin Peterson <benjamin@python.org>
2020-07-16 06:25:31 -07:00
Miss Islington (bot) edeaf61b68
bpo-41215: Make assertion in the new parser more strict (GH-21364)
(cherry picked from commit 782f44b8fb)

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-07-06 16:35:10 -07:00
Pablo Galindo 54f115dd53
[3.9] bpo-41215: Don't use NULL by default in the PEG parser keyword list (GH-21355) (GH-21356)
(cherry picked from commit 39e76c0fb0)

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>

Automerge-Triggered-By: @lysnikolaou
2020-07-06 12:29:59 -07:00
Victor Stinner f8599279b6
[3.9] bpo-41194: The _ast module cannot be loaded more than once (GH-21290) (GH-21292)
* bpo-41194: Pass module state in Python-ast.c (GH-21284)

Rework asdl_c.py to pass the module state to functions in
Python-ast.c, instead of using astmodulestate_global.

Handle also PyState_AddModule() failure in init_types().

(cherry picked from commit 74419f0c64)

* bpo-41194: The _ast module cannot be loaded more than once (GH-21290)

Fix a crash in the _ast module: it can no longer be loaded more than
once. It now uses a global state rather than a module state.

* Move _ast module state: use a global state instead.
* Set _astmodule.m_size to -1, so the extension cannot be loaded more
  than once.

(cherry picked from commit 91e1bc18bd)
2020-07-03 16:57:19 +02:00
Guido van Rossum 2a1ee1d970
[3.9] bpo-35975: Only use cf_feature_version if PyCF_ONLY_AST in cf_flags (#21022) 2020-06-27 17:34:30 -07:00
Pablo Galindo dab533d0ee
[3.9] bpo-41076: Pre-feed the parser with the f-string expression location (GH-21054) (GH-21190)
This commit changes the parsing of f-string expressions with the new parser. The parser gets pre-fed with the location of the expression itself (not the f-string, which was what we were doing before). This allows us to completely skip the shifting of the AST nodes after the parsing is completed..
(cherry picked from commit 1f0f4abb11)
2020-06-28 01:15:28 +01:00
Pablo Galindo 102ca529ef
[3.9] bpo-40769: Allow extra surrounding parentheses for invalid annotated assignment rule (GH-20387) (GH-21186)
(cherry picked from commit c8f29ad986)
2020-06-28 00:40:41 +01:00
Miss Islington (bot) cb0dc52d37
bpo-41084: Adjust message when an f-string expression causes a SyntaxError (GH-21084)
Prefix the error message with `fstring: `, when parsing an f-string expression throws a `SyntaxError`.
(cherry picked from commit 2e0a920e9e)

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-06-27 12:43:49 -07:00
Lysandros Nikolaou 5193d0a665
[3.9] bpo-41132: Use pymalloc allocator in the f-string parser (GH-21173) (GH-21183)
(cherry picked from commit 6dcbc2422d)

Automerge-Triggered-By: @pablogsal
2020-06-27 11:35:18 -07:00
Lysandros Nikolaou d01a3e76ee
[3.9] bpo-41119: Output correct error message for list/tuple followed by colon (GH-21160) (GH-21172)
(cherry picked from commit 4b85e60601)
2020-06-27 00:14:12 +01:00
Lysandros Nikolaou 71bb921829
[3.9] bpo-41060: Avoid SEGFAULT when calling GET_INVALID_TARGET in the grammar (GH-21020) (GH-21024)
`GET_INVALID_TARGET` might unexpectedly return `NULL`, which if not
caught will cause a SEGFAULT. Therefore, this commit introduces a new
inline function `RAISE_SYNTAX_ERROR_INVALID_TARGET` that always
checks for `GET_INVALID_TARGET` returning NULL and can be used in
the grammar, replacing the long C ternary operation used till now.

(cherry picked from commit 6c4e0bd974)

Automerge-Triggered-By: @pablogsal
2020-06-20 19:47:22 -07:00
Miss Islington (bot) c9f83c173b
bpo-40958: Avoid 'possible loss of data' warning on Windows (GH-20970)
(cherry picked from commit 861efc6e8f)

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-06-20 10:35:03 -07:00
Lysandros Nikolaou a5442b26f4
[3.9] bpo-40334: Produce better error messages on invalid targets (GH-20106) (GH-20973)
* bpo-40334: Produce better error messages on invalid targets (GH-20106)

The following error messages get produced:
- `cannot delete ...` for invalid `del` targets
- `... is an illegal 'for' target` for invalid targets in for
  statements
- `... is an illegal 'with' target` for invalid targets in
  with statements

Additionally, a few `cut`s were added in various places before the
invocation of the `invalid_*` rule, in order to speed things
up.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
(cherry picked from commit 01ece63d42)
2020-06-19 01:03:58 +01:00