Commit Graph

623 Commits

Author SHA1 Message Date
Miss Islington (bot) a5eaa14584
[3.11] gh-95782: Fix io.BufferedReader.tell() etc. being able to return offsets < 0 (GH-99709) (GH-115600)
lseek() always returns 0 for character pseudo-devices like
`/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it
always returns -1, to which CPython reacts by raising appropriate
exceptions). They are thus technically seekable despite not having seek
semantics.

When calling read() on e.g. an instance of `io.BufferedReader` that
wraps such a file, `BufferedReader` reads ahead, filling its buffer,
creating a discrepancy between the number of bytes read and the internal
`tell()` always returning 0, which previously resulted in e.g.
`BufferedReader.tell()` or `BufferedReader.seek()` being able to return
positions < 0 even though these are supposed to be always >= 0.

Invariably keep the return value non-negative by returning
max(former_return_value, 0) instead, and add some corresponding tests.
(cherry picked from commit 26800cf25a)

Co-authored-by: 6t8k <58048945+6t8k@users.noreply.github.com>
2024-02-17 14:55:43 +02:00
Miss Islington (bot) 20c6535693
[3.11] gh-115059: Flush the underlying write buffer in io.BufferedRandom.read1() (GH-115163) (GH-115206)
(cherry picked from commit 846fd721d5)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-02-09 11:01:47 +00:00
Miss Islington (bot) 4db8d3be49
[3.11] gh-80109: Fix io.TextIOWrapper dropping the internal buffer during write() (GH-22535) (GH-113809)
io.TextIOWrapper was dropping the internal decoding buffer
during read() and write() calls.
(cherry picked from commit 73c9326563)

Co-authored-by: Zackery Spytz <zspytz@gmail.com>
2024-01-08 10:47:50 +00:00
Miss Islington (bot) e2421a36f0
[3.11] gh-111942: Fix crashes in TextIOWrapper.reconfigure() (GH-111976) (GH-112059)
* Fix crash when encoding is not string or None.
* Fix crash when both line_buffering and write_through raise exception
  when converted ti int.
* Add a number of tests for constructor and reconfigure() method
  with invalid arguments.

(cherry picked from commit ee06fffd38)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-11-15 14:20:18 +00:00
Miss Islington (bot) bb92fdabc7
[3.11] gh-111174: Fix crash in getbuffer() called repeatedly for empty BytesIO (GH-111210) (GH-111315)
(cherry picked from commit 9da98c0d9a)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-10-25 11:18:30 +00:00
Miss Islington (bot) 69bcaf7e0e
gh-110913: Fix WindowsConsoleIO chunking of UTF-8 text (GH-111007)
(cherry picked from commit 11312eae6e)

Co-authored-by: Tamás Hegedűs <sorgloomer@users.noreply.github.com>
2023-10-20 12:19:04 +00:00
Erlend E. Aasland d79216d48f
[3.11] gh-107801: Improve the accuracy of io.IOBase.seek docs (#108268) (#108656)
(cherry picked from commit 8178a88bd8)

- Add param docstrings
- Link to os.SEEK_* constants
- Mention the return value in the initial paragraph

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2023-08-29 19:57:49 +00:00
Erlend E. Aasland bd951cd95b
[3.11] gh-107801: Document io.TextIOWrapper.tell (#108265) (#108548)
(cherry picked from commit 38afa4af9b)
2023-08-27 21:22:43 +00:00
Serhiy Storchaka b9fc536399
[3.11] gh-107913: Fix possible losses of OSError error codes (GH-107930) (GH-108524)
Functions like PyErr_SetFromErrno() and SetFromWindowsErr() should be
called immediately after using the C API which sets errno or the Windows
error code.
(cherry picked from commit 2b15536fa9)
2023-08-27 12:18:58 +00:00
Hugo van Kemenade d678ee7719
[3.11] Trim trailing whitespace and test on CI (GH-104275) (#108215) 2023-08-22 12:57:31 +03:00
Miss Islington (bot) dd0a1f9da2
[3.11] gh-102507 Remove invisible pagebreak characters (GH-102531) (#108266)
gh-102507 Remove invisible pagebreak characters (GH-102531)
(cherry picked from commit b097925858)

Co-authored-by: JosephSBoyle <48555120+JosephSBoyle@users.noreply.github.com>
Co-authored-by: AlexWaygood <alex.waygood@gmail.com>
2023-08-22 08:49:35 +00:00
Erlend E. Aasland cc42182c97
[3.11] gh-107801: Improve the accuracy of io.TextIOWrapper.seek docs (#107933) (#108264)
(cherry picked from commit 7f87ebbc3f)

Clearly document the supported seek() operations:

- Rewind to the start of the stream
- Restore a previous stream position (given by tell())
- Fast-forward to the end of the stream
2023-08-22 08:19:56 +00:00
Miss Islington (bot) 3ef9f6b508
gh-82052: Don't send partial UTF-8 sequences to the Windows API (GH-101103)
Don't send partial UTF-8 sequences to the Windows API
(cherry picked from commit f34176b77f)

Co-authored-by: Paul Moore <p.f.moore@gmail.com>
2023-01-17 11:52:50 -08:00
Kumar Aditya 6f658dd60d
[3.11] bpo-31718: Fix io.IncrementalNewlineDecoder SystemErrors and segfaults (GH-18640) (#99841)
Co-authored-by: Oren Milman <orenmn@gmail.com>
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>

(cherry picked from commit 53eef27133)
2022-11-28 16:47:33 +05:30
Miss Islington (bot) c06f74f1d6
bpo-38031: Fix a possible assertion failure in _io.FileIO() (GH-GH-5688)
(cherry picked from commit d386115039)

Co-authored-by: Zackery Spytz <zspytz@gmail.com>
2022-11-25 05:20:00 -08:00
Miss Islington (bot) 29c3dc050a
gh-83004: Clean up refleak in _io initialisation (GH-98840)
(cherry picked from commit 1208037246)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-11-06 06:03:52 -08:00
Miss Islington (bot) 84d58ad17b
GH-90699: fix ref counting of static immortal strings (gh-94850)
(cherry picked from commit 1834133e66)

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
2022-07-19 23:56:47 -07:00
Miss Islington (bot) a914fa979e
GH-94857: fix test_io refleak (GH-94858)
(cherry picked from commit 631160c262)

Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
2022-07-18 07:17:55 -07:00
Miss Islington (bot) 6c18bd5da0
Fix typo in _io.TextIOWrapper Clinic input (GH-94037) (GH-94116)
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
(cherry picked from commit ca308c13da)

Co-authored-by: fikotta <81991278+fikotta@users.noreply.github.com>
2022-06-22 14:29:09 +02:00
Miss Islington (bot) 81686e701c
gh-84461: Silence some compiler warnings on WASM (GH-93978)
(cherry picked from commit 774ef28814)

Co-authored-by: Christian Heimes <christian@python.org>
2022-06-20 05:08:14 -07:00
Victor Stinner b270b82f11
gh-91320: Argument Clinic uses _PyCFunction_CAST() (#32210)
Replace "(PyCFunction)(void(*)(void))func" cast with
_PyCFunction_CAST(func).
2022-05-03 20:25:41 +02:00
Inada Naoki 0729b31a8b
gh-91952: Make TextIOWrapper.reconfigure() supports "locale" encoding (GH-91982) 2022-05-01 10:44:14 +09:00
Kumar Aditya ab0d35d70d
bpo-46712: share more global strings in deepfreeze (gh-32152)
(for gh-90868)
2022-04-19 11:41:36 -06:00
Inada Naoki 6fdb62b1fa
gh-91526: io: Remove device encoding support from TextIOWrapper (GH-91529)
`TextIOWrapper.__init__()` called `os.device_encoding(file.fileno())` if fileno is 0-2 and encoding=None.
But it is very rarely works, and never documented behavior.
2022-04-19 11:44:36 +09:00
Inada Naoki 13b17e2a0a
gh-91156: Fix `encoding="locale"` in UTF-8 mode (GH-70056) 2022-04-14 16:00:35 +09:00
Inada Naoki 6773203487
bpo-47000: Add `locale.getencoding()` (GH-32068) 2022-04-09 09:54:54 +09:00
Inada Naoki 4216dce04b
bpo-47000: Make `io.text_encoding()` respects UTF-8 mode (GH-32003)
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2022-04-04 11:46:57 +09:00
slateny cedd2473a9
bpo-25415: Remove confusing sentence from IOBase docstrings (PR-31631) 2022-03-04 12:35:52 -05:00
Eric Snow 1f455361ec
bpo-46765: Replace Locally Cached Strings with Statically Initialized Objects (gh-31366)
https://bugs.python.org/issue46765
2022-02-22 17:23:51 -07:00
Eric Snow 81c72044a1
bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
2022-02-08 13:39:07 -07:00
Victor Stinner 9c8e490b8f
bpo-46417: Clear _io module static objects at exit (GH-30807)
Add _PyIO_Fini() function, called by finalize_interp_clear(). It
clears static objects used by the _io extension module.
2022-01-22 23:22:20 +01:00
Benjamin Peterson 19a6c41e56
Remove unused variables. (GH-29231) 2021-10-26 16:22:34 -07:00
Victor Stinner 97308dfcdc
bpo-45434: Move _Py_BEGIN_SUPPRESS_IPH to pycore_fileutils.h (GH-28922) 2021-10-13 15:03:35 +02:00
Victor Stinner d943d19172
bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895)
* Move _PyObject_CallNoArgs() to pycore_call.h (internal C API).
* _ssl, _sqlite and _testcapi extensions now call the public
  PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs().
* _lsprof extension is now built with Py_BUILD_CORE_MODULE macro
  defined to get access to internal _PyObject_CallNoArgs().
2021-10-12 08:38:19 +02:00
Victor Stinner ce3489cfdb
bpo-45439: Rename _PyObject_CallNoArg() to _PyObject_CallNoArgs() (GH-28891)
Fix typo in the private _PyObject_CallNoArg() function name: rename
it to _PyObject_CallNoArgs() to be consistent with the public
function PyObject_CallNoArgs().
2021-10-12 00:42:23 +02:00
AngstyDuck a450398933
bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457) 2021-10-01 21:11:08 +01:00
Serhiy Storchaka 92bf8691fb
bpo-43413: Fix handling keyword arguments in subclasses of some buitin classes (GH-26456)
* Constructors of subclasses of some buitin classes (e.g. tuple, list,
  frozenset) no longer accept arbitrary keyword arguments.
* Subclass of set can now define a __new__() method with additional
  keyword parameters without overriding also __init__().
2021-09-12 13:27:50 +03:00
Victor Stinner 7974c30b9f
bpo-45094: Add Py_NO_INLINE macro (GH-28140)
* Rename _Py_NO_INLINE macro to Py_NO_INLINE: make it public and
  document it.
* Sort macros in the C API documentation.
2021-09-03 16:44:02 +02:00
Victor Stinner 19ba2122ac
bpo-37330: open() no longer accept 'U' in file mode (GH-28118)
open(), io.open(), codecs.open() and fileinput.FileInput no longer
accept "U" ("universal newline") in the file mode. This flag was
deprecated since Python 3.3.
2021-09-02 12:58:00 +02:00
Segev Finer 5e437fb872
bpo-30555: Fix WindowsConsoleIO fails in the presence of fd redirection (GH-1927)
This works by not caching the handle and instead getting the handle from
the file descriptor each time, so that if the actual handle changes by
fd redirection closing/opening the console handle beneath our feet, we
will keep working correctly.
2021-04-23 23:00:27 +01:00
Inada Naoki bec8c787ec
bpo-43510: Fix emitting EncodingWarning from _io module. (GH-25146)
I forget to check PyErr_WarnEx() return value. But it will fail when -Werror is used.
2021-04-02 17:38:59 +09:00
Inada Naoki cfa176685a
Revert "bpo-43510: PEP 597: Accept `encoding="locale"` in binary mode (GH-25103)" (#25108)
This reverts commit ff3c9739bd.
2021-03-31 18:49:41 +09:00
Inada Naoki ff3c9739bd
bpo-43510: PEP 597: Accept `encoding="locale"` in binary mode (GH-25103)
It make `encoding="locale"` usable everywhere `encoding=None` is
allowed.
2021-03-31 14:26:08 +09:00
Inada Naoki 4827483f47
bpo-43510: Implement PEP 597 opt-in EncodingWarning. (GH-19481)
See [PEP 597](https://www.python.org/dev/peps/pep-0597/).

* Add `-X warn_default_encoding` and `PYTHONWARNDEFAULTENCODING`.
* Add EncodingWarning
* Add io.text_encoding()
* open(), TextIOWrapper() emits EncodingWarning when encoding is omitted and warn_default_encoding is enabled.
* _pyio.TextIOWrapper() uses UTF-8 as fallback default encoding used when failed to import locale module. (used during building Python)
* bz2, configparser, gzip, lzma, pathlib, tempfile modules use io.text_encoding().
* What's new entry
2021-03-29 12:28:14 +09:00
Inada Naoki 01806d5beb
bpo-43260: io: Prevent large data remains in textio buffer. (GH-24592)
When very large data remains in TextIOWrapper, flush() may fail forever.

So prevent that data larger than chunk_size is remained in TextIOWrapper internal
buffer.

Co-Authored-By: Eryk Sun
2021-02-22 08:29:30 +09:00
Victor Stinner 82458b6cdb
bpo-42236: Enhance _locale._get_locale_encoding() (GH-23083)
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
  share code between _Py_GetLocaleEncodingObject()
  and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
  from the current locale encoding with surrogateescape,
  rather than using UTF-8.
2020-11-01 20:59:35 +01:00
Victor Stinner 710e826307
bpo-42208: Add _Py_GetLocaleEncoding() (GH-23050)
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.

Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
2020-10-31 01:02:09 +01:00
Victor Stinner 37834136d0
bpo-42161: Modules/ uses _PyLong_GetZero() and _PyLong_GetOne() (GH-22998)
Use _PyLong_GetZero() and _PyLong_GetOne() in Modules/ directory.

_cursesmodule.c and zoneinfo.c are now built with
Py_BUILD_CORE_MODULE macro defined.
2020-10-27 17:12:53 +01:00
Victor Stinner 97d15ae1d8
bpo-40170: Use inline _PyType_HasFeature() function (GH-22375)
Use _PyType_HasFeature() in the _io module and in structseq
implementation. Replace PyType_HasFeature() opaque function call with
_PyType_HasFeature() inlined function.
2020-09-23 14:08:38 +02:00
Serhiy Storchaka 4c8f09d7ce
bpo-36346: Make using the legacy Unicode C API optional (GH-21437)
Add compile time option USE_UNICODE_WCHAR_CACHE. Setting it to 0
makes the interpreter not using the wchar_t cache and the legacy Unicode C API.
2020-07-10 23:26:06 +03:00