Clearly document the supported seek() operations:
- Rewind to the start of the stream
- Restore a previous stream position (given by tell())
- Fast-forward to the end of the stream
Remove the following private functions of the C API:
* _PyCodecInfo_GetIncrementalDecoder()
* _PyCodecInfo_GetIncrementalEncoder()
* _PyCodec_DecodeText()
* _PyCodec_EncodeText()
* _PyCodec_Forget()
* _PyCodec_Lookup()
* _PyCodec_LookupTextEncoding()
Move these functions to a new pycore_codecs.h internal header file.
These functions are no longer exported.
Remove the following private functions from the public C API:
* _Py_CheckFunctionResult()
* _PyObject_CallMethod()
* _PyObject_CallMethodId()
* _PyObject_CallMethodIdNoArgs()
* _PyObject_CallMethodIdObjArgs()
* _PyObject_CallMethodIdOneArg()
* _PyObject_MakeTpCall()
* _PyObject_VectorcallMethodId()
* _PyStack_AsDict()
Move these functions to the internal C API (pycore_call.h).
No longer export the following functions:
* _PyObject_Call()
* _PyObject_CallMethod()
* _PyObject_CallMethodId()
* _PyObject_CallMethodIdObjArgs()
* _PyObject_Call_Prepend()
* _PyObject_FastCallDictTstate()
* _PyStack_AsDict()
The following functions are still exported for stdlib shared
extensions:
* _Py_CheckFunctionResult()
* _PyObject_MakeTpCall()
Mark the following internal functions as extern:
* _PyStack_UnpackDict()
* _PyStack_UnpackDict_Free()
* _PyStack_UnpackDict_FreeNoDecRef()
When preparing the _io extension module for isolation, many methods were
adapted to Argument Clinic. Some of these used the '*args: object'
signature, which is incorrect. These are now corrected to an exact
signature, and marked unused, since they are stub methods.
builtins and extension module functions and methods that expect boolean values for parameters now accept any Python object rather than just a bool or int type. This is more consistent with how native Python code itself behaves.
Fix potential race condition in code patterns:
* Replace "Py_DECREF(var); var = new;" with "Py_SETREF(var, new);"
* Replace "Py_XDECREF(var); var = new;" with "Py_XSETREF(var, new);"
* Replace "Py_CLEAR(var); var = new;" with "Py_XSETREF(var, new);"
Other changes:
* Replace "old = var; var = new; Py_DECREF(var)"
with "Py_SETREF(var, new);"
* Replace "old = var; var = new; Py_XDECREF(var)"
with "Py_XSETREF(var, new);"
* And remove the "old" variable.
`TextIOWrapper.__init__()` called `os.device_encoding(file.fileno())` if fileno is 0-2 and encoding=None.
But it is very rarely works, and never documented behavior.
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
See [PEP 597](https://www.python.org/dev/peps/pep-0597/).
* Add `-X warn_default_encoding` and `PYTHONWARNDEFAULTENCODING`.
* Add EncodingWarning
* Add io.text_encoding()
* open(), TextIOWrapper() emits EncodingWarning when encoding is omitted and warn_default_encoding is enabled.
* _pyio.TextIOWrapper() uses UTF-8 as fallback default encoding used when failed to import locale module. (used during building Python)
* bz2, configparser, gzip, lzma, pathlib, tempfile modules use io.text_encoding().
* What's new entry
When very large data remains in TextIOWrapper, flush() may fail forever.
So prevent that data larger than chunk_size is remained in TextIOWrapper internal
buffer.
Co-Authored-By: Eryk Sun
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
share code between _Py_GetLocaleEncodingObject()
and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
from the current locale encoding with surrogateescape,
rather than using UTF-8.
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.
Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
Use _PyLong_GetZero() and _PyLong_GetOne() in Modules/ directory.
_cursesmodule.c and zoneinfo.c are now built with
Py_BUILD_CORE_MODULE macro defined.
Move PyInterpreterState.fs_codec into a new
PyInterpreterState.unicode structure.
Give a name to the fs_codec structure and use this structure in
unicodeobject.c.
Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET()
for consistency with _PyThreadState_GET() and to have a shorter name
(help to fit into 80 columns).
Add also "assert(tstate != NULL);" to the function.
Don't access PyInterpreterState.config member directly anymore, but
use new functions:
* _PyInterpreterState_GetConfig()
* _PyInterpreterState_SetConfig()
* _Py_GetConfig()
gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either:
Adding the const to the type cast, as in:
- return _PyUnicode_FromUCS1((unsigned char*)s, size);
+ return _PyUnicode_FromUCS1((const unsigned char*)s, size);
or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in:
- PyDTrace_FUNCTION_ENTRY((char *)filename, (char *)funcname, lineno);
+ PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno);
These changes will not change code, but they will make it much easier to check for errors in consts
The bulk of this patch was generated automatically with:
for name in \
PyObject_Vectorcall \
Py_TPFLAGS_HAVE_VECTORCALL \
PyObject_VectorcallMethod \
PyVectorcall_Function \
PyObject_CallOneArg \
PyObject_CallMethodNoArgs \
PyObject_CallMethodOneArg \
;
do
echo $name
git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
done
old=_PyObject_FastCallDict
new=PyObject_VectorcallDict
git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"
and then cleaned up:
- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments