* Clarify sys.getdefaultencoding() documentation
* Add missing documentation for PyUnicode_GetDefaultEncoding,
the C equivalent of sys.getdefaultencoding
(cherry picked from commit 9f25c1f012)
Co-authored-by: RUANG (James Roy) <longjinyii@outlook.com>
gh-46236: Document PyUnicode_RSplit, PyUnicode_Partition and PyUnicode_RPartition (GH-130191)
(cherry picked from commit 0f5b82169e)
Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.
With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.
Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.
Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.
Also add a thorough test that the implementation agrees with this definition.
Author: Greg Price <gnprice@gmail.com>
Co-authored-by: Greg Price <gnprice@gmail.com>
(cherry picked from commit 3402e133ef)
Docs C API: Clarify what happens when null bytes are passed to `PyUnicode_AsUTF8` (GH-127458)
(cherry picked from commit e792f4bc2e)
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Stan U. <89152624+StanFromIreland@users.noreply.github.com>
Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
GH-95079: document error behaviour for some unicode C APIs (GH-95080)
(cherry picked from commit b79a21ea42)
Co-authored-by: Max Bachmann <kontakt@maxbachmann.de>
* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs
* Document immortality in some functions that take `const char *`
This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.
Always point out a non-immortalizing alternative.
* Don't immortalize user-provided attr names in _ctypes
(cherry picked from commit b4aedb23ae)
* Revert "gh-111089: Use PyUnicode_AsUTF8() in Argument Clinic (#111585)"
This reverts commit d9b606b3d0.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in getargs.c (#111620)"
This reverts commit cde1071b2a.
* Revert "gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)"
This reverts commit d731579bfb.
* Revert "gh-111089: Add PyUnicode_AsUTF8() to the limited C API (#111121)"
This reverts commit d8f32be5b6.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in sqlite3 (#111122)"
This reverts commit 37e4e20eaa.
* PyUnicode_AsUTF8() now raises an exception if the string contains
embedded null characters.
* Update related C API tests (test_capi.test_unicode).
* type_new_set_doc() uses PyUnicode_AsUTF8AndSize() to silently
truncate doc containing null bytes.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
PEP 683 (immortal objects) revealed some ways in which the Python documentation has been unnecessarily coupled to the implementation details of reference counts. In the end users should focus on reference ownership, including taking references and releasing them, rather than on how many reference counts an object has.
This change updates the documentation to reflect that perspective. It also updates the docs relative to immortal objects in a handful of places.
Declare the following functions as macros, since they are actually
macros. It avoids a warning on "TYPE" or "macro" argument.
* PyMem_New()
* PyMem_Resize()
* PyModule_AddIntMacro()
* PyModule_AddStringMacro()
* PyObject_GC_New()
* PyObject_GC_NewVar()
* PyObject_New()
* PyObject_NewVar()
Add C standard C types to nitpick_ignore in Doc/conf.py:
* int64_t
* uint64_t
* uintptr_t
No longer ignore non existing "__int" type in nitpick_ignore.
Update Doc/tools/.nitignore
_PyUnicode_ToLowercase(), _PyUnicode_ToUppercase(),
_PyUnicode_ToTitlecase() are no longer deprecated in the
documentation. It's no longer needed since they now use Py_UCS4 type,
rather than the deprecated Py_UNICODE type.
Deprecate the old Py_UNICODE and PY_UNICODE_TYPE types in the C API:
use wchar_t instead.
Replace Py_UNICODE with wchar_t in multiple C files.
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
* Support for conversion specifiers o (octal) and X (uppercase hexadecimal).
* Support for length modifiers j (intmax_t) and t (ptrdiff_t).
* Length modifiers are now applied to all integer conversions.
* Support for wchar_t C strings (%ls and %lV).
* Support for variable width and precision (*).
* Support for flag - (left alignment).
An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
Python now always use the ``%zu`` and ``%zd`` printf formats to
format a size_t or Py_ssize_t number. Building Python 3.12 requires a
C11 compiler, so these printf formats are now always supported.
* PyObject_Print() and _PyObject_Dump() now use the printf %zd format
to display an object reference count.
* Update PY_FORMAT_SIZE_T comment.
* Remove outdated notes about the %zd format in PyBytes_FromFormat()
and PyUnicode_FromFormat() documentations.
* configure no longer checks for the %zd format and no longer defines
PY_FORMAT_SIZE_T macro in pyconfig.h.
* pymacconfig.h no longer undefines PY_FORMAT_SIZE_T: macOS 10.4 is
no longer supported. Python 3.12 now requires macOS 10.6 (Snow
Leopard) or newer.
Update documentation of PyUnicode_DecodeFSDefault(),
PyUnicode_DecodeFSDefaultAndSize() and PyUnicode_EncodeFSDefault():
they now use the filesystem encoding and error handler of PyConfig,
Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors
variables are no longer used.
Use _Py_CAST() and _Py_STATIC_CAST() in macros wrapping static inline
functions of unicodeobject.h.
Change also the kind type from unsigned int to int: same parameter
type than PyUnicode_FromKindAndData().
The limited API version 3.11 no longer casts arguments to expected
types.
* C API docs: move PyErr_SetImportErrorSubclass docs
It was in the section about warnings, but it makes more sense to
put it with PyErr_SetImportError.
* C API docs: document closeit argument to PyRun_AnyFileExFlags
It was already documented for PyRun_SimpleFileExFlags.
* textual fixes to unicode docs
* Move paragraph about tp_dealloc into tp_dealloc section
* __aiter__ returns an async iterator, not an awaitable