gh-102471, PEP 757: Add PyLong import and export API (#121339)

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
This commit is contained in:
Victor Stinner 2024-12-13 14:24:48 +01:00 committed by GitHub
parent d05a4e6a0d
commit 6446408d42
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 576 additions and 0 deletions

View File

@ -653,3 +653,177 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
.. versionadded:: 3.12
Export API
^^^^^^^^^^
.. versionadded:: next
.. c:struct:: PyLongLayout
Layout of an array of "digits" ("limbs" in the GMP terminology), used to
represent absolute value for arbitrary precision integers.
Use :c:func:`PyLong_GetNativeLayout` to get the native layout of Python
:class:`int` objects, used internally for integers with "big enough"
absolute value.
See also :data:`sys.int_info` which exposes similar information in Python.
.. c:member:: uint8_t bits_per_digit
Bits per digit. For example, a 15 bit digit means that bits 0-14 contain
meaningful information.
.. c:member:: uint8_t digit_size
Digit size in bytes. For example, a 15 bit digit will require at least 2
bytes.
.. c:member:: int8_t digits_order
Digits order:
- ``1`` for most significant digit first
- ``-1`` for least significant digit first
.. c:member:: int8_t digit_endianness
Digit endianness:
- ``1`` for most significant byte first (big endian)
- ``-1`` for least significant byte first (little endian)
.. c:function:: const PyLongLayout* PyLong_GetNativeLayout(void)
Get the native layout of Python :class:`int` objects.
See the :c:struct:`PyLongLayout` structure.
The function must not be called before Python initialization nor after
Python finalization. The returned layout is valid until Python is
finalized. The layout is the same for all Python sub-interpreters
in a process, and so it can be cached.
.. c:struct:: PyLongExport
Export of a Python :class:`int` object.
There are two cases:
* If :c:member:`digits` is ``NULL``, only use the :c:member:`value` member.
* If :c:member:`digits` is not ``NULL``, use :c:member:`negative`,
:c:member:`ndigits` and :c:member:`digits` members.
.. c:member:: int64_t value
The native integer value of the exported :class:`int` object.
Only valid if :c:member:`digits` is ``NULL``.
.. c:member:: uint8_t negative
``1`` if the number is negative, ``0`` otherwise.
Only valid if :c:member:`digits` is not ``NULL``.
.. c:member:: Py_ssize_t ndigits
Number of digits in :c:member:`digits` array.
Only valid if :c:member:`digits` is not ``NULL``.
.. c:member:: const void *digits
Read-only array of unsigned digits. Can be ``NULL``.
.. c:function:: int PyLong_Export(PyObject *obj, PyLongExport *export_long)
Export a Python :class:`int` object.
*export_long* must point to a :c:struct:`PyLongExport` structure allocated
by the caller. It must not be ``NULL``.
On success, fill in *\*export_long* and return ``0``.
On error, set an exception and return ``-1``.
:c:func:`PyLong_FreeExport` must be called when the export is no longer
needed.
.. impl-detail::
This function always succeeds if *obj* is a Python :class:`int` object
or a subclass.
.. c:function:: void PyLong_FreeExport(PyLongExport *export_long)
Release the export *export_long* created by :c:func:`PyLong_Export`.
.. impl-detail::
Calling :c:func:`PyLong_FreeExport` is optional if *export_long->digits*
is ``NULL``.
PyLongWriter API
^^^^^^^^^^^^^^^^
The :c:type:`PyLongWriter` API can be used to import an integer.
.. versionadded:: next
.. c:struct:: PyLongWriter
A Python :class:`int` writer instance.
The instance must be destroyed by :c:func:`PyLongWriter_Finish` or
:c:func:`PyLongWriter_Discard`.
.. c:function:: PyLongWriter* PyLongWriter_Create(int negative, Py_ssize_t ndigits, void **digits)
Create a :c:type:`PyLongWriter`.
On success, allocate *\*digits* and return a writer.
On error, set an exception and return ``NULL``.
*negative* is ``1`` if the number is negative, or ``0`` otherwise.
*ndigits* is the number of digits in the *digits* array. It must be
greater than 0.
*digits* must not be NULL.
After a successful call to this function, the caller should fill in the
array of digits *digits* and then call :c:func:`PyLongWriter_Finish` to get
a Python :class:`int`.
The layout of *digits* is described by :c:func:`PyLong_GetNativeLayout`.
Digits must be in the range [``0``; ``(1 << bits_per_digit) - 1``]
(where the :c:struct:`~PyLongLayout.bits_per_digit` is the number of bits
per digit).
Any unused most significant digits must be set to ``0``.
Alternately, call :c:func:`PyLongWriter_Discard` to destroy the writer
instance without creating an :class:`~int` object.
.. c:function:: PyObject* PyLongWriter_Finish(PyLongWriter *writer)
Finish a :c:type:`PyLongWriter` created by :c:func:`PyLongWriter_Create`.
On success, return a Python :class:`int` object.
On error, set an exception and return ``NULL``.
The function takes care of normalizing the digits and converts the object
to a compact integer if needed.
The writer instance and the *digits* array are invalid after the call.
.. c:function:: void PyLongWriter_Discard(PyLongWriter *writer)
Discard a :c:type:`PyLongWriter` created by :c:func:`PyLongWriter_Create`.
*writer* must not be ``NULL``.
The writer instance and the *digits* array are invalid after the call.

View File

@ -1299,6 +1299,13 @@ PyLong_GetSign:int:::
PyLong_GetSign:PyObject*:v:0:
PyLong_GetSign:int*:sign::
PyLong_Export:int:::
PyLong_Export:PyObject*:obj:0:
PyLong_Export:PyLongExport*:export_long::
PyLongWriter_Finish:PyObject*::+1:
PyLongWriter_Finish:PyLongWriter*:writer::
PyMapping_Check:int:::
PyMapping_Check:PyObject*:o:0:

View File

@ -1018,6 +1018,17 @@ New features
(Contributed by Victor Stinner in :gh:`107954`.)
* Add a new import and export API for Python :class:`int` objects (:pep:`757`):
* :c:func:`PyLong_GetNativeLayout`;
* :c:func:`PyLong_Export`;
* :c:func:`PyLong_FreeExport`;
* :c:func:`PyLongWriter_Create`;
* :c:func:`PyLongWriter_Finish`;
* :c:func:`PyLongWriter_Discard`.
(Contributed by Victor Stinner in :gh:`102471`.)
* Add :c:func:`PyType_GetBaseByToken` and :c:data:`Py_tp_token` slot for easier
superclass identification, which attempts to resolve the `type checking issue
<https://peps.python.org/pep-0630/#type-checking>`__ mentioned in :pep:`630`

View File

@ -139,6 +139,44 @@ _PyLong_CompactValue(const PyLongObject *op)
#define PyUnstable_Long_CompactValue _PyLong_CompactValue
/* --- Import/Export API -------------------------------------------------- */
typedef struct PyLongLayout {
uint8_t bits_per_digit;
uint8_t digit_size;
int8_t digits_order;
int8_t digit_endianness;
} PyLongLayout;
PyAPI_FUNC(const PyLongLayout*) PyLong_GetNativeLayout(void);
typedef struct PyLongExport {
int64_t value;
uint8_t negative;
Py_ssize_t ndigits;
const void *digits;
// Member used internally, must not be used for other purpose.
Py_uintptr_t _reserved;
} PyLongExport;
PyAPI_FUNC(int) PyLong_Export(
PyObject *obj,
PyLongExport *export_long);
PyAPI_FUNC(void) PyLong_FreeExport(
PyLongExport *export_long);
/* --- PyLongWriter API --------------------------------------------------- */
typedef struct PyLongWriter PyLongWriter;
PyAPI_FUNC(PyLongWriter*) PyLongWriter_Create(
int negative,
Py_ssize_t ndigits,
void **digits);
PyAPI_FUNC(PyObject*) PyLongWriter_Finish(PyLongWriter *writer);
PyAPI_FUNC(void) PyLongWriter_Discard(PyLongWriter *writer);
#ifdef __cplusplus
}
#endif

View File

@ -10,6 +10,7 @@
NULL = None
class IntSubclass(int):
pass
@ -714,5 +715,95 @@ def test_long_asuint64(self):
self.check_long_asint(as_uint64, 0, UINT64_MAX,
negative_value_error=ValueError)
def test_long_layout(self):
# Test PyLong_GetNativeLayout()
int_info = sys.int_info
layout = _testcapi.get_pylong_layout()
expected = {
'bits_per_digit': int_info.bits_per_digit,
'digit_size': int_info.sizeof_digit,
'digits_order': -1,
'digit_endianness': -1 if sys.byteorder == 'little' else 1,
}
self.assertEqual(layout, expected)
def test_long_export(self):
# Test PyLong_Export()
layout = _testcapi.get_pylong_layout()
base = 2 ** layout['bits_per_digit']
pylong_export = _testcapi.pylong_export
# value fits into int64_t
self.assertEqual(pylong_export(0), 0)
self.assertEqual(pylong_export(123), 123)
self.assertEqual(pylong_export(-123), -123)
self.assertEqual(pylong_export(IntSubclass(123)), 123)
# use an array, doesn't fit into int64_t
self.assertEqual(pylong_export(base**10 * 2 + 1),
(0, [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]))
self.assertEqual(pylong_export(-(base**10 * 2 + 1)),
(1, [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]))
self.assertEqual(pylong_export(IntSubclass(base**10 * 2 + 1)),
(0, [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]))
self.assertRaises(TypeError, pylong_export, 1.0)
self.assertRaises(TypeError, pylong_export, 0+1j)
self.assertRaises(TypeError, pylong_export, "abc")
def test_longwriter_create(self):
# Test PyLongWriter_Create()
layout = _testcapi.get_pylong_layout()
base = 2 ** layout['bits_per_digit']
pylongwriter_create = _testcapi.pylongwriter_create
self.assertRaises(ValueError, pylongwriter_create, 0, [])
self.assertRaises(ValueError, pylongwriter_create, -123, [])
self.assertEqual(pylongwriter_create(0, [0]), 0)
self.assertEqual(pylongwriter_create(0, [123]), 123)
self.assertEqual(pylongwriter_create(1, [123]), -123)
self.assertEqual(pylongwriter_create(1, [1, 2]),
-(base * 2 + 1))
self.assertEqual(pylongwriter_create(0, [1, 2, 3]),
base**2 * 3 + base * 2 + 1)
max_digit = base - 1
self.assertEqual(pylongwriter_create(0, [max_digit, max_digit, max_digit]),
base**2 * max_digit + base * max_digit + max_digit)
# normalize
self.assertEqual(pylongwriter_create(0, [123, 0, 0]), 123)
# test singletons + normalize
for num in (-2, 0, 1, 5, 42, 100):
self.assertIs(pylongwriter_create(bool(num < 0), [abs(num), 0]),
num)
def to_digits(num):
digits = []
while True:
num, digit = divmod(num, base)
digits.append(digit)
if not num:
break
return digits
# round trip: Python int -> export -> Python int
pylong_export = _testcapi.pylong_export
numbers = [*range(0, 10), 12345, 0xdeadbeef, 2**100, 2**100-1]
numbers.extend(-num for num in list(numbers))
for num in numbers:
with self.subTest(num=num):
data = pylong_export(num)
if isinstance(data, tuple):
negative, digits = data
else:
value = data
negative = int(value < 0)
digits = to_digits(abs(value))
self.assertEqual(pylongwriter_create(negative, digits), num,
(negative, digits))
if __name__ == "__main__":
unittest.main()

View File

@ -0,0 +1,10 @@
Add a new import and export API for Python :class:`int` objects (:pep:`757`):
* :c:func:`PyLong_GetNativeLayout`;
* :c:func:`PyLong_Export`;
* :c:func:`PyLong_FreeExport`;
* :c:func:`PyLongWriter_Create`;
* :c:func:`PyLongWriter_Finish`;
* :c:func:`PyLongWriter_Discard`.
Patch by Victor Stinner.

View File

@ -141,6 +141,127 @@ pylong_aspid(PyObject *module, PyObject *arg)
}
static PyObject *
layout_to_dict(const PyLongLayout *layout)
{
return Py_BuildValue("{sisisisi}",
"bits_per_digit", (int)layout->bits_per_digit,
"digit_size", (int)layout->digit_size,
"digits_order", (int)layout->digits_order,
"digit_endianness", (int)layout->digit_endianness);
}
static PyObject *
pylong_export(PyObject *module, PyObject *obj)
{
PyLongExport export_long;
if (PyLong_Export(obj, &export_long) < 0) {
return NULL;
}
if (export_long.digits == NULL) {
assert(export_long.negative == 0);
assert(export_long.ndigits == 0);
assert(export_long.digits == NULL);
PyObject *res = PyLong_FromInt64(export_long.value);
PyLong_FreeExport(&export_long);
return res;
}
assert(PyLong_GetNativeLayout()->digit_size == sizeof(digit));
const digit *export_long_digits = export_long.digits;
PyObject *digits = PyList_New(0);
if (digits == NULL) {
goto error;
}
for (Py_ssize_t i = 0; i < export_long.ndigits; i++) {
PyObject *item = PyLong_FromUnsignedLong(export_long_digits[i]);
if (item == NULL) {
goto error;
}
if (PyList_Append(digits, item) < 0) {
Py_DECREF(item);
goto error;
}
Py_DECREF(item);
}
assert(export_long.value == 0);
PyObject *res = Py_BuildValue("(iN)", export_long.negative, digits);
PyLong_FreeExport(&export_long);
assert(export_long._reserved == 0);
return res;
error:
Py_XDECREF(digits);
PyLong_FreeExport(&export_long);
return NULL;
}
static PyObject *
pylongwriter_create(PyObject *module, PyObject *args)
{
int negative;
PyObject *list;
// TODO(vstinner): write test for negative ndigits and digits==NULL
if (!PyArg_ParseTuple(args, "iO!", &negative, &PyList_Type, &list)) {
return NULL;
}
Py_ssize_t ndigits = PyList_GET_SIZE(list);
digit *digits = PyMem_Malloc((size_t)ndigits * sizeof(digit));
if (digits == NULL) {
return PyErr_NoMemory();
}
for (Py_ssize_t i = 0; i < ndigits; i++) {
PyObject *item = PyList_GET_ITEM(list, i);
long num = PyLong_AsLong(item);
if (num == -1 && PyErr_Occurred()) {
goto error;
}
if (num < 0 || num >= PyLong_BASE) {
PyErr_SetString(PyExc_ValueError, "digit doesn't fit into digit");
goto error;
}
digits[i] = (digit)num;
}
void *writer_digits;
PyLongWriter *writer = PyLongWriter_Create(negative, ndigits,
&writer_digits);
if (writer == NULL) {
goto error;
}
assert(PyLong_GetNativeLayout()->digit_size == sizeof(digit));
memcpy(writer_digits, digits, (size_t)ndigits * sizeof(digit));
PyObject *res = PyLongWriter_Finish(writer);
PyMem_Free(digits);
return res;
error:
PyMem_Free(digits);
return NULL;
}
static PyObject *
get_pylong_layout(PyObject *module, PyObject *Py_UNUSED(args))
{
const PyLongLayout *layout = PyLong_GetNativeLayout();
return layout_to_dict(layout);
}
static PyMethodDef test_methods[] = {
_TESTCAPI_CALL_LONG_COMPACT_API_METHODDEF
{"pylong_fromunicodeobject", pylong_fromunicodeobject, METH_VARARGS},
@ -148,6 +269,9 @@ static PyMethodDef test_methods[] = {
{"pylong_fromnativebytes", pylong_fromnativebytes, METH_VARARGS},
{"pylong_getsign", pylong_getsign, METH_O},
{"pylong_aspid", pylong_aspid, METH_O},
{"pylong_export", pylong_export, METH_O},
{"pylongwriter_create", pylongwriter_create, METH_VARARGS},
{"get_pylong_layout", get_pylong_layout, METH_NOARGS},
{"pylong_ispositive", pylong_ispositive, METH_O},
{"pylong_isnegative", pylong_isnegative, METH_O},
{"pylong_iszero", pylong_iszero, METH_O},

View File

@ -6750,6 +6750,7 @@ PyUnstable_Long_CompactValue(const PyLongObject* op) {
return _PyLong_CompactValue((PyLongObject*)op);
}
PyObject* PyLong_FromInt32(int32_t value)
{ return PyLong_FromNativeBytes(&value, sizeof(value), -1); }
@ -6815,3 +6816,122 @@ int PyLong_AsUInt64(PyObject *obj, uint64_t *value)
{
LONG_TO_UINT(obj, value, "C uint64_t");
}
static const PyLongLayout PyLong_LAYOUT = {
.bits_per_digit = PyLong_SHIFT,
.digits_order = -1, // least significant first
.digit_endianness = PY_LITTLE_ENDIAN ? -1 : 1,
.digit_size = sizeof(digit),
};
const PyLongLayout*
PyLong_GetNativeLayout(void)
{
return &PyLong_LAYOUT;
}
int
PyLong_Export(PyObject *obj, PyLongExport *export_long)
{
if (!PyLong_Check(obj)) {
memset(export_long, 0, sizeof(*export_long));
PyErr_Format(PyExc_TypeError, "expect int, got %T", obj);
return -1;
}
// Fast-path: try to convert to a int64_t
int overflow;
#if SIZEOF_LONG == 8
long value = PyLong_AsLongAndOverflow(obj, &overflow);
#else
// Windows has 32-bit long, so use 64-bit long long instead
long long value = PyLong_AsLongLongAndOverflow(obj, &overflow);
#endif
Py_BUILD_ASSERT(sizeof(value) == sizeof(int64_t));
// the function cannot fail since obj is a PyLongObject
assert(!(value == -1 && PyErr_Occurred()));
if (!overflow) {
export_long->value = value;
export_long->negative = 0;
export_long->ndigits = 0;
export_long->digits = NULL;
export_long->_reserved = 0;
}
else {
PyLongObject *self = (PyLongObject*)obj;
export_long->value = 0;
export_long->negative = _PyLong_IsNegative(self);
export_long->ndigits = _PyLong_DigitCount(self);
if (export_long->ndigits == 0) {
export_long->ndigits = 1;
}
export_long->digits = self->long_value.ob_digit;
export_long->_reserved = (Py_uintptr_t)Py_NewRef(obj);
}
return 0;
}
void
PyLong_FreeExport(PyLongExport *export_long)
{
PyObject *obj = (PyObject*)export_long->_reserved;
if (obj) {
export_long->_reserved = 0;
Py_DECREF(obj);
}
}
/* --- PyLongWriter API --------------------------------------------------- */
PyLongWriter*
PyLongWriter_Create(int negative, Py_ssize_t ndigits, void **digits)
{
if (ndigits <= 0) {
PyErr_SetString(PyExc_ValueError, "ndigits must be positive");
goto error;
}
assert(digits != NULL);
PyLongObject *obj = _PyLong_New(ndigits);
if (obj == NULL) {
goto error;
}
if (negative) {
_PyLong_FlipSign(obj);
}
*digits = obj->long_value.ob_digit;
return (PyLongWriter*)obj;
error:
*digits = NULL;
return NULL;
}
void
PyLongWriter_Discard(PyLongWriter *writer)
{
PyLongObject *obj = (PyLongObject *)writer;
assert(Py_REFCNT(obj) == 1);
Py_DECREF(obj);
}
PyObject*
PyLongWriter_Finish(PyLongWriter *writer)
{
PyLongObject *obj = (PyLongObject *)writer;
assert(Py_REFCNT(obj) == 1);
// Normalize and get singleton if possible
obj = maybe_small_long(long_normalize(obj));
return (PyObject*)obj;
}

View File

@ -319,6 +319,7 @@ Objects/exceptions.c - static_exceptions -
Objects/genobject.c - ASYNC_GEN_IGNORED_EXIT_MSG -
Objects/genobject.c - NON_INIT_CORO_MSG -
Objects/longobject.c - _PyLong_DigitValue -
Objects/longobject.c - PyLong_LAYOUT -
Objects/object.c - _Py_SwappedOp -
Objects/object.c - _Py_abstract_hack -
Objects/object.c - last_final_reftotal -

Can't render this file because it has a wrong number of fields in line 4.