gh-127937: convert decimal module to use import/export API for ints (PEP 757) #127925

skirpichev · 2024-12-13T16:17:34Z

For export (int instance → Decimal)

Benchmark	ref	patch
Decimal(1<<7)	672 ns	708 ns: 1.05x slower
Decimal(1<<38)	764 ns	713 ns: 1.07x faster
Decimal(1<<300)	1.88 us	1.94 us: 1.03x slower
Decimal(1<<3000)	90.1 us	90.2 us: 1.00x slower
Geometric mean	(ref)	1.00x slower

For import (Decimal instance → int)

Benchmark	ref	patch
int(Decimal(1<<7))	609 ns	517 ns: 1.18x faster
int(Decimal(1<<38))	712 ns	502 ns: 1.42x faster
int(Decimal(1<<300))	2.04 us	1.97 us: 1.04x faster
int(Decimal(1<<3000))	116 us	115 us: 1.00x faster
Geometric mean	(ref)	1.15x faster

>>> sys.int_info[:2]
(30, 4)

benchmarks code

# export_bench.py
import pyperf
from decimal import Decimal as D

runner = pyperf.Runner()
i1, i2, i3, i4 = 1 << 7, 1 << 38, 1 << 300, 1 << 3000
runner.bench_func('Decimal(1<<7)', D, i1)
runner.bench_func('Decimal(1<<38)', D, i2)
runner.bench_func('Decimal(1<<300)', D, i3)
runner.bench_func('Decimal(1<<3000)', D, i4)

# import_bench.py
import pyperf
from decimal import Decimal as D

runner = pyperf.Runner()
d1, d2, d3, d4 = D(1 << 7), D(1 << 38), D(1 << 300), D(1 << 3000)
runner.bench_func('int(Decimal(1<<7))', int, d1)
runner.bench_func('int(Decimal(1<<38))', int, d2)
runner.bench_func('int(Decimal(1<<300))', int, d3)
runner.bench_func('int(Decimal(1<<3000))', int, d4)

Issue: Remove private _PyLong_FromDigits() function #127937

…757)

picnixz · 2024-12-13T17:31:54Z

hide _PyLong_FromDigits()? it's not used outside of the longobject.c anymore

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

news

Not needed I think, unless you want to indicate the performance gain (it's always nice to know that something is faster). I did report the improvements of fnmatch.translate, so I think you can report those improvements as well.

Modules/_decimal/_decimal.c

Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev · 2024-12-14T00:47:15Z

Modules/_decimal/_decimal.c

+    n = (mpd_sizeinbase(x, 2) + bpd - 1) / bpd;
+    PyLongWriter *writer = PyLongWriter_Create(mpd_isnegative(x), n,
+                                               (void**)&ob_digit);
+    /* mpd_sizeinbase can overestimate size by 1 digit, set it to zero. */


BTW, this looks as a bug in the mpdecimal. C.f. the GNU GMP, the mpz_sizeinbase docs says: "If base is a power of 2, the result is always exact".

skirpichev · 2024-12-14T01:05:31Z

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

I've updated the pr descriptions with my research. So far, I've found just one use case.

At least, I think we should deprecate (not soft) this. This apparently affects not so much projects and there is now a public alternative. @picnixz, what do you think?

picnixz · 2024-12-14T01:30:40Z

At least, I think we should deprecate (not soft) this

I would be fine with deprecating it, saying which alternative to use, so that we can simply remove it in some later versions. I think Victor was the one who removed and restored it so we should ask him as well.

picnixz · 2024-12-14T01:31:31Z

should dec_from_long() be modified here? (To use the PyLong_Export API.) I would prefer to do this in a separate PR.

If you prefer doing it in a follow-up PR because you fear it would be too hard to review, then it's better. If the change is minimal, we can do it this one (I didn't check the code to change)

skirpichev · 2024-12-14T02:03:21Z

If the change is minimal, we can do it this one

You can estimate them looking on the gmpy2 pr (referenced in the PEP): aleaxit/gmpy#495 In principle, I don't think that this will complicate review to much. On another hand, changes looks logically independent. I would rather include here deprecation.

picnixz · 2024-12-14T02:11:14Z

Let's change dec_from_long in another PR since the changes are independent (sorry it's 3 AM here and I don't have much energy).
For deprecating _PyLong_FromDigits, maybe it's better to make a separate PR so that we have a dedicated NEWS entry and re-use the issue that actually removed the private API (and not the issue that reverted the removal). WDYT? (we would also be able to change PyLong_Copy accordingly)

Modules/_decimal/_decimal.c

Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst

* cleanup: forgotten PyLongWriter_Discard, pylong variable * clarify news

serhiy-storchaka · 2024-12-19T09:12:21Z

The precondition is still in the docs. It says MUST.

I've added asserts to ensure that no reallocation occurs.

I meant that there should be comments and asserts in the libmpdec. Testing that no reallocation occurs after the call is too late -- the program can already be crashed. And you cannot test for resizing if it occurs in-place, but the memory management structure can already be broken, and crash later.

For now we cannot use this code. If the libmpdec developers give satisfying guarantees, we could.

skirpichev · 2024-12-19T10:01:04Z

The precondition is still in the docs. It says MUST.

Docs also says that no memory management (read: resize) happens in our scenario. Do we agree on this?

And you cannot test for resizing if it occurs in-place

Added asserts rather for documentation, to show that qexport functions are used in a special way. Now comment added too.

If the libmpdec developers give satisfying guarantees

I think they already did this in docs.

Resize occur e.g. here:

cpython/Modules/_decimal/libmpdec/mpdecimal.c

Line 8230 in 48c70b8

if (n >= wlen) {

That condition can be true only if wlen was underestimated. It can't happen if wlen was obtained by a call to mpd_sizeinbase, just as docs says.

Modules/_decimal/_decimal.c

Co-authored-by: Victor Stinner <[email protected]>

Modules/_decimal/_decimal.c

* move comment up * move assert down * remove redundant assert & restore nonzero assert

vstinner

LGTM. But I would prefer a second review, from @serhiy-storchaka or @picnixz for example :-)

vstinner · 2024-12-20T08:13:32Z

Would you mind to share your benchmark code?

skirpichev · 2024-12-20T08:44:19Z

But I would prefer a second review, from @serhiy-storchaka

Currently Serhiy clearly -1 on this. He think that we could be unsafe, because PyLongObject's and libmpdec use different memory management functions. See e.g. this. My point was that mpd_qexport*() functions should do no memory management at all with given arguments (len coming from mpd_sizeinbase). Do you agree with this or documentation is not clear for you on this aspect?

If neither you or @picnixz agree on above point - probably I should go to mpdecimal mailing list for a clarification.

Would you mind to share your benchmark code?

Ah, this was in "details" of the pr description:-)

vstinner · 2024-12-20T09:33:21Z

Currently Serhiy clearly -1 on this. He think that we could be unsafe, because PyLongObject's and libmpdec use different memory management functions. See e.g. #127925 (comment). My point was that mpd_qexport*() functions should do no memory management at all with given arguments (len coming from mpd_sizeinbase). Do you agree with this or documentation is not clear for you on this aspect?

I don't think that the current implementation pass a pointer to the start of a memory block allocated by libmpdec. I know PEP 757 internals, and this change does basically exactly the same than the current code. It pass a pointer to PyLongObject.ob_digit. I'm fine with that.

vstinner · 2024-12-20T09:49:46Z

I ran the benchmark with CPU isolation, Python built with gcc -O3.

Benchmark	ref	change
export Decimal(1<<3000)	60.0 us	52.5 us: 1.14x faster
import int(Decimal(1<<7))	124 ns	91.0 ns: 1.37x faster
import int(Decimal(1<<38))	127 ns	90.9 ns: 1.39x faster
import int(Decimal(1<<300))	663 ns	733 ns: 1.10x slower
import int(Decimal(1<<3000))	53.4 us	61.3 us: 1.15x slower
Geometric mean	(ref)	1.07x faster

Benchmark hidden because not significant (3): export Decimal(1<<7), export Decimal(1<<38), export Decimal(1<<300)

Performance for integers up to 64-bit: neutral or up to 1.4x faster
Performance for large integers: export is faster (1.14x), import is slower (between 1.10x and 1.15x)

I didn't use PGO+LTO, maybe results are just pure noise. But it sounds unlikely that it's pure noise when the difference is at least 10% (1.10x).

skirpichev · 2024-12-20T09:57:01Z

and this change does basically exactly the same than the current code

No! Code in the main branch pass ob_digit, which set to NULL. In that case mpd_qexport_*() functions do memory allocation and set this pointer. Then in _PyLong_FromDigits() this array memcpy'ed to digits of the PyLongObject instance and we do mpd_free(ob_digit).

New code pass non-NULL pointer to mpd_qexport_*(), it points to pre-allocated (by PyLongWriter_Create()) memory. It's a different case. From mpdecimal docs:

size_t mpd_qexport_u32(uint32_t *rdata, size_t rlen, uint32_t rbase, const mpd_t *src, uint32_t *status);
...
If rdata is non-NULL, it MUST be allocated by one of libmpdec’s allocation functions and rlen MUST be correct. If necessary, the function will resize rdata. Resizing is slow and should not occur if rlen has been obtained by a call to mpd_sizeinbase. In case of an error the caller must free rdata.

So, from my understanding, docs says us that no memory management (resizing) happens in mpd_qexport_*() call. Hence, precondition "MUST be allocated by one of libmpdec’s allocation functions" is irrelevant in our case.

vstinner · 2024-12-20T10:20:07Z

Oh ok, thanks for the explanation.

skirpichev · 2024-12-20T10:42:52Z

Performance for large integers: export is faster (1.14x), import is slower (between 1.10x and 1.15x)

Hmm, this looks strange for me. I did tests on a somewhat noisy system (just with "hands off keyboard"), but the difference with your case looks huge here.

picnixz

On my side, here are the benchmarks with a release build (no PGO). I would like to mention that my machine is quite powerful and that sys.int_info[:2] == (30, 4) for me as well.

Specs

OS: openSUSE Leap 15.5 x86_64
Host: ROG Strix G814JZ_G814JZ 1.0
Kernel: 5.14.21-150500.55.83-default
CPU: 13th Gen Intel i9-13980HX (32) @ 5.400GHz
GPU: NVIDIA GeForce RTX 4080 Max-Q / Mobile
GPU: Intel Raptor Lake-S UHD Graphics
Memory: 11782MiB / 31698MiB

Export (`int` to `Decimal`)

+-----------------+------------+-----------------------+
| Benchmark       | export-ref | export-pep            |
+=================+============+=======================+
| Decimal(1<<38)  | 74.8 ns    | 72.0 ns: 1.04x faster |
+-----------------+------------+-----------------------+
| Decimal(1<<300) | 153 ns     | 161 ns: 1.06x slower  |
+-----------------+------------+-----------------------+
| Geometric mean  | (ref)      | 1.00x slower          |
+-----------------+------------+-----------------------+

Benchmark hidden because not significant (2): Decimal(1<<7), Decimal(1<<3000)

Import (`Decimal` to `int`)

+-----------------------+------------+-----------------------+
| Benchmark             | import-ref | import-pep            |
+=======================+============+=======================+
| int(Decimal(1<<7))    | 61.8 ns    | 51.6 ns: 1.20x faster |
+-----------------------+------------+-----------------------+
| int(Decimal(1<<38))   | 74.4 ns    | 52.5 ns: 1.42x faster |
+-----------------------+------------+-----------------------+
| int(Decimal(1<<300))  | 138 ns     | 134 ns: 1.03x faster  |
+-----------------------+------------+-----------------------+
| int(Decimal(1<<3000)) | 7.26 us    | 7.30 us: 1.01x slower |
+-----------------------+------------+-----------------------+
| Geometric mean        | (ref)      | 1.15x faster          |
+-----------------------+------------+-----------------------+

Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst

Modules/_decimal/_decimal.c

picnixz · 2024-12-22T13:27:35Z

Ah I missed your question about mpd. Wait a bit until I've written my answer sorry.

picnixz · 2024-12-22T13:41:11Z

My point was that mpd_qexport*() functions should do no memory management at all with given arguments (len coming from mpd_sizeinbase)

How I understand the docs you quoted:

If rdata is non-NULL, it MUST be allocated by one of libmpdec’s allocation functions and rlen MUST be correct

is that we should use mpd_mallocfunc or mpd_alloc. However, since mpd_mallocfunc is malloc by default, it doesn't change anything. However, I would like confirmation from libmpd maintainers that mpd_qexport would not do anything if we correctly allocate the memory that is being used (namely, mpd_qexport just uses the memory as is, and neither does it free it afterwards or steals the memory itself).

What I want to know is that we are allowed to use another allocation function which allocates the correct amount of memory. Namely that mpd_qexport is essentially equivalent to something like this:

def mpd_qexport(res, n, ...):
	if is_null(res):
		res, n = allocate(...)
	else:
		if should_resize(res, n):
			n = do_resize_and_compute_new_n(res, n)
	export_to_res(res, n)

In our case, I expect that we're bypassing all checks. But I'd like to know if the export_to_res subroutine has assumptions on whether the destination has been allocated using a mpd function or if allocating using malloc is compatible.

skirpichev · 2024-12-22T14:09:18Z

However, since mpd_mallocfunc is malloc by default

It's overridden in the decimal module to ise PyMem_Malloc(). But _PyLong_New() uses PyObject_Malloc().

In our case, I expect that we're bypassing all checks.

Do you agree that follows unambiguously from the documentation? I.e. there should be no memory allocation and no resize.

I'd like to know if the export_to_res subroutine has assumptions on whether the destination has been allocated using a mpd function

But then this function just export data to a contiguous array. Why do you think it might be important how this array was allocated?

PS: Your benchmarks looks more close to mine than to Victor's.

…aov.rst Co-authored-by: Bénédikt Tran <[email protected]>

picnixz · 2024-12-22T14:15:25Z

Do you agree that follows unambiguously from the documentation? I.e. there should be no memory allocation and no resize.

That's how I understand it.

Why do you think it might be important how this array was allocated?

Well... considering they said "MUST", maybe there are some internals that I'm not aware of.

PS: Your benchmarks looks more close to mine than to Victor's.

Yes, sorry I forgot to say it. I also had "hands off" benchmarks rather than with CPU isolation so maybe there is something happening.

skirpichev · 2024-12-22T14:36:09Z

Well... considering they said "MUST", maybe there are some internals that I'm not aware of.

I can't imagine some sane interpretation of mpdecimal internals where that might be essential.

Ok, I'll have to post in mpdecimal mail lists some stupid questions :-( Unless Serhiy changed his mind, this is no-go for now.

I also had "hands off" benchmarks rather than with CPU isolation so maybe there is something happening.

Maybe it's just a typo;)

skirpichev · 2024-12-24T00:14:32Z

Here is reply from @skrah in the mpdecimal-discuss mail list:

mpd_sizeinbase() uses log10() from math.h for performance reasons. If log10() is IEEE compliant, the result should be sufficiently large. Resizing is for guarding against broken log10() implementations.

The current code in _decimal.c sets the libmpdec allocation functions to PyMem_Malloc() etc. So if longobject uses PyMem_Free() it is safe even when resizing occurs.

If the new API allows for allocators other that PyMem_Malloc(), it will rely on the IEEE compliance of log10().

Stefan Krah

pythongh-102471: convert decimal module to use PyLongWriter API (PEP …

80f1a04

…757)

bedevere-app bot mentioned this pull request Dec 13, 2024

The C-API for Python to C integer conversion is, to be frank, a mess. #102471

Open

skirpichev requested review from vstinner and picnixz December 13, 2024 16:42

picnixz reviewed Dec 13, 2024

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

skirpichev and others added 2 commits December 14, 2024 03:40

+ news

c13b7d2

Apply suggestions from code review

589f926

Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev commented Dec 14, 2024

View reviewed changes

skirpichev marked this pull request as ready for review December 14, 2024 01:05

bedevere-app bot added the awaiting review label Dec 14, 2024

This comment was marked as outdated.

Sign in to view

skirpichev marked this pull request as draft December 14, 2024 05:07

bedevere-app bot removed the awaiting review label Dec 14, 2024

skirpichev added 2 commits December 14, 2024 08:42

Merge branch 'master' into long_export-decimal

f27adef

+ adapt dec_from_long() to use PEP 757

6669b89

skirpichev changed the title ~~gh-102471: convert decimal module to use PyLongWriter API (PEP 757)~~ gh-102471: convert decimal module to use import/export API for ints (PEP 757) Dec 14, 2024

skirpichev requested a review from picnixz December 14, 2024 06:53

skirpichev marked this pull request as ready for review December 14, 2024 07:10

bedevere-app bot added the awaiting review label Dec 14, 2024

skirpichev mentioned this pull request Dec 14, 2024

gh-127937: deprecate _PyLong_FromDigits() function #127939

Draft

Merge branch 'master' into long_export-decimal

05ec274

vstinner reviewed Dec 16, 2024

View reviewed changes

skirpichev added 2 commits December 16, 2024 10:56

Don't use PyLong_GetNativeLayout()

6e46bc1

Address review:

7f0061f

* cleanup: forgotten PyLongWriter_Discard, pylong variable * clarify news

+ comment

90bafc1

This comment was marked as resolved.

Sign in to view

skirpichev commented Dec 19, 2024

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Apply suggestions from code review

c117956

Co-authored-by: Victor Stinner <[email protected]>

vstinner reviewed Dec 19, 2024

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

address review:

7b97855

* move comment up * move assert down * remove redundant assert & restore nonzero assert

vstinner approved these changes Dec 20, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting review labels Dec 20, 2024

skirpichev requested a review from picnixz December 22, 2024 12:52

picnixz reviewed Dec 22, 2024

View reviewed changes

Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst Outdated Show resolved Hide resolved

Modules/_decimal/_decimal.c Show resolved Hide resolved

Modules/_decimal/_decimal.c Show resolved Hide resolved

Modules/_decimal/_decimal.c Show resolved Hide resolved

Update Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7…

7e0e9a9

…aov.rst Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev added the DO-NOT-MERGE label Dec 22, 2024

skirpichev removed the DO-NOT-MERGE label Dec 24, 2024

skirpichev requested a review from serhiy-storchaka December 24, 2024 00:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-127937: convert decimal module to use import/export API for ints (PEP 757) #127925

gh-127937: convert decimal module to use import/export API for ints (PEP 757) #127925

skirpichev commented Dec 13, 2024 •

edited

Loading

picnixz commented Dec 13, 2024

skirpichev Dec 14, 2024

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024

picnixz commented Dec 14, 2024

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024 •

edited

Loading

This comment was marked as outdated.

serhiy-storchaka commented Dec 19, 2024

This comment was marked as resolved.

skirpichev commented Dec 19, 2024

vstinner left a comment

vstinner commented Dec 20, 2024

skirpichev commented Dec 20, 2024

vstinner commented Dec 20, 2024

vstinner commented Dec 20, 2024

skirpichev commented Dec 20, 2024

vstinner commented Dec 20, 2024

skirpichev commented Dec 20, 2024

picnixz left a comment •

edited

Loading

picnixz commented Dec 22, 2024

picnixz commented Dec 22, 2024

skirpichev commented Dec 22, 2024

picnixz commented Dec 22, 2024

skirpichev commented Dec 22, 2024

skirpichev commented Dec 24, 2024

gh-127937: convert decimal module to use import/export API for ints (PEP 757) #127925

Are you sure you want to change the base?

gh-127937: convert decimal module to use import/export API for ints (PEP 757) #127925

Conversation

skirpichev commented Dec 13, 2024 • edited Loading

picnixz commented Dec 13, 2024

skirpichev Dec 14, 2024

Choose a reason for hiding this comment

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024

picnixz commented Dec 14, 2024

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024 • edited Loading

This comment was marked as outdated.

serhiy-storchaka commented Dec 19, 2024

This comment was marked as resolved.

skirpichev commented Dec 19, 2024

vstinner left a comment

Choose a reason for hiding this comment

vstinner commented Dec 20, 2024

skirpichev commented Dec 20, 2024

vstinner commented Dec 20, 2024

vstinner commented Dec 20, 2024

skirpichev commented Dec 20, 2024

vstinner commented Dec 20, 2024

skirpichev commented Dec 20, 2024

picnixz left a comment • edited Loading

Choose a reason for hiding this comment

Specs

Export (int to Decimal)

Import (Decimal to int)

picnixz commented Dec 22, 2024

picnixz commented Dec 22, 2024

skirpichev commented Dec 22, 2024

picnixz commented Dec 22, 2024

skirpichev commented Dec 22, 2024

skirpichev commented Dec 24, 2024

skirpichev commented Dec 13, 2024 •

edited

Loading

picnixz commented Dec 14, 2024 •

edited

Loading

picnixz left a comment •

edited

Loading

Export (`int` to `Decimal`)

Import (`Decimal` to `int`)