-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-127937: convert decimal module to use import/export API for ints (PEP 757) #127925
base: main
Are you sure you want to change the base?
Conversation
Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).
Not needed I think, unless you want to indicate the performance gain (it's always nice to know that something is faster). I did report the improvements of |
Co-authored-by: Bénédikt Tran <[email protected]>
Modules/_decimal/_decimal.c
Outdated
n = (mpd_sizeinbase(x, 2) + bpd - 1) / bpd; | ||
PyLongWriter *writer = PyLongWriter_Create(mpd_isnegative(x), n, | ||
(void**)&ob_digit); | ||
/* mpd_sizeinbase can overestimate size by 1 digit, set it to zero. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, this looks as a bug in the mpdecimal. C.f. the GNU GMP, the mpz_sizeinbase docs says: "If base is a power of 2, the result is always exact".
I've updated the pr descriptions with my research. So far, I've found just one use case. At least, I think we should deprecate (not soft) this. This apparently affects not so much projects and there is now a public alternative. @picnixz, what do you think? |
I would be fine with deprecating it, saying which alternative to use, so that we can simply remove it in some later versions. I think Victor was the one who removed and restored it so we should ask him as well. |
If you prefer doing it in a follow-up PR because you fear it would be too hard to review, then it's better. If the change is minimal, we can do it this one (I didn't check the code to change) |
You can estimate them looking on the gmpy2 pr (referenced in the PEP): aleaxit/gmpy#495 In principle, I don't think that this will complicate review to much. On another hand, changes looks logically independent. I would rather include here deprecation. |
|
This comment was marked as outdated.
This comment was marked as outdated.
Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst
Outdated
Show resolved
Hide resolved
* cleanup: forgotten PyLongWriter_Discard, pylong variable * clarify news
The precondition is still in the docs. It says MUST.
I meant that there should be comments and asserts in the libmpdec. Testing that no reallocation occurs after the call is too late -- the program can already be crashed. And you cannot test for resizing if it occurs in-place, but the memory management structure can already be broken, and crash later. For now we cannot use this code. If the libmpdec developers give satisfying guarantees, we could. |
Docs also says that no memory management (read: resize) happens in our scenario. Do we agree on this?
Added asserts rather for documentation, to show that qexport functions are used in a special way. Now comment added too.
I think they already did this in docs. Resize occur e.g. here: cpython/Modules/_decimal/libmpdec/mpdecimal.c Line 8230 in 48c70b8
That condition can be true only if wlen was underestimated. It can't happen if wlen was obtained by a call to mpd_sizeinbase, just as docs says.
|
Co-authored-by: Victor Stinner <[email protected]>
* move comment up * move assert down * remove redundant assert & restore nonzero assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But I would prefer a second review, from @serhiy-storchaka or @picnixz for example :-)
Would you mind to share your benchmark code? |
Currently Serhiy clearly -1 on this. He think that we could be unsafe, because PyLongObject's and libmpdec use different memory management functions. See e.g. this. My point was that mpd_qexport*() functions should do no memory management at all with given arguments (len coming from mpd_sizeinbase). Do you agree with this or documentation is not clear for you on this aspect? If neither you or @picnixz agree on above point - probably I should go to mpdecimal mailing list for a clarification.
Ah, this was in "details" of the pr description:-) |
I don't think that the current implementation pass a pointer to the start of a memory block allocated by libmpdec. I know PEP 757 internals, and this change does basically exactly the same than the current code. It pass a pointer to |
I ran the benchmark with CPU isolation, Python built with
Benchmark hidden because not significant (3): export Decimal(1<<7), export Decimal(1<<38), export Decimal(1<<300)
I didn't use PGO+LTO, maybe results are just pure noise. But it sounds unlikely that it's pure noise when the difference is at least 10% (1.10x). |
No! Code in the main branch pass ob_digit, which set to New code pass non-
So, from my understanding, docs says us that no memory management (resizing) happens in |
Oh ok, thanks for the explanation. |
Hmm, this looks strange for me. I did tests on a somewhat noisy system (just with "hands off keyboard"), but the difference with your case looks huge here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my side, here are the benchmarks with a release build (no PGO). I would like to mention that my machine is quite powerful and that sys.int_info[:2] == (30, 4)
for me as well.
Specs
OS: openSUSE Leap 15.5 x86_64
Host: ROG Strix G814JZ_G814JZ 1.0
Kernel: 5.14.21-150500.55.83-default
CPU: 13th Gen Intel i9-13980HX (32) @ 5.400GHz
GPU: NVIDIA GeForce RTX 4080 Max-Q / Mobile
GPU: Intel Raptor Lake-S UHD Graphics
Memory: 11782MiB / 31698MiB
Export (int
to Decimal
)
+-----------------+------------+-----------------------+
| Benchmark | export-ref | export-pep |
+=================+============+=======================+
| Decimal(1<<38) | 74.8 ns | 72.0 ns: 1.04x faster |
+-----------------+------------+-----------------------+
| Decimal(1<<300) | 153 ns | 161 ns: 1.06x slower |
+-----------------+------------+-----------------------+
| Geometric mean | (ref) | 1.00x slower |
+-----------------+------------+-----------------------+
Benchmark hidden because not significant (2): Decimal(1<<7), Decimal(1<<3000)
Import (Decimal
to int
)
+-----------------------+------------+-----------------------+
| Benchmark | import-ref | import-pep |
+=======================+============+=======================+
| int(Decimal(1<<7)) | 61.8 ns | 51.6 ns: 1.20x faster |
+-----------------------+------------+-----------------------+
| int(Decimal(1<<38)) | 74.4 ns | 52.5 ns: 1.42x faster |
+-----------------------+------------+-----------------------+
| int(Decimal(1<<300)) | 138 ns | 134 ns: 1.03x faster |
+-----------------------+------------+-----------------------+
| int(Decimal(1<<3000)) | 7.26 us | 7.30 us: 1.01x slower |
+-----------------------+------------+-----------------------+
| Geometric mean | (ref) | 1.15x faster |
+-----------------------+------------+-----------------------+
Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst
Outdated
Show resolved
Hide resolved
Ah I missed your question about mpd. Wait a bit until I've written my answer sorry. |
How I understand the docs you quoted:
is that we should use What I want to know is that we are allowed to use another allocation function which allocates the correct amount of memory. Namely that def mpd_qexport(res, n, ...):
if is_null(res):
res, n = allocate(...)
else:
if should_resize(res, n):
n = do_resize_and_compute_new_n(res, n)
export_to_res(res, n) In our case, I expect that we're bypassing all checks. But I'd like to know if the |
It's overridden in the decimal module to ise PyMem_Malloc(). But _PyLong_New() uses PyObject_Malloc().
Do you agree that follows unambiguously from the documentation? I.e. there should be no memory allocation and no resize.
But then this function just export data to a contiguous array. Why do you think it might be important how this array was allocated? PS: Your benchmarks looks more close to mine than to Victor's. |
…aov.rst Co-authored-by: Bénédikt Tran <[email protected]>
That's how I understand it.
Well... considering they said "MUST", maybe there are some internals that I'm not aware of.
Yes, sorry I forgot to say it. I also had "hands off" benchmarks rather than with CPU isolation so maybe there is something happening. |
I can't imagine some sane interpretation of mpdecimal internals where that might be essential. Ok, I'll have to post in mpdecimal mail lists some stupid questions :-( Unless Serhiy changed his mind, this is no-go for now.
Maybe it's just a typo;) |
Here is reply from @skrah in the mpdecimal-discuss mail list:
|
For export (int instance → Decimal)
For import (Decimal instance → int)
>>> sys.int_info[:2] (30, 4)
benchmarks code