gh-129695: Optimize _PyFloat_FromDouble_ConsumeInputs()
#129697
+43
−18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add optimization for
_PyFloat_FromDouble_ConsumeInputs()
for the free-threaded build. Implement free-threaded versions of_Py_DECREF_SPECIALIZED
and_Py_DECREF_NO_DEALLOC
as well.pyperformance results vs merge base
Based on a microbenchmark I wrote, this seems to give a small speed up for binary operations on floats, it there are temporary results (refcnt == 1) that can be re-used. The
_Py_DECREF_SPECIALIZED
and_Py_DECREF_SPECIALIZED
changes seem to help as well but only a little (hard to accurately measure).The longobject.c file also uses
_Py_DECREF_SPECIALIZED
and I guess that might be the reason the pidigits benchmark got faster. I not sure why the pyflate benchmark got slower. Perhaps due to_Py_DECREF_SPECIALIZED
taking the slow path too often._PyFloat_FromDouble_ConsumeInputs
#129695