Skip to content

Conversation

@randombit
Copy link
Contributor

In various elliptic curve point multiplication algorithms, an accumulator element is created (initialized as the identity element) and then several points are added to it.

However the first point addition is always completely avoidable; it's equvalent to just an assignment.

Restructure the multiplication algorithms such that the accumulator is instead initialized with the value created in the first loop of the multiplication algorithm.

In various elliptic curve point multiplication algorithms, an
accumulator element is created (initialized as the identity element)
and then several points are added to it.

However the first point addition is always completely avoidable; it's
equvalent to just an assignment.

Restructure the multiplication algorithms such that the accumulator is
instead initialized with the value created in the first loop of the
multiplication algorithm.
@randombit randombit requested a review from a team as a code owner December 4, 2025 18:26
@github-actions github-actions bot added the perf label Dec 4, 2025
@randombit
Copy link
Contributor Author

This is a pretty minor optimization overall; at best it saves a single point addition which even for G2 costs just ~ 2 microseconds. But it's a pretty easy change and, imo across all of the scalar multiplications ever performed by the system kind of adds up.

@randombit randombit changed the title perf(crypto): CRP-2965 Avoid adding into the identity element perf(crypto): CRP-2965 Avoid adding into the identity element in BLS12-381 multiplication Dec 4, 2025
@randombit randombit changed the title perf(crypto): CRP-2965 Avoid adding into the identity element in BLS12-381 multiplication perf(crypto): CRP-2965 Avoid adding into the identity element during BLS12-381 multiplication Dec 4, 2025
passed to ct_select - this results in the identity element being
returned since the index is out of range.
*/
accum += Self::ct_select(&tbl, (w as usize).wrapping_sub(1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we have three implementations of ct_select. One of them does the wrapping_sub(1) internally, the other two don't, so we have to do the wrapping_sub(1) here. I guess this kind of makes sense, because in the former case the table is defined elsewhere, while in the latter the table is part of the same function so we know the table is shifted.

However, I wonder if we should aim to have a consistent implementation across all types, if possible. The actual concern I have is related to the use of macros, which makes it a bit hard (for me) to track what types we can instantiate them on. For example, could I have declare_windowed_scalar_mul_ops_for on an affine type? IIUC this would then use the ct_select with internal wrapping_sub, which may not result in the correct behaviour?

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is kind of funky (and the macro proliferation admittedly does not make it easier!), I'll see if I can clean that up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants