Add a utility for performing randomized rotations of vectors.#36949
Merged
Conversation
This models the effects of multiplying a vector with a (pseudo-)random rotation matrix (or its inverse/transpose), but does so in O(n lg n) instead of O(n^2). Both forward and inverse (transposed) rotations are supported. We use the fast Walsh-Hadamard transform (FWHT) as the work horse algorithm to implement rotations. The FWHT is a deterministic transform and not a "true" matrix multiplication, but we can get results as if we had used a random matrix multiplication if we pseudo-randomly flip the signs of the input vector prior to rotation. Fast Hadamard transforms are only defined for power-of-two vectors, so for vectors where this property does not hold we use multiple overlapping sub-transforms to achieve a statistically similar result. Note that this requires more computations (up to 2x) than if the vectors were powers of two. The partially overlapping transforms are inspired by the RaBitQ rotation algorithm, but we have a different way of "bridging the gap" across the partial overlaps; see the comment for the `Rotator` class for details. The quality of the rotator algorithm has been estimated by sampling the _expected_ Gaussian vs _actual_ post- rotation coordinate distributions for several scenarios (sparse vectors, shifted mean normal distribution), and then computing the Kullback-Leibler divergence between the two distributions. The intuition and assumption being that a divergence close to zero means that the rotation has the desired properties for our quantization purposes.
havardpe
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
@havardpe and/or @arnej27959 please review.
This aims to implement the "exercise left for the reader"/"draw the rest of the owl"$y \gets \Pi x$ and $x \gets \Pi^\top y$ parts found in most quantization papers (where $\Pi$ is a random rotation matrix, $\Pi^\top$ its inverse/transpose, and $x$ , $y$ are pre/post rotation vectors, respectively).
We use the fast Walsh-Hadamard transform (FWHT) as the work horse algorithm to implement rotations. The FWHT is a deterministic transform and not a "true" matrix multiplication, but we can get results as if we had used a random matrix multiplication if we pseudo-randomly flip the signs of the input vector prior to rotation.
Fast Hadamard transforms are only defined for power-of-two vectors, so for vectors where this property does not hold we use multiple overlapping sub-transforms to achieve a statistically similar result. Note that this requires more computations (up to 2x) than if the vectors were powers of two.
The partially overlapping transforms are inspired by the RaBitQ rotation algorithm, but we use a different strategy for "bridging the gap" (well, technically bridging the overlap) across the sub-transforms; see the comment for the
Rotatorclass for details.The quality of the rotator algorithm has been estimated by sampling the expected Gaussian vs actual post- rotation coordinate distributions for several scenarios (sparse vectors, shifted mean normal distribution), and then computing the Kullback-Leibler divergence between the two distributions. The intuition and assumption™ being that a divergence close to zero means that the rotation has the desired properties for our quantization purposes.
This implementation is not yet optimized for speed, and we may change things around based on further experiments. But it should be a functionally complete starting point.