RyanCodrai/turbovec: Trending on GitHub

Turbovec: Revolutionizing Vector Indexing with Near-Optimal Distortion Rate

Turbovec, a Rust vector index with Python bindings, has been making waves in the tech community with its impressive performance and efficiency. Built on Google Research's TurboQuant algorithm, turbovec is a data-oblivious quantizer that matches the Shannon lower bound on distortion, with no codebook training and no separate train phase. In this article, we'll delve into the world of turbovec and explore its key features, benefits, and implications.

concompressing High-Dimensional Vectors

Turbovec's core innovation lies in its ability to compress high-dimensional vectors while preserving their essential properties. Each vector is a direction on a high-dimensional hypersphere, and turbovec compresses these directions using a simple yet powerful insight: after applying a random rotation, every coordinate follows a known distribution, regardless of the input data.

The Turbovec Process

The turbovec process involves six key steps:

Normalization: Strip the length (norm) from each vector and store it as a single float. Now every vector is a unit direction on the hypersphere.
Random Rotation: Multiply all vectors by the same random orthogonal matrix. After rotation, each coordinate independently follows a Beta distribution that converges to Gaussian N(0, 1/d) in high dimensions.
Per-coordinate Calibration (TQ+): The Beta distribution from step 2 is asymptotic — at finite dimensions, individual coordinates drift from the canonical shape (especially low-bit and word-vector-style embeddings). TQ+ fits two scalars per coordinate — a shift and a scale — during the first add, mapping each coordinate's empirical 5/95% quantiles onto the canonical Beta marginal.
Lloyd-Max Scalar Quantization: Since the distribution is known, we can precompute the optimal way to bucket each coordinate. For 2-bit, that's 4 buckets; for 4-bit, 16 buckets. The Lloyd-Max algorithm finds bucket boundaries and centroids that minimize mean squared error.
Bit-Packing: Each coordinate is now a small integer (0-3 for 2-bit, 0-15 for 4-bit). Pack these tightly into bytes. A 1536-dim vector goes from 6,144 bytes (FP32) to 384 bytes (2-bit). That's 16x compression.
Length-Renormalized Scoring: Scalar quantization systematically underestimates inner products — the reconstructed unit direction is a little shorter than the original. We compute one scalar per vector at encode time — the inner product of the rotated unit vector with its own centroid reconstruction — and store ||v|| / ⟨u, x̂⟩ alongside each compressed vector.

Search and Retrieval

Instead of decompressing every database vector, turbovec rotates the query once into the same domain and scores directly against the codebook values. The scoring kernel uses SIMD intrinsics (NEON on ARM, AVX-512BW on modern x86 with an AVX2 fallback) with nibble-split lookup tables for maximum throughput.

Building and Running Benchmarks

Turbovec is built using Rust and Python, with a focus on performance and efficiency. The code is available on GitHub, and the project includes a comprehensive set of benchmarks to demonstrate its capabilities.

Conclusion

Turbovec is a powerful tool for vector indexing and retrieval, offering near-optimal distortion rate and impressive performance. Its ability to compress high-dimensional vectors while preserving their essential properties makes it an attractive solution for a wide range of applications, from natural language processing to computer vision. As the tech community continues to evolve and innovate, turbovec is sure to play a key role in shaping the future of vector indexing and retrieval.

References

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate (ICLR 2026)
RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search (SIGMOD 2024)
FAISS Fast accumulation of PQ and AQ codes