Metadata-Version: 2.4
Name: muvera
Version: 0.2.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Dist: numpy>=1.20.0
Requires-Dist: cffi>=1.0.0
Requires-Dist: pytest>=7.0.0 ; extra == 'dev'
Requires-Dist: black>=22.0.0 ; extra == 'dev'
Requires-Dist: mypy>=1.0.0 ; extra == 'dev'
Provides-Extra: dev
License-File: LICENSE
Summary: An unofficial Rust implementation of MuVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Keywords: muvera,multi-vector,embedding,fde,retrieval
Home-Page: https://github.com/NewBornRustacean/muvera-rs
Author: NewBornRustacean <huiseomkim@gmail.com>
Author-email: NewBornRustacean <huiseomkim@gmail.com>
License: MIT
Requires-Python: >=3.8
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/NewBornRustacean/muvera-rs
Project-URL: Repository, https://github.com/NewBornRustacean/muvera-rs
Project-URL: Documentation, https://docs.rs/muvera-rs
Project-URL: Issues, https://github.com/NewBornRustacean/muvera-rs/issues

# muvera-rs

An unofficial Rust implementation of **MuVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings**.

[![Crates.io](https://img.shields.io/crates/v/muvera-rs)](https://crates.io/crates/muvera-rs)
[![Documentation](https://docs.rs/muvera-rs/badge.svg)](https://docs.rs/muvera-rs)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

## Disclaimer

- This is an unofficial implementation and is not affiliated with Google Research or the original authors.
- Some parts of this README were generated by AI.

## Overview

MuVERA is a breakthrough algorithm that enables efficient multi-vector similarity search by reducing it to single-vector search. This implementation provides the core Fixed Dimensional Encoding (FDE) functionality that transforms variable-length token embeddings into fixed-dimensional vectors while preserving similarity relationships.

### The Algorithm

Multi-vector models (like ColBERT) produce multiple embeddings per query/document token, achieving superior retrieval performance compared to single-vector models. However, multi-vector retrieval is computationally expensive due to the complex Chamfer similarity scoring.

MuVERA solves this by:

1. **Fixed Dimensional Encoding (FDE)**: Transforms sets of token embeddings into fixed-dimensional vectors using randomized hyperplanes and hashing
2. **Asymmetric Encoding**: Uses sum aggregation for queries and average aggregation for documents
3. **Single-Vector MIPS**: Enables the use of highly-optimized Maximum Inner Product Search algorithms
4. **Theoretical Guarantees**: Provides ε-approximation guarantees for Chamfer similarity

The key insight is that the dot product of FDEs approximates the true multi-vector Chamfer similarity, allowing retrieval systems to leverage decades of optimization in single-vector search while maintaining multi-vector quality.

## Features

- ✅ **Core FDE Implementation**: Complete Fixed Dimensional Encoding with configurable buckets
- ✅ **Batch Processing**: Efficient parallel encoding for large datasets
- ✅ **Type Safety**: Generic implementation supporting `f32` and `f64`
- ✅ **Memory Efficient**: Uses `ndarray` for optimized vector operations
- ✅ **Deterministic**: Reproducible results with seed-based randomization
- ✅ **Comprehensive Testing**: Extensive test suite covering edge cases

## Installation

### Rust

Add to your `Cargo.toml`:

```toml
[dependencies]
muvera-rs = "0.1.0"
```

### Python

Install from PyPI:

```bash
pip install muvera
```

Or install from source:

```bash
pip install maturin
git clone https://github.com/NewBornRustacean/muvera-rs.git
cd muvera-rs
maturin develop --features python-bindings
```

## Testing the Python Bindings

After building with maturin, you can test the Python bindings interactively:

```python
import numpy as np
import muvera

embeddings = np.random.randn(32, 768).astype(np.float32)
result = muvera.encode_fde(embeddings, "mean")
print(result)
```

Or save the above as a script and run with `python my_test.py`.

## Usage

### Python Example

```python
import numpy as np
import muvera

# Create token embeddings (num_tokens, embedding_dim)
embeddings = np.random.randn(32, 768).astype(np.float32)

# Encode with mean aggregation
buckets = 3
result = muvera.encode_fde(embeddings, buckets, "mean")
print(f"FDE result shape: {result.shape}")

# Encode with max aggregation
result_max = muvera.encode_fde(embeddings, buckets, "max")
print(f"FDE max result shape: {result_max.shape}")
```

### Rust Example

```rust
use muvera_rs::encoder::fde_encoder::{FDEEncoder, FDEEncoding};
use muvera_rs::types::Aggregation;
use ndarray::{Array2, ArrayView2};

fn main() {
    // Create encoder with 1024 buckets for 768-dimensional embeddings
    let encoder = FDEEncoder::new(128, 768, 42);
    
    // Example token embeddings (num_tokens, embedding_dim)
    let tokens = Array2::from_shape_vec((32, 768), vec![0.1; 32 * 768]).unwrap();
    
    // Encode query (sum aggregation)
    let query_fde = encoder.encode_query(tokens.view());
    
    // Encode document (average aggregation)
    let doc_fde = encoder.encode_doc(tokens.view());
}
```

### Batch Processing

```rust
use muvera_rs::encoder::fde_encoder::{FDEEncoder, FDEEncoding};
use ndarray::{Array3, ArrayView3};

fn encode_batch() {
    let encoder = FDEEncoder::new(128, 768, 42);
    
    // Batch of token embeddings (batch_size, num_tokens, embedding_dim)
    let batch_tokens = Array3::from_shape_vec((100, 32, 768), vec![0.1; 100 * 32 * 768]).unwrap();
    
    // Encode entire batch
    let batch_fdes = encoder.encode_query_batch(batch_tokens.view());
    
    println!("Encoded batch shape: {:?}", batch_fdes.shape());
}
```

### Custom Aggregation

```rust
use muvera_rs::encoder::fde_encoder::{FDEEncoder, FDEEncoding};
use muvera_rs::types::Aggregation;

fn custom_encoding() {
    let encoder = FDEEncoder::new(128, 768, 42);
    let tokens = Array2::from_shape_vec((32, 768), vec![0.1; 32 * 768]).unwrap();
    
    // Use custom aggregation mode
    let fde_sum = encoder.encode(tokens.view(), Aggregation::Sum);
    let fde_avg = encoder.encode(tokens.view(), Aggregation::Avg);
}
```

## API Reference

### `FDEEncoder<T>`

The main encoder struct that implements the Fixed Dimensional Encoding algorithm.

#### Constructor
```rust
pub fn new(buckets: usize, dim: usize, seed: u64) -> Self
```

- `buckets`: Number of hash buckets (hyperplanes) 
- `dim`: Dimensionality of input token embeddings
- `seed`: Random seed for reproducible hyperplane initialization

#### Methods

- `encode(tokens, mode)`: Encode single multi-vector
- `batch_encode(tokens, mode)`: Encode batch of multi-vectors (parallel)
- `encode_query(tokens)`: Encode query with sum aggregation
- `encode_doc(tokens)`: Encode document with average aggregation
- `encode_query_batch(tokens)`: Batch encode queries
- `encode_doc_batch(tokens)`: Batch encode documents

### `Aggregation`

Enum defining aggregation modes:
- `Sum`: Sum all tokens in each bucket
- `Avg`: Average all tokens in each bucket

## Performance Considerations

- **Buckets**: More buckets = better approximation but higher memory usage
- **Batch Size**: Use batch encoding for multiple vectors to leverage parallelism
- **Memory**: FDE vectors are `buckets * dim` in size
- **Precision**: `f32` is typically sufficient and faster than `f64`

## References

### Original Paper
- **MuVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings**  
  Laxman Dhulipala, Jason Lee, Majid Hadian, Rajesh Jayaram, Vahab Mirrokni  
  [arXiv:2405.19504](https://arxiv.org/pdf/2405.19504)

### Google Research Blog
- **MuVERA: Making Multi-Vector Retrieval as Fast as Single-Vector Search**  
  [Google Research Blog](https://research.google/blog/muvera-making-multi-vector-retrieval-as-fast-as-single-vector-search/)

### Original Implementation
- **Google Graph Mining Repository**  
  [github.com/google/graph-mining/tree/main/sketching/point_cloud](https://github.com/google/graph-mining/tree/main/sketching/point_cloud)

## Future Work

### Planned Features

1. **Benchmark Suite**
   - Integration with BEIR datasets (MS MARCO, etc.)
   - Memory usage and compression analysis

2. **Advanced Features**
   - **Support BLAS**
   - **Product Quantization (PQ)**: 32x compression with minimal quality loss
   - **Final Projections**: Dimensionality reduction techniques
   

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.


