Metadata-Version: 2.4
Name: langchain_endee
Version: 0.1.0b2
Summary: High Speed Vector Database for Faster and Efficient  ANN Searches with LangChain
Home-page: https://endee.io
Author: Endee Labs
Author-email: support@endee.io
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: langchain>=0.3.25
Requires-Dist: langchain-core>=0.3.59
Requires-Dist: endee>=0.1.13
Requires-Dist: numpy
Requires-Dist: fastembed>=0.3.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Endee LangChain Integration

This package provides an integration between [Endee](https://endee.io) (a high-performance vector database) and [LangChain](https://www.langchain.com/), allowing you to use Endee as a vector store backend for LangChain applications.

## Features

- **🔍 Hybrid Search**: Combines dense (semantic) + sparse (keyword) embeddings for superior retrieval accuracy
  - **SPLADE (default)**: Neural sparse model for highest accuracy
  - **BM25**: Classical sparse model for speed
- **Multiple Distance Metrics**: Support for cosine, L2, and inner product distance metrics
- **Configurable Precision**: Choose between different quantization levels using the `Precision` enum for optimal performance/accuracy trade-offs
- **Metadata Filtering**: Filter search results based on metadata using powerful query operators ($eq, $in, $range)
- **Automatic Text Truncation**: Smart text handling based on embedding model type
- **High Performance**: Optimized for speed and efficiency with the HNSW algorithm
- **Batch Operations**: Efficient batch processing for large-scale vector operations
- **Production Ready**: Comprehensive test suite (26 tests, 100% passing), examples, and documentation

## Installation

```bash
pip install langchain_endee
```

This will install both the `endee-langchain` package and its dependencies (`endee`, `langchain`, and `langchain-core`).

### For Hybrid Search (Recommended)

```bash
pip install fastembed  # For sparse embeddings
```

## 📚 Documentation

- **[Complete Guide](VECTORSTORE_COMPLETE_GUIDE.md)** - Everything about EndeeVectorStore (1,700+ lines)
- **[Hybrid Search Guide](HYBRID_SEARCH_README.md)** - Dense vs Sparse vs Hybrid explained
- **[Examples](examples/README.md)** - Complete RAG implementation & examples
- **[Test Suite](tests/test_complete_rag_pipeline.py)** - 26 comprehensive tests
- **[Summary](SUMMARY.md)** - Project overview and quick reference

## Quick Start

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings
from endee import Precision

# Initialize embedding model
embedding_model = OpenAIEmbeddings()

# Initialize the vector store
vector_store = EndeeVectorStore(
    embedding=embedding_model,
    api_token="your-api-token",  # Optional for local deployment
    index_name="my_langchain_vectors",
    dimension=1536,
    space_type="cosine",
    precision=Precision.INT8D  # Use Precision enum
)

# Add documents
texts = [
    "Endee is a high-performance vector database",
    "LangChain is a framework for developing applications powered by language models",
    "Vector databases store vector embeddings and enable fast similarity search"
]

metadatas = [
    {"source": "product", "category": "database"},
    {"source": "github", "category": "framework"},
    {"source": "textbook", "category": "education"}
]

# Add texts to the vector store
ids = vector_store.add_texts(texts=texts, metadatas=metadatas)

# Search similar documents
results = vector_store.similarity_search("How do vector databases work?", k=2)

# Process results
for doc in results:
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}")
    print()
```

## 🔥 Hybrid Search (Recommended for Production)

Hybrid search combines **dense embeddings** (semantic search) with **sparse embeddings** (keyword search) for superior retrieval accuracy. This is now the **recommended approach** for production RAG applications.

### Why Hybrid Search?

| Search Type | Strengths | Weaknesses | Use Case |
|-------------|-----------|------------|----------|
| **Dense Only** | Semantic understanding, synonyms | May miss exact keywords | Conceptual queries |
| **Sparse Only** | Exact keyword matching | No semantic understanding | Keyword-based search |
| **Hybrid** ⭐ | Best of both worlds | Slightly slower (acceptable) | **Production RAG** |

### Quick Start with Hybrid Search

```python
from langchain_endee import EndeeVectorStore, FastEmbedSparse, RetrievalMode
from langchain_huggingface import HuggingFaceEmbeddings

# Dense embeddings (semantic)
dense_embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

# Sparse embeddings (keyword) - SPLADE is now the default!
sparse_embeddings = FastEmbedSparse()  # Uses prithivida/Splade_PP_en_v1 by default

# Create hybrid vector store
vector_store = EndeeVectorStore(
    embedding=dense_embeddings,
    sparse_embedding=sparse_embeddings,
    retrieval_mode=RetrievalMode.HYBRID,
    index_name="hybrid_index",
    dimension=384,
    api_token=None  # Local deployment
)

# Use normally - hybrid search is automatic!
texts = ["Python is a programming language", "Machine learning uses neural networks"]
vector_store.add_texts(texts)

results = vector_store.similarity_search("programming with Python", k=2)
# Returns results using both semantic AND keyword matching!
```

### Sparse Embedding Models

**Default: SPLADE (Highest Accuracy) ⭐**
```python
from langchain_endee import FastEmbedSparse

# SPLADE - Neural sparse model (default)
sparse = FastEmbedSparse()  # prithivida/Splade_PP_en_v1
# or explicitly:
sparse = FastEmbedSparse(model_name="prithivida/Splade_PP_en_v1", batch_size=128)
```

**Alternative: BM25 (Faster)**
```python
# BM25 - Classical sparse model (faster, slightly less accurate)
sparse = FastEmbedSparse(model_name="Qdrant/bm25", batch_size=256)
```

### When to Use What?

| Scenario | Recommendation |
|----------|---------------|
| **Production RAG** | Hybrid with SPLADE (default) ⭐ |
| **Speed Critical** | Hybrid with BM25 or Dense-only |
| **Maximum Accuracy** | Hybrid with SPLADE + FLOAT16 precision |
| **Research/Prototyping** | Dense-only (simpler setup) |
| **Large Scale (>1M docs)** | Hybrid with BM25 + INT8D precision |

**Learn more:** See [Hybrid Search Guide](HYBRID_SEARCH_README.md) for detailed comparisons and best practices.

## Understanding Precision Levels

Endee supports different precision levels (quantization) that allow you to balance between memory usage, search speed, and accuracy. Use the `Precision` enum from the `endee` package for type safety:

```python
from endee import Precision

# Available precision levels
Precision.BINARY2   # 1-bit binary quantization
Precision.INT8D     # 8-bit integer quantization (default)
Precision.INT16D    # 16-bit integer quantization  
Precision.FLOAT16   # 16-bit floating point
Precision.FLOAT32   # 32-bit floating point
```

| Precision | Quantization | Data Type | Memory per Vector | Search Speed | Best For |
|-----------|--------------|-----------|-------------------|--------------|----------|
| `Precision.BINARY2` | 1-bit | Binary | Smallest (~96.9% less) | Fastest | Extreme compression, large-scale deployments |
| `Precision.INT8D` | 8-bit | INT8 | Small (~75% less) | Very Fast | **Default** - great for most use cases |
| `Precision.INT16D` | 16-bit | INT16 | Medium (~50% less) | Fast | Balanced integer precision |
| `Precision.FLOAT16` | 16-bit | FP16 | Medium (~50% less) | Fast | Balanced float precision |
| `Precision.FLOAT32` | 32-bit | FP32 | Largest (baseline) | Slower | Maximum accuracy requirements |

**Memory Usage Example:** For a 1536-dimensional vector:
- `Precision.BINARY2`: ~0.2 KB per vector (extreme compression)
- `Precision.INT8D`: ~1.5 KB per vector (default)
- `Precision.INT16D` / `Precision.FLOAT16`: ~3 KB per vector
- `Precision.FLOAT32`: ~6 KB per vector

### Example: Choosing Precision Level

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings
from endee import Precision

# Default precision - balanced performance (recommended for most cases)
default_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="default_index",
    dimension=1536,
    precision=Precision.INT8D  # Default - 8-bit integer quantization
)

# High accuracy with 16-bit precision
high_accuracy_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="high_accuracy_index",
    dimension=1536,
    precision=Precision.FLOAT16  # 16-bit floating point
)

# Maximum accuracy with full 32-bit precision
max_accuracy_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="max_accuracy_index",
    dimension=1536,
    precision=Precision.FLOAT32  # 32-bit floating point
)

# Extreme compression for very large datasets
compressed_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="compressed_index",
    dimension=1536,
    precision=Precision.BINARY2  # 1-bit binary quantization
)
```

## Local Deployment

Endee can be run locally without requiring an API token. If you have a local Endee server running on `http://127.0.0.1:8080`, you can initialize the vector store without an API token:

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings

# Initialize without API token for local deployment
vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token=None,  # No token needed for local deployment
    index_name="local_index",
    dimension=1536
)
```

## Creating Vector Stores

### From Texts

Create a vector store directly from a list of texts:

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings
from endee import Precision

texts = ["foo", "bar", "baz"]
metadatas = [{"key": "val1"}, {"key": "val2"}, {"key": "val3"}]

vector_store = EndeeVectorStore.from_texts(
    texts=texts,
    embedding=OpenAIEmbeddings(),
    metadatas=metadatas,
    api_token="your-api-token",
    index_name="my-index",
    dimension=1536,
    space_type="cosine",
    precision=Precision.INT8D
)
```

### From Documents

Create a vector store from LangChain documents:

```python
from langchain_core.documents import Document
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings
from endee import Precision

documents = [
    Document(
        page_content="Endee is a high-performance vector database",
        metadata={"source": "product", "category": "database"}
    ),
    Document(
        page_content="LangChain is a framework for developing applications",
        metadata={"source": "github", "category": "framework"}
    )
]

vector_store = EndeeVectorStore.from_documents(
    documents=documents,
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="doc-index",
    dimension=1536,
    precision=Precision.INT8D
)
```

### From Existing Index

Connect to an existing Endee index:

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings

vector_store = EndeeVectorStore.from_existing_index(
    index_name="existing-index",
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token"
)
```

## Filtering Search Results

You can filter search results based on metadata using flexible query operators:

```python
# Search with a filter
query = "Tell me about Endee"
filter_dict = [{"category": {"$eq": "database"}}]
 
filtered_results = vector_store.similarity_search(
    query=query,
    k=3,
    filter=filter_dict
)

print(f"Query: '{query}' with filter: {filter_dict}")
print(f"\nFound {len(filtered_results)} filtered results:")
for i, doc in enumerate(filtered_results):
    print(f"\nResult {i+1}:")
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}")
```

### Supported Filter Operators

- **`$eq`**: Matches records with metadata values equal to a specified value  
  ```python
  {"category": {"$eq": "database"}}
  ```

- **`$in`**: Matches records with metadata values that are in a specified array  
  ```python
  {"category": {"$in": ["database", "framework"]}}
  ```

- **`$range`**: Matches numeric metadata fields within a given range [min, max]  
  ```python
  {"score": {"$range": [70, 95]}}
  ```

### Multiple Filters (AND Logic)

Multiple filter conditions are combined with logical AND:

```python
# Both conditions must be true
filter_dict = [
    {"category": {"$eq": "database"}},
    {"difficulty": {"$in": ["intermediate", "advanced"]}}
]

results = vector_store.similarity_search(
    query="vector databases",
    k=5,
    filter=filter_dict
)
```

## Advanced Search Operations

### Similarity Search with Scores

Get similarity scores along with documents:

```python
results = vector_store.similarity_search_with_score(
    query="machine learning",
    k=3
)

for doc, score in results:
    print(f"Score: {score:.4f}")
    print(f"Content: {doc.page_content}")
    print()
```

### Search by Vector

Search using a pre-computed embedding vector:

```python
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
query_vector = embeddings.embed_query("What is a vector database?")

results = vector_store.similarity_search_by_vector(
    embedding=query_vector,
    k=5
)
```

### Search by Vector with Scores

```python
results = vector_store.similarity_search_by_vector_with_score(
    embedding=query_vector,
    k=5,
    filter=[{"category": {"$eq": "database"}}]
)

for doc, score in results:
    print(f"Score: {score:.4f} - {doc.page_content}")
```

### Custom Search Parameters

Adjust the `ef` parameter for search quality:

```python
# Higher ef = better recall but slower search
results = vector_store.similarity_search(
    query="vector search",
    k=10,
    ef=256  # Default is 128, max is 1024
)
```

### Filter Tuning

When using filtered queries, two optional parameters let you tune the trade-off between search speed and recall:

#### `prefilter_cardinality_threshold`

Controls when the search strategy switches from **HNSW filtered search** (fast, graph-based) to **brute-force prefiltering** (exhaustive scan on the matched subset).

- Default: `None` (server uses `10_000`)
- Range: `1_000` – `1_000_000`
- Raise the threshold → prefiltering kicks in more often
- Lower the threshold → favors HNSW graph search

```python
# Only prefilter when filter matches ≤5,000 vectors
results = vector_store.similarity_search(
    query="rare topic",
    k=10,
    filter=[{"category": {"$eq": "rare"}}],
    prefilter_cardinality_threshold=5_000,
)
```

#### `filter_boost_percentage`

Expands the internal HNSW candidate pool by this percentage when a filter is active, compensating for candidates discarded by the filter.

- Default: `None` (server uses `0` — no boost)
- Range: `0` – `100`
- `20` → fetch 20% more candidates before applying the filter
- `100` → double the candidate pool

```python
# Fetch 30% more candidates to compensate for aggressive filtering
results = vector_store.similarity_search(
    query="public content",
    k=10,
    filter=[{"visibility": {"$eq": "public"}}],
    filter_boost_percentage=30,
)
```

#### Using Both Together

```python
results = vector_store.similarity_search(
    query="rare public item",
    k=10,
    filter=[{"category": {"$eq": "rare"}}],
    prefilter_cardinality_threshold=5_000,  # brute-force for small match sets
    filter_boost_percentage=25,             # boost candidates for HNSW search
)
```

> **Tip:** Start with defaults. If filtered queries return fewer results than expected, increase `filter_boost_percentage`. If filtered queries are slow on selective filters, lower `prefilter_cardinality_threshold`.

These parameters are available on all four search methods:
- `similarity_search()`
- `similarity_search_with_score()`
- `similarity_search_by_vector()`
- `similarity_search_by_vector_with_score()`

## Using with LangChain

Endee can be used anywhere a LangChain vector store is needed:

```python
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_endee import EndeeVectorStore
from endee import Precision

# Initialize your vector store
vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="rag-index",
    dimension=1536,
    precision=Precision.INT8D
)

# Create a retriever
retriever = vector_store.as_retriever(
    search_kwargs={"k": 3}
)

# Create the RAG chain
model = ChatOpenAI()
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based on the provided context:
    
    Context: {context}
    Question: {question}
    """
)

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

# Use the chain
response = rag_chain.invoke("What is Endee?")
print(response)
```

### Retriever with Filters

```python
# Create retriever with metadata filters
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 5,
        "filter": [{"category": {"$eq": "database"}}]
    }
)

results = retriever.invoke("vector databases")
```

## Document Management

### Adding Documents

```python
from langchain_core.documents import Document

documents = [
    Document(page_content="text 1", metadata={"source": "doc1"}),
    Document(page_content="text 2", metadata={"source": "doc2"})
]

# Add documents and get their IDs
ids = vector_store.add_documents(documents)
```

### Deleting Documents

Delete by IDs:

```python
# Delete specific documents by ID
vector_store.delete(ids=["id1", "id2", "id3"])
```

Delete by filter:

```python
# Delete all documents matching a filter
vector_store.delete(filter=[{"status": {"$eq": "expired"}}])
```

### Retrieving Documents by ID

```python
# Get specific documents by their IDs
docs = vector_store.get_by_ids(["id1", "id2"])

for doc in docs:
    print(doc.page_content)
    print(doc.metadata)
```

## Automatic Text Truncation

The vector store automatically detects your embedding model type and truncates text to fit within token limits:

```python
from langchain_openai import OpenAIEmbeddings
from langchain_endee import EndeeVectorStore

# Auto-detects OpenAI embeddings (8191 token limit)
vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="auto-truncate",
    dimension=1536
)

# Or set custom limit
vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="custom-truncate",
    dimension=1536,
    max_text_length=1000  # Custom token limit
)
```

**Supported embedding models:**
- OpenAI: 8191 tokens
- Cohere: 512 tokens  
- HuggingFace: 512 tokens
- Default: 512 tokens

## Configuration Options

### EndeeVectorStore Constructor Parameters

- **`embedding`** (required): LangChain embedding function
- **`api_token`**: Endee API token (optional for local deployment)
- **`index_name`** (required): Name of the Endee index
- **`dimension`**: Vector dimension (required when creating new index)
- **`space_type`**: Distance metric - `"cosine"` (default), `"l2"`, or `"ip"`
- **`precision`**: Precision level using `Precision` enum - `Precision.INT8D` (default), `Precision.BINARY2`, `Precision.INT16D`, `Precision.FLOAT16`, or `Precision.FLOAT32`
- **`M`**: HNSW graph connectivity parameter (default: 16)
- **`ef_con`**: HNSW construction parameter (default: 128)
- **`max_text_length`**: Maximum text length in tokens (auto-detected if not provided)
- **`embedding_model_type`**: Type of embedding model - `"openai"`, `"cohere"`, `"huggingface"`, or `"default"` (auto-detected if not provided)
- **`force_recreate`**: Delete and recreate index if it exists (default: False)
- **`validate_index_config`**: Validate index configuration on initialization (default: True)
- **`content_payload_key`**: Key for storing text content (default: "text")
- **`metadata_payload_key`**: Key for storing metadata (default: "metadata")

### Example with All Options

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings
from endee import Precision

vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="full-config-index",
    dimension=1536,
    space_type="cosine",
    precision=Precision.INT8D,
    M=16,
    ef_con=128,
    max_text_length=8191,
    embedding_model_type="openai",
    force_recreate=False,
    validate_index_config=True,
    content_payload_key="text",
    metadata_payload_key="metadata"
)
```

## Performance Tips

### 1. Choose the Right Precision

- **`Precision.INT8D`**: Default - excellent balance of speed, memory, and accuracy for most use cases
- **`Precision.FLOAT16` / `Precision.INT16D`**: Better accuracy with moderate memory increase
- **`Precision.FLOAT32`**: Maximum accuracy but highest memory usage
- **`Precision.BINARY2`**: Extreme compression for very large datasets where lower accuracy is acceptable

### 2. Batch Operations

Use larger batch sizes for better performance when adding many documents:

```python
# Add texts in batches
ids = vector_store.add_texts(
    texts=large_text_list,
    metadatas=metadata_list,
    batch_size=1000,           # Endee batch size (max 1000)
    embedding_chunk_size=100   # Embedding generation batch size
)
```

### 3. Use Metadata Filtering

Pre-filter your search space using metadata to improve both speed and relevance:

```python
results = vector_store.similarity_search(
    query="your query",
    k=10,
    filter=[{"category": {"$eq": "relevant_category"}}]
)
```

### 4. Tune Search Parameters

Adjust `ef` parameter based on your accuracy/speed requirements:

```python
# Faster but potentially lower recall
results = vector_store.similarity_search(query="test", k=10, ef=64)

# Slower but potentially higher recall
results = vector_store.similarity_search(query="test", k=10, ef=256)
```

Use `prefilter_cardinality_threshold` and `filter_boost_percentage` to tune filtered queries:

```python
# Improve recall when filters are aggressive
results = vector_store.similarity_search(
    query="test",
    k=10,
    filter=[{"category": {"$eq": "rare"}}],
    prefilter_cardinality_threshold=5_000,  # range: 1,000–1,000,000
    filter_boost_percentage=25,             # range: 0–100
)
```

### 5. Index Management

Use `force_recreate` when you need a clean slate:

```python
# Recreate index with new configuration
vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="my-index",
    dimension=1536,
    force_recreate=True  # Delete existing index and create new one
)
```

## API Reference

### Class Methods

- **`__init__(...)`**: Initialize with Endee index or parameters to create a new one
- **`from_texts(...)`**: Create a vector store from a list of texts
- **`from_documents(...)`**: Create a vector store from LangChain documents
- **`from_existing_index(...)`**: Connect to an existing Endee index

### Instance Methods

- **`add_texts(...)`**: Add text documents with optional metadata
- **`add_documents(...)`**: Add LangChain Document objects
- **`similarity_search(...)`**: Search for similar documents
- **`similarity_search_with_score(...)`**: Search and return similarity scores
- **`similarity_search_by_vector(...)`**: Search using an embedding vector
- **`similarity_search_by_vector_with_score(...)`**: Search by vector with scores
- **`delete(...)`**: Delete documents by ID or filter
- **`get_by_ids(...)`**: Retrieve documents by their IDs
- **`as_retriever(...)`**: Create a LangChain retriever from the vector store

### Properties

- **`embeddings`**: Get the embeddings instance being used
- **`client`**: Get the Endee client instance
- **`index`**: Get the Endee index instance

## Examples

### Example 1: RAG System with Filters

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from endee import Precision

# Create vector store
vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="docs",
    dimension=1536,
    precision=Precision.INT8D
)

# Create retriever with category filter
retriever = vector_store.as_retriever(
    search_kwargs={
        "k": 5,
        "filter": [{"category": {"$eq": "technical"}}]
    }
)

# Build RAG chain
prompt = ChatPromptTemplate.from_template(
    "Answer based on context:\n\nContext: {context}\n\nQuestion: {question}"
)

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI()
    | StrOutputParser()
)

# Use the chain
answer = chain.invoke("How does vector search work?")
print(answer)
```

### Example 2: Document Management

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from endee import Precision

vector_store = EndeeVectorStore(
    embedding=OpenAIEmbeddings(),
    api_token="your-api-token",
    index_name="documents",
    dimension=1536,
    precision=Precision.INT8D
)

# Add documents
docs = [
    Document(page_content="AI is transforming industries", metadata={"category": "ai"}),
    Document(page_content="Python is a popular programming language", metadata={"category": "programming"})
]

ids = vector_store.add_documents(docs)
print(f"Added documents with IDs: {ids}")

# Search with filter
results = vector_store.similarity_search(
    "programming languages",
    k=5,
    filter=[{"category": {"$eq": "programming"}}]
)

# Delete by filter
vector_store.delete(filter=[{"category": {"$eq": "outdated"}}])
```

### Example 3: Multiple Precision Levels

```python
from langchain_endee import EndeeVectorStore
from langchain_openai import OpenAIEmbeddings
from endee import Precision

embeddings = OpenAIEmbeddings()

# Fast, memory-efficient index
fast_store = EndeeVectorStore(
    embedding=embeddings,
    api_token="your-api-token",
    index_name="fast-index",
    dimension=1536,
    precision=Precision.INT8D
)

# High accuracy index
accurate_store = EndeeVectorStore(
    embedding=embeddings,
    api_token="your-api-token",
    index_name="accurate-index",
    dimension=1536,
    precision=Precision.FLOAT32
)

# Extreme compression index
compressed_store = EndeeVectorStore(
    embedding=embeddings,
    api_token="your-api-token",
    index_name="compressed-index",
    dimension=1536,
    precision=Precision.BINARY2
)
```
### Updated Features

**Precision Parameter**: Now uses the `Precision` enum instead of strings:

```python
from endee import Precision

# NEW (Precision enum)
precision=Precision.INT8D     # ✅ Recommended
precision=Precision.FLOAT16   # ✅ Recommended
precision=Precision.INT16D    # ✅ Recommended
precision=Precision.FLOAT32   # ✅ Recommended
precision=Precision.BINARY2   # ✅ Recommended
```

**Method Name**: `from_params()` is now the regular constructor:

```python
# OLD
vector_store = EndeeVectorStore.from_params(...)  # ❌ Method removed

# NEW  
vector_store = EndeeVectorStore(...)  # ✅ Use constructor
```

## Troubleshooting

### Common Issues

**1. "Index not found" error**
- Ensure the index name is correct
- Check that the index exists using `client.list_indexes()`
- If creating a new index, ensure `dimension` parameter is provided

**2. Dimension mismatch error**
- Verify that the `dimension` parameter matches your embedding model's output
- Common dimensions: OpenAI (1536), Cohere (1024), sentence-transformers (384, 768)

**3. Local deployment not working**
- Ensure Endee server is running on `http://127.0.0.1:8080`
- Check server health endpoint
- Set `api_token=None` explicitly for local deployment

**4. Text truncation warnings**
- Text is automatically truncated to fit embedding model limits
- Adjust `max_text_length` parameter if needed
- Consider chunking very long documents before adding them

## License

MIT License
