Metadata-Version: 2.4
Name: crewai-endee
Version: 0.1.1
Summary: CrewAI tools with Endee vector database support
Home-page: https://endee.io/
Author: Endee Labs
Author-email: support@endee.io
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: endee>=0.1.12
Requires-Dist: crewai>=1.5.0
Requires-Dist: crewai-tools>=1.5.0
Requires-Dist: fastembed>=0.3.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Endee CrewAI Integration

**High-Performance Vector Database Integration for CrewAI Agents**

`Endee CrewAI` provides a seamless integration between [Endee](https://endee.io/)—a high-performance vector database—and [CrewAI](https://crewai.com). It enables scalable Short-Term and Entity Memory, allowing agents to store, search, and reuse knowledge with semantic and hybrid retrieval.

---

## Table of Contents

* [Features](#features)
* [Installation](#installation)
* [Environment Configuration](#environment-configuration)
* [Core Concepts](#core-concepts)

  * [Precision Levels](#precision-levels)
  * [Space Types (Distance Metrics)](#space-types-distance-metrics)
* [Initialization](#initialization)

  * [Standard Initialization](#standard-initialization)
  * [Hybrid Initialization](#hybrid-initialization)
* [Core Operations](#core-operations)

  * [Saving Documents](#saving-documents)
  * [Searching with Filters](#searching-with-filters)
* [CrewAI Integration Workflow](#crewai-integration-workflow)

  * [Memory Setup](#memory-setup)
  * [Agent Creation](#agent-creation)
  * [Task Creation](#task-creation)
  * [Crew Execution](#crew-execution)
* [Retrieval Testing Helpers](#retrieval-testing-helpers)
* [API Reference](#api-reference)

---

## Features

This section highlights the core capabilities of `EndeeCrewAI` and how it enhances CrewAI agents with scalable, persistent memory.

* **Vector-Based Memory**
  Enables agents to store and retrieve information using embeddings rather than raw text, allowing semantic understanding instead of keyword matching.

* **High-Performance Search**
  Uses optimized Approximate Nearest Neighbor (ANN) algorithms to retrieve relevant memories with extremely low latency.

* **Metadata Filtering**
  Supports structured metadata (e.g., tags, entities, timestamps) that can be used to filter semantic search results.

* **Provider Agnostic**
  Works with any embedding model supported by CrewAI, allowing flexibility across providers and models.

* **Configurable Precision**
  Allows fine-grained control over memory usage versus retrieval accuracy.

* **Hybrid Search Capabilities**
  Combines Dense embeddings (semantic match) with Sparse embeddings (keyword match) for retrieval accuracy

---

## Installation

This section explains how to install `EndeeCrewAI` and its required dependencies.

Install the package via pip:

```bash
pip install crewai-endee
```

### Dependencies

`EndeeCrewAI` relies on CrewAI and an embedding provider of your choice. Install only the providers you plan to use.

**Example: Google Gemini**

```bash
pip install crewai crewai-tools google-genai
```

**Example: OpenAI**

```bash
pip install crewai crewai-tools openai
```

Sparse Encoding (Required for Hybrid Search) If you intend to use Hybrid Search

```bash
pip install fastembed
```

---

## Environment Configuration

Environment variables are used to securely manage API credentials and avoid hardcoding secrets in your codebase.

Create a `.env` file with the following variables:

```env
# Optional for local/open-source usage 
ENDEE_API_TOKEN=your_endee_api_token

# Embedding Provider Keys (Add only the one you use)
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
COHERE_API_KEY=your_cohere_api_key
```

---

## Core Concepts

This section explains the foundational configuration concepts that affect performance, accuracy, and storage behavior.

### Precision Levels

Precision controls how vectors are stored internally. Lower precision reduces memory usage and increases speed, while higher precision improves accuracy.

| Precision            | Quantization | Data Type | Memory Usage | Accuracy     | Use Case                                               |
|----------------------|--------------|-----------|--------------|--------------|--------------------------------------------------------|
| `float32`   | 32-bit       | FP32      | Highest      | Maximum     | When accuracy is absolutely critical                   |
| `float16`    | 16-bit       | FP16      | ~50% less    | Very good   | Good accuracy with half precision                      |
| `int16d`     | 16-bit       | INT16     | ~50% less    | Very good   | Integer quantization with good accuracy                |
| `int8d`      | 8-bit        | INT8      | ~75% less    | Good        | Default – great for most use cases                     |
| `binary`     | 1-bit        | Binary    | ~96.9% less  | Lower       | Extreme compression for large-scale similarity search  |

---

### Space Types (Distance Metrics)

The space type defines how similarity between vectors is measured during search operations.

| Space Type | Description            | Best For                     |
| ---------- | ---------------------- | ---------------------------- |
| `cosine`   | Angle-based similarity | Text, RAG, NLP (**default**) |
| `l2`       | Euclidean distance     | Image similarity, clustering |
| `ip`       | Inner product          | Recommendation systems       |

---

## Initialization

Initialization configures how your vector index is created, including embedding models, precision, and security settings.

### Standard Initialization

This approach is recommended for development environments or non-sensitive data.

```python
from crewai_endee import EndeeVectorStore

# Embedding function (e.g., using OpenAI(text-embedding-3-small) or Google (gemini-embedding-001) or Cohere(small))
embedder_config = {
    "provider": "cohere",
    "config": {
        "model_name": "small",
        "api_key": "YOUR_COHERE_API_KEY"
    }
}

# Create Endee store
memory_store = EndeeVectorStore(
    type="standard_agent_index",
    embedder_config=embedder_config,
    space_type="cosine",
    precision="int8d",
    api_token="YOUR_ENDEE_API_TOKEN",
)

# Reset index if needed
memory_store.reset()
time.sleep(2) # Wait for reset
```

### Hybrid Initialization

This enables both Dense and Sparse (SPLADE) indexing. You must provide sparse_dim.

```python
from crewai_endee import EndeeVectorStore

# Embedding function (e.g., using OpenAI(text-embedding-3-small) or Google (gemini-embedding-001) or Cohere(small))
embedder_config = {
    "provider": "openai",
    "config": {
        "model_name": "text-embedding-3-small",
        "api_key": "YOUR_OPENAI_API_KEY"
    }
}

# Create Endee store
memory_store = EndeeVectorStore(
    type="hybrid_agent_index",
    embedder_config=embedder_config,
    space_type="cosine",
    precision="int8d",
    api_token="YOUR_ENDEE_API_TOKEN",
    # Trigger Hybrid Mode:
    sparse_model_name="splade_pp",
    sparse_dim=30522,
)

# Reset index if needed
memory_store.reset()
time.sleep(2) # Wait for reset
```

---

## Core Operations

This section covers the most common operations performed on the vector store.

### Saving Documents

When saving documents, text is automatically embedded and stored along with structured metadata for filtering and retrieval.

> When Hybrid Mode is enabled, `save()` automatically stores both dense embeddings and SPLADE sparse representations — no extra parameters required.

```python
documents = [
    ("Python is dynamically typed.", {"language": "Python", "typing": "dynamic"}),
    ("Go is statically typed.", {"language": "Go", "typing": "static"})
]

for text, meta in documents:
    memory_store.save(text, meta)
```

---

### Searching with Filters

Search combines semantic similarity with metadata-based filtering, allowing precise and context-aware retrieval.

> Hybrid Mode works transparently — `search()` automatically combines dense semantic similarity with sparse keyword scoring when enabled during initialization.

```python
results = memory_store.search(
    query="programming language typing",
    limit=5,
    filter=[{"typing": {"$eq": "dynamic"}}]
)

for result in results:
    print(result["context"], result["score"])
```

---

## CrewAI Integration Workflow

This section demonstrates how Endee-backed memory integrates directly into a CrewAI workflow.

### Memory Setup

Here, the Endee vector store is connected to CrewAI’s memory abstractions.

```python
from crewai.memory import ShortTermMemory, EntityMemory

# Create CrewAI memory objects
short_term = ShortTermMemory(storage=memory_store)
entity_memory = EntityMemory(storage=memory_store)
```

---

### Agent Creation

The agent is configured with an LLM and given access to the vector-backed short-term memory.

```python
from crewai import Agent, LLM

# Define LLM (Use any valid model)
llm = LLM(model="gemini/gemini-2.5-flash-lite", api_key="GOOGLE_API_KEY")

# Define an agent
agent = Agent(
    role="Research Assistant",
    goal="Answer questions based on stored knowledge.",
    backstory="You have access to a high-performance vector database.",
    llm=llm,
    verbose=True
)
```

---

### Task Creation

Tasks define what the agent should accomplish and what kind of output is expected.

```python
from crewai import Task

# Define a task
task = Task(
    description="Explain the difference between Python and Go based on your memory.",
    expected_output="A detailed comparison of typing disciplines.",
    agent=agent
)
```

---

### Crew Execution

The crew orchestrates agents and tasks, enabling memory sharing and controlled execution flow.

```python
from crewai import Crew, Process

# Run Crew
crew = Crew(
    agents=[agent],
    tasks=[task],
    process=Process.sequential,
    memory=True,
    short_term_memory=short_term,
    entity_memory=entity_memory,
    verbose=True
)

result = crew.kickoff()
print(result)
```

---

## Retrieval Testing Helpers

You can define custom helpers which can be used to debug dense or hybrid search behaviour independently of CrewAI agents:

* Checking vector entries in Endee
* Testing knowledge retrieval
* Validating entity memory

Example:

```python
def test_vector_retrieval(memory_store):
    queries = ["Python language features", "Go concurrency"]
    for q in queries:
        results = memory_store.search(query=q, limit=1)
        for r in results:
            print(r["context"])
```

---

## API Reference

This section documents all configuration options for the `EndeeVectorStore` constructor.

| Parameter         | Type   | Required | Default  | Description                      |
| ----------------- | ------ | -------- | -------- | -------------------------------- |
| `type`            | `str`  | Yes      | —        | Unique index name                |
| `api_token`       | `str`  | No       | —        | Endee API token                  |
| `embedder_config` | `dict` | Yes      | —        | Embedding provider configuration |
| `space_type`      | `str`  | No       | `cosine` | Distance metric                  |
| `precision`       | `str`  | No       | `int8d` | Quantization level               |
| `allow_reset`     | `bool` | No       | `True`   | Allow index reset                |
| `text_key`        | `str`  | No       | `value`  | Metadata key for stored text     |
| `sparse_dim`       | `int`  | No       | —        | Enables Hybrid Index when provided |
| `sparse_model_name`| `str`  | No       | `splade_pp` | Sparse encoder   |

---
