Metadata-Version: 2.4
Name: ai-rag
Version: 0.1.0
Summary: Local knowledge base CLI with vector + keyword hybrid search and MCP support
Author: songyunfeng
License: MIT
Keywords: chromadb,docx,knowledge-base,mcp,rag,vector-search
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Requires-Dist: chromadb>=0.4.0
Requires-Dist: click>=8.1.0
Requires-Dist: mcp[cli]>=1.0.0
Requires-Dist: numpy<2.0,>=1.24.0
Requires-Dist: python-docx>=1.1.0
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: sentence-transformers>=2.2.0
Requires-Dist: transformers<5.0,>=4.40.0
Requires-Dist: watchdog>=4.0.0
Description-Content-Type: text/markdown

# ai-rag

A local knowledge base CLI tool with **hybrid vector + keyword search** and **MCP server** support.

Automatically indexes `.docx` files from a watch directory, stores embeddings in ChromaDB, and exposes retrieval as both a CLI and an [MCP](https://modelcontextprotocol.io) server for LLM clients like Claude Desktop and Cursor.

## Features

- 📄 **Auto-indexing** — watches a directory and indexes `.docx` files automatically
- 🔍 **Hybrid search** — combines semantic vector search with exact keyword matching (RRF fusion)
- 🤖 **MCP server** — expose your knowledge base as tools callable by Claude / Cursor / any MCP client
- ⚙️ **Configurable** — chunk size, overlap, embedding model, watch directory via YAML config

## Installation

```bash
pip install ai-rag
```

> **Requirements**: Python 3.9+, macOS / Linux  
> First run will download the embedding model (~120 MB).

## Quick Start

```bash
# 1. Index documents in ~/Downloads
ai-rag sync

# 2. Search
ai-rag search "your query"

# 3. Check status
ai-rag status
```

## Search Modes

```bash
# Hybrid (default, recommended)
ai-rag search "direct broadcast architecture" --mode hybrid

# Semantic vector search
ai-rag search "broadcast architecture" --mode vector

# Exact keyword match
ai-rag search "SPU" --mode keyword

# Control number of results
ai-rag search "query" -n 10
```

## File Watcher

```bash
# Foreground — auto-re-index when files change
ai-rag watch

# Background daemon
ai-rag watch --daemon
ai-rag stop
```

## MCP Server (for Claude Desktop / Cursor)

Start the MCP server (stdio transport):

```bash
ai-rag-mcp
```

### Claude Desktop config

Edit `~/Library/Application Support/Claude/claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "ai-rag": {
      "command": "ai-rag-mcp"
    }
  }
}
```

### Cursor config (`~/.cursor/mcp.json`)

```json
{
  "mcpServers": {
    "ai-rag": {
      "command": "ai-rag-mcp"
    }
  }
}
```

After restarting the client, you can ask: *"Search my knowledge base for broadcast backend architecture"* and the LLM will call the retrieval tools automatically.

### Available MCP Tools

| Tool | Description |
|------|-------------|
| `search` | Hybrid vector + keyword search (recommended) |
| `vector_search` | Semantic similarity search only |
| `keyword_search` | Exact keyword / phrase match |
| `get_status` | Index statistics |

## Configuration

Default config file: `~/.config/ai-rag/config.yaml`

```yaml
watch_dir: ~/Downloads        # directory to watch and index
file_patterns:
  - "*.docx"
chunk_size: 500               # characters per chunk
chunk_overlap: 100            # overlap between chunks
model_name: paraphrase-multilingual-MiniLM-L12-v2
```

## Architecture

```
docx files
    │
    ▼
DocxParser → TextChunker (500 chars, 100 overlap)
    │
    ▼
sentence-transformers (paraphrase-multilingual-MiniLM-L12-v2)
    │
    ▼
ChromaDB (local persistent vector store)
    │
    ├── vector_search  (cosine similarity)
    ├── keyword_search (substring match via where_document)
    └── hybrid_search  (RRF fusion, vector×0.7 + keyword×0.3)
```

## License

MIT
