Metadata-Version: 2.4
Name: haiku.rag
Version: 0.13.0
Summary: Agentic Retrieval Augmented Generation (RAG) with LanceDB
Author-email: Yiorgis Gozadinos <ggozadinos@gmail.com>
License: MIT
License-File: LICENSE
Keywords: RAG,lancedb,mcp,ml,vector-database
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows :: Windows 10
Classifier: Operating System :: Microsoft :: Windows :: Windows 11
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: docling>=2.56.1
Requires-Dist: fastmcp>=2.12.4
Requires-Dist: httpx>=0.28.1
Requires-Dist: lancedb>=0.25.2
Requires-Dist: pydantic-ai>=1.0.18
Requires-Dist: pydantic-graph>=1.0.18
Requires-Dist: pydantic>=2.12.2
Requires-Dist: python-dotenv>=1.1.1
Requires-Dist: pyyaml>=6.0.1
Requires-Dist: rich>=14.2.0
Requires-Dist: tiktoken>=0.12.0
Requires-Dist: typer>=0.19.2
Requires-Dist: watchfiles>=1.1.0
Provides-Extra: a2a
Requires-Dist: fasta2a>=0.1.0; extra == 'a2a'
Provides-Extra: mxbai
Requires-Dist: mxbai-rerank>=0.1.6; extra == 'mxbai'
Provides-Extra: voyageai
Requires-Dist: voyageai>=0.3.5; extra == 'voyageai'
Description-Content-Type: text/markdown

# Haiku RAG

Retrieval-Augmented Generation (RAG) library built on LanceDB.

`haiku.rag` is a Retrieval-Augmented Generation (RAG) library built to work with LanceDB as a local vector database. It uses LanceDB for storing embeddings and performs semantic (vector) search as well as full-text search combined through native hybrid search with Reciprocal Rank Fusion. Both open-source (Ollama) as well as commercial (OpenAI, VoyageAI) embedding providers are supported.

> **Note**: Configuration now uses YAML files instead of environment variables. If you're upgrading from an older version, run `haiku-rag init-config --from-env` to migrate your `.env` file to `haiku.rag.yaml`. See [Configuration](https://ggozad.github.io/haiku.rag/configuration/) for details.

## Features

- **Local LanceDB**: No external servers required, supports also LanceDB cloud storage, S3, Google Cloud & Azure
- **Multiple embedding providers**: Ollama, VoyageAI, OpenAI, vLLM
- **Multiple QA providers**: Any provider/model supported by Pydantic AI
- **Research graph (multi‑agent)**: Plan → Search → Evaluate → Synthesize with agentic AI
- **Native hybrid search**: Vector + full-text search with native LanceDB RRF reranking
- **Reranking**: Default search result reranking with MixedBread AI, Cohere, or vLLM
- **Question answering**: Built-in QA agents on your documents
- **File monitoring**: Auto-index files when run as server
- **40+ file formats**: PDF, DOCX, HTML, Markdown, code files, URLs
- **MCP server**: Expose as tools for AI assistants
- **A2A agent**: Conversational agent with context and multi-turn dialogue
- **CLI & Python API**: Use from command line or Python

## Quick Start

```bash
# Install
# Python 3.12 or newer required
uv pip install haiku.rag

# Add documents
haiku-rag add "Your content here"
haiku-rag add "Your content here" --meta author=alice --meta topic=notes
haiku-rag add-src document.pdf --meta source=manual

# Search
haiku-rag search "query"

# Ask questions
haiku-rag ask "Who is the author of haiku.rag?"

# Ask questions with citations
haiku-rag ask "Who is the author of haiku.rag?" --cite

# Deep QA (multi-agent question decomposition)
haiku-rag ask "Who is the author of haiku.rag?" --deep --cite

# Deep QA with verbose output
haiku-rag ask "Who is the author of haiku.rag?" --deep --verbose

# Multi‑agent research (iterative plan/search/evaluate)
haiku-rag research \
  "What are the main drivers and trends of global temperature anomalies since 1990?" \
  --max-iterations 2 \
  --confidence-threshold 0.8 \
  --max-concurrency 3 \
  --verbose

# Rebuild database (re-chunk and re-embed all documents)
haiku-rag rebuild

# Start server with file monitoring
haiku-rag serve --monitor
```

To customize settings, create a `haiku.rag.yaml` config file (see [Configuration](https://ggozad.github.io/haiku.rag/configuration/)).

## Python Usage

```python
from haiku.rag.client import HaikuRAG
from haiku.rag.research import (
    PlanNode,
    ResearchContext,
    ResearchDeps,
    ResearchState,
    build_research_graph,
    stream_research_graph,
)

async with HaikuRAG("database.lancedb") as client:
    # Add document
    doc = await client.create_document("Your content")

    # Search (reranking enabled by default)
    results = await client.search("query")
    for chunk, score in results:
        print(f"{score:.3f}: {chunk.content}")

    # Ask questions
    answer = await client.ask("Who is the author of haiku.rag?")
    print(answer)

    # Ask questions with citations
    answer = await client.ask("Who is the author of haiku.rag?", cite=True)
    print(answer)

    # Multi‑agent research pipeline (Plan → Search → Evaluate → Synthesize)
    graph = build_research_graph()
    question = (
        "What are the main drivers and trends of global temperature "
        "anomalies since 1990?"
    )
    state = ResearchState(
        context=ResearchContext(original_question=question),
        max_iterations=2,
        confidence_threshold=0.8,
        max_concurrency=2,
    )
    deps = ResearchDeps(client=client)

    # Blocking run (final result only)
    result = await graph.run(
        PlanNode(provider="openai", model="gpt-4o-mini"),
        state=state,
        deps=deps,
    )
    print(result.output.title)

    # Streaming progress (log/report/error events)
    async for event in stream_research_graph(
        graph,
        PlanNode(provider="openai", model="gpt-4o-mini"),
        state,
        deps,
    ):
        if event.type == "log":
            iteration = event.state.iterations if event.state else state.iterations
            print(f"[{iteration}] {event.message}")
        elif event.type == "report":
            print("\nResearch complete!\n")
            print(event.report.title)
            print(event.report.executive_summary)
```

## MCP Server

Use with AI assistants like Claude Desktop:

```bash
haiku-rag serve --stdio
```

Provides tools for document management and search directly in your AI assistant.

## A2A Agent

Run as a conversational agent with the Agent-to-Agent protocol:

```bash
# Start the A2A server
haiku-rag serve --a2a

# Connect with the interactive client (in another terminal)
haiku-rag a2aclient
```

The A2A agent provides:

- Multi-turn dialogue with context
- Intelligent multi-search for complex questions
- Source citations with titles and URIs
- Full document retrieval on request

## Examples

See the [examples directory](examples/) for working examples:

- **[Interactive Research Assistant](examples/ag-ui-research/)** - Full-stack research assistant with Pydantic AI and AG-UI featuring human-in-the-loop approval and real-time state synchronization
- **[Docker Setup](examples/docker/)** - Complete Docker deployment with file monitoring, MCP server, and A2A agent
- **[A2A Security](examples/a2a-security/)** - Authentication examples (API key, OAuth2, GitHub)

## Documentation

Full documentation at: https://ggozad.github.io/haiku.rag/

- [Installation](https://ggozad.github.io/haiku.rag/installation/) - Provider setup
- [Configuration](https://ggozad.github.io/haiku.rag/configuration/) - YAML configuration
- [CLI](https://ggozad.github.io/haiku.rag/cli/) - Command reference
- [Python API](https://ggozad.github.io/haiku.rag/python/) - Complete API docs
- [Agents](https://ggozad.github.io/haiku.rag/agents/) - QA agent and multi-agent research
- [MCP Server](https://ggozad.github.io/haiku.rag/mcp/) - Model Context Protocol integration
- [A2A Agent](https://ggozad.github.io/haiku.rag/a2a/) - Agent-to-Agent protocol support
- [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance Benchmarks
