Metadata-Version: 2.4
Name: fitz-ai
Version: 0.4.5
Summary: A modular, production-ready knowledge engine platform with clean architecture and multi-paradigm support (RAG, CLaRa).
Author: Yan Fitzner
License: MIT
Project-URL: Homepage, https://github.com/yafitzdev/fitz-ai
Project-URL: Repository, https://github.com/yafitzdev/fitz-ai
Project-URL: Documentation, https://github.com/yafitzdev/fitz-ai#readme
Project-URL: Issues, https://github.com/yafitzdev/fitz-ai/issues
Project-URL: Changelog, https://github.com/yafitzdev/fitz-ai/blob/main/CHANGELOG.md
Project-URL: Source Code, https://github.com/yafitzdev/fitz-ai
Keywords: rag,retrieval,augmented,generation,llm,ai,knowledge-base,vector-database,semantic-search,document-processing,embeddings,nlp,clara,compression,machine-learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Indexing
Classifier: Topic :: Database
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: httpx>=0.24
Requires-Dist: typing-extensions>=4.7
Requires-Dist: typer>=0.9
Requires-Dist: jinja2>=3.1
Requires-Dist: faiss-cpu>=1.7.0
Requires-Dist: pdfplumber>=0.10.0
Requires-Dist: python-docx
Provides-Extra: local
Requires-Dist: ollama>=0.1.0; extra == "local"
Provides-Extra: map
Requires-Dist: umap-learn>=0.5.0; extra == "map"
Requires-Dist: scikit-learn>=1.0; extra == "map"
Provides-Extra: remote
Requires-Dist: qdrant-client>=1.7; extra == "remote"
Provides-Extra: clara
Requires-Dist: torch>=2.0; extra == "clara"
Requires-Dist: transformers>=4.35; extra == "clara"
Requires-Dist: accelerate>=0.24; extra == "clara"
Requires-Dist: bitsandbytes>=0.41; extra == "clara"
Requires-Dist: peft>=0.10; extra == "clara"
Provides-Extra: api
Requires-Dist: fastapi>=0.100.0; extra == "api"
Requires-Dist: uvicorn[standard]>=0.23.0; extra == "api"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: isort>=5.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Provides-Extra: all
Requires-Dist: ollama>=0.1.0; extra == "all"
Requires-Dist: qdrant-client>=1.7; extra == "all"
Requires-Dist: umap-learn>=0.5.0; extra == "all"
Requires-Dist: scikit-learn>=1.0; extra == "all"
Requires-Dist: fastapi>=0.100.0; extra == "all"
Requires-Dist: uvicorn[standard]>=0.23.0; extra == "all"
Requires-Dist: torch>=2.0; extra == "all"
Requires-Dist: transformers>=4.35; extra == "all"
Requires-Dist: accelerate>=0.24; extra == "all"
Requires-Dist: bitsandbytes>=0.41; extra == "all"
Requires-Dist: peft>=0.10; extra == "all"
Dynamic: license-file

# fitz-ai

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![PyPI version](https://badge.fury.io/py/fitz-ai.svg)](https://pypi.org/project/fitz-ai/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Version](https://img.shields.io/badge/version-0.4.5-green.svg)](CHANGELOG.md)
[![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen)](https://github.com/yafitzdev/fitz-ai)

---

**Honest RAG in 5 minutes. No infrastructure. No boilerplate.**

```bash
pip install fitz-ai

fitz quickstart ./docs "What is our refund policy?"
```

That's it. Your documents are now searchable with AI.


![fitz-ai quickstart demo](https://raw.githubusercontent.com/yafitzdev/fitz-ai/main/docs/assets/quickstart_demo.gif)

<br>

<details>

<summary><strong>Python SDK</strong></summary>

<br>

```python
import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is our refund policy?")
```

</details>

<br>

<details>

<summary><strong>REST API</strong></summary>

<br>

```bash
pip install fitz-ai[api]

fitz serve  # http://localhost:8000/docs for interactive API
```

</details>

---

### About 🧑‍🌾

  Solo project by Yan Fitzner ([LinkedIn](https://www.linkedin.com/in/yan-fitzner/), [GitHub](https://github.com/yafitzdev)).

  - ~55k lines of Python
  - 700+ tests, 100% coverage
  - Zero LangChain/LlamaIndex dependencies — built from scratch

![fitz-ai honest_rag](https://raw.githubusercontent.com/yafitzdev/fitz-ai/main/docs/assets/honest_rag.jpg)

---

<details>

<summary><strong>📦 What is RAG?</strong></summary>

<br>

RAG is how ChatGPT's "file search," Notion AI, and enterprise knowledge tools actually work under the hood.
Instead of sending all your documents to an AI, RAG:

1. [X] **Indexes your documents once** — Splits them into chunks, converts to vectors, stores in a database
2. [X] **Retrieves only what's relevant** — When you ask a question, finds the 5-10 most relevant chunks
3. [X] **Sends just those chunks to the LLM** — The AI answers based on focused, relevant context

Traditional approach:
```
  [All 10,000 documents] → LLM → Answer
  ❌ Impossible (too large)
  ❌ Expensive (if possible)
  ❌ Unfocused
```
RAG approach:
```
  Question → [Search index] → [5 relevant chunks] → LLM → Answer
  ✅ Works at any scale
  ✅ Costs pennies per query
  ✅ Focused context = better answers
```

</details>

---

<details>

<summary><strong>📦 Why Can't I Just Send My Documents to ChatGPT directly?</strong></summary>

<br>

You can—but you'll hit walls fast.

**Context window limits 🚨** 
> GPT-4 accepts ~128k tokens. That's roughly 300 pages. Your company wiki, codebase, or document archive is likely 10x-100x larger. You physically cannot paste it all.

**Cost explosion 💥**
> Even if you could fit everything, you'd pay for every token on every query. Sending 100k tokens costs ~\$1-3 per question. Ask 50 questions a day? That's $50-150 daily—for one user.

**No selective retrieval ❌**
> When you paste documents, the model reads everything equally. It can't focus on what's relevant. Ask about refund policies and it's also processing your hiring guidelines, engineering specs, and meeting notes—wasting context and degrading answers.

**No persistence 💢**
> Every conversation starts fresh. You re-upload, re-paste, re-explain. There's no knowledge base that accumulates and improves.

</details>

---

### Why Fitz?

**Super fast setup 🐆**
> Point at a folder. Ask a question. Get an answer with sources. Everything else is handled by Fitz.

**Honest answers ✅**
> Most RAG tools confidently answer even when the answer isn't in your documents. Ask "What was our Q4 revenue?" when your docs only cover Q1-Q3, and typical RAG hallucinates a number. Fitz says: *"I cannot find Q4 revenue figures in the provided documents."*

**Swap engines, keep everything else ⚙️**
> RAG is evolving fast—GraphRAG, HyDE, ColBERT, whatever's next. Fitz lets you switch engines in one line. Your ingested data stays. Your queries stay. No migration, no re-ingestion, no new API to learn. Frameworks lock you in; Fitz lets you move.

**Analytical queries that actually work 📊**
> Standard RAG fails on questions like "What are the trends?"—it retrieves random chunks instead of insights. Fitz's hierarchical RAG generates multi-level summaries during ingestion. Ask for trends, get aggregated analysis. Ask for specifics, get detail chunks. No special syntax required.

**Other Features at a Glance 🃏**
>
>1. [x] **Local execution possible.** FAISS and Ollama support, no API keys required to start.
>2. [x] **Plugin-based architecture.** Swap LLMs, vector databases, rerankers, and retrieval pipelines via YAML config.
>3. [x] **Multiple engines.** Supports FitzRAG, GraphRAG and CLaRa out of the box—swap engines in one line.
>4. [X] **Incremental ingestion.** Only reprocesses changed files, even with new chunking settings.
>5. [x] **Full provenance.** Every answer traces back to the exact chunk and document.
>6. [x] **Data privacy**: No telemetry, no cloud, no external calls except to the LLM provider you configure.

####

Any questions left? Try fitz on itself:

```bash
fitz quickstart ./fitz_ai "How does the chunking pipeline work?"
```

The codebase speaks for itself.

---

<details>

<summary><strong>📦 Fitz vs LangChain vs LlamaIndex</strong></summary>

<br>

#### Fitz opts for a deliberately narrower approach.
>
>LangChain and LlamaIndex are powerful **LLM application frameworks** designed to help developers build complex, end-to-end AI systems. 
>Fitz provides a **minimal, replaceable RAG engine** with strong epistemic guarantees — without locking users into a framework, ecosystem, or long-term architectural commitment.
>
>Fitz is not a competitor in scope.  
>It is an infrastructure primitive.

<br>

#### Core philosophical differences ⚖️
>
>| Dimension | Fitz | LangChain | LlamaIndex |
>|--------|------|-----------|------------|
>| Primary role | **RAG engine** | LLM application framework | LLM data framework |
>| User commitment | **No framework lock-in** | High | High |
>| Engine coupling | **Swappable in one line** | Deep | Deep |
>| Design goal | Correctness & honesty | Flexibility | Data integration |
>| Long-term risk | Low | Migration-heavy | Migration-heavy |

<br>

#### Epistemic behavior (truth over fluency) 🎯
>
>| Aspect | Fitz | LangChain / LlamaIndex |
>|-----|------|------------------------|
>| “I don’t know” | **First-class behavior** | Not guaranteed |
>| Hallucination handling | Designed-in | Usually prompt-level |
>| Confidence signaling | Explicit | Implicit |
>
>Fitz treats uncertainty as a **feature**, not a failure.  
>If the system cannot support an answer with retrieved evidence, it says so.

<br>

#### Transparency & provenance 🔎
>
>| Capability | Fitz | LangChain / LlamaIndex |
>|---------|------|------------------------|
>| Source attribution | **Mandatory** | Optional |
>| Retrieval trace | **Explicit & structured** | Often opaque |
>| Debuggability | Built-in | Tool-dependent |
>
>Every answer in Fitz is fully auditable down to the retrieval step.

<br>

#### Scope & complexity 🪐
>
>| Aspect | Fitz | LangChain / LlamaIndex |
>|-----|------|------------------------|
>| Chains / agents | ❎ | ✔ |
>| Prompt graphs | ❎ | ✔ |
>| UI abstractions | ❎ | Often |
>| Cognitive overhead | **Very low** | High |
>
>Fitz intentionally does less — so it can be trusted more.

<br>

#### Use Fitz if you want:
>
>- A replaceable RAG engine, not a framework marriage
>- Strong epistemic guarantees (“I don’t know” is valid output)
>- Full provenance for every answer
>- A transparent, extensible plugin architecture
>- A future-proof ingestion pipeline that survives engine changes

</details>

---

<details>

<summary><strong>📦 Fitz RAG vs GraphRAG</strong></summary>

<br>

> **"RAG is dead"** posts flood the AI scene. The argument: traditional RAG can't handle relationships or trends. GraphRAG is the new hotness.
>
> **Traditional RAG has two problems—it can't see the forest for the trees, and it lies about what it sees.**
>
> Fitz RAG solves both: **hierarchical summaries** for the big picture, **epistemic guardrails** for honesty. And now it also extracts entities and relationships—without the graph construction overhead.

<br>

#### The real problem with RAG isn't retrieval—it's confidence

>Most RAG failures aren't "couldn't find the relationship." They're:
>- Hallucinated answers presented confidently
>- Conflicting sources silently collapsed into one answer
>- Causality invented from correlation
>
>**GraphRAG doesn't solve any of these.** It just finds relationships better.
>
>Fitz RAG solves them with **epistemic guardrails**:
>
>| Problem | GraphRAG | Fitz RAG |
>|---------|----------|----------|
>| Sources disagree | Picks one silently | **DISPUTED mode** — surfaces the conflict |
>| No evidence for claim | Answers anyway | **ABSTAIN mode** — refuses to guess |
>| Correlation ≠ causation | Invents "why" | **Blocks causal hallucination** |
>| Uncertain answer | Sounds confident | **QUALIFIED mode** — notes limitations |

<br>

#### What Fitz RAG now shares with GraphRAG

>Fitz RAG has closed the gap on key GraphRAG features—without the complexity:
>
>| Capability | GraphRAG | Fitz RAG |
>|------------|----------|----------|
>| **Entity extraction** | LLM extracts entities | LLM extracts entities (classes, functions, APIs, people, orgs) |
>| **Entity relationships** | Full knowledge graph | Co-occurrence links (entities in same chunk are linked) |
>| **Semantic clustering** | Leiden community detection | K-means clustering by embedding similarity |
>| **Trend analysis** | Community summaries | Hierarchical summaries (L0→L1→L2) |
>
>The difference: Fitz extracts entities and links them **without building a graph**. Co-occurrence linking captures 80% of useful relationships at 10% of the complexity.

<br>

#### When GraphRAG still wins

>GraphRAG excels at **multi-hop relationship traversal**:
>
>| Use Case | Why GraphRAG |
>|----------|--------------|
>| "Who founded the company that acquired Z?" | Multi-hop graph traversal |
>| Complex relationship chains | Explicit edge following |
>| Visual knowledge exploration | Graph visualization |
>
>If you need to traverse 3+ hop relationships or visualize entity networks, GraphRAG is the right tool.

<br>

#### When Fitz RAG wins

>Fitz RAG excels at **trusted answers, entities, and analytical queries**:
>
>| Use Case | Why Fitz RAG |
>|----------|--------------|
>| Q&A where trust matters | Epistemic guardrails |
>| "What entities are in this doc?" | **Entity extraction** with type filtering |
>| "What concepts co-occur?" | **Entity linking** (co-occurrence) |
>| "What are the trends?" | **Hierarchical summaries** (L0→L1→L2) |
>| "Summarize this corpus" | **Corpus-level summaries** auto-generated |
>| Conflicting sources | Conflict detection |
>| Compliance/legal queries | Admits uncertainty |
>| Fast, cheap retrieval | No graph construction |
>| Incremental updates | Just add new chunks |
>
>**Fitz RAG extracts entities, links them, clusters them, and summarizes them**—all during ingestion:
>- **Entities**: Classes, functions, APIs, people, organizations, concepts
>- **Links**: Co-occurrence relationships stored in chunk metadata
>- **Clusters**: Semantic grouping via K-means on embeddings
>- **Summaries**: L0 chunks → L1 group summaries → L2 corpus summary

<br>

#### The cost difference

>| Aspect | GraphRAG | Fitz RAG |
>|--------|----------|----------|
>| Ingest cost | **High** — LLM extracts entities + builds graph | **Medium** — LLM extracts entities (optional) |
>| Ingest speed | Slow — graph construction | Fast — no graph building |
>| Query latency | Higher — graph traversal | Lower — vector search |
>| Error propagation | Bad extraction = bad graph | Entities are metadata, not structure |
>| Schema dependency | Must define entity types | Flexible type list |
>| Incremental updates | Rebuild graph sections | Just add chunks + entities |

<br>

#### The bottom line

>| Capability | GraphRAG | Fitz RAG |
>|------------|----------|----------|
>| Entity extraction | ✅ LLM-based | ✅ LLM-based |
>| Entity relationships | Full knowledge graph | Co-occurrence links |
>| Semantic clustering | Leiden algorithm | K-means on embeddings |
>| Trend analysis | Community summaries | **Hierarchical summaries** |
>| Corpus overview | Global search | **L2 corpus summary** |
>| Epistemic safety | ❌ None | ✅ **Guardrails built-in** |
>
>**GraphRAG wins on multi-hop traversal. Fitz RAG wins on trusted answers + entities + trends.**
>
>For most enterprise use cases—support, compliance, internal knowledge, trend analysis—Fitz RAG now delivers 90% of GraphRAG's capabilities at a fraction of the cost, plus epistemic guarantees GraphRAG simply doesn't have.
>
>Need the full graph? Fitz gives you both engines. Same data. Same API. Choose per query.

</details>

---
<details>

<summary><strong>📦 Features</strong></summary>

<br>

#### Hierarchical RAG 📊

>Standard RAG struggles with analytical queries like "What are the trends?" because it retrieves random chunks instead of aggregated insights. Hierarchical RAG solves this.
>
>**The problem ☔️**
>```
>Q: "What are the trends in my comments?"
>Standard RAG: Returns random individual comments (not useful)
>```
>
>**The solution ☀️**
>
>For documents, Fitz auto-enables hierarchy when an LLM is available. It groups by file and generates multi-level summaries:
>- **Level 0**: Original chunks (unchanged)
>- **Level 1**: Group summaries (one per source file)
>- **Level 2**: Corpus summary (aggregates all groups)
>
>**Example: YouTube comment analysis**
>```
>Ingested: 500 comments across 10 videos
>
>Level 0: "This tutorial helped me understand async/await finally!"
>Level 1: "Tutorial Video #3: 47 comments, mostly positive. Users praise
>         clarity of examples. Common request: more on error handling."
>Level 2: "Across 10 videos (500 comments): 78% positive sentiment.
>         Top themes: code clarity, pacing, example quality.
>         Recurring requests: longer videos, more advanced topics."
>```
>
>Now analytical queries retrieve summaries, while specific queries still retrieve details:
>```
>Q: "What are the overall trends in my comments?"
>→ Returns Level 2 corpus summary + Level 1 video summaries
>```
>```
>Q: "What did people say about my async tutorial?"
>→ Returns Level 0 individual comments from that video
>```
>
>No special query syntax. No retrieval config changes. Summaries match analytical queries naturally via vector similarity.

<br>

#### Actually admits when it doesn't know 📚

> When documents don't contain the answer, fitz says so:
>
> ```
> Q: "What was our Q4 revenue?"
> A: "I cannot find Q4 revenue figures in the provided documents.
>     The available financial data covers Q1-Q3 only."
>
>    Mode: ABSTAIN
>```
>
>Three constraint plugins run automatically:
>1. [X] **📕 ConflictAwareConstraint**: Detects contradictions across sources
>2. [X] **📗 InsufficientEvidenceConstraint**: Blocks answers without evidence
>3. [X] **📘 CausalAttributionConstraint**: Prevents hallucinated cause-effect claims

<br>

#### Swappable RAG Engines 🔄

>Your data stays. Your queries stay. Only the engine changes.
>
>```
>        ┌─────────────────────────────────────┐
>        │           Your Query                │
>        │   "What are the payment terms?"     │
>        └──────────────────┬──────────────────┘
>                           │
>                           ▼
>        ┌─────────────────────────────────────┐
>        │       engine="..."                  │
>        │  ┌─────────┐ ┌───────┐ ┌─────────┐  │
>        │  │ fitz    │ │ clara │ │ graph   │  │
>        │  │  _rag   │ │       │ │  _rag   │  │
>        │  └────┬────┘ └───┬───┘ └────┬────┘  │
>        │       └──────────┼──────────┘       │
>        └──────────────────┼──────────────────┘
>                           │
>                           ▼
>        ┌─────────────────────────────────────┐
>        │       Your Ingested Knowledge       │
>        │      (unchanged across engines)     │
>        └─────────────────────────────────────┘
>```
>
>```python
>answer = run("What are the payment terms?", engine="fitz_rag")
>answer = run("What are the payment terms?", engine="clara")
>answer = run("What are the payment terms?", engine="graph_rag")  # future
>```
>
>No migration. No re-ingestion. No new API to learn.

<br>

#### Full Provenance 🗂️

>Every answer traces back to its source:
>
>```
>Answer: The refund policy allows returns within 30 days...
>
>Sources:
>  [1] policies/refund.md [chunk 3] (score: 0.92)
>  [2] faq/payments.md [chunk 1] (score: 0.87)
>```

<br>

#### Incremental Ingestion ⚡

>Fitz tracks file hashes and only re-ingests what changed:
>
>```
>$ fitz ingest ./src
>
>Scanning... 847 files
>  → 12 new files
>  → 3 modified files
>  → 832 unchanged (skipped)
>
>Ingesting 15 files...
>```
>
>Re-running ingestion on a large codebase takes seconds, not minutes. Changed your chunking config? Fitz detects that too and re-processes affected files.

<br>

#### Smart Chunking 🧠

>Format-aware chunking that preserves structure:
>
>| Format | Strategy |
>|--------|----------|
>| **Python** | AST-aware: keeps classes, functions, imports intact. Large classes split by method. |
>| **Markdown** | Header-aware: splits on `#` headers, preserves code blocks and lists. Extracts YAML frontmatter as metadata. |
>| **PDF** | Section-aware: detects numbered headings (1.1, 2.3.1), roman numerals, and keywords (Abstract, Conclusion). |
>
>No more retrieving half a function or a code block split mid-syntax.

</details>

---

<details>

<summary><strong>📦 Quick Start</strong></summary>

<br>

#### CLI
>
>```bash
>pip install fitz-ai
>
>fitz quickstart ./docs "Your question here"
>```
>
>Fitz auto-detects your LLM provider:
>1. **Ollama running?** → Uses it automatically (fully local)
>2. **`COHERE_API_KEY` or `OPENAI_API_KEY` set?** → Uses it automatically
>3. **First time?** → Guides you through free Cohere signup (2 minutes)
>
>After first run, it's completely zero-friction.

<br>

#### Python SDK
>
>```python
>import fitz_ai
>
>fitz_ai.ingest("./docs")
>answer = fitz_ai.query("Your question here")
>
>print(answer.text)
>for source in answer.provenance:
>    print(f"  - {source.source_id}: {source.excerpt[:50]}...")
>```
>
>The SDK provides:
>- Module-level functions matching CLI (`ingest`, `query`)
>- Auto-config creation (no setup required)
>- Full provenance tracking
>- Same honest RAG as the CLI
>
>For advanced use (multiple collections), use the `fitz` class directly:
>```python
>from fitz_ai import fitz
>
>physics = fitz(collection="physics")
>physics.ingest("./physics_papers")
>answer = physics.query("Explain entanglement")
>```

<br>

#### Fully Local (Ollama)
>
>```bash
>pip install fitz-ai[local]
>
>ollama pull llama3.2
>ollama pull nomic-embed-text
>
>fitz quickstart ./docs "Your question here"
>```
>
>Fitz auto-detects Ollama when running. No API keys needed—no data leaves your machine.

</details>

---

<details>

<summary><strong>📦 Real-World Usage</strong></summary>

<br>

Fitz is a foundation. It handles document ingestion and grounded retrieval—you build whatever sits on top: chatbots, dashboards, alerts, or automation.

<br>

<strong>Chatbot Backend 🤖</strong>

> Connect fitz to Slack, Discord, Teams, or your own UI. One function call returns an answer with sources—no hallucinations, full provenance. You handle the conversation flow; fitz handles the knowledge.
>
> *Example:* A SaaS company plugs fitz into their support bot. Tier-1 questions like "How do I reset my password?" get instant answers. Their support team focuses on edge cases while fitz deflects 60% of incoming tickets.

<br>

<strong>Internal Knowledge Base 📖</strong>

> Point fitz at your company's wiki, policies, and runbooks. Employees ask natural language questions instead of hunting through folders or pinging colleagues on Slack.
>
> *Example:* A 200-person startup ingests their Notion workspace and compliance docs. New hires find answers to "How do I request PTO?" on day one—no more waiting for someone in HR to respond.

<br>

<strong>Continuous Intelligence & Alerting (Watchdog) 🐶</strong>

> Pair fitz with cron, Airflow, or Lambda. Ingest data on a schedule, run queries automatically, trigger alerts when conditions match. Fitz provides the retrieval primitive; you wire the automation.
>
> *Example:* A security team ingests SIEM logs nightly. Every morning, a scheduled job asks "Were there failed logins from unusual locations?" If fitz finds evidence, an alert fires to the on-call channel before anyone checks email.

<br>

<strong>Web Knowledge Base 🌎</strong>

> Scrape the web with Scrapy, BeautifulSoup, or Playwright. Save to disk, ingest with fitz. The web becomes a queryable knowledge base.
>
> *Example:* A football analytics hobbyist scrapes Premier League match reports. After ingesting, they ask "How did Arsenal perform against top 6 teams?" or "What tactics did Liverpool use in away games?"—insights that would take hours to compile manually.

<br>

<strong>Codebase Search 🐍</strong>

> Fitz includes built-in AST-aware chunking for code bases. Functions, classes, and modules become individual searchable units with docstrings and imports preserved. Ask questions in natural language; get answers pointing to specific code.
>
> *Example:* A team inherits a legacy Django monolith—200k lines, sparse docs. They ingest the codebase and ask "Where is user authentication handled?" or "What API endpoints modify the billing table?" New developers onboard in days instead of weeks.

</details>

---

<details>

<summary><strong>📦 Architecture</strong></summary>

<br>

```
┌───────────────────────────────────────────────────────────────┐
│                         fitz-ai                               │
├───────────────────────────────────────────────────────────────┤
│  User Interfaces                                              │
│  CLI: quickstart | init | ingest | query | chat | serve       │
│  SDK: fitz_ai.fitz() → ingest() → ask()                       │
│  API: /query | /chat | /ingest | /collections | /health       │
├───────────────────────────────────────────────────────────────┤
│  Engines                                                      │
│  ┌───────────┐  ┌───────────┐  ┌────────────┐                 │
│  │  FitzRAG  │  │   CLaRa   │  │  GraphRAG  │  (pluggable)    │
│  └───────────┘  └───────────┘  └────────────┘                 │
├───────────────────────────────────────────────────────────────┤
│  Plugin System (all YAML-defined)                             │
│  ┌────────┐ ┌───────────┐ ┌────────┐ ┌──────────┐             │
│  │  Chat  │ │ Embedding │ │ Rerank │ │ VectorDB │             │
│  └────────┘ └───────────┘ └────────┘ └──────────┘             │
│  openai, cohere, anthropic, ollama, azure...                  │
├───────────────────────────────────────────────────────────────┤
│  Retrieval Pipelines (YAML-composed)                          │
│  dense.yaml | dense_rerank.yaml | custom...                   │
├───────────────────────────────────────────────────────────────┤
│  Enrichment (opt-in)                                          │
│  entities | entity links | semantic clusters | hierarchical   │
├───────────────────────────────────────────────────────────────┤
│  Constraints (epistemic safety)                               │
│  ConflictAware | InsufficientEvidence | CausalAttribution     │
└───────────────────────────────────────────────────────────────┘
```

</details>

---

<details>

<summary><strong>📦 CLI Reference</strong></summary>

<br>

```bash
fitz quickstart [PATH] [QUESTION]    # Zero-config RAG (start here)
fitz init                            # Interactive setup wizard
fitz ingest                          # Interactive ingestion
fitz query                           # Single question with sources
fitz chat                            # Multi-turn conversation with your knowledge base
fitz collections                     # List and delete knowledge collections
fitz serve                           # Start REST API server
fitz config                          # View/edit configuration
fitz doctor                          # System diagnostics
```

</details>

---

<details>

<summary><strong>📦 Python SDK Reference</strong></summary>

<br>

**Simple usage (module-level, matches CLI):**
```python
import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is the refund policy?")
print(answer.text)
```

<br>

**Advanced usage (multiple collections):**
```python
from fitz_ai import fitz

# Create separate instances for different collections
physics = fitz(collection="physics")
physics.ingest("./physics_papers")

legal = fitz(collection="legal")
legal.ingest("./contracts")

# Query each collection
physics_answer = physics.query("Explain entanglement")
legal_answer = legal.query("What are the payment terms?")
```

<br>

**Working with answers:**
```python
answer = fitz_ai.query("What is the refund policy?")

print(answer.text)
print(answer.mode)  # CONFIDENT, QUALIFIED, DISPUTED, or ABSTAIN

for source in answer.provenance:
    print(f"Source: {source.source_id}")
    print(f"Excerpt: {source.excerpt}")
```

</details>

---

<details>

<summary><strong>📦 REST API Reference</strong></summary>

<br>

**Start the server:**
```bash
pip install fitz-ai[api]

fitz serve                    # localhost:8000
fitz serve -p 3000            # custom port
fitz serve --host 0.0.0.0     # all interfaces
```

**Interactive docs:** Visit `http://localhost:8000/docs` for Swagger UI.

<br>

**Endpoints:**

| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/query` | Query knowledge base |
| POST | `/chat` | Multi-turn chat (stateless) |
| POST | `/ingest` | Ingest documents from path |
| GET | `/collections` | List all collections |
| GET | `/collections/{name}` | Get collection stats |
| DELETE | `/collections/{name}` | Delete a collection |
| GET | `/health` | Health check |

<br>

**Example requests:**

```bash
# Query
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the refund policy?", "collection": "default"}'

# Ingest
curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"source": "./docs", "collection": "mydata"}'

# Chat (stateless - client manages history)
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What about returns?",
    "history": [
      {"role": "user", "content": "What is the refund policy?"},
      {"role": "assistant", "content": "The refund policy allows..."}
    ],
    "collection": "default"
  }'
```

</details>

---

<details>

<summary><strong>📦 Beyond RAG</strong></summary>

<br>

> **RAG is a method. Knowledge access is a strategy.**

Fitz is not a RAG framework. It's a knowledge platform that *currently* uses RAG as its primary engine.

```python
from fitz_ai import run

# Fitz RAG - fast, reliable vector search
answer = run("What are the payment terms?", engine="fitz_rag")

# CLaRa - compressed RAG, 16x smaller context
answer = run("What are the payment terms?", engine="clara")

# GraphRAG - knowledge graph with entity extraction and community summaries
answer = run("What are the payment terms?", engine="graphrag")
```

The engine is an implementation detail. Your ingested knowledge, your queries, your workflow—all stay the same. When a better retrieval paradigm emerges, swap one line, not your entire codebase.

</details>

---

<details>

<summary><strong>📦 Philosophy</strong></summary>

<br>

**Principles:**
- **Explicit over clever**: No magic. Read the config, know what happens.
- **Answers over architecture**: Optimize for time-to-insight, not flexibility.
- **Honest over helpful**: Better to say "I don't know" than hallucinate.
- **Files over frameworks**: YAML plugins over class hierarchies.

</details>

---

### License

MIT

---

### Links

- [GitHub](https://github.com/yafitzdev/fitz-ai)
- [PyPI](https://pypi.org/project/fitz-ai/)
- [Changelog](CHANGELOG.md)
- [CLI Documentation](docs/CLI.md)
