Metadata-Version: 2.4
Name: fabra_ai
Version: 2.0.5
Summary: Context infrastructure for AI applications
Project-URL: Homepage, https://github.com/davidahmann/fabra
Project-URL: Repository, https://github.com/davidahmann/fabra
Project-URL: Documentation, https://github.com/davidahmann/fabra#readme
Author-email: Fabra Contributors <dahmann@lumyn.cc>
License: Apache-2.0
License-File: LICENSE
Keywords: data-engineering,feature-store,machine-learning,mlops
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Requires-Dist: anthropic>=0.18.0
Requires-Dist: apscheduler>=3.10.0
Requires-Dist: asyncpg>=0.29.0
Requires-Dist: cohere>=4.27.0
Requires-Dist: duckdb>=0.10.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: greenlet>=3.0.0
Requires-Dist: openai>=2.9.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: pgvector>=0.4.2
Requires-Dist: prometheus-client>=0.17.0
Requires-Dist: psycopg2-binary>=2.9.11
Requires-Dist: pybreaker>=1.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: redis>=5.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: sqlalchemy[asyncio]>=2.0.0
Requires-Dist: sqlglot>=18.0.0
Requires-Dist: structlog>=23.1.0
Requires-Dist: tiktoken>=0.12.0
Requires-Dist: typer>=0.9.0
Requires-Dist: uuid6>=2024.1.12
Requires-Dist: uvicorn>=0.20.0
Provides-Extra: dev
Requires-Dist: bandit>=1.7.0; extra == 'dev'
Requires-Dist: fakeredis>=2.0.0; extra == 'dev'
Requires-Dist: httpx>=0.24.0; extra == 'dev'
Requires-Dist: mkdocs-material>=9.5.0; extra == 'dev'
Requires-Dist: mkdocs>=1.5.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pandas-stubs>=2.0.0; extra == 'dev'
Requires-Dist: pre-commit>=3.5.0; extra == 'dev'
Requires-Dist: prometheus-client>=0.17.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: testcontainers>=3.7.1; extra == 'dev'
Requires-Dist: types-redis; extra == 'dev'
Requires-Dist: types-sqlalchemy; extra == 'dev'
Provides-Extra: ui
Requires-Dist: streamlit; extra == 'ui'
Requires-Dist: watchdog; extra == 'ui'
Description-Content-Type: text/markdown

<div align="center">
  <h1>Fabra</h1>
  <p><b>Context Infrastructure for AI Applications</b></p>

  <p>
    <a href="https://pypi.org/project/fabra-ai/"><img src="https://img.shields.io/pypi/v/fabra-ai?color=blue&label=pypi" alt="PyPI version" /></a>
    <a href="https://github.com/davidahmann/fabra/blob/main/LICENSE"><img src="https://img.shields.io/github/license/davidahmann/fabra?color=green" alt="License" /></a>
    <img src="https://img.shields.io/badge/python-3.9+-blue.svg" alt="Python Version" />
  </p>

  <p>
    <a href="https://fabraoss.vercel.app"><b>Try in Browser</b></a> ·
    <a href="https://davidahmann.github.io/fabra/docs/quickstart"><b>Quickstart</b></a> ·
    <a href="https://davidahmann.github.io/fabra/docs/"><b>Docs</b></a>
  </p>
</div>

---

**Fabra** is the system of record for what your AI knows. We ingest, index, track freshness, and serve context data — not just query it.

This "write path ownership" enables:
- **Replay any AI decision** — What exactly did the model know?
- **Full lineage tracking** — Which features, documents, and retrievers were used?
- **Freshness guarantees** — Was the data stale when the decision was made?

```bash
pip install "fabra-ai[ui]"
```

---

## Choose Your Path

<table>
<tr>
<td width="50%" valign="top">

### ML Engineers
**"Feast needs Kubernetes. I just need features."**

```python
from fabra import FeatureStore, entity, feature

store = FeatureStore()

@entity(store)
class User:
    user_id: str

@feature(entity=User, refresh="hourly")
def purchase_count(user_id: str) -> int:
    return db.query("SELECT COUNT(*) FROM purchases WHERE user_id = ?", user_id)
```

```bash
fabra serve features.py
curl localhost:8000/features/purchase_count?user_id=123
```

No Kubernetes. No Spark. No YAML. Just Python.

**[Feature Store Without K8s →](https://davidahmann.github.io/fabra/docs/feature-store-without-kubernetes)** · **[Feast vs Fabra →](https://davidahmann.github.io/fabra/docs/feast-alternative)**

</td>
<td width="50%" valign="top">

### AI Engineers
**"Someone asked what the AI knew. I couldn't tell them."**

```python
from fabra import FeatureStore, context, ContextItem
from fabra.retrieval import retriever

store = FeatureStore()

@retriever(index="docs", top_k=5)
async def search_docs(query: str):
    pass  # Auto-wired to pgvector

@context(store, max_tokens=4000)
async def build_prompt(user_id: str, query: str):
    docs = await search_docs(query)
    return [ContextItem(content=str(docs), priority=0)]

ctx = await build_prompt("user_123", "query")
print(ctx.id)       # Replay this exact context anytime
print(ctx.lineage)  # What data was used?
```

**[Context Traceability →](https://davidahmann.github.io/fabra/docs/rag-audit-trail)** · **[Compliance Guide →](https://davidahmann.github.io/fabra/docs/compliance-guide)**

</td>
</tr>
</table>

---

## Why Engineers Choose Fabra

### 1. We Own the Write Path

LangChain, Pinecone, and other tools are **read-only wrappers** — they query your data but don't manage it. When compliance asks "what did the AI know?", they have no answer.

Fabra ingests, indexes, and serves context data. Every decision traces back through the data that informed it.

```python
# Replay any historical context
ctx = await store.get_context_at("01912345-6789-7abc-def0-123456789abc")
print(ctx.content)   # Exact prompt from that moment
print(ctx.lineage)   # Complete data provenance
```

### 2. Local-First, Production-Ready

Same code runs everywhere. DuckDB locally, Postgres + Redis in production.

```bash
# Development (zero setup)
fabra serve features.py

# Production (just add env vars)
FABRA_ENV=production \
FABRA_POSTGRES_URL=postgresql+asyncpg://... \
FABRA_REDIS_URL=redis://... \
fabra serve features.py
```

No Docker for local dev. No Kubernetes for production. Deploy to Fly.io, Railway, Cloud Run, or any container platform with one command.

### 3. Point-in-Time Correctness

Training ML models? We use `ASOF JOIN` (DuckDB) and `LATERAL JOIN` (Postgres) to ensure your training data reflects the world exactly as it was — no data leakage, ever.

### 4. Token Budget Management

No more "context too long" errors. Priority-based truncation keeps your prompts under budget.

```python
@context(store, max_tokens=4000)
async def build_prompt(user_id: str, query: str):
    return [
        ContextItem(content=system_prompt, priority=0, required=True),
        ContextItem(content=docs, priority=1),
        ContextItem(content=history, priority=2),  # Dropped first if over budget
    ]
```

---

## Key Capabilities

### For ML Engineers

| Capability | Description |
|:-----------|:------------|
| **Python Decorators** | `@feature` instead of 500 lines of YAML |
| **DuckDB + Postgres** | Local dev with embedded DB, production with Postgres |
| **Point-in-Time Joins** | ASOF/LATERAL joins for training data correctness |
| **Hybrid Features** | Mix Python logic and SQL in the same pipeline |
| **One-Command Deploy** | `fabra deploy fly\|cloudrun\|railway\|render` |

### For AI Engineers

| Capability | Description |
|:-----------|:------------|
| **Context Accountability** | UUIDv7 IDs, full lineage, replay any decision |
| **Vector Search** | Built-in pgvector with automatic chunking |
| **Token Budgets** | `max_tokens` with priority-based truncation |
| **Freshness SLAs** | Fail-safe when data is stale |
| **Export** | `fabra context export` for debugging and compliance |

### Production Features

- **Observability:** Prometheus metrics, OpenTelemetry tracing
- **Reliability:** Circuit breakers, fallback chains, `fabra doctor`
- **Security:** Self-hosted, your data never leaves your infrastructure

---

## Architecture

```
Development                         Production
┌─────────────────────┐            ┌─────────────────────────┐
│  Your Python Code   │            │   Your Python Code      │
│  (@feature, @context)│            │   (@feature, @context)  │
└──────────┬──────────┘            └───────────┬─────────────┘
           │                                   │
           ▼                                   ▼
┌─────────────────────┐            ┌─────────────────────────┐
│  DuckDB (embedded)  │            │  Postgres + pgvector    │
│  In-Memory Cache    │            │  Redis                  │
└─────────────────────┘            └─────────────────────────┘

Same code. Same decorators. Different backends.
FABRA_ENV=development → FABRA_ENV=production
```

---

## Comparison

### vs Feast (Feature Store)

| | Feast | Fabra |
|:---|:---|:---|
| Setup | Kubernetes + Spark | `pip install` |
| Configuration | YAML | Python decorators |
| Time to production | Weeks | 30 seconds |
| RAG support | None | Built-in Context Store |
| Traceability | None | Full lineage |

**Use Feast when:** You have a platform team and existing K8s/Spark infrastructure.

### vs LangChain (RAG)

| | LangChain | Fabra |
|:---|:---|:---|
| Type | Framework (orchestration) | Infrastructure (storage + serving) |
| Traceability | None | Full lineage + replay |
| Token budgets | DIY | Built-in |
| Data ownership | Read-only wrapper | Write path owner |

**Use LangChain when:** You need agent orchestration and don't need compliance.

---

## Get Started

```bash
pip install "fabra-ai[ui]"

# ML Engineers: Serve features
fabra serve features.py

# AI Engineers: Index documents and serve context
fabra serve chatbot.py
```

<p align="center">
  <a href="https://fabraoss.vercel.app"><b>Try in Browser</b></a> ·
  <a href="https://davidahmann.github.io/fabra/docs/quickstart"><b>Quickstart Guide</b></a> ·
  <a href="https://davidahmann.github.io/fabra/docs/"><b>Full Documentation</b></a>
</p>

---

## Roadmap

- [x] **v1.0:** Core Feature Store (DuckDB, Postgres, Redis)
- [x] **v1.2:** Context Store (pgvector, retrievers, token budgets)
- [x] **v1.3:** UI, Magic Retrievers, One-Command Deploy
- [x] **v1.4:** Context Accountability (lineage, replay, traceability)
- [x] **v1.5:** Freshness SLAs (data freshness guarantees)
- [ ] **v1.6:** Drift detection, RBAC, multi-region

---

## Contributing

We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) to get started.

<div align="center">
  <p><b>Fabra</b> · Apache 2.0 · 2025</p>
</div>
