Metadata-Version: 2.4
Name: relay-llm
Version: 0.3.0
Summary: Multi-provider LLM batch prediction library
License: MIT
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: aiofiles>=23.0
Requires-Dist: aiosqlite>=0.20
Requires-Dist: httpx>=0.27
Requires-Dist: pydantic>=2.0
Requires-Dist: rich>=13.0
Requires-Dist: tomli>=2.0; python_version < '3.11'
Requires-Dist: typer>=0.12
Requires-Dist: zstandard>=0.22
Provides-Extra: all
Requires-Dist: relay[anthropic,dashboard,google,hf,openai,parquet,redis,xai]; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.43; extra == 'anthropic'
Provides-Extra: dashboard
Requires-Dist: fastapi>=0.115; extra == 'dashboard'
Requires-Dist: jinja2>=3.1; extra == 'dashboard'
Requires-Dist: textual>=0.80; extra == 'dashboard'
Requires-Dist: uvicorn[standard]>=0.30; extra == 'dashboard'
Provides-Extra: dev
Requires-Dist: hypothesis>=6.0; extra == 'dev'
Requires-Dist: mypy>=1.12; extra == 'dev'
Requires-Dist: pre-commit>=4.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
Requires-Dist: pytest-vcr>=1.0; extra == 'dev'
Requires-Dist: pytest>=8; extra == 'dev'
Requires-Dist: ruff>=0.7; extra == 'dev'
Provides-Extra: google
Requires-Dist: google-cloud-aiplatform>=1.70; extra == 'google'
Requires-Dist: google-generativeai>=0.8; extra == 'google'
Provides-Extra: hf
Requires-Dist: datasets>=3.0; extra == 'hf'
Provides-Extra: openai
Requires-Dist: openai>=1.50; extra == 'openai'
Requires-Dist: tiktoken>=0.7; extra == 'openai'
Provides-Extra: otel
Requires-Dist: opentelemetry-api>=1.28; extra == 'otel'
Requires-Dist: opentelemetry-exporter-otlp>=1.28; extra == 'otel'
Requires-Dist: opentelemetry-sdk>=1.28; extra == 'otel'
Provides-Extra: parquet
Requires-Dist: pyarrow>=17.0; extra == 'parquet'
Provides-Extra: redis
Requires-Dist: redis[hiredis]>=5.0; extra == 'redis'
Provides-Extra: xai
Requires-Dist: httpx>=0.27; extra == 'xai'
Description-Content-Type: text/markdown

# relay

A unified, provider-agnostic Python library for submitting, managing, and downloading results from large-scale LLM batch prediction jobs across Anthropic, OpenAI, Google, and XAI.

## Installation

```bash
# Minimal
pip install relay

# All providers + full feature set
pip install relay[all]

# Individual providers
pip install relay[anthropic]
pip install relay[openai]
pip install relay[google]
pip install relay[xai]
```

## Quickstart

```python
import asyncio
from relay import BatchClient, BatchRequest, BatchConfig, Message

requests = [
    BatchRequest(id=f"req-{i}", messages=[Message(role="user", content=f"Summarize: {text}")])
    for i, text in enumerate(my_texts)
]

config = BatchConfig(provider="anthropic", model="claude-opus-4-5")

async def main():
    async with BatchClient() as client:
        job = await client.submit(requests, config)
        results = await client.wait_and_download(job.id)
    return results

results = asyncio.run(main())
```

## Features

- **Provider-agnostic API** — switch providers with a single config change
- **Resilient job state** — jobs survive process crashes; state is persisted to SQLite
- **Smart caching** — identical `(prompt, model, params)` tuples are never sent twice
- **Cost estimation** — token count and cost estimate before every submission
- **Budget controls** — configurable warn, confirm, and hard-limit thresholds
- **Fan-out** — send the same request to multiple providers simultaneously
- **Multiple export formats** — JSONL, CSV, Parquet, HuggingFace Dataset
- **Terminal dashboard** — live TUI for monitoring jobs, costs, and cache
- **Web dashboard** — lightweight FastAPI UI for remote monitoring
- **Observability** — Prometheus, OpenTelemetry, and JSONL metrics exporters

## Provider Compatibility

| Provider  | Model family      | Batch API | Streaming | Fan-out |
|-----------|-------------------|-----------|-----------|---------|
| Anthropic | Claude 3/4        | Yes       | No        | Yes     |
| OpenAI    | GPT-4o, o-series  | Yes       | No        | Yes     |
| Google    | Gemini 1.5/2.0    | Yes       | No        | Yes     |
| XAI       | Grok              | Yes       | No        | Yes     |

## CLI

```bash
# Submit a batch job
relay submit requests.jsonl --provider anthropic --model claude-opus-4-5

# Submit and wait for results
relay run requests.jsonl --provider openai --model gpt-4o --output results.jsonl

# Dry-run cost estimate
relay estimate requests.jsonl --provider anthropic --model claude-opus-4-5

# List and monitor jobs
relay jobs list
relay jobs status <job-id>

# Export results
relay export <job-id> --format parquet --output results/

# Cache management
relay cache stats
relay cache vacuum

# Spending report
relay costs today
```

## Documentation

See the [spec.md](spec.md) for the full technical specification and API reference.
