Metadata-Version: 2.4
Name: costrouter
Version: 0.1.0
Summary: Production-ready LLM routing library — multi-model routing across N cost tiers
Project-URL: Homepage, https://github.com/nikos118/costrouter
Project-URL: Repository, https://github.com/nikos118/costrouter
Project-URL: Issues, https://github.com/nikos118/costrouter/issues
Author: nikos118
License-Expression: MIT
License-File: LICENSE
Keywords: anthropic,cost,litellm,llm,openai,routing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Requires-Dist: litellm>=1.0.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-asyncio; extra == 'dev'
Description-Content-Type: text/markdown

# CostRouter

Stop paying GPT-4o prices for "What's 2+2?"

CostRouter automatically sends simple questions to cheap models and hard questions to expensive ones. Drop-in replacement for your LLM calls — saves up to 85% on costs.

## Install

```bash
pip install costrouter
```

## Quick Start

```python
import os
from costrouter import CostRouter

os.environ["OPENAI_API_KEY"] = "sk-..."

# List your models from cheapest to most expensive
router = CostRouter(["gpt-4o-mini", "gpt-4o"])

# Use it like normal
result = router.completion(messages=[{"role": "user", "content": "Hi!"}])
print(result.content)  # "Hello! How can I help you?"
print(result.model)    # "gpt-4o-mini" (cheap model — it was just a greeting)
print(result.tier)     # "simple"
```

That's it. Simple questions go to `gpt-4o-mini`, hard ones go to `gpt-4o`.

## How It Works

Every request gets classified into a complexity tier before being routed:

| Prompt | Tier | Goes to |
|--------|------|---------|
| "Hi there!" | simple | gpt-4o-mini |
| "Summarize this article" | moderate | gpt-4o-mini |
| "Write a Python web scraper" | complex | gpt-4o |
| "Prove the Riemann hypothesis" | expert | gpt-4o |

The classifier is a cheap LLM call (~$0.0001) that runs before your actual request. Obvious cases (greetings, short questions, code keywords) are caught by free pattern matching and skip the classifier entirely.

## More Models

You can use 1, 2, 3, or 4 models. They auto-map to tiers by position:

```python
# 2 models: simple + complex
router = CostRouter(["gpt-4o-mini", "gpt-4o"])

# 3 models: simple + moderate + complex
router = CostRouter(["gpt-4o-mini", "gpt-4o", "claude-opus-4-6"])

# 4 models: simple + moderate + complex + expert
router = CostRouter(["gpt-4o-mini", "gpt-4o", "claude-sonnet-4-5-20250929", "claude-opus-4-6"])
```

## Mix Providers

Use OpenAI for cheap calls and Anthropic for hard ones (or any of the [100+ providers litellm supports](https://docs.litellm.ai/docs/providers)):

```python
from costrouter import CostRouter, CostTier, ModelConfig

router = CostRouter([
    ModelConfig(name="gpt-4o-mini", tier=CostTier.SIMPLE, api_key="sk-..."),
    ModelConfig(name="claude-opus-4-6", tier=CostTier.COMPLEX, api_key="sk-ant-..."),
])
```

## Track Your Savings

```python
# After some requests...
print(router.analytics.summary())
```

```
CostRouter Analytics
====================
Total requests: 150
Total cost: $0.45
Cost if always using most expensive: $3.20
Total saved: $2.75 (86%)

Routing breakdown:
  simple      45.0% (68 requests)  -> gpt-4o-mini
  moderate    30.0% (45 requests)  -> gpt-4o-mini
  complex     20.0% (30 requests)  -> gpt-4o
  expert       5.0% (7 requests)   -> gpt-4o
```

```python
# Export for analysis
router.analytics.export_csv("routing_log.csv")

# Programmatic access
data = router.analytics.to_dict()
```

## Async

```python
result = await router.acompletion(messages=[{"role": "user", "content": "Hi"}])
print(result.content)
```

## Pass-Through Parameters

Anything you'd pass to OpenAI/litellm works:

```python
result = router.completion(
    messages=[{"role": "user", "content": "Write a haiku"}],
    temperature=0.9,
    max_tokens=200,
)
```

## Custom Routing Logic

Subclass `BaseRouter` to define your own rules:

```python
from costrouter import BaseRouter, CostRouter, CostTier

class MyRouter(BaseRouter):
    def classify(self, messages):
        text = messages[-1]["content"]
        if "urgent" in text.lower():
            return CostTier.EXPERT, 1.0, 0.0
        return CostTier.SIMPLE, 0.8, 0.0

    async def aclassify(self, messages):
        return self.classify(messages)

router = CostRouter(
    models=["gpt-4o-mini", "gpt-4o"],
    strategy=MyRouter(),
)
```

`classify` returns `(tier, confidence, screener_cost)`.

## Response Object

```python
result = router.completion(messages=[...])

result.content     # Response text (shortcut)
result.model       # Which model handled it
result.tier        # Classified complexity
result.cost        # Cost of the LLM call
result.total_cost  # Screening + LLM call
result.saved       # $ saved vs most expensive model
result.response    # Full litellm response object
```

## Advanced Options

```python
router = CostRouter(
    models=["gpt-4o-mini", "gpt-4o"],
    screener="gpt-4o-mini",       # Model used to classify (default: cheapest)
    strategy="cascade",            # "cascade" (default), "heuristic" (free), or "llm"
    confidence_threshold=0.7,      # How confident heuristic must be to skip LLM classifier
    fallback=True,                 # If a model fails, try the next tier up
)
```

**Strategies:**

- **`cascade`** (default) — Free pattern matching first. If unsure, asks a cheap LLM. Best balance of cost and accuracy.
- **`heuristic`** — Pattern matching only. Free and instant, but less accurate on ambiguous prompts.
- **`llm`** — Always uses a cheap LLM to classify. Most accurate, costs ~$0.0001 per request.

## Requirements

- Python >= 3.10
- `litellm` + `pydantic` (no PyTorch, no Transformers)
