Metadata-Version: 2.4
Name: robot-resources-router
Version: 2.0.0
Summary: Intelligent LLM routing proxy — cost optimization via local proxy
Project-URL: Homepage, https://github.com/robot-resources/robot-resources
Project-URL: Documentation, https://github.com/robot-resources/robot-resources#readme
Project-URL: Repository, https://github.com/robot-resources/robot-resources
Author: Robot Resources Team
License: MIT
Keywords: ai,cost-optimization,llm,proxy,router,routing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: click>=8.1.0
Requires-Dist: fastapi>=0.109.0
Requires-Dist: httpx>=0.26.0
Requires-Dist: pydantic-settings>=2.1.0
Requires-Dist: pydantic>=2.5.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: structlog>=24.1.0
Requires-Dist: tiktoken>=0.7.0
Requires-Dist: uvicorn[standard]>=0.27.0
Provides-Extra: dev
Requires-Dist: mypy>=1.8.0; extra == 'dev'
Requires-Dist: pre-commit>=3.6.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
Requires-Dist: pytest>=7.4.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# Robot Resources

> Intelligent LLM cost optimization via local proxy.

Automatically route each LLM request to the cheapest model that can handle it. **60-90% cost savings** with no quality loss.

## Quick Start

```bash
# 1. Install
pip install robot-resources-router

# 2. Set API keys
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."

# 3. Start proxy
rr-router start
# Proxy running on http://localhost:3838
```

That's it. Point your agent to `http://localhost:3838` and use `model: "auto"`.

## Why Robot Resources?

| Without RR | With RR |
|------------|---------|
| Every message uses same expensive model | Each message routed to optimal model |
| "hello" costs same as "refactor codebase" | Simple tasks use cheap/free models |
| Manual model selection | Automatic task detection |
| No cost visibility | Full routing transparency |

### Example Savings

```
Turn 1: "hello"                    → gemini-1.5-flash-8b          $0.0000
Turn 2: "what's 2+2?"              → gemini-1.5-flash-8b          $0.0000
Turn 3: "refactor this React code" → gpt-4o-mini                  $0.0002
Turn 4: "thanks, looks good"       → gemini-1.5-flash-8b          $0.0000
─────────────────────────────────────────────────────────────────────────
Total with RR:     $0.0002
Without RR (gpt-4o): $0.0075
Savings:           97%
```

## How It Works

```
Your Agent
    │
    │  POST /v1/chat/completions
    │  model: "auto"
    ▼
┌─────────────────────────────────────┐
│   Robot Resources (localhost:3838)  │
│                                     │
│   1. Detect task type               │
│      → coding, reasoning, analysis  │
│        simple_qa, creative, general │
│                                     │
│   2. Filter capable models          │
│      → capability >= 0.70 threshold │
│                                     │
│   3. Select cheapest                │
│      → lowest cost_per_1k_input     │
│                                     │
│   4. Forward to provider            │
│      → Anthropic, OpenAI, Google    │
└─────────────────────────────────────┘
    │
    ▼
Real LLM Provider (using your API keys)
```

## Installation

### From PyPI

```bash
pip install robot-resources-router
```

### From Source

```bash
git clone https://github.com/your-org/robot-resources.git
cd robot-resources
pip install -e .
```

### Requirements

- Python 3.11+
- API keys for at least one provider (Anthropic, OpenAI, or Google)

## Configuration

### Environment Variables

```bash
# Required: At least one provider
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="..."

# Optional: Server settings
export RR_PORT=3838              # Default: 3838
export RR_HOST=127.0.0.1         # Default: 127.0.0.1
```

### OpenClaw Integration

Add to your OpenClaw config:

```json
{
  "models": {
    "providers": {
      "robot-resources": {
        "baseUrl": "http://localhost:3838",
        "api": "openai-completions"
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "robot-resources/auto"
      }
    }
  }
}
```

### Claude Desktop / Other Agents

Point your agent's API base URL to `http://localhost:3838` and use model `auto`.

## Usage

### Automatic Routing (Recommended)

Use `model: "auto"` to let RR choose the optimal model:

```bash
curl http://localhost:3838/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

### Explicit Model

Bypass routing by specifying a model directly:

```bash
curl http://localhost:3838/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

## API Reference

### Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/v1/chat/completions` | POST | Chat completions (main endpoint) |
| `/v1/models` | GET | List available models |
| `/health` | GET | Health check |

### Request Format

Standard OpenAI chat completions format:

```json
{
  "model": "auto",
  "messages": [
    {"role": "system", "content": "You are helpful."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}
```

### Response Format

Standard OpenAI format plus `routing_info`:

```json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gemini-2.0-flash",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  },
  "routing_info": {
    "selected_model": "gemini-2.0-flash",
    "original_model": "auto",
    "provider": "google",
    "task_type": "simple_qa",
    "capability_score": 0.92,
    "savings_percent": 96.0,
    "baseline_model": "gpt-4o",
    "reasoning": "Selected gemini-2.0-flash as cheapest capable model..."
  }
}
```

## Task Types

RR automatically detects 6 task types:

| Task Type | Detection Keywords | Typical Models |
|-----------|-------------------|----------------|
| `coding` | function, code, debug, python, api | claude-sonnet-4, gpt-4o-mini |
| `reasoning` | explain why, prove, step by step | o3-mini, o1-mini |
| `analysis` | compare, pros and cons, evaluate | gpt-4o-mini, gemini-1.5-pro |
| `simple_qa` | what is, who invented, capital of | gemini-2.0-flash, claude-3-haiku |
| `creative` | write a story, compose, brainstorm | claude-sonnet-4, gpt-4o |
| `general` | (fallback) | cheapest available |

## Supported Models

14 models across 3 providers:

| Provider | Models |
|----------|--------|
| **OpenAI** | gpt-4o, gpt-4o-mini, o1, o1-mini, o3-mini |
| **Anthropic** | claude-opus-4, claude-sonnet-4, claude-3-5-sonnet, claude-3-5-haiku, claude-3-haiku |
| **Google** | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash, gemini-1.5-flash-8b |

## CLI Commands

```bash
# Start the proxy server
rr-router start

# Start on custom port
rr-router start --port 8080

# Check version
rr-router --version

# Get help
rr-router --help
```

## Development

### Setup

```bash
git clone https://github.com/your-org/robot-resources.git
cd robot-resources
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
```

### Run Tests

```bash
pytest                          # All tests
pytest --cov=robot_resources    # With coverage
pytest -v                       # Verbose
```

### Project Structure

```
src/robot_resources/
├── cli/                    # CLI entry point
├── proxy/
│   ├── server.py          # FastAPI app
│   ├── models.py          # Pydantic models
│   ├── handlers/          # API endpoints
│   └── providers/         # LLM provider clients
├── routing/
│   ├── task_detection.py  # Task type classification
│   ├── selector.py        # Model selection logic
│   ├── router.py          # Routing pipeline
│   └── models_db.json     # Model capabilities database
└── mcp/                   # (Future) MCP server for stats
```

## Troubleshooting

### Port already in use

```bash
# Check what's using port 3838
lsof -i :3838

# Use a different port
rr-router start --port 3839
```

### API key not found

```bash
# Verify keys are set
echo $ANTHROPIC_API_KEY
echo $OPENAI_API_KEY

# Set them
export ANTHROPIC_API_KEY="sk-ant-..."
```

### Model not found

Use `model: "auto"` for automatic routing. Check `/v1/models` for available models.

## Roadmap

- [x] **Phase 1:** Local proxy with task detection routing
- [ ] **Phase 2:** Outcome-based routing (learning from success/failure)
- [ ] **Phase 3:** MCP server for stats and configuration

## License

MIT

## Contributing

Contributions welcome! Please read the contributing guidelines first.
