Metadata-Version: 2.4
Name: epochler
Version: 0.1.0
Summary: Autonomous ML research agent
License-Expression: LicenseRef-BSL-1.1
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.115
Requires-Dist: uvicorn[standard]>=0.32
Requires-Dist: websockets>=14.0
Requires-Dist: pydantic>=2.0
Requires-Dist: pydantic-settings>=2.6
Requires-Dist: python-dotenv>=1.0
Requires-Dist: python-jose[cryptography]>=3.3
Requires-Dist: passlib[bcrypt]>=1.7
Requires-Dist: anthropic>=0.40
Requires-Dist: google-genai>=1.0
Requires-Dist: aiofiles>=24.0
Requires-Dist: httpx>=0.27
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# Epochler

[![License: BSL 1.1](https://img.shields.io/badge/License-BSL_1.1-blue.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/Python-3.11+-3776AB.svg)](https://python.org)
[![Node 20+](https://img.shields.io/badge/Node-20+-339933.svg)](https://nodejs.org)
![Status: Beta](https://img.shields.io/badge/Status-Beta-orange)

Autonomous ML Research Agent. Define your ML task through conversation, approve decisions progressively, then let the agent iterate autonomously until it hits your target metric.

> **Early beta (v0.1.0).** The core workflow is functional, but expect rough edges, breaking changes between releases, and incomplete documentation.

## Why Epochler?

Training ML models involves a repetitive cycle: choose a dataset, pick metrics, design preprocessing, select a model family, write training code, run it, analyze results, tweak, repeat. Epochler automates this loop. You describe what you want in plain language, approve structured decisions through a UI, and the agent handles code generation, execution, and iteration, while you watch in real time.

## Quick Start

### Option A: Docker (recommended)

```bash
git clone https://github.com/elilat/epochler.git
cd epochler
cp backend/.env.example .env
# Edit .env with your API keys and a secure password/secret
docker compose up --build
```

Open http://localhost:8000 and sign in with your configured password.

### Option B: pip install from source

```bash
git clone https://github.com/elilat/epochler.git
cd epochler

# Backend
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install .
cp .env.example .env
# Edit .env with your API keys and a secure password/secret

# Frontend
cd ../frontend
npm install
npm run build

# Run
cd ..
epochler
```

Open http://localhost:8000.

### Option C: Development mode

```bash
# Backend
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
cp .env.example .env
# Edit .env with your API keys and a secure password/secret

# Frontend
cd ../frontend
npm install

# Start both (from repo root)
cd ..
./dev.sh
```

Open http://localhost:5173 (frontend dev server with hot reload).

## How It Works

Epochler splits every ML project into two deterministic phases:

```mermaid
flowchart TD
    subgraph locking ["Phase 1: Progressive Locking (you approve)"]
        Chat[Chat with agent] --> D1[Dataset decision]
        D1 --> D2[Preprocessing decision]
        D2 --> D3[Evaluation decision]
        D3 --> D4[Model decision]
        D4 --> D5[Target threshold]
        D5 --> Freeze[Contract freezes]
    end

    subgraph execution ["Phase 2: Autonomous Execution (agent iterates)"]
        Freeze --> Preprocess[Preprocessing]
        Preprocess --> Feature[Featurization]
        Feature --> Train[Training]
        Train --> Eval[Evaluation]
        Eval --> Analyze[Analysis]
        Analyze -->|"target not met"| Train
        Analyze -->|"target met or budget hit"| PostRun[Post-run recommendation]
    end

    PostRun --> Review[You review results]
    Review -->|continue| Train
    Review -->|finalize| Done[Done]
```

**Phase 1 - Progressive Locking.** You describe your ML task in natural language. The agent proposes structured decisions (dataset, preprocessing, evaluation metrics, model family, target threshold). You approve or reject each one through decision cards in the UI. Nothing executes until every decision is locked and the contract freezes.

**Phase 2 - Autonomous Execution.** Once the contract is frozen, the agent runs the full experiment loop: preprocessing, featurization, training, evaluation, and analysis. It iterates autonomously, writing new training code each round, trying to beat your target metric. Preprocessing and featurization outputs are cached and reused across iterations. Every experiment is logged with its hypothesis, code, results, and verdict.

You can watch live activity, training output, and experiment history in the dashboard at any point.

## Architecture

- **Frontend**: React 19 + Vite 7 + TailwindCSS 4 + shadcn/ui
- **Backend**: Python FastAPI with WebSocket
- **Agents**: Claude (planning + coding) via Anthropic API, Gemini (web search) via Google AI API
- **Storage**: File-based JSON (conversations, contracts, experiments, activity logs)

## Tabs

- **Chat** - Conversation with the planner agent
- **Experiments** - Table of all experiment runs with expandable reasoning (hypothesis, verdict, lessons, diffs)
- **Contract** - Current state of the experiment contract (dataset, eval, model, preprocessing)
- **Activity** - Real-time feed of what the agent is doing
- **Console** - Live training stdout/stderr output

## Project Structure

```
epochler/
  frontend/          # React + Vite + TailwindCSS + shadcn/ui
  backend/           # Python FastAPI
    epochler/
      agents/        # Planner, coder, search agents
      orchestrator/  # State machine, message routing
      contract/      # Pydantic models, progressive locking
      runner/        # Experiment loop, hardware detection, subprocess execution
      storage/       # File-based JSON persistence
    data/            # Runtime data (gitignored)
  docs/              # Design documents
  scripts/           # Build and release scripts
  dev.sh             # Development startup script
```

## Configuration

All configuration is done through environment variables (or a `.env` file in the backend directory). See [backend/.env.example](backend/.env.example) for all available options.

Key settings:

| Variable | Description |
|----------|-------------|
| `ANTHROPIC_API_KEY` | Anthropic API key (required) |
| `GEMINI_API_KEY` | Google AI API key (required) |
| `EPOCHLER_PASSWORD` | Login password (required, must change from default) |
| `EPOCHLER_SECRET_KEY` | JWT signing secret (required, must change from default) |
| `CORS_ORIGINS` | Allowed CORS origins (comma-separated) |
| `WORKSPACE_ROOT` | Root directory for experiment workspaces |
| `ANTHROPIC_MODEL` | Override the default planner/coder model |
| `LOG_LEVEL` | `DEBUG`, `INFO`, `WARNING` (default: `INFO`) |

## Known Limitations

- **Single-user**: Epochler is designed for individual use. There is no multi-user auth or role system.
- **Sandbox model**: Training scripts run in subprocess isolation with env scrubbing, but this is not OS-level sandboxing. See [SECURITY.md](SECURITY.md).
- **Preview model names**: The default LLM model names in config may reference preview versions that could be renamed or deprecated by providers. Override them via `.env` if needed.
- **File-based storage**: All data is stored as JSON files on disk. There is no database. This is simple and portable but not designed for high concurrency.

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

## License

Epochler is licensed under the [Business Source License 1.1](LICENSE). You may use it for any non-commercial purpose. The license converts to Apache 2.0 on 2028-03-11.
