Metadata-Version: 2.4
Name: htc-env
Version: 0.1.2
Summary: A Python framework for building RL environments with objective and subjective evaluation
Requires-Python: >=3.9
Requires-Dist: datasets>=2.0.0
Requires-Dist: openai>=1.0.0
Requires-Dist: pillow>=10.0.0
Requires-Dist: playwright>=1.40.0
Requires-Dist: python-dotenv>=1.0.0
Description-Content-Type: text/markdown

# Hyperbolic

A Python framework for building RL environments that generate training-grade trajectories across any domain -- browser tasks, text tasks, coding tasks, and beyond.

## Installation

```bash
pip install htc-env
```

For browser environments, run the one-time setup after install:

```bash
htc setup
```

## Quick Start

```python
import hyperbolic as htc

dataset = htc.Dataset.from_list([
    {"question": "What is 2+2?", "answer": "4"},
    {"question": "What is 3*5?", "answer": "15"},
])

def correct_answer(completion, answer):
    response = completion[-1]["content"]
    return 1.0 if answer in response else 0.0

rubric = htc.Rubric(funcs=[correct_answer])
env = htc.SingleTurnEnv(dataset=dataset, rubric=rubric, system_prompt="Answer concisely.")
agent = htc.OpenAIAgent(model="gpt-4o-mini")
trajectories = env.run(agent)

for t in trajectories:
    print(f"Task: {t.task['question']}, Reward: {t.scalar_reward:.1f}")
```

## Browser Environment

```python
import hyperbolic as htc

dataset = htc.Dataset.from_list([{
    "task": "Find the headphones product and add it to the cart.",
    "start_url": "http://localhost:5001",
    "reset_url": "http://localhost:5001/api/reset",
}])

rubric = htc.Rubric()
rubric.add(htc.URLMatch(pattern=r"/cart"), weight=0.5, dimension="reached_cart")

env = htc.BrowserEnv(
    dataset=dataset,
    rubric=rubric,
    max_turns=10,
    browser_config=htc.BrowserConfig(headless=False),
)

agent = htc.OpenAIAgent(model="gpt-4o", tool_choice="required")
trajectories = env.run(agent)
```

## CLI

```bash
htc run examples/simple_math.py -m gpt-4o-mini
htc run examples/browser_task.py -m gpt-4o
htc --version
```

## Environment Types

- **TextEnv** -- single-turn or multi-turn text interactions with optional tool calling
- **BrowserEnv** -- browser automation via Playwright with screenshot observations

## Key Features

- Multi-dimensional evaluation (accuracy, tone, helpfulness as separate scoring axes)
- Judge panels with disagreement tracking
- Pluggable model interface (OpenAI, Anthropic, custom)
- Deterministic browser environment resets
- Standardized trajectory output format

## API Keys

Create a `.env` file in your project root:

```
OPENAI_API_KEY=sk-...
```

Keys are resolved in order: explicit parameter > `.env` file > system env var.
