Metadata-Version: 2.4
Name: traintrack-ai
Version: 0.1.5
Summary: TrainTrack Client: PyTorch training-time evaluation
Author: TrainTrack Team
License: MIT
Project-URL: Homepage, https://github.com/traintrack/traintrack
Keywords: pytorch,llm,evaluation,training,ml
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0.0
Requires-Dist: requests>=2.28.0
Requires-Dist: pydantic>=2.0.0

# TrainTrack

**Training-time evaluation and win-rate tracking for LLMs.**

TrainTrack helps you monitor model behavior during training by running automated LLM-as-a-Judge evaluations on every checkpoint.

## Features

- 🚀 **Real-time Metrics**: Get immediate feedback on conciseness, helpfulness, and reasoning quality.
- 📊 **Win-rate Tracking**: Automatically track win-rates against an anchor checkpoint (baseline) or the previous step.
- 📚 **Built-in Benchmarks**: Integrated support for GPQA, MMLU-Pro, IFEval, and TruthfulQA.
- 🛠️ **Seamless Integration**: Works with standard PyTorch loops and HuggingFace Trainer.

## Quick Installation

```bash
pip install traintrack-ai
```

## Minimal Example

```python
from traintrack import TrainTrackHook

# 1. Initialize the hook
hook = TrainTrackHook(
    model=model,
    tokenizer=tokenizer,
    run_name="my-first-run",
    datasets=["reasoning", "helpfulness"]
)

# 2. Capture a baseline (optional)
hook.capture_anchor()

# 3. Add to your training loop
for step, batch in enumerate(train_dataloader):
    # ... training logic ...

    hook.step(step)
```

## Documentation

For full documentation and advanced configuration (custom metrics, rubrics, and category-based evaluation), visit:
[github.com/traintrack/traintrack](https://github.com/traintrack/traintrack)
