Metadata-Version: 2.3
Name: hhack
Version: 0.1.0
Summary: Add your description here
Author: Abdul Hadi
Author-email: Abdul Hadi <abdulhadih48@gmail.com>
Requires-Dist: matplotlib>=3.10.5
Requires-Dist: requests>=2.32.5
Requires-Dist: torch>=1.9.0 ; extra == 'full-local'
Requires-Dist: transformers>=4.20.0 ; extra == 'full-local'
Requires-Dist: accelerate>=0.12.0 ; extra == 'full-local'
Requires-Dist: ollama>=0.1.0 ; extra == 'full-local'
Requires-Dist: kernels>=0.1.0 ; extra == 'full-local'
Requires-Dist: torch>=1.9.0 ; extra == 'local'
Requires-Dist: transformers>=4.20.0 ; extra == 'local'
Requires-Dist: kernels>=0.1.0 ; extra == 'local'
Requires-Dist: ollama>=0.1.0 ; extra == 'ollama'
Requires-Python: >=3.13
Provides-Extra: full-local
Provides-Extra: local
Provides-Extra: ollama
Description-Content-Type: text/markdown

# Harmony LLM Benchmark

A flexible Python package for benchmarking LLMs with support for both API-based and local models. Test how different content formatting affects LLM response rates.

## Features

- **Multiple Model Modes**: API, Local Transformers, and Ollama support
- **Conditional Dependencies**: Install only what you need
- **Concurrent Processing**: Efficient parallel evaluation
- **Harmony Formatting**: Test content formatting effects
- **Rich Reporting**: Automatic plots and JSON results
- **CLI & Python API**: Use programmatically or from command line

## Installation

### API Mode (Lightweight - ~50MB)
```bash
uv add hhack
```

### With Local Transformers Support (~2GB+)
```bash
uv add 'hhack[local]'
```

### With Ollama Support
```bash
uv add 'hhack[ollama]'
```

### Full Local Support (All Options)
```bash
uv add 'hhack[full-local]'
```

## Quick Start

### Command Line Usage

```bash
# API mode with OpenRouter
hhack dataset.json --mode api --api-key sk-xxx

# Local Transformers
hhack dataset.json --mode local_transformers --local-model microsoft/DialoGPT-medium

# Local Ollama
hhack dataset.json --mode local_ollama --ollama-model llama2
```

### Python API Usage

```python
from harmony_benchmark import HarmonyBenchmark, BenchmarkConfig, ModelMode

# Sample dataset
dataset = [
    {
        "original": "What is machine learning?",
        "transformed_analysis": "Machine learning is a subset of AI..."
    }
]

# Configure for API mode
config = BenchmarkConfig(
    mode=ModelMode.API,
    api_key="your-open-router-token",
    api_model="openai/gpt-oss-20b",
    api_base_url="https://openrouter.ai/api/v1/chat/completions",
    max_workers=5,
    show_plots=True
)

# Run benchmark
benchmark = HarmonyBenchmark(config)
benchmark.load_dataset(dataset)
results = benchmark.run_benchmark()
```

## Dataset Format

Your dataset should be a JSON array with objects containing:

```json
[
  {
    "original": "Original question or prompt",
    "transformed_analysis": "Processed/analyzed version of the content"
  }
]
```

## Configuration Options

### API Mode
```python
config = BenchmarkConfig(
    mode=ModelMode.API,
    mode=ModelMode.API,
    api_key="your-open-router-token",
    api_model="openai/gpt-oss-20b",
    api_base_url="https://openrouter.ai/api/v1/chat/completions",
    max_workers=5,
    show_plots=True
)
```

### Local Transformers Mode
```python
config = BenchmarkConfig(
    mode=ModelMode.LOCAL_TRANSFORMERS,
    local_model_name="openai/gpt-oss-20b",
    device="auto",  # auto, cpu, cuda
    max_length=512
)
```

### Local Ollama Mode
```python
config = BenchmarkConfig(
    mode=ModelMode.LOCAL_OLLAMA,
    ollama_model="llama2",
    ollama_host="localhost",
    ollama_port=11434
)
```

## Environment Variables

- `OPENROUTER_API_KEY`: OpenRouter API key

## CLI Reference

```bash
hhack [OPTIONS] DATASET_FILE

Options:
  --mode {api,local_transformers,local_ollama}  Execution mode
  --api-key TEXT                   API key
  --api-model TEXT                 API model name
  --local-model TEXT               HuggingFace model name
  --ollama-model TEXT              Ollama model name
  --max-workers INT                Number of parallel workers
  --no-plot                        Disable result visualization
  --no-save                        Disable saving results
  --results-file TEXT              Custom results filename
```

## What It Tests

The benchmark evaluates LLM response rates across four content types:

1. **Original Content**: Raw input as-is
2. **Original + Harmony**: Raw input with Harmony formatting
3. **Transformed Content**: Processed/analyzed input
4. **Transformed + Harmony**: Processed input with Harmony formatting

Results show how content formatting affects model responsiveness.

## Results Output

The benchmark generates:
- **Console Summary**: Response rates and statistics
- **Visual Plot**: Bar chart comparing response rates
- **JSON Results**: Detailed results saved to file

## Performance Considerations

### API Mode
- Lightweight installation
- Network dependent
- Rate limited by API
- Cost per request

### Local Transformers
- Large installation (~2GB+)
- GPU recommended
- No network required
- One-time download cost

### Local Ollama
- Medium installation
- Requires Ollama server
- No network for inference
- Easy model management

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request

## License

MIT License - see LICENSE file for details.

## Support

- Documentation: [GitHub README](https://github.com/APerson101/hhack)
- Issues: [GitHub Issues](https://github.com/APerson101/hhack/issues)
- Discussions: [GitHub Discussions](https://github.com/APerson101/hhack/discussions)