Metadata-Version: 2.4
Name: hipr
Version: 0.1.5
Summary: Automatic Pydantic config generation from function signatures with hyperparameters
Author-email: "Sencer S." <nospam@gmail.com>
License: MIT
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0.0
Requires-Dist: annotated-types>=0.6.0
Requires-Dist: loguru>=0.7.0
Provides-Extra: dev
Requires-Dist: basedpyright>=1.31.7; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: numpy>=2.3.5; extra == "dev"
Requires-Dist: pandas>=2.3.3; extra == "dev"
Requires-Dist: pandas-stubs>=2.0.0; extra == "dev"
Dynamic: license-file

# hipr

**(pronounced "hyper")**

![CI](https://github.com/sencer/hipr/actions/workflows/ci.yml/badge.svg)
[![codecov](https://codecov.io/gh/sencer/hipr/branch/main/graph/badge.svg)](https://app.codecov.io/github/sencer/hipr)

**Automatic Pydantic config generation from function signatures with hyperparameters.**

`hipr` is a lightweight Python library that automatically generates type-safe Pydantic configuration classes from your function and class signatures. Just annotate your parameters with `Hyper[T]`, and get automatic validation, serialization, and a clean separation between hyperparameters and runtime arguments.

## Features

- 🚀 **Zero boilerplate** - Automatically generate config classes from signatures
- ✅ **Type-safe** - Full Pydantic validation with constraints
- 🛡️ **Robust Validation** - Detects invalid or conflicting constraints at definition time
- 🎯 **Clean separation** - Separate hyperparameters from runtime data
- 🔧 **Flexible** - Works with functions, methods, classes, and dataclasses
- 🎨 **Constraint syntax** - Inline constraints: `Hyper[int, Ge[2], Le[100]]`
- 🪆 **Nested configs** - Compose configurations hierarchically
- 📦 **Serializable** - JSON-compatible config serialization

## Installation

```bash
pip install hipr
```

Or with uv:
```bash
uv add hipr
```

After installation, the `hipr-generate-stubs` command will be available in your PATH.

## Quick Start

Suppose you have a typical ML pipeline with nested components:

```python
class Optimizer:
    def __init__(self, learning_rate: float = 0.01, momentum: float = 0.9):
        self.learning_rate = learning_rate
        self.momentum = momentum

class Model:
    def __init__(self, hidden_size: int = 128, dropout: float = 0.1,
                 optimizer: Optimizer = None):
        self.hidden_size = hidden_size
        self.dropout = dropout
        self.optimizer = optimizer or Optimizer()

    def train(self, data: list[float]) -> dict:
        # Training logic here...
        return {"loss": 0.5}

# Using it
model = Model(
    hidden_size=256,
    dropout=0.2,
    optimizer=Optimizer(learning_rate=0.001, momentum=0.95)
)
result = model.train(data=[1.0, 2.0, 3.0])
```

**To optimize hyperparameters and run experiments**, add the `@configurable` decorator and mark tunable parameters with `Hyper[T]`:

```python
from hipr import configurable, Hyper, Gt, Ge, Le, Lt, DEFAULT

@configurable
class Optimizer:
    def __init__(
        self,
        learning_rate: Hyper[float, Gt[0.0], Le[1.0]] = 0.01,
        momentum: Hyper[float, Ge[0.0], Le[1.0]] = 0.9,
    ):
        self.learning_rate = learning_rate
        self.momentum = momentum

@configurable
class Model:
    def __init__(
        self,
        data: list[float],  # Runtime data, not a hyperparameter
        hidden_size: Hyper[int, Ge[1]] = 128,
        dropout: Hyper[float, Ge[0.0], Lt[1.0]] = 0.1,
        optimizer_config: Hyper[Optimizer.Config] = DEFAULT,
    ):
        self.hidden_size = hidden_size
        self.dropout = dropout
        self.optimizer = optimizer_config.make()()

    def train(self) -> dict:
        # Training logic here...
        return {"loss": 0.5}

# Now you can build serializable configs and run experiments:
from pydantic import ValidationError

# Create a configuration
config = Model.Config(
    hidden_size=256,
    dropout=0.2,
    optimizer_config=Optimizer.Config(learning_rate=0.001, momentum=0.95),
)

# Serialize to JSON for experiment tracking
config_json = config.model_dump_json()
# '{"hidden_size":256,"dropout":0.2,"optimizer_config":{"learning_rate":0.001,"momentum":0.95}}'

# Load from JSON
loaded_config = Model.Config.model_validate_json(config_json)

# Build and run
model_fn = loaded_config.make()
model = model_fn(data=[1.0, 2.0, 3.0])
result = model.train()

# Validation is automatic
try:
    bad_config = Model.Config(dropout=1.5)  # Error: dropout must be < 1.0
except ValidationError:
    print("Invalid configuration!")
```

## Core Concepts

### The `@configurable` Decorator

The `@configurable` decorator works on:
- Functions
- Methods (including class methods)
- Regular classes
- Dataclasses

It generates a `.Config` class that:
- Captures all `Hyper[T]` parameters
- Provides Pydantic validation
- Has a `.make()` method that returns a configured callable/constructor

### The `Hyper[T]` Annotation

Mark parameters as hyperparameters using `Hyper[T]`:

```python
# Simple type
period: Hyper[int] = 14

# With constraints
period: Hyper[int, Ge[2], Le[100]] = 14
alpha: Hyper[float, Ge[0.0], Le[1.0]] = 0.5
name: Hyper[str, Pattern[r"^[A-Z]"]] = "Default"
```

**Available constraints:**
- `Ge[n]` - Greater than or equal
- `Gt[n]` - Greater than
- `Le[n]` - Less than or equal
- `Lt[n]` - Less than
- `MinLen[n]` - Minimum length (strings, lists)
- `MaxLen[n]` - Maximum length (strings, lists)
- `MultipleOf[n]` - Must be a multiple of
- `Pattern[r"..."]` - Regex pattern match

> **Note:** These constraint markers wrap [`annotated-types`](https://github.com/annotated-types/annotated-types)
> to provide bracket syntax `Ge[2]` instead of parentheses `Ge(2)`, making them valid type expressions
> that work in annotations. This enables clean inline constraint syntax while maintaining compatibility
> with Pydantic's validation system.

**Using Literal types for enums:**

Instead of Pattern constraints, you can use `Literal` for a fixed set of choices:

```python
from typing import Literal

@configurable
def process(
    mode: Hyper[Literal["fast", "slow", "medium"]] = "fast",
    level: Hyper[Literal[1, 2, 3]] = 1,
) -> str:
    return f"{mode} mode, level {level}"

# Pydantic validates that only these exact values are allowed
config = process.Config(mode="slow", level=2)  # ✓ Valid
# process.Config(mode="invalid")  # ✗ ValidationError
```

**Using Python Enums:**

For more structured enumerations, use Python's `Enum`:

```python
from enum import Enum

class Mode(str, Enum):
    FAST = "fast"
    SLOW = "slow"
    MEDIUM = "medium"

@configurable
def process(mode: Hyper[Mode] = Mode.FAST) -> str:
    return f"Processing in {mode.value} mode"

# Use enum directly
config = process.Config(mode=Mode.SLOW)
# Or use string value (Pydantic coerces)
config = process.Config(mode="slow")
```

## Examples

### Functions

```python
from hipr import configurable, Hyper, Ge, Le

@configurable
def exponential_smoothing(
    data: list[float],
    alpha: Hyper[float, Ge[0.0], Le[1.0]] = 0.3,
    adjust: Hyper[bool] = True,
) -> list[float]:
    """Apply exponential smoothing."""
    result = [data[0]]
    for value in data[1:]:
        smoothed = alpha * value + (1 - alpha) * result[-1]
        result.append(smoothed)
    return result

# Use it
data = [10.0, 12.0, 11.0, 13.0, 12.5]
smoothed = exponential_smoothing(data, alpha=0.5)

# Or with config
config = exponential_smoothing.Config(alpha=0.7, adjust=False)
fn = config.make()
smoothed = fn(data)
```

### Classes

```python
from hipr import configurable, Hyper, Gt, Le
from dataclasses import dataclass

# Regular class
@configurable
class Optimizer:
    def __init__(
        self,
        learning_rate: Hyper[float, Gt[0.0], Le[1.0]] = 0.01,
        momentum: Hyper[float, Ge[0.0], Le[1.0]] = 0.9,
    ):
        self.learning_rate = learning_rate
        self.momentum = momentum

    def step(self, loss: float) -> float:
        return loss * self.learning_rate

# Direct instantiation
opt = Optimizer(learning_rate=0.001)

# Using Config
config = Optimizer.Config(learning_rate=0.1, momentum=0.95)
opt = config.make()()  # .make() returns constructor, call it to instantiate
```

### Dataclasses

```python
from dataclasses import dataclass

@configurable
@dataclass
class ModelConfig:
    hidden_size: Hyper[int, Ge[1]] = 128
    num_layers: Hyper[int, Ge[1], Le[100]] = 3
    dropout: Hyper[float, Ge[0.0], Lt[1.0]] = 0.1

# Direct usage
model = ModelConfig(hidden_size=256, num_layers=6)

# Via Config
config = ModelConfig.Config(hidden_size=512, num_layers=12)
model = config.make()()
```

### Methods

```python
class Analyzer:
    def __init__(self, base_threshold: float = 1.0):
        self.base_threshold = base_threshold

    @configurable
    def detect_outliers(
        self,
        data: list[float],
        threshold: Hyper[float, Gt[0.0]] = 3.0,
    ) -> list[int]:
        """Detect outliers using threshold."""
        mean = sum(data) / len(data)
        std = (sum((x - mean) ** 2 for x in data) / len(data)) ** 0.5
        cutoff = threshold * std * self.base_threshold
        return [i for i, x in enumerate(data) if abs(x - mean) > cutoff]

analyzer = Analyzer()

# Direct call
outliers = analyzer.detect_outliers([1, 2, 3, 100, 4, 5], threshold=2.0)

# Using Config
config = analyzer.detect_outliers.Config(threshold=2.5)
fn = config.make()
outliers = fn(analyzer, [1, 2, 3, 100, 4, 5])
```

### Nested Configurations

You can compose configurations hierarchically using `DEFAULT`:

```python
import pandas as pd
from hipr import configurable, Hyper, Gt, DEFAULT

@configurable
def base_transform(
    data: pd.Series,
    multiplier: Hyper[float, Gt[0.0]] = 2.0,
) -> float:
    return data.sum() * multiplier

@configurable
def pipeline(
    data: pd.Series,
    transform_config: Hyper[base_transform.Config] = DEFAULT,
    offset: Hyper[float] = 10.0,
) -> float:
    """A pipeline that uses another configurable."""
    transformer = transform_config.make()
    result = transformer(data=data)
    return result + offset

# Use with defaults
data = pd.Series([1.0, 2.0, 3.0])
result = pipeline(data)

# Customize nested config
config = pipeline.Config(
    transform_config=base_transform.Config(multiplier=5.0),
    offset=20.0,
)
fn = config.make()
result = fn(data)
```

### Configuration Serialization

Configs are Pydantic models, so they serialize naturally:

```python
@configurable
def train_model(
    data: list[float],
    epochs: Hyper[int, Ge[1]] = 100,
    lr: Hyper[float, Gt[0.0]] = 0.001,
) -> dict:
    return {"trained": True, "epochs": epochs}

# Create config
config = train_model.Config(epochs=200, lr=0.01)

# Serialize to dict
config_dict = config.model_dump()
# {"epochs": 200, "lr": 0.01}

# Serialize to JSON
config_json = config.model_dump_json()
# '{"epochs":200,"lr":0.01}'

# Deserialize from dict
config2 = train_model.Config(**config_dict)

# Deserialize from JSON
import json
config3 = train_model.Config(**json.loads(config_json))
```

## Type Checking

The `@configurable` decorator dynamically generates `.Config` classes at runtime. For the best type checking experience, use the included stub generator.

### Automatic Stub Generation (Recommended)

After installing `hipr`, use the included CLI tool to generate `.pyi` stub files:

```bash
# Generate stubs for your package (scans src/ by default)
hipr-generate-stubs

# Generate stubs for a specific directory
hipr-generate-stubs my_package/

# See all options
hipr-generate-stubs --help
```

**Integrate with your workflow:**

```toml
# pyproject.toml
[tool.poe.tasks]
generate-stubs = "hipr-generate-stubs src/"

# Now you can run: poe generate-stubs
```

Or add to your pre-commit hooks, CI/CD, or development scripts.

This creates `.pyi` files with complete type information:

```python
# your_module.py
@configurable
def moving_average(
    data: list[float],
    period: Hyper[int] = 14,
) -> float:
    return sum(data[-period:]) / len(data[-period:])

# After running generate-stubs, creates: your_module.pyi
class MovingAverageConfig(MakeableModel[float]):
    period: int
    def __init__(self, *, period: int = 14) -> None: ...

class _MovingAverageConfigurable:
    Config: type[MovingAverageConfig]
    def __call__(self, data: list[float], *, period: int = 14) -> float: ...

moving_average: _MovingAverageConfigurable
```

With stubs generated, type checkers work perfectly:

```python
# ✓ No type errors, full autocomplete
config = moving_average.Config(period=20)
result = moving_average(data=[1, 2, 3], period=5)
```

### Without Stubs

If you don't use stub generation, type checkers will complain about dynamically created attributes:

```python
# Type checker warning without stubs
config = moving_average.Config(period=5)  # type: ignore[call-arg]
print(config.period)  # type: ignore[attr-defined]
```

**Recommendation:** Always run `generate-stubs` as part of your development workflow for the best experience.

## Advanced Usage

### Multiple Constraint Types

Combine multiple constraints:

```python
@configurable
def process(
    data: list[float],
    window: Hyper[int, Ge[1], Le[1000], MultipleOf[2]] = 10,  # Even number, 1-1000
) -> float:
    return sum(data[-window:]) / window
```

### Validation Errors

Pydantic validation happens automatically:

```python
from pydantic import ValidationError

try:
    config = moving_average.Config(period=0)  # Fails: period must be >= 2
except ValidationError as e:
    print(e)
```

### Safety & Validation

`hipr` includes robust checks to prevent invalid configurations before they cause runtime errors.

**Constraint Conflicts:**
Contradictory constraints are caught at definition time (or stub generation time):

```python
# Raises ValueError: lower bound (10) is greater than upper bound (5)
def bad_func(x: Hyper[int, Ge[10], Le[5]] = 10): ...

# Raises ValueError: min_length (10) is greater than max_length (5)
def bad_str(s: Hyper[str, MinLen[10], MaxLen[5]] = "default"): ...
```

**Invalid Patterns:**
Regex patterns are validated immediately:

```python
# Raises InvalidPatternError: bad regex pattern
def bad_regex(s: Hyper[str, Pattern(r"[")] = "default"): ...
```

**Reserved Names:**
The parameter name `model_config` is reserved by Pydantic. `hipr` will raise a `ValueError` if you try to use it as a hyperparameter name.

**Circular Dependency Prevention:**
When using nested configurations with `DEFAULT`, `hipr` automatically detects and prevents circular dependencies that would cause infinite recursion during instantiation:

```python
@configurable
class ComponentA:
    # If ComponentB also depends on ComponentA, this creates a cycle
    b_config: Hyper[ComponentB.Config] = DEFAULT

# Raises ValueError: Circular dependency detected: ComponentA -> ComponentB -> ComponentA
```

### Mixed Configurables

Mix functions, classes, and dataclasses in nested configs:

```python
from hipr import configurable, Hyper, DEFAULT

@configurable
class Scaler:
    def __init__(self, scale: Hyper[float] = 1.0):
        self.scale = scale

@configurable
def transform(
    data: list[float],
    scaler_config: Hyper[Scaler.Config] = DEFAULT,
) -> list[float]:
    scaler = scaler_config.make()()
    return [x * scaler.scale for x in data]
```

### Thread Safety

The `@configurable` decorator is thread-safe and can be used in concurrent environments:

```python
import concurrent.futures
from hipr import configurable, Hyper

@configurable
def process_data(
    data: list[float],
    multiplier: Hyper[float] = 2.0,
) -> float:
    return sum(data) * multiplier

# Create configs in multiple threads
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
    configs = [
        process_data.Config(multiplier=i)
        for i in range(10)
    ]

    # Use configs concurrently
    futures = [
        executor.submit(config.make(), [1.0, 2.0, 3.0])
        for config in configs
    ]

    results = [f.result() for f in futures]
```

### Validation Utilities

`hipr` provides a utility to validate configuration dictionaries without instantiating the config class:

```python
from hipr import validate_config

data = {"period": 0}  # Invalid: period must be >= 2
is_valid, errors = validate_config(moving_average.Config, data)

if not is_valid:
    print("Validation errors:", errors)
    # ['Value error, Input should be greater than or equal to 2 [type=greater_than_equal, input_value=0, input_type=int]']
```

## Performance

`hipr` is designed to be lightweight. The overhead of using `@configurable` is minimal:

- **Config creation**: <100µs (dominated by Pydantic validation)
- **make() overhead**: <50µs (closure creation)
- **Direct function call**: Zero overhead (same speed as raw function)
- **Made function call**: Minimal overhead (<1µs) compared to raw function

For detailed benchmarks and reproduction scripts, see the [benchmarks](./benchmarks) directory.

## Why hipr?

**Problem:** When building ML pipelines, scientific computing tools, or any configurable system, you often need to:
- Separate hyperparameters from runtime data
- Validate parameter ranges
- Serialize/deserialize configurations
- Compose configurations hierarchically

**Traditional approach:** Write lots of boilerplate Pydantic models, dataclasses, or config classes.

**With hipr:** Just annotate your function/class parameters with `Hyper[T]`, and get all of this for free.

## Comparison with other libraries

*This comparison reflects the author's design philosophy. Each tool excels in different contexts depending on your needs.*

| Feature | hipr | gin-config | hydra | tyro |
| :--- | :--- | :--- | :--- | :--- |
| **Core Philosophy** | Config from code: Function signature defines schema | Dependency injection: Global binding system | Hierarchical composition: YAML-first configuration | CLI from types: Type hints define interface |
| **Error Detection** | **Development + Runtime**: Invalid constraints caught during stub generation, decoration time, and runtime | Runtime only (when function executes) | Runtime (when config is composed) | CLI parse time |
| **Type Checking** | **Full support**: `.pyi` stubs enable complete IDE and type checker integration | None (string-based bindings) | Partial (improved with Structured Configs) | Full support (native dataclasses) |
| **Validation** | Pydantic validation at instantiation with automatic constraint checking | At function execution time | Schema-based validation (optional, with Structured Configs) | argparse + dataclass validation at startup |
| **State Management** | Stateless: Explicit `Config` objects, no globals | Global registry with singleton pattern | Composed state via OmegaConf | Stateless: CLI args parsed to config |
| **IDE Support** | Excellent: Pure Python with full autocomplete/refactoring | Limited: `.gin` files lack IDE integration | Good: YAML editing varies; Structured Configs provide autocomplete | Excellent: Native Python dataclasses |
| **Boilerplate** | Minimal: Just `@configurable` decorator | Minimal: `@gin.configurable` decorator | Moderate: YAML files + dataclass schemas + composition logic | Minimal: `tyro.cli()` call |
| **Serialization** | Native Pydantic: `model_dump()` / `model_dump_json()` | Custom format: Operative config logging | Strong: Built-in YAML save/load for job configs | Strong: YAML/JSON with dataclass support |
| **Nested Configs** | Native: Configs compose hierarchically with type safety | Supported via scoping | Excellent: Core feature with config groups | Supported via nested dataclasses |
| **Best For** | Type-safe Python APIs, ML experiments, library development | Google-style codebases, research experiments with DI patterns | Complex applications, multi-run experiments, config composition | Command-line tools, simple scripts, research utilities |

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

MIT License - see LICENSE file for details.

## Credits

Built with:
- [Pydantic](https://github.com/pydantic/pydantic) - Data validation using Python type annotations
- [annotated-types](https://github.com/annotated-types/annotated-types) - Reusable constraint types

---

**hipr** - Because configuration should be effortless.
