Metadata-Version: 2.4
Name: operant
Version: 0.3.2
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Dist: numpy>=2.0
Requires-Dist: gymnasium>=1.0 ; extra == 'dev'
Requires-Dist: pytest>=8.0 ; extra == 'dev'
Provides-Extra: dev
License-File: LICENSE
Summary: High-performance SIMD-optimized RL environments in Rust
Keywords: reinforcement-learning,rl,simd,gymnasium,environments
Author-email: goshea <goshea@me.com>
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Repository, https://github.com/galenoshea/operant

# Operant

[![PyPI version](https://badge.fury.io/py/operant.svg)](https://badge.fury.io/py/operant)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)

High-performance SIMD-optimized Gymnasium-compatible reinforcement learning environments in Rust with Python bindings.

**~600x faster than Gymnasium** for vectorized environments.

## What is This?

Operant provides **native Rust implementations** of Gymnasium environments with:
- **SIMD vectorization**: Process 8 environments simultaneously per instruction (AVX2)
- **Struct-of-Arrays layout**: Cache-friendly memory access patterns
- **Zero-copy numpy**: Direct array access without Python overhead
- **Gymnasium compatibility**: Drop-in replacement for standard Gym environments

Unlike [PufferLib](https://github.com/PufferAI/PufferLib) which wraps existing Gymnasium environments for vectorization, Operant implements environments **natively in Rust** for maximum performance.

## Supported Environments

| Environment | State Dim | Action Space | Physics | Reward |
|-------------|-----------|--------------|---------|---------|
| CartPole | 4 | Discrete(2) | Inverted pendulum balance | +1 per step alive |
| MountainCar | 2 | Discrete(3) | Sparse reward climbing | -1 per step |
| Pendulum | 3 | Continuous(1) | Swing-up control | Cost minimization |

All environments provide Gymnasium-compatible `observation_space` and `action_space` properties for easy integration with RL frameworks.

## Performance

```
CartPole Benchmark (4096 envs)
============================================================
Operant...     97.54M steps/sec
Gymnasium...    0.16M steps/sec

Speedup: ~600x faster than Gymnasium
```

## Requirements

- Python 3.10+

## Installation

### From PyPI (Recommended)

```bash
pip install operant
```

### From Source (Development)

Requires Rust nightly and Poetry:

```bash
# 1. Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# 2. Install Poetry
curl -sSL https://install.python-poetry.org | python3 -

# 3. Setup project
poetry install
poetry run maturin develop --release
```

## Usage

### CartPole (Discrete Actions)

```python
import numpy as np
from operant.envs import CartPoleVecEnv

# Create 4096 parallel environments
num_envs = 4096
env = CartPoleVecEnv(num_envs)
obs = env.reset(seed=42)  # Shape: (4096, 4)

for step in range(10000):
    actions = np.random.randint(0, 2, size=num_envs, dtype=np.int32)
    obs, rewards, terminals, truncations = env.step(actions)
```

### MountainCar (Discrete Actions)

```python
from operant.envs import MountainCarVecEnv

num_envs = 4096
env = MountainCarVecEnv(num_envs)
obs = env.reset(seed=42)  # Shape: (4096, 2)

for step in range(10000):
    actions = np.random.randint(0, 3, size=num_envs, dtype=np.int32)
    obs, rewards, terminals, truncations = env.step(actions)
```

### Pendulum (Continuous Actions)

```python
from operant.envs import PendulumVecEnv

num_envs = 4096
env = PendulumVecEnv(num_envs)
obs = env.reset(seed=42)  # Shape: (4096, 3) - [cos(θ), sin(θ), θ_dot]

for step in range(10000):
    actions = np.random.uniform(-2.0, 2.0, size=num_envs).astype(np.float32)
    obs, rewards, terminals, truncations = env.step(actions)
```

### Logging and Metrics

```python
from operant.utils import Logger

# Context manager automatically handles cleanup
with Logger(csv_path="training.csv") as logger:
    for step in range(1000):
        # ... training loop ...
        logger.log(steps=num_envs, reward=mean_reward, length=mean_length)
```

## Migration from v0.1.x

**Old imports (deprecated)**:
```python
from operant import PyCartPoleVecEnv, Logger
```

**New imports (recommended)**:
```python
from operant.envs import CartPoleVecEnv
from operant.utils import Logger
```

The old import style will continue to work until v0.4.0, but will emit deprecation warnings.

## Benchmarks

### Quick Benchmark

Compare Operant at 4096 environments:

```bash
poetry run python benches/cartpole_benchmark.py
```

### Full Benchmark

Test across multiple environment counts (1, 16, 256, 1024, 4096):

```bash
poetry run python benches/cartpole_benchmark.py --all
```

## Architecture

Operant uses a Struct-of-Arrays (SoA) memory layout with SIMD vectorization:

- **f32x8 SIMD**: Processes 8 environments simultaneously per instruction
- **SoA Layout**: Cache-friendly memory access patterns
- **Zero-copy**: Direct numpy array access without Python overhead
- **Rust + PyO3**: Native performance with Python ergonomics

## Development

```bash
# Run tests
poetry run pytest

# Build in debug mode (faster compilation)
poetry run maturin develop

# Build in release mode (faster runtime)
poetry run maturin develop --release
```

## License

MIT

