Metadata-Version: 2.4
Name: rl-practicum-I054
Version: 0.1
Summary: A simple package providing common Reinforcement Learning utility functions
Author: RL Practicum
Author-email: 
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.11.1
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# RL Utils

A simple and useful Python package providing common Reinforcement Learning utility functions.

## Installation

```bash
pip install rl-utils
```

## Features

This package provides essential utility functions for implementing Reinforcement Learning algorithms:

- **Epsilon-Greedy Action Selection**: Classic exploration-exploitation strategy
- **Q-Learning Update**: Implementation of the Q-learning algorithm update rule
- **Softmax Action Selection**: Boltzmann exploration strategy
- **Q-Table Initialization**: Helper function to initialize Q-tables

## Usage

### Epsilon-Greedy Action Selection

```python
from rl_utils import epsilon_greedy_action
import numpy as np

q_values = [0.5, 0.3, 0.8, 0.2]
action = epsilon_greedy_action(q_values, epsilon=0.1, num_actions=4)
```

### Q-Learning Update

```python
from rl_utils import q_learning_update, initialize_q_table
import numpy as np

# Initialize Q-table
q_table = initialize_q_table(num_states=10, num_actions=4)

# Update Q-value
q_table, new_q = q_learning_update(
    q_table=q_table,
    state=0,
    action=1,
    reward=10.0,
    next_state=2,
    alpha=0.1,
    gamma=0.9,
    num_actions=4
)
```

### Softmax Action Selection

```python
from rl_utils import softmax_action_selection

q_values = [0.5, 0.3, 0.8, 0.2]
action = softmax_action_selection(q_values, temperature=1.0)
```

## Requirements

- numpy >= 1.11.1

## License

MIT License

## Author

Created for RL Practicum course.

