Metadata-Version: 2.0
Name: pytorch-es
Version: 0.1.0
Summary: Evolutionary Strategies using PyTorch
Home-page: https://github.com/staturecrane/PyTorch-ES
Author: Richard Herbert
Author-email: richard.alan.herbert@gmail.com
License: MIT
Keywords: machine learning,ai,evolutionary strategies,reinforcement learning,pytorch
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Topic :: Internet :: WWW/HTTP
Requires-Dist: evostra (==1.0.1)
Requires-Dist: gym (==0.10.4)
Requires-Dist: gym[box2d] (==0.10.4)
Requires-Dist: gym[atari] (==0.10.4)
Requires-Dist: keras (==2.1.5)
Requires-Dist: numpy (==1.14.2)
Requires-Dist: Pillow (==5.0.0)
Requires-Dist: scikit-image (==0.13.1)
Requires-Dist: tensorflow (==1.6.0)

# Evolutionary Strategies in PyTorch

![](https://media.giphy.com/media/30pEMgYfiPliU87swt/giphy.gif)

A set of tools based on [evostra](https://github.com/alirezamika/evostra) for using [OpenAI's evolutionary strategies](https://blog.openai.com/evolution-strategies/) in PyTorch. Keras implementations using evostra will be provided with some examples.

TABLE OF CONTENTS
=================

- [Installation](#installation)
- [Usage](#usage)
- [Run](#run)

## Installation

Your system needs all the prerequisites for the minimal installation of OpenAI gym. These will differ by operating system, so please refer to the [gym repository](https://github.com/openai/gym) for detailed instructions for your build. You also need to install the PyTorch distribution of your [choice](http://pytorch.org/). You can trigger CUDA ops by passing in ```-c``` or ```--cuda``` to the training examples.

Following that, create a conda or virtualenv enviroment and run:

```shell
pip install -r requirements.txt
```

## Usage

You will find the strategy classes (one as of now) within ```evolutionary_strategies/strategies```. These classes are designed to be used with PyTorch models and take two parameters: a function to get a reward and a list of PyTorch Variables that correspond to parameter layers. This can be achieved in the following manner:

```python
import copy
from functools import partial

from evolution.strategies import EvolutionModule


def get_reward(model, weights):
    """
    This function runs your model and generates a reward
    """
    cloned_model = copy.deepcopy(model)
    for i, param in enumerate(cloned_model.parameters()):
        try:
            param.data = weights[i]
        except:
            param.data = weights[i].data

    # run environment and return reward as an integer or float
    return 100


model = generate_pytorch_model()
# EvolutionModule runs the population in a ThreadPool, so
# if you need to inject other arguments, you can do that
# using the partial tool
partial_func = partial(get_reward, model=model)
mother_parameters = list(model.parameters())

es = EvolutionModule(
    mother_parameters, partial_func, population_size=100,
    sigma=0.1, learning_rate=0.001,
    reward_goal=200, consecutive_goal_stopping=20,
    threadcount=10, cuda=cuda, render_test=True
)
```

* EvolutionModule
    - init
        - parameters (list of PyTorch Variables)
        - reward_function => float (runs episode and returns a reward)
        - population_size=50
        - sigma=0.1
        - learning_rate=0.001
        - decay=1.0
        - sigma_decay=1.0
        - threadcount=4
        - render_test=False
        - cuda=False
        - reward_goal=None
        - consecutive_goal_stopping=None (stops after n tests consecutively return rewards equal-to or greater-than goal)
        - save_path=None (path to save weights at test times)
    - run
        - iterations
        - print_step=10 (frequency with which to run test and save weights)

## Run

You can run the examples in the following manner:

```shell
PYTHONPATH=. python evolutionary_strategies/examples/cartpole/train_pytorch.py --weights_path cartpole_weights.p
```

## Examples

### Lunar Lander

Solved in 1200~ iterations: population=100, sigma=0.01, learning_rate=0.001.

![](https://media.giphy.com/media/30pEMgYfiPliU87swt/giphy.gif)

### Cartpole

Solved in 200 iterations: population=10, sigma=0.1, learning_rate=0.001.

![](https://media.giphy.com/media/5h9xfw3BXvztG4HVBi/giphy.gif)



