Metadata-Version: 2.1
Name: azalea
Version: 0.1.0
Summary: Hex board game AI with self-play learning based on the AlphaZero algorithm
Home-page: https://github.com/jseppanen/azalea
Author: Jarno Seppänen
Author-email: azalea@meit.si
License: BSD 3-Clause License
Description: 
        # Azalea
        
        > playing to learn to play
        
        Azalea is a reinterpretation of the [AlphaZero game AI](https://en.wikipedia.org/wiki/AlphaZero)
        learning algorithm for the [Hex board game](https://en.wikipedia.org/wiki/Hex_(board_game)).
        
        ## Features
        
        * Straightforward reimplementation of the AlphaZero algorithm except
          for MCTS parallelization (see below)
        * Pre-trained model for Hex board game
        * Fast MCTS implementation through Numba JIT acceleration.
        * Fast Hex game move generation implementation through Numba.
        * Parallelized self play to saturate Nvidia V100 GPU during training
        * AI policy evaluation through round robin tournament, also parallelized
        * Tested on Ubuntu 16.04
        * Requires Python 3.6 and PyTorch 0.4
        
        ## Differences to published AlphaZero
        
        * Single GPU implementation only - tested on Nvidia V100, with 8 CPU's
          for move generation and MCTS, and 1 GPU for the policy network.
        * Only Hex game is implemented, though the code supports adding more
          games. Two components are needed for a new game: move generator and
          policy network, with board input and moves output adjusted to the
          new game.
        * MCTS simulations are not run in parallel threads, but instead,
          self-play games are played in parallel processes. This is to avoid
          the need for a multi-threaded MCTS implementation while still
          maintaining fast training speed and saturating the GPU.
        * MCTS simulation and board evaluations are batched according to
          `search_batch_size` config parameter. "Virtual loss" is used
          as in AlphaZero, to increase search diversity.
        
        # Installation
        
        Clone the repository and install dependencies with Conda:
        
            git clone https://github.com/jseppanen/azalea.git
            conda env create -n azalea
            source activate azalea
        
        The default `environment.yml` installs GPU packages but you can choose
        `environment-cpu.yml` for testing on a laptop.
        
        ## Playing against pretrained model
        
            python play.py models/hex11-20180712-3362.policy.pth
        
        This will load the model and start playing, asking for your move. The
        columns are labeled a–k and rows 1–11. The first player, playing `X`'s,
        is trying to draw a vertical connected path through the board, while the
        second player, with `O`'s, is drawing a horizontal path.
        
        ```
        O O O O X . . . . . . 
         . . . . . . . . . . . 
          . . . . . . . . . . . 
           . . . . X . . . . . . 
            . . . . . X . . . . . 
             . . . . . . . . . . . 
              . . . . X . . . . . . 
               . . . . . . . . . . . 
                . . . X . . . . . . . 
          x      . . . . . . . . . . . 
         o\\      . . . . . . . . . . . 
        last move: e1
        Your move? 
        ```
        
        ## Model training
        
            python train.py --config config/hex11_train_config.yml --rundir runs/train
        
        ## Model comparison
        
            python compare.py --config config/hex11_eval_config.yml --rundir runs/compare <mode1> <model2> [model3] ...
        
        ## Model selection
        
            python tune.py
        
        ## References
        
        * [Mastering the Game of Go without Human Knowledge](https://deepmind.com/documents/119/agz_unformatted_nature.pdf)
        * [Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm](https://arxiv.org/abs/1712.01815)
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
