Metadata-Version: 2.1
Name: digitallab
Version: 1.3.0.2
Summary: digitallab is a python package for conducting large-scale computational experiments.
Home-page: https://gitlab.com/Dnis/dlab
Author: Dennis Kreber
Author-email: dnis.kk@gmail.com
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: tqdm
Requires-Dist: pymongo
Requires-Dist: sacred
Requires-Dist: seaborn
Requires-Dist: matplotlib
Requires-Dist: tables

# Digital Lab (digitallab)
## Introduction
dlab is a python package for conducting large-scale computational experiments. The underlying framework
is based on the module [sacred](https://sacred.readthedocs.io/en/stable/). It extends its functionality by
allowing batches of experiments, repetitions of experiments with different seeds, and parallel execution
of experiments. Furthermore, it provides tools to evaluate the experiments via plots or tables.

## Dependencies
### Python packages:
* numpy
* pymongo
* tqdm
* sacred
* pandas
* seaborn
* matplotlib
* pytables

### Other:
- MongoDB server

## Installation
### Via pip
Run

    pip install --user digitallab

### From source
Clone the project to your hard drive and run the command

    python3 setup.py install --user

in the project folder.

## Example
### Conducting experiments
Assume we want to compare the run times and quality of three methods (`fast`, `slow`, `special`). 
`fast` and `slow` are taking the same arguments while `special` requires an extra parameter. 
We have the following functions:

Generating input data:

    def generate_input_data_from_seed(size, seed):
        # do something...
        return input_data

The `fast` and `slow` method:

    def fast(input_data, time_limit):
        # do something and set the variables quality and run_time...
        ret = dict()
        ret["quality"] = quality
        ret["run_time"] = run_time
        return ret

    def slow(input_data, time_limit):
        # do something and set the variables quality and run_time...
        ret = dict()
        ret["quality"] = quality
        ret["run_time"] = run_time
        return ret

The `special` method:

    def special(input_data, special_parameter, time_limit):
        # do something and set the variables quality and run_time...
        ret = dict()
        ret["quality"] = quality
        ret["run_time"] = run_time
        return ret

Then we can run the experiments. We first have to setup the lab:

    from dlab.lab import Lab


    # URL to MongoDB
    mongodb_url = "localhost"

    # MongoDB data base name, here: experiments
    mongodb_db = "experiments"

    # create the lab
    lab = Lab("example", mongodb_url, mongodb_db)

Then we assign two dictonaries which define our experiments. `dlab` will provide every
possible combination of parameters to our experiment function. Additionally, every
parameter combination will be submitted as often as specified by the field `number_of_repetitions`
 (each time with a different seed). By the way, a field `seed` is added for each config 
 with the specific seed. The results of the experiments can be deleted and the experiments 
 repeated and the given seeds will be identical.  

 Mandatory keys in a settings file are `experiment`, `sub_experiment`, `version`, and 
 `number_of_repetitions`.

    settings_fast_slow_experiments = {
            "experiment": "example",
            "sub_experiment": "fast_slow",
            "version": "1",
            "number_of_repetitions": 10,
            "method": ["fast", "slow"],
            "time_limit": 60,
            "size": [10, 100]
            }

     settings_special_experiments = {
            "experiment": "example",
            "sub_experiment": "special",
            "version": "1",
            "number_of_repetitions": 10,
            "method": "special",
            "time_limit": 60,
            "size": [10, 100],
            "special_parameter": [100, 1000]
            }

Finally we can define our experiment function and run the experiments:

    @lab.experiment
    def main(_config):
        # generate input data
        input_data = generate_input_data_from_seed(_config["size"], _config["seed"])

        if _config["method"] == "fast":
            # Run fast method
            return fast(input_data, _config["time_limit"])
        if _config["method"] == "slow":
            # Run slow method
            return slow(input_data, _config["time_limit"])
        if _config["method"] == "special":
            # Run special method
            return special(input_data, _config["special_paramter"], _config["time_limit"])

     lab.run_experiments(main, settings_fast_slow_experiments, number_of_parallel_runs=4)
     lab.run_experiments(main, settings_special_experiments, number_of_parallel_runs=4)

### Evaluating experiments
To be done...



## ToDos
The project is work in progress and there are still some tasks to be done:
* Documentation
* Examples
* Allow additional data bases (SQL, TinyDB, local storage)

