Metadata-Version: 2.1
Name: crowdplay-datasets
Version: 0.1.0
Summary: A collection of crowdsourced human demonstration datasets for offline learning
Home-page: https://github.com/mgerstgrasser/crowdplay
Author: Matthias Gerstgrasser
Author-email: matthias@seas.harvard.edu
License: UNKNOWN
Project-URL: Documentation, https://mgerstgrasser.github.io/crowdplay/
Project-URL: Paper, https://openreview.net/pdf?id=qyTBxTztIpQ
Project-URL: Bug Reports, https://github.com/mgerstgrasser/crowdplay/issues
Project-URL: Source, https://github.com/mgerstgrasser/crowdplay
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.7, <4
Description-Content-Type: text/markdown
Requires-Dist: SQLAlchemy (>=1.3.20)
Requires-Dist: gym (>=0.23.1)
Requires-Dist: opencv-python (>=4.4.0.46)

# The CrowdPlay Atari Dataset

The CrowdPlay Atari dataset is a collection of over 300 hours of human Atari 2600 demonstrations. In addition to vanilla gameplay, it includes substantial amounts of multiagent data, including human-human and human-AI data, as well as multimodal behavioral data, where participants were asked to follow a specific behavior in the game.

## Installing the CrowdPlay Atari Dataset

### 1. Installing the Python Package

Run ```pip install crowdplay_datasets``` to install the dataset package.

### 2. Downloading and Extracting the Dataset

Then, run `python -m crowdplay_datasets.install --dataset=crowdplay_atari-v0` to download and extract the actual dataset. The dataset is about 15GB in size, but during installation will temporarily require about 30GB of disk space.

### 3. Optional: Re-pack into Gzip

Trajectories are compressed using bzip, which is space-efficient but slow. If you will load trajectories many times, you can re-pack the dataset into gzip using the `scripts/convert_to_gzip.sh` script. Note that this requires around 250GB of disk space. There is an additional script to unpack the dataset entirely, but this requires around 10TB of space and is not noticeably faster than gzip. Run either script inside the dataset directory (shown during installation).

## More Information

For more information see [https://mgerstgrasser.github.io/crowdplay/](https://mgerstgrasser.github.io/crowdplay/).


