Metadata-Version: 2.1
Name: pyvww
Version: 0.1.1
Summary: Python API to work with the Visual Wake Words Dataset.
Home-page: https://github.com/Mxbonn/visualwakewords
Author: Maxim Bonnaerens
Author-email: maxim@bonnaerens.be
License: Apache 2.0
Platform: UNKNOWN
Description-Content-Type: text/markdown
Requires-Dist: pycocotools
Requires-Dist: torchvision

# Visual Wake Words Dataset
Python library to work with the [Visual Wake Words Dataset](https://arxiv.org/abs/1906.05721), 
comparable to [pycococools](https://github.com/cocodataset/cocoapi) for the COCO dataset.

`pyvww.utils.VisualWakeWords` inherits from `pycocotools.coco.COCO` and can be used in an similar fashion.

`pyvww.pytorch.VisualWakeWordsClassification` is a pytorch `Dataset` which can be used like any 
image classification dataset.

 ---
 ### Installation
 The code is implemented in Python 3.7 and can be installed with pip:
 ```bash
 pip install pyvww
 ```

 ### Usage
 The Visual Wake Words Dataset is derived from the publicly available [COCO](cocodataset.org/#/home) dataset.
 To download the COCO dataset use the script `download_coco.sh`
 ```bash
bash scripts/download_mscoco.sh path-to-COCO-dataset year
```
Where `year` is an optional argument that can be either 2014 (default) or 2017.

The Visual Wake Words Dataset evaluates the accuracy on the [minival image ids](https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/mscoco_minival_ids.txt),
and for training uses the remaining 115k images of the COCO training/validation dataset.

To create COCO annotation files that converts the 2014 or 2017 split to the minival split use:
`scripts/create_coco_train_minival_split.py`
```bash
TRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_train2014.json"
VAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_val2014.json"
DIR="path-to-mscoco-dataset/annotations/"
python scripts/create_coco_train_minival_split.py \
  --train_annotations_file="${TRAIN_ANNOTATIONS_FILE}" \
  --val_annotations_file="${VAL_ANNOTATIONS_FILE}" \
--output_dir="${DIR}"
```
(2014 can be replaced by 2017 if you downloaded the 2017 dataset)

The process of creating the Visual Wake Words dataset from COCO dataset is as follows.
Each image is assigned a label 1 or 0. 
The label 1 is assigned as long as it has at least one bounding box corresponding 
to the object of interest (e.g. person) with the box area greater than a certain threshold 
(e.g. 0.5% of the image area).

To generate the new annotations, use the script `scripts/create_visualwakewords_annotations.py`.
```bash
MAXITRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_maxitrain.json"
MINIVAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_minival.json"
VWW_OUTPUT_DIR="new-path-to-visualwakewords-dataset/annotations/"
python scripts/create_visualwakewords_annotations.py \
  --train_annotations_file="${MAXITRAIN_ANNOTATIONS_FILE}" \
  --val_annotations_file="${MINIVAL_ANNOTATIONS_FILE}" \
  --output_dir="${VWW_OUTPUT_DIR}" \
  --threshold=0.005 \
  --foreground_class='person'
```

The generated annotations follow the [COCO Data format](http://cocodataset.org/#format-data).
```
{
  "info" : info, 
  "images" : [image], 
  "annotations" : [annotation], 
  "licenses" : [license],
}

info{
  "year" : int, 
  "version" : str, 
  "description" : str, 
  "url" : str, 
}

image{
  "id" : int, 
  "width" : int, 
  "height" : int, 
  "file_name" : str, 
  "license" : int, 
  "flickr_url" : str, 
  "coco_url" : str, 
  "date_captured" : datetime,
}

license{
  "id" : int, 
  "name" : str, 
  "url" : str,
}

annotation{
  "id" : int, 
  "image_id" : int, 
  "category_id" : int, 
  "area" : float, 
  "bbox" : [x,y,width,height], 
  "iscrowd" : 0 or 1,
}
```

### Pytorch Dataset

The `pyvww.pytorch.VisualWakeWordsClassification` can be used in pytorch like any other pytorch image classification
dataset such as MNIST or ImageNet.

```python
import torch
import pyvww

train_dataset = pyvww.pytorch.VisualWakeWordsClassification(root="path-to-mscoco-dataset/all", 
                    annFile=".../visualwakewords/annotations/instances_train.json")
```




