Metadata-Version: 2.3
Name: vocalocator
Version: 1.0.0
Summary: Tool for sound-source localization of vocal calls
Project-URL: Homepage, https://github.com/neurostatslab/vocalocator
Project-URL: Issues, https://github.com/neurostatslab/vocalocator/issues
Author: NeuroStats Lab
License-Expression: MIT
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.9
Requires-Dist: h5py>=3.9.0
Requires-Dist: matplotlib>=3.8.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pyjson5>=1.6.0
Requires-Dist: scipy>=1.11.0
Requires-Dist: torch>=2.0.0
Requires-Dist: torchaudio>=2.0.0
Requires-Dist: tqdm>=4.66.0
Description-Content-Type: text/markdown

# Vocal Call Locator
==============================

## Project Organization
------------

    ├── LICENSE
    ├── README.md                               <- The top-level README for developers using this project.
    │
    ├── trained_models                          <- Trained and serialized models, model predictions, or model summaries
    │
    ├── notebooks                               <- Jupyter notebooks. Naming convention is a number (for ordering),
    │                                              and a short `-` delimited description, e.g.
    │                                              `1.0-initial-data-exploration`.
    │
    ├── requirements.txt                        <- The requirements file for reproducing the analysis environment
    │
    ├── vocalocator                                <- Scripts to create new models and train them.


## Installation
1. Clone this repository: `git clone https://github.com/neurostatslab/vocalocator.git`
2. Install prerequisites: `cd vocalocator & pipenv install`
3. Install with pip: `pip install .`

## Usage
1. Create a dataset. This should be an HDF5 file with the following datasets:

| Dataset group/name | Shape             | Data type | Description                                                                                                                                    |
|--------------------|-------------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------|
| /audio     | (*, n_channels) | float     | All sound events concatenated along axis 0                                                                                                     |
| /length_idx           | (n + 1,)          | int       | Index into audio dataset. Sound event `i` should span the half open interval [`length_idx[i]`, `length_idx[i+1]`) and the first element should be 0. |
| /locations         | (n, 2)            | float     | Locations associated with each sound event. Only required for training.                                                                        |
1. Optionally, partition the dataset into a validation set and training set with the names `train_set.h5` and `val_set.h5`. Alternatively, the complete dataset can be passed in as a single .h5 file and an 80/20 train/val split will be automatically generated and saved. A directory containing training and validation indices by the names `train_set.npy` and `val_set.npy` can also be passed in through the `--indices` argument.
2. Create a config. This is a JSON file consisting of a single object whose properties correspond to the hyperparameters of the model and optimization algorithm.
3. Train a model:
   1. With SLURM: `sbatch run_gpu.sh /path/to/directory/containing/trainset/ /path/to/config.json`. Note that the first argument is expected to be a directory and the second argument is expected to be a file
   2. Without SLURM: `python -m vocalocator --data /path/to/directory/containing/trainset/ --config /path/to/config.json --save-path /path/to/model/weight/directory/ --indices /optional/path/to/index/directory`
4. Optionally, run with pretrained weights by pointing to the config.json within the model weight directory of the pretrained model.
5. Perform inference:
   1. With SLURM: `sbatch run_eval.sh /path/to/hdf5/dataset.h5 /path/to/model_dir/trained_models/config_name/#####/config.json /optional/output/path.h5`. 
   2. Without SLURM: `python -m vocalocator.assess --inference --data /path/to/hdf5/dataset.h5 --config /path/to/model_dir/trained_models/config_name/#####/config.json -o /optional/output/path.h5 --index /optional/index/path.npy`.
   3. Note that here, the data argument is a file rather than a directory and the config should point toward a config.json in a trained model's directory, just as in the finetuning step. If no output path is provided, predictions will be stored in the same directory as the input dataset.
   4. Output predictions will be stored in a dataset labeled "predictions" at the root of the HDF5 hierarchy

## Public datasets
See our [dataset website](https://users.flatironinstitute.org/~atanelus/) to learn more about and download our public datasets.