Metadata-Version: 2.4
Name: auditory_models
Version: 1.0.1
Summary: Computation of auditory models
Author-email: Max Zimmermann <zimmermannmax16@gmail.com>
License-Expression: GPL-3.0-or-later
Project-URL: Repository, https://gitlab.com/zimmermannmax16/auditory_models.git
Keywords: audio,quality,speech,perception,intelligibility,model,auditory,stoi,gpsm,snr
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: Microsoft :: Windows :: Windows 10
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: COPYING
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: soundfile
Dynamic: license-file

# auditory_models

## Description

This repository provides multiple packages to compute auditory models, including
- Short Term Objective Intelligibility ([STOI](https://ieeexplore.ieee.org/abstract/document/5713237))
- Generalized Power Spectrum Model for audio quality ([GPSMq](https://ieeexplore.ieee.org/abstract/document/8708700))
- Binaural Auditory-Model-based Quality prediction ([BAM-Q](https://d1wqtxts1xzle7.cloudfront.net/118608373/jaes.2017.003720241002-1-yep3my-libre.pdf?1727912710=&response-content-disposition=inline%3B+filename%3DAssessment_and_Prediction_of_Binaural_As.pdf&Expires=1774029700&Signature=a-MjugzXPu3anOdmG6T55CNOh9lohScOizDtZOCO97m~PRu3k0Em3J~FuNz42gdBPN6WW4-wlkToIBRIJGbVm3RID4flgHls7dOANo~l0lOTh41yiRFscWcNtnEQTIAPq8DIw6046Ln1t72l~h9q3rgM7psceD~AIy62YcSuwClI3qhapFkKW3hMDhxBTkIbwYgX~LQS8g31ptTf4K-0ti-F5xk6Kbo1gP9dbddqCFASCbVjK~FcrfNTNtVVojZ98D-rlckxa6ZucHuDlaSyLnhcg4fwVXPikS1~OI6H9Xdoe-bzsAXFIX-8t8jZoht2GiHkNSDLdgiXq4dGgwD73A__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA))

The repository is currently located [here](https://gitlab.com/zimmermannmax16/auditory_models.git).

## Prerequisites
Install Python `>=3.9`

## Installation
`pip install auditory_models`

## Usage
```
from auditory_models import GPSMq
import soundfile as sf

reference, fs_ref = sf.read("reference.wav")
degraded, fs_dgr = sf.read("degraded.wav")
if fs_ref != fs_dgr:
    raise ValueError("Sample rates must be equal!")

gpsmq = GPSMq()

gpsmq.process(reference, degraded, fs_ref)
```
This usage example can be applied to any other auditory model. Each model class is designed to only take keyword 
arguments so that they can be called in a default version without any arguments. The `process()` method is also 
standardized across all models to take `reference`, `degraded`, and `sample_rate`. Please refer to the models' 
individual documentation to learn more about their inputs and outputs.

Currently available models:
- `BAMQ`
- `GPSMq`
- `STOI`
- Base class for type hints: `AuditoryModel`

For each model-class there exists a `frontend()` and a `backend()` method. The `process()` method calls them both in 
succession, so `frontend()` and `process()` have the same input values and `backend()` and `process()` have the same 
output values. Since there is no universal definition of a front and a back end, we stick to the following idea here:
- `frontend()` is everything that extracts features from the audio data
- `backend()` is the statistical analysis of the features that results in the final value

## Additional Info

- `STOI` differs from its original implementation in that the center-frequencies of the third-octave bands are fixed to 
    the frequencies defined by IEC 61260-1:2014.
- There are some helpful implementations in `./auditory_models/helpers/` like a Gammatone filterbank, a Matlab-like 
    resample class, etc. 

## Support
Regarding issues please feel free to contact me via 
<a href="mailto:zimmermannmax16@gmail.com">zimmermannmax16@gmail.com</a>

## Contributing
Any contribution is welcome. 

## Authors and acknowledgment
Author: Max Zimmermann\
Reviewer: Thomas Biberger\
Credits to: 
- The developers of the original Matlab implementations
    - STOI: Cees Taal 
    - GPSMq and BAM-Q: Thomas Biberger and Jan-Hendrik Fleßner
- Manuel Pariente for the original Python implementation of STOI

## License
This project is licensed under the GNU General Public License v3 (GPLv3). For further info see file `COPYING`.
