Metadata-Version: 2.1
Name: snac
Version: 0.1.0
Summary: Multi-Scale Neural Audio Codec
Home-page: https://github.com/hubertsiuzdak/snac
Author: Hubert Siuzdak
Author-email: hubert.siuzdak@gmail.com
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch
Requires-Dist: numpy
Requires-Dist: einops
Requires-Dist: huggingface-hub

# [WIP] SNAC 🍿

Multi-**S**cale **N**eural **A**udio **C**odec (SNAC) compressess 44.1 kHz audio into discrete codes at a low bitrate.

It encodes audio into hierarchical tokens similarly to SoundStream, EnCodec, and DAC (see the image
on the left). However, SNAC introduces a simple change where coarse tokens are sampled less frequently,
covering a broader time span (see the image on the right).

This can not only save on bitrate, but more importantly this might be very useful for language modeling approaches to
audio generation. E.g. with coarse tokens of ~10 Hz and a context window of 2048 you can effectively model a
consistent structure of an audio track for ~3 minutes.

![snac.png](img%2Fsnac.png)
