Metadata-Version: 2.4
Name: pocket-tts-mlx
Version: 0.2.0
Summary: MLX backend for pocket-tts with Apple Silicon optimization
Author: jishnuvenugopal
License: MIT
Project-URL: Homepage, https://github.com/jishnuvenugopal/pocket-tts-mlx
Project-URL: Repository, https://github.com/jishnuvenugopal/pocket-tts-mlx
Project-URL: Issues, https://github.com/jishnuvenugopal/pocket-tts-mlx/issues
Keywords: tts,text-to-speech,mlx,apple-silicon,voice-cloning
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mlx>=0.20.0
Requires-Dist: numpy
Requires-Dist: safetensors
Requires-Dist: sentencepiece>=0.2.1
Requires-Dist: pydantic>=2
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.20.0
Requires-Dist: huggingface_hub>=0.10
Requires-Dist: scipy>=1.5.0
Requires-Dist: soundfile>=0.12.0
Requires-Dist: typing-extensions
Provides-Extra: dev
Requires-Dist: torch>=2.5.0; extra == "dev"
Requires-Dist: pocket-tts>=1.0.3; extra == "dev"
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-xdist>=3.0; extra == "dev"
Dynamic: license-file

# pocket-tts-mlx

MLX backend for [pocket-tts](https://github.com/kyutai-labs/pocket-tts) optimized for Apple Silicon.

Runtime is torch-free. Torch is only required for optional parity tests.

**Installation**

PyPI install (available after release):

```bash
pip install pocket-tts-mlx
```

Local development:

```bash
pip install -e .
```

Model weights are downloaded from Hugging Face on first run. For voice cloning
weights, accept the model terms and authenticate:

```bash
hf auth login
```

**Quickstart**

```python
from pocket_tts_mlx import TTSModel

model = TTSModel.load_model()
state = model.get_state_for_audio_prompt("marius")
audio = model.generate_audio(state, "Hello from MLX!", max_tokens=200)
```

**CLI**

```bash
pocket-tts-mlx "Hello, world!" --voice marius --output output.wav
```

**Voices**

Predefined voices:

- alba
- marius
- javert
- jean
- fantine
- cosette
- eponine
- azelma

**Requirements**

- Python 3.10+
- Apple Silicon Mac (M1/M2/M3/M4)
- MLX
- Internet access for initial model downloads

**Notes**

- Voice cloning requires Hugging Face access to `kyutai/pocket-tts`.
- Non-voice-cloning weights are used automatically when voice cloning is unavailable.
