Metadata-Version: 2.1
Name: jotts
Version: 1.0.5
Summary: JoTTS is a German text-to-speech engine.
Home-page: https://github.com/padmalcom/jotts
Author: Jonas Freiknecht
Author-email: j.freiknecht@googlemail.com
License: MIT
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: appdirs (==1.4.4)
Requires-Dist: audioread (==3.0.0)
Requires-Dist: cffi (==1.15.0)
Requires-Dist: charset-normalizer (==2.1.1)
Requires-Dist: colorama (==0.4.6)
Requires-Dist: contourpy (==1.0.6)
Requires-Dist: cycler (==0.11.0)
Requires-Dist: decorator (==5.1.1)
Requires-Dist: fonttools (==4.38.0)
Requires-Dist: idna (==3.4)
Requires-Dist: inflect (==6.0.2)
Requires-Dist: joblib (==1.2.0)
Requires-Dist: kiwisolver (==1.4.4)
Requires-Dist: librosa (==0.9.2)
Requires-Dist: llvmlite (==0.39.1)
Requires-Dist: loguru (==0.6.0)
Requires-Dist: matplotlib (==3.6.2)
Requires-Dist: numba (==0.56.4)
Requires-Dist: numpy (>=1.23.4)
Requires-Dist: packaging (==21.3)
Requires-Dist: Pillow (==9.3.0)
Requires-Dist: pooch (==1.6.0)
Requires-Dist: pycparser (==2.21)
Requires-Dist: pydantic (==1.10.2)
Requires-Dist: pyparsing (==3.0.9)
Requires-Dist: python-dateutil (==2.8.2)
Requires-Dist: requests (==2.28.1)
Requires-Dist: resampy (==0.4.2)
Requires-Dist: scikit-learn (==1.1.3)
Requires-Dist: scipy (>=1.9.3)
Requires-Dist: six (==1.16.0)
Requires-Dist: sounddevice (==0.4.5)
Requires-Dist: soundfile (==0.11.0)
Requires-Dist: threadpoolctl (==3.1.0)
Requires-Dist: torch (>=1.10.0)
Requires-Dist: tqdm (>=4.64.1)
Requires-Dist: typing-extensions (==4.4.0)
Requires-Dist: Unidecode (==1.3.6)
Requires-Dist: urllib3 (==1.26.12)

# jotts
JoTTS is a German text-to-speech engine using tacotron and griffin-lim or wavernn as vocoder. The synthesizer model
has been trained on my voice using tacotron1. Using grifin-lim as vocoder makes the audio generation much faster
whereas using a trained vocoder returns better results in most cases.

<a href="https://www.buymeacoffee.com/padmalcom" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/default-orange.png" alt="Buy Me A Coffee" height="41" width="174"></a>


## API
- First create an instance of *JoTTS*.

- (optional) List all models that are available *using list_models()*. You can also look them up in the browser: https://github.com/padmalcom/Real-Time-Voice-Cloning-German/releases

- Load a model of your choice using *load_models()* which takes *force_model_download* as an optional parameter
in case that the last download of the synthesizer failed and the model cannot be applied. The parameter
*model_name* is validated against all available models on the release page.

- Call speak with a *text* parameter that contains the text to speak out loud. The second parameter *wait_for_end*
can be set to True, to wait until speaking is done, e.g. to prevent your application to close. If you want
to use a trained vocoder, set *use_wavernn_vocoder* to True.

- Use *textToWav* to create a wav file instead of speaking the text. *out_path* specifies where the wav file is
written to. Use *use_wavernn_vocoder* to use a trained vocoder.

## Example usage

```python
from jotts import JoTTS
if __name__ == "__main__":
	tts = JoTTS()
	tts.list_models()
	tts.load_models(force_model_download=False, model_name="jonas_v0.1")
	tts.speak("Das ist ein Test mit meiner Stimme.", wait_for_end = True, use_wavernn_vocoder=True)
	tts.speak("Das ist ein Test mit meiner Stimme.", wait_for_end = True, use_wavernn_vocoder=False)
	tts.textToWav(text="Das ist ein Test mit meiner Stimme.", out_path="vocoder_out.wav", use_wavernn_vocoder=True)
	tts.textToWav(text="Das ist ein Test mit meiner Stimme.", out_path="griffin_lim_out.wav", use_wavernn_vocoder=False)
```

## Todo
- Add an option to change the default audio device to speak the text
- Add threading or multi processing to allow speaking without blocking
- Add a parameter to avoid online communication in case of running JoTTS on edge.
- Add a feature to quickly finetune a model with a arbitrary voice

## Training a model for your own voice
Training a synthesizer model is easy - if you know how to do it. I created a course on udemy to show you how it is done.
Don't buy the tutorial for the full price, there is a discout every month :-) 

https://www.udemy.com/course/voice-cloning/

If you neither have the backgroud or the resources or if you are just lazy or too rich, contact me for contract work.
Cloning a voice normally needs ~15 Minutes of clean audio from the voice you want to clone.

## Disclaimer
I hope that my (and any other person's) voice will be used only for legal and ethical purposes. Please do not get into mischief with it.


