Metadata-Version: 2.1
Name: kfa
Version: 0.0.4
Summary: Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus
Home-page: https://github.com/seanghay/kfa
Author: Seanghay Yath
Author-email: seanghay.dev@gmail.com
License: Apache License 2.0
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Requires-Python: >3.5
Description-Content-Type: text/markdown
Requires-Dist: onnxruntime
Requires-Dist: sosap ==0.0.1
Requires-Dist: numpy ==1.24.4
Requires-Dist: khmercut ==0.0.2
Requires-Dist: librosa ==0.10.1
Requires-Dist: bistring ==0.5.0
Requires-Dist: requests ==2.31.0
Requires-Dist: appdirs

## KFA

A fast Khmer Forced Aligner powered by **Wav2Vec2CTC** and **Phonetisaurus**.

- [ ] Built-in Speech Enhancement
- [x] Word-level Alignment

```shell
pip install kfa
```

#### CLI

```shell
kfa -a audio.wav -t text.txt -o alignments.jsonl
```

#### Python

```python
from kfa import align
import librosa

with open("test.txt") as infile:
    text = infile.read()

y, sr = librosa.load("text.wav", sr=16000, mono=True)

for alignment in align(y, sr, text):
  print(alignment)
```

#### References

- [MMS: Scaling Speech Technology to 1000+ languages](https://github.com/facebookresearch/fairseq/tree/main/examples/mms)
- [CTC FORCED ALIGNMENT API TUTORIAL](https://pytorch.org/audio/main/tutorials/ctc_forced_alignment_api_tutorial.html)
- [Phonetisaurus](https://github.com/AdolfVonKleist/Phonetisaurus)
- [Fine-Tune Wav2Vec2 for English ASR with 🤗 Transformers](https://huggingface.co/blog/fine-tune-wav2vec2-english)
- [Thai Wav2vec2 model to ONNX model](https://pythainlp.github.io/tutorials/notebooks/thai_wav2vec2_onnx.html)


#### License

`Apache-2.0`
