Metadata-Version: 2.1
Name: ukr-itn
Version: 0.1.1
Summary: WFST for Ukrainian Inverse Text Normalization (ITN) based on NVIDIA NeMo and Pynini
Author-email: Vasyl Spachynskyi <vspachyn@gmail.com>
Project-URL: Homepage, https://github.com/lociko/ukraine_itn_wfst
Project-URL: Bug Tracker, https://github.com/lociko/ukraine_itn_wfst/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Natural Language :: Ukrainian
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE

# WFST for Ukrainian ITN

Simple WFST for Ukrainian ITN based on NVIDIA NeMo and Pynini

## Usage

```python
from ukr.wfst import graph, apply_fst_text

apply_fst_text("це трапилося дві тисячі девятнадцятого числа", graph)  # це трапилося 2019 числа
apply_fst_text("мінус пять цілих одна десята відсотка", graph)  # -5.1 %
apply_fst_text("двадцять дві тисячі сто один", graph)  # 22101
```

## How it works

We have two king of FST: taggers and verbalizers

This is a tagger:

```python
from ukr.wfst import tMeasureFst, apply_fst_text

apply_fst_text("мінус пять цілих одна десята відсотка", tMeasureFst)  
```

will return `"measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }"`

And this is a verbalizers

```python
from ukr.wfst import vMeasureFst, apply_fst_text

apply_fst_text('measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }', vMeasureFst)  
```

will return `-5.1 %`
