Metadata-Version: 2.4
Name: bewer
Version: 0.1.0a7
Summary: Evaluation and analysis framework for automatic speech recognition in Python.
Project-URL: Homepage, https://github.com/corticph/bewer
Project-URL: Repository, https://github.com/corticph/bewer
Author-email: Lasse Borgholt <lb@corti.ai>
License: MIT
License-File: LICENSE
Keywords: ASR,NLP,WER,evaluation,speech-recognition
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: <3.15,>=3.10
Requires-Dist: error-align>=0.1.0b8
Requires-Dist: fuzzywuzzy>=0.18.0
Requires-Dist: hydra-core>=1.3.0
Requires-Dist: jinja2<4.0.0,>=3.1.6
Requires-Dist: levenshtein>=0.20.0
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: pyahocorasick<3.0.0,>=2.3.0
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: rapidfuzz>=3.0.0
Requires-Dist: regex>=2024.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typeguard>=4.0.0
Requires-Dist: unidecode>=1.3.0
Description-Content-Type: text/markdown


# BeWER

*Beyond Word Error Rate → BeWER (/ˈbiːvər/) 🦫*

<p align="left">
  <img src="https://img.shields.io/badge/python-%203.10%20|%203.11%20|%203.12%20|%203.13%20|%203.14-green" alt="Python Versions">
  <img src="https://codecov.io/gh/corticph/bewer/graph/badge.svg?token=4QBH8TD4T4" alt="Coverage" style="margin-left:5px;">
  <img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License" style="margin-left:5px;">
</p>

**⚠️ Important:** This project is not production ready and is still in early development. Breaking changes may occur, and backwards compatibility between alpha versions is not guaranteed.

**BeWER is an evaluation and analysis framework for automatic speech recognition in Python.** It defines a transparent YAML-based approach for configuring evaluation pipelines and makes it easy to inspect and analyze individual examples through a web-based interface. The built-in preprocessing pipeline and metrics collection are designed to cover all conventional use cases and then some, while still being fully extensible.




__Contents__ | [Installation](#installation) | [Quickstart](#quickstart) |


<a name="installation">

## Installation

```bash
pip install bewer
```

## Quickstart

**Create a Dataset**

```python
from bewer import Dataset

dataset = Dataset()
```

**Add data**

From a file:
```python
dataset.load_csv(
    "data.csv",
    ref_col="reference",
    hyp_col="hypothesis",
)
```

Or manually:
```python
for ref, hyp in iterator:
    dataset.add(ref=ref, hyp=hyp)
```

**List available metrics**

```python
dataset.metrics.list_metrics()
```

**Compute metrics lazily**

```python
print(f"WER: {dataset.metrics.wer().value:.2%}")
```
