Metadata-Version: 2.3
Name: evalica
Version: 0.1.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Typing :: Typed
Requires-Dist: numpy >=1.16.0
Requires-Dist: pandas
Requires-Dist: hypothesis[numpy] ; extra == 'dev'
Requires-Dist: mypy ; extra == 'dev'
Requires-Dist: nbqa ; extra == 'dev'
Requires-Dist: notebook ; extra == 'dev'
Requires-Dist: pandas-stubs ; extra == 'dev'
Requires-Dist: plotly ; extra == 'dev'
Requires-Dist: pytest ; extra == 'dev'
Requires-Dist: pytest-cov ; extra == 'dev'
Requires-Dist: ruff ; extra == 'dev'
Requires-Dist: twine ; extra == 'dev'
Requires-Dist: mkdocs ; extra == 'docs'
Requires-Dist: mkdocstrings-python ; extra == 'docs'
Provides-Extra: dev
Provides-Extra: docs
License-File: LICENSE
Summary: Evalica, your favourite evaluation toolkit.
Keywords: Bradley-Terry,Elo,PageRank,eigenvector,evaluation,leaderboard,pairwise comparisons,ranking,rating,statistics
License: Apache-2.0
Requires-Python: ~=3.8
Description-Content-Type: text/markdown
Project-URL: Homepage, https://github.com/dustalov/evalica
Project-URL: Documentation, https://dustalov.github.io/evalica/
Project-URL: Download, https://pypi.org/project/evalica/#files

# Evalica, your favourite evaluation toolkit

[![Evalica](Evalica.svg)](https://github.com/dustalov/evalica)

[![Tests][github_tests_badge]][github_tests_link]
[![PyPI Version][pypi_badge]][pypi_link]
[![Anaconda.org][conda_badge]][conda_link]
[![Codecov][codecov_badge]][codecov_link]

[github_tests_badge]: https://github.com/dustalov/evalica/actions/workflows/test.yml/badge.svg?branch=master
[github_tests_link]: https://github.com/dustalov/evalica/actions/workflows/test.yml
[pypi_badge]: https://badge.fury.io/py/evalica.svg
[pypi_link]: https://pypi.python.org/pypi/evalica
[codecov_badge]: https://codecov.io/gh/dustalov/evalica/branch/master/graph/badge.svg
[codecov_link]: https://codecov.io/gh/dustalov/evalica
[conda_badge]: https://anaconda.org/conda-forge/evalica/badges/version.svg
[conda_link]: https://anaconda.org/conda-forge/evalica

**Evalica** is a Python library that transforms pairwise comparisons into ranked lists of items. It offers convenient high-performant Rust implementations of the corresponding methods via [PyO3](https://pyo3.rs/), and additionally provides naïve Python code for most of them. Evalica is fully compatible with [NumPy](https://numpy.org/) arrays and [pandas](https://pandas.pydata.org/) data frames.

- [Tutorial](https://dustalov.github.io/evalica/) (and [Tutorial.ipynb](Tutorial.ipynb))
- [Chatbot-Arena.ipynb](Chatbot-Arena.ipynb) [![Open in Colab][colab_badge]][colab_link] [![Binder][binder_badge]][binder_link]
- [Pair2Rank](https://huggingface.co/spaces/dustalov/pair2rank)

[colab_badge]: https://colab.research.google.com/assets/colab-badge.svg
[colab_link]: https://colab.research.google.com/github/dustalov/evalica/blob/master/Chatbot-Arena.ipynb
[binder_badge]: https://mybinder.org/badge_logo.svg
[binder_link]: https://mybinder.org/v2/gh/dustalov/evalica/HEAD?labpath=Chatbot-Arena.ipynb

The logo was created using [Recraft](https://www.recraft.ai/).

## Installation

- [pip](https://pip.pypa.io/): `pip install evalica`
- [Anaconda](https://docs.conda.io/en/latest/): `conda install conda-forge::evalica`

## Usage

Imagine that we would like to rank the different meals and have the following dataset of three comparisons produced by food experts.

| **Item X**| **Item Y** | **Winner** |
|:---:|:---:|:---:|
| `pizza` | `burger` | `x` |
| `burger` | `sushi` | `y` |
| `pizza` | `sushi` | `tie` |

Given this hypothetical example, Evalica takes these three columns and computes the outcome of the given pairwise comparison according to the chosen model. Note that the first argument is the column `Item X`, the second argument is the column `Item Y`, and the third argument corresponds to the column `Winner`.

```python
>>> from evalica import elo, Winner
>>> result = elo(
...     ['pizza', 'burger', 'pizza'],
...     ['burger', 'sushi', 'sushi'],
...     [Winner.X, Winner.Y, Winner.Draw],
... )
>>> result.scores
pizza     1014.972058
burger     970.647200
sushi     1014.380742
Name: elo, dtype: float64
```

As a result, we obtain [Elo scores](https://en.wikipedia.org/wiki/Elo_rating_system) of our items. In this example, `pizza` was the most favoured item, `sushi` was the runner-up, and `burger` was the least preferred item.

| **Item**| **Score** |
|---|---:|
| `pizza` | 1014.97 |
| `burger` | 970.65 |
| `sushi` | 1014.38 |

## Command-Line Interface

Evalica also provides a simple command-line interface, allowing the use of these methods in shell scripts and for prototyping.

```
$ evalica -i food.csv bradley-terry
item,score,rank
Tacos,0.43428827947351706,1
Sushi,0.19060105855071743,2
Burger,0.14797720376982199,3
Pasta,0.12815347866987045,4
Pizza,0.0989799795360731,5
```

Refer to the [food.csv](food.csv) file as an input example.

## Crowd-Kit

Users of the [Crowd-Kit](https://github.com/Toloka/crowd-kit) library can easily switch to Evalica by replacing their `label` item references with the corresponding `Winner` values, enjoying the faster and cleaner code.

```python
>>> import pandas as pd
>>> from crowdkit.aggregation import BradleyTerry
>>> df = pd.DataFrame(
...     [
...         ['item1', 'item2', 'item1'],
...         ['item3', 'item2', 'item2']
...     ],
...     columns=['left', 'right', 'label']
... )
>>> agg_bt = BradleyTerry(n_iter=100).fit_predict(df)
```

Evalica is not bound to the specific column names, reducing the potentially expensive operation of building a data frame, while remaining fully compatible with NumPy and pandas.

```python
>>> import pandas as pd
>>> from evalica import bradley_terry, Winner
>>> df = pd.DataFrame(
...     [
...         ['item1', 'item2', Winner.X],
...         ['item2', 'item3', Winner.Y]
...     ],
...     columns=['left', 'right', 'label']
... )
>>> scores = bradley_terry(df['left'], df['right'], df['item'], limit=100)
```

## Implemented Methods

| **Method** | **In Python** | **In Rust** |
|---|:---:|:---:|
| Counting | &#x2705; | &#x2705; |
| Average Win Rate | &#x2705; | &#x2705; |
| [Bradley&ndash;Terry] | &#x2705; | &#x2705; |
| [Elo] | &#x2705; | &#x2705; |
| [Eigenvalue] | &#x2705; | &#x2705; |
| [PageRank] | &#x2705; | &#x2705; |
| [Newman] | &#x2705; | &#x2705; |

<!-- Present: &#x2705; / Absent: &#x274C; -->

[Bradley&ndash;Terry]: https://doi.org/10.2307/2334029
[Elo]: https://isbnsearch.org/isbn/9780923891275
[Eigenvalue]: https://doi.org/10.1086/228631
[PageRank]: https://doi.org/10.1016/S0169-7552(98)00110-X
[Newman]: https://jmlr.org/papers/v24/22-1086.html

## Copyright

Copyright (c) 2024 [Dmitry Ustalov](https://github.com/dustalov). See [LICENSE](LICENSE) for details.

