Metadata-Version: 2.1
Name: trust_eval
Version: 0.1.0
Summary: Metric to measure RAG responses with inline citations
Home-page: https://github.com/shanghongsim/trust-eval
License: CC BY-NC 4.0
Keywords: RAG,evaluation,metrics,citation
Author: Shang Hong Sim
Author-email: simshanghong@gmail.com
Requires-Python: >=3.10,<3.12
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: colorlog (>=6.9.0,<7.0.0)
Requires-Dist: fuzzywuzzy (>=0.18.0,<0.19.0)
Requires-Dist: nltk (>=3.9.1,<4.0.0)
Requires-Dist: peft (>=0.14.0,<0.15.0)
Requires-Dist: py3nvml (>=0.2.7,<0.3.0)
Requires-Dist: python-levenshtein (>=0.26.1,<0.27.0)
Requires-Dist: scipy (>=1.14.1,<2.0.0)
Requires-Dist: vllm (>=0.6.6.post1,<0.7.0)
Project-URL: Repository, https://github.com/shanghongsim/trust-eval
Description-Content-Type: text/markdown

# Trust Eval

Trust Eval is a holistic metric for evaluating trustworthiness of inline cited LLM outputs within the RAG framework. 

## Project Structure

```text
trust-eval/
├── trust-eval/
│   ├── __init__.py
│   ├── config.py
│   ├── llm.py
│   ├── response_generator.py
│   ├── evaluator.py
│   ├── metrics.py
│   ├── utils.py
├── tests/
│   ├── __init__.py
│   ├── test_response_generator.py
│   ├── test_evaluator.py
├── README.md
├── poetry.lock
├── pyproject.toml
```

## Installation

```bash
conda create -n trust-eval python=3.10.13
conda activate trust-eval
poetry install
```

```bash
import nltk
nltk.download('punkt_tab')
```

## Example usage

```python
from config import EvaluationConfig, ResponseGeneratorConfig
from evaluator import Evaluator
from logging_config import logger
from response_generator import ResponseGenerator

# Generate responses
generator_config = ResponseGeneratorConfig.from_yaml(yaml_path="generator_config.yaml")
logger.info(generator_config)
generator = ResponseGenerator(generator_config)
generator.generate_responses()
generator.save_responses()

# Evaluate responses
evaluation_config = EvaluationConfig.from_yaml(yaml_path="eval_config.yaml")
logger.info(evaluation_config)
evaluator = Evaluator(evaluation_config)
evaluator.compute_metrics()
evaluator.save_results()
```


