Metadata-Version: 2.4
Name: eval-ai-library
Version: 0.5.1
Summary: Comprehensive AI Model Evaluation Framework with support for multiple LLM providers
Author-email: Aleksandr Meshkov <alekslynx90@gmail.com>
License: Apache License 2.0
Project-URL: Homepage, https://github.com/meshkovQA/Eval-ai-library
Project-URL: Documentation, https://github.com/meshkovQA/Eval-ai-library#readme
Project-URL: Repository, https://github.com/meshkovQA/Eval-ai-library
Project-URL: Bug Tracker, https://github.com/meshkovQA/Eval-ai-library/issues
Keywords: ai,evaluation,llm,rag,metrics,testing,quality-assurance
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: openai>=1.0.0
Requires-Dist: anthropic>=0.18.0
Requires-Dist: google-genai>=0.2.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: langchain>=0.1.0
Requires-Dist: langchain-community>=0.0.10
Requires-Dist: langchain-core>=0.1.0
Requires-Dist: langchain-text-splitters>=0.2.0
Requires-Dist: pypdf>=3.0.0
Requires-Dist: python-docx>=0.8.11
Requires-Dist: openpyxl>=3.1.0
Requires-Dist: pillow>=10.0.0
Requires-Dist: pytesseract>=0.3.10
Requires-Dist: python-pptx>=0.6.21
Requires-Dist: PyMuPDF>=1.23.0
Requires-Dist: mammoth>=1.6.0
Requires-Dist: PyYAML>=6.0.0
Requires-Dist: html2text>=2020.1.16
Requires-Dist: markdown>=3.4.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: striprtf>=0.0.26
Requires-Dist: transformers>=4.30.0
Requires-Dist: torch>=2.0.0
Requires-Dist: presidio-analyzer>=2.2.0
Requires-Dist: spacy<3.8.0,>=3.0.0
Requires-Dist: flask>=3.0.0
Requires-Dist: aiohttp>=3.8.0
Provides-Extra: lite
Requires-Dist: pydantic>=2.0.0; extra == "lite"
Requires-Dist: aiohttp>=3.8.0; extra == "lite"
Provides-Extra: llm
Requires-Dist: openai>=1.0.0; extra == "llm"
Requires-Dist: anthropic>=0.18.0; extra == "llm"
Requires-Dist: google-genai>=0.2.0; extra == "llm"
Requires-Dist: pydantic>=2.0.0; extra == "llm"
Requires-Dist: numpy>=1.24.0; extra == "llm"
Requires-Dist: flask>=3.0.0; extra == "llm"
Requires-Dist: aiohttp>=3.8.0; extra == "llm"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Provides-Extra: vectors
Requires-Dist: sentence-transformers>=2.0.0; extra == "vectors"
Provides-Extra: deterministic
Requires-Dist: jsonschema>=4.0.0; extra == "deterministic"
Requires-Dist: langdetect>=1.0.9; extra == "deterministic"
Provides-Extra: docs
Requires-Dist: sphinx>=6.0.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.2.0; extra == "docs"
Dynamic: license-file

# Eval AI Library

[![Python Version](https://img.shields.io/badge/python-3.9%2B-blue)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![PyPI](https://img.shields.io/pypi/v/eval-ai-library)](https://pypi.org/project/eval-ai-library/)

> Based on [firstlinesoftware/eval-ai-library](https://github.com/firstlinesoftware/eval-ai-library). This is an independently maintained version with additional features and PyPI distribution.

Comprehensive AI model evaluation framework for RAG systems and AI agents. Supports 35+ evaluation metrics, 12 LLM providers, built-in test data generation from documents, and an interactive web dashboard for visualization and analysis. Implements advanced techniques including G-Eval probability-weighted scoring and Temperature-Controlled Verdict Aggregation via Generalized Power Mean.

## Installation

```bash
pip install eval-ai-library
```

Full version with document parsing and OCR support:

```bash
pip install eval-ai-library[full]
```

Lite version (core evaluation only):

```bash
pip install eval-ai-library[lite]
```

## Quick Start

```python
from eval_lib import EvalAI

evaluator = EvalAI(model="gpt-4o")

result = evaluator.evaluate(
    input="What is Python?",
    actual_output="Python is a programming language.",
    expected_output="Python is a high-level programming language.",
    metrics=["answer_relevancy", "faithfulness"]
)

print(result.score)
```

## Documentation

Full documentation is available at [library.eval-ai.com](https://library.eval-ai.com).

## License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## Citation

If you use this library in your research, please cite:
```bibtex
@software{eval_ai_library,
  author = {Meshkov, Aleksandr},
  title = {Eval AI Library: Comprehensive AI Model Evaluation Framework},
  year = {2025},
  url = {https://github.com/meshkovQA/Eval-ai-library.git}
}
```

### References

This library implements techniques from:
```bibtex
@inproceedings{liu2023geval,
  title={G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment},
  author={Liu, Yang and Iter, Dan and Xu, Yichong and Wang, Shuohang and Xu, Ruochen and Zhu, Chenguang},
  booktitle={Proceedings of EMNLP},
  year={2023}
}
```

## Support

- Issues: [GitHub Issues](https://github.com/meshkovQA/Eval-ai-library/issues)
- Documentation: [library.eval-ai.com](https://library.eval-ai.com)
