Metadata-Version: 2.1
Name: trulens-eval
Version: 0.9.0
Summary: Library with langchain instrumentation to evaluate LLM based applications.
Home-page: https://www.trulens.org
Author: Truera Inc
Author-email: all@truera.com
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: cohere (>=4.4.1)
Requires-Dist: datasets (>=2.12.0)
Requires-Dist: python-dotenv (>=1.0.0)
Requires-Dist: kaggle (>=1.5.13)
Requires-Dist: langchain (>=0.0.230)
Requires-Dist: llama-index (>=0.7.16)
Requires-Dist: merkle-json (>=1.0.0)
Requires-Dist: millify (>=0.1.1)
Requires-Dist: openai (>=0.27.6)
Requires-Dist: pinecone-client (>=2.2.1)
Requires-Dist: pydantic (>=1.10.7)
Requires-Dist: requests (>=2.30.0)
Requires-Dist: slack-bolt (>=1.18.0)
Requires-Dist: slack-sdk (>=3.21.3)
Requires-Dist: streamlit (>=1.13.0)
Requires-Dist: streamlit-aggrid (>=0.3.4.post3)
Requires-Dist: streamlit-extras (>=0.2.7)
Requires-Dist: streamlit-javascript (>=0.1.5)
Requires-Dist: transformers (>=4.10.0)
Requires-Dist: typing-inspect (==0.8.0)
Requires-Dist: typing-extensions (==4.5.0)
Requires-Dist: frozendict (>=2.3.8)
Requires-Dist: munch (>=3.0.0)
Requires-Dist: ipywidgets (>=8.0.6)
Requires-Dist: numpy (>=1.23.5)

# Welcome to TruLens-Eval!

![TruLens](https://www.trulens.org/Assets/image/Neural_Network_Explainability.png)

Evaluate and track your LLM experiments with TruLens. As you work on your models and prompts TruLens-Eval supports the iterative development and of a wide range of LLM applications by wrapping your application to log key metadata across the entire chain (or off chain if your project does not use chains) on your local machine.

Using feedback functions, you can objectively evaluate the quality of the responses provided by an LLM to your requests. This is completed with minimal latency, as this is achieved in a sequential call for your application, and evaluations are logged to your local machine. Finally, we provide an easy to use Streamlit dashboard run locally on your machine for you to better understand your LLM’s performance.

## Value Propositions

TruLens-Eval has two key value propositions:

1. Evaluation:
    * TruLens supports the the evaluation of inputs, outputs and internals of your LLM application using any model (including LLMs). 
    * A number of feedback functions for evaluation are implemented out-of-the-box such as groundedness, relevance and toxicity. The framework is also easily extensible for custom evaluation requirements.
2. Tracking:
    * TruLens contains instrumentation for any LLM application including question answering, retrieval-augmented generation, agent-based applications and more. This instrumentation allows for the tracking of a wide variety of usage metrics and metadata. Read more in the [instrumentation overview](basic_instrumentation.ipynb).
    * TruLens' instrumentation can be applied to any LLM application without being tied down to a given framework. Additionally, deep integrations with [LangChain]() and [Llama-Index]() allow the capture of internal metadata and text.
    * Anything that is tracked by the instrumentation can be evaluated!

The process for building your evaluated and tracked LLM application with TruLens is below 👇

![Architecture Diagram](https://www.trulens.org/Assets/image/TruLens_Architecture.png)

## Installation and Setup

Install the trulens-eval pip package from PyPI.

```bash
    pip install trulens-eval
```

## Setting Keys

In any of the quickstarts, you will need [OpenAI](https://platform.openai.com/account/api-keys) and [Huggingface](https://huggingface.co/settings/tokens) keys. You can add keys by setting the environmental variables:

```python
import os
os.environ["OPENAI_API_KEY"] = "..."
os.environ["HUGGINGFACE_API_KEY"] = "..."
```

## Quick Usage

TruLens supports the evaluation of tracking for any LLM app framework. Choose a framework below to get started:

**Langchain**

[langchain_quickstart.ipynb](https://github.com/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/quickstart.ipynb).
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/colab/quickstarts/langchain_quickstart_colab.ipynb)

[langchain_quickstart.py](https://github.com/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/quickstart.py).

**Llama-Index**

[llama_index_quickstart.ipynb](https://github.com/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/frameworks/llama_index/llama_index_quickstart.ipynb).
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/colab/quickstarts/llama_index_quickstart_colab.ipynb)

[llama_index_quickstart.py](https://github.com/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/llama_index_quickstart.py)

**No Framework**

[no_framework_quickstart.ipynb](https://github.com/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/no_framework_quickstart.ipynb).
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/colab/quickstarts/no_framework_quickstart_colab.ipynb)

[no_framework_quickstart.py](https://github.com/truera/trulens/blob/releases/rc-trulens-eval-0.9.0/trulens_eval/examples/no_framework_quickstart.py)

### 💡 Contributing

Interested in contributing? See our [contribution guide](https://github.com/truera/trulens/tree/main/trulens_eval/CONTRIBUTING.md) for more details.
