Metadata-Version: 2.1
Name: spot-llm
Version: 0.3.0
Summary: 
Author: Edouard Yvinec
Author-email: edouardyvinec@hotmail.fr
Requires-Python: >=3.8,<3.11
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Dist: accelerate (>=0.30.1,<0.31.0)
Requires-Dist: loadingpy (>=0.1.4,<0.2.0)
Requires-Dist: torch (>=2.3.0,<3.0.0)
Requires-Dist: transformers (>=4.41.1,<5.0.0)
Description-Content-Type: text/markdown

# SPOT

Spot spots LLM-based texts

## Use this repo

```bash
conda create -n spot python=3.8 --y
conda activate spot

poetry install
```

## Installation

```bash
poetry add spot
```

### Usage

```python
from spot import MistralSpotter

s = MistralSpotter("YourHuggingfaceToken")

s.is_human("""The wide acceptance of large language models (LLMs) has unlocked new applica-
tions and social risks. Popular countermeasures aim at detecting misinformation,
usually involve domain specific models trained to recognize the relevance of any
information. Instead of evaluating the validity of the information, we propose to
investigate LLM generated text from the perspective of trust. In this study, we
define trust as the ability to know if an input text was generated by a LLM or a
human. To do so, we design SPOT, an efficient method, that classifies the source
of any, standalone, text input based on originality score. This score is derived from
the prediction of a given LLM to detect other LLMs. We empirically demonstrate
the robustness of the method to the architecture, training data, evaluation data, task
and compression of modern LLMs.""") # => true
```

```python
from spot import Opt125Spotter

s = Opt125Spotter()

s.is_human("""The wide acceptance of large language models (LLMs) has unlocked new applica-
tions and social risks. Popular countermeasures aim at detecting misinformation,
usually involve domain specific models trained to recognize the relevance of any
information. Instead of evaluating the validity of the information, we propose to
investigate LLM generated text from the perspective of trust. In this study, we
define trust as the ability to know if an input text was generated by a LLM or a
human. To do so, we design SPOT, an efficient method, that classifies the source
of any, standalone, text input based on originality score. This score is derived from
the prediction of a given LLM to detect other LLMs. We empirically demonstrate
the robustness of the method to the architecture, training data, evaluation data, task
and compression of modern LLMs.""") # => true
```

