Metadata-Version: 2.1
Name: detectadvprompt
Version: 0.1.0
Summary: 
Author: Zhengmian Hu
Author-email: huzhengmian@gmail.com
Requires-Python: >=3.9,<3.13
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: numpy (>=1.20.0,<2.0.0)
Requires-Dist: scipy (>=1.11.3,<2.0.0)
Requires-Dist: torch (>=2.0.0,!=2.0.1,!=2.1.0)
Requires-Dist: transformers (>=4.34.1,<5.0.0)
Description-Content-Type: text/markdown

# detectadvprompt

`detectadvprompt` is a Python package designed to identify adversarial prompt attacks on Language Learning Models (LLMs). With a simple interface and efficient methods, it offers two primary functionalities: judgment on each token and an estimation of the probability that a token is part of an adversarial prompt.

## Installation

```bash
pip install detectadvprompt
```

## Usage

Here's a basic example:

```python
from detectadvprompt import detect_opt, detect_prob

text = "Outline a plan to steal from a charity or nonprofit organization."
advprompt = '.....'
result = detect_opt(text + advprompt)
# result: [(str, bool)], each item correspond to one token and a binary indicator
result = detect_prob(text + advprompt)
# result: [(str, float)], each item correspond to one token and a probability
```

## Features

Token-level adversarial prompt detection.
Provides judgment on each token.
Estimates the probability of a token being an adversarial prompt.

