Metadata-Version: 2.1
Name: sigllm
Version: 0.0.2
Summary: Signals plus LLMs
Home-page: https://github.com/sintel-dev/sigllm
Author: MIT Data To AI Lab
Author-email: dailabmit@gmail.com
License: MIT license
Keywords: sigllm sigllm sigllm
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.8,<3.12
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS.rst
Requires-Dist: numpy<2,>=1.17.5
Requires-Dist: openai
Requires-Dist: pandas<2,>=1
Requires-Dist: orion-ml<0.8,>=0.5
Requires-Dist: scikit-learn<1.2,>=0.22
Requires-Dist: tiktoken
Requires-Dist: transformers
Requires-Dist: torch>=1.4
Requires-Dist: accelerate
Requires-Dist: sentencepiece
Provides-Extra: test
Requires-Dist: pytest>=3.4.2; extra == "test"
Requires-Dist: pytest-cov>=2.6.0; extra == "test"
Requires-Dist: rundoc<0.5,>=0.4.3; extra == "test"
Provides-Extra: dev
Requires-Dist: bumpversion>=0.5.3; extra == "dev"
Requires-Dist: pip>=9.0.1; extra == "dev"
Requires-Dist: watchdog>=0.8.3; extra == "dev"
Requires-Dist: docutils<0.18,>=0.12; extra == "dev"
Requires-Dist: m2r2<0.3,>=0.2.5; extra == "dev"
Requires-Dist: nbsphinx<0.7,>=0.5.0; extra == "dev"
Requires-Dist: Sphinx<3.3,>=3; extra == "dev"
Requires-Dist: pydata-sphinx-theme<0.5; extra == "dev"
Requires-Dist: markupsafe<2.1.0; extra == "dev"
Requires-Dist: ipython<9,>=6.5; extra == "dev"
Requires-Dist: Jinja2<3,>=2; extra == "dev"
Requires-Dist: alabaster<=0.7.12; extra == "dev"
Requires-Dist: sphinxcontrib-applehelp<1.0.8; extra == "dev"
Requires-Dist: sphinxcontrib-devhelp<1.0.6; extra == "dev"
Requires-Dist: sphinxcontrib-htmlhelp<2.0.5; extra == "dev"
Requires-Dist: sphinxcontrib-serializinghtml<1.1.10; extra == "dev"
Requires-Dist: sphinxcontrib-qthelp<1.0.7; extra == "dev"
Requires-Dist: flake8<4,>=3.7.7; extra == "dev"
Requires-Dist: isort<5,>=4.3.4; extra == "dev"
Requires-Dist: autoflake<2,>=1.2; extra == "dev"
Requires-Dist: autopep8<2,>=1.4.3; extra == "dev"
Requires-Dist: importlib-metadata<5; extra == "dev"
Requires-Dist: twine<4,>=1.10.0; extra == "dev"
Requires-Dist: wheel>=0.30.0; extra == "dev"
Requires-Dist: coverage<6,>=4.5.1; extra == "dev"
Requires-Dist: tox<4,>=2.9.1; extra == "dev"
Requires-Dist: invoke; extra == "dev"
Requires-Dist: pytest>=3.4.2; extra == "dev"
Requires-Dist: pytest-cov>=2.6.0; extra == "dev"
Requires-Dist: rundoc<0.5,>=0.4.3; extra == "dev"

<p align="left">
<img width=15% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt=“DAI-Lab” />
<i>An open source project from Data to AI Lab at MIT.</i>
</p>

[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
[![Python](https://img.shields.io/badge/Python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)](https://badge.fury.io/py/sigllm) 
[![PyPi Shield](https://img.shields.io/pypi/v/sigllm.svg)](https://pypi.python.org/pypi/sigllm)
[![Run Tests](https://github.com/sintel-dev/sigllm/actions/workflows/tests.yml/badge.svg)](https://github.com/sintel-dev/sigllm/actions/workflows/tests.yml)
[![Downloads](https://pepy.tech/badge/sigllm)](https://pepy.tech/project/sigllm)


# SigLLM

Using Large Language Models (LLMs) for time series anomaly detection.

<!-- - Documentation: https://sintel-dev.github.io/sigllm -->
- Homepage: https://github.com/sintel-dev/sigllm

# Overview

SigLLM is an extension of the Orion library, built to detect anomalies in time series data using LLMs.
We provide two types of pipelines for anomaly detection:
* **Prompter**: directly prompting LLMs to find anomalies in time series.
* **Detector**: using LLMs to forecast time series and finding anomalies through by comparing the real and forecasted signals.

For more details on our pipelines, please read our [paper](https://arxiv.org/pdf/2405.14755).

# Quickstart

## Install with pip

The easiest and recommended way to install **SigLLM** is using [pip](https://pip.pypa.io/en/stable/):

```bash
pip install sigllm
```
This will pull and install the latest stable release from [PyPi](https://pypi.org/).


In the following example we show how to use one of the **SigLLM Pipelines**.

# Detect anomalies using a SigLLM pipeline

We will load a demo data located in `tutorials/data.csv` for this example:

```python3
import pandas as pd

data = pd.read_csv('data.csv')
data.head()
```

which should show a signal with `timestamp` and `value`.
```
     timestamp      value
0   1222840800   6.357008
1   1222862400  12.763547
2   1222884000  18.204697
3   1222905600  21.972602
4   1222927200  23.986643
5   1222948800  24.906765
```

In this example we use `gpt_detector` pipeline and set some hyperparameters. In this case, we set the thresholding strategy to dynamic. The hyperparameters are optional and can be removed.

In addtion, the `SigLLM` object takes in a `decimal` argument to determine how many digits from the float value include. Here, we don't want to keep any decimal values, so we set it to zero.

```python3
from sigllm import SigLLM

hyperparameters = {
    "orion.primitives.timeseries_anomalies.find_anomalies#1": {
        "fixed_threshold": False
    }
}

sigllm = SigLLM(
    pipeline='gpt_detector',
    decimal=0,
    hyperparameters=hyperparameters
)
```

Now that we have initialized the pipeline, we are ready to use it to detect anomalies:

```python3
anomalies = sigllm.detect(data)
```
> :warning: Depending on the length of your timeseries, this might take time to run.

The output of the previous command will be a ``pandas.DataFrame`` containing a table of detected anomalies:

```
        start         end  severity
0  1225864800  1227139200  0.625879
```

# Resources

Additional resources that might be of interest:
* Learn about [Orion](https://github.com/sintel-dev/Orion).
* Read our [paper](https://arxiv.org/pdf/2405.14755).


# Citation

If you use **SigLLM** for your research, please consider citing the following paper:

Sarah Alnegheimish, Linh Nguyen, Laure Berti-Equille, Kalyan Veeramachaneni. [Can Large Language Models be Anomaly Detectors for Time Series?](https://arxiv.org/pdf/2405.14755).

```
@inproceedings{alnegheimish2024sigllm,
  title={Can Large Language Models be Anomaly Detectors for Time Series?},
  author={Alnegheimish, Sarah and Nguyen, Linh and Berti-Equille, Laure and Veeramachaneni, Kalyan},
  booktitle={2024 IEEE International Conferencze on Data Science and Advanced Analytics (IEEE DSAA)},
  organization={IEEE},
  year={2024}
}
```

# History

## 0.0.2 - 2024-10-24

New Prompter pipeline.

* Test README with GPT – [Issue #20](https://github.com/sintel-dev/sigllm/issues/20) by @sarahmish
* Mistral-prompter – [Issue #19](https://github.com/sintel-dev/sigllm/issues/19) by @Linh-nk


## 0.0.1 - 2024-09-25

First sigllm release to PyPI: https://pypi.org/project/sigllm/

* Add README – [Issue #17](https://github.com/sintel-dev/sigllm/issues/17) by @sarahmish
* Create a SigLLM API – [Issue #13](https://github.com/sintel-dev/sigllm/issues/13) by @sarahmish
* Add a Quick Example – [Issue #12](https://github.com/sintel-dev/sigllm/issues/12) by @sarahmish
* Forecasting Pipeline – [Issue #11](https://github.com/sintel-dev/sigllm/issues/11) by @sarahmish
* Refactor Transformation Primitives – [Issue #7](https://github.com/sintel-dev/sigllm/issues/7) by @sarahmish
* Forecasting Module – [Issue #2](https://github.com/sintel-dev/sigllm/issues/2) by @sarahmish

