|
3 | 3 | <i>An open source project from Data to AI Lab at MIT.</i>
|
4 | 4 | </p>
|
5 | 5 |
|
6 |
| -<!-- Uncomment these lines after releasing the package to PyPI for version and downloads badges --> |
7 |
| -<!--[](https://pypi.python.org/pypi/sigllm)--> |
8 |
| -<!--[](https://pepy.tech/project/sigllm)--> |
9 |
| -[](https://github.com/sintel-dev/sigllm/actions) |
| 6 | +[](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha) |
| 7 | +[](https://badge.fury.io/py/sigllm) |
| 8 | +[](https://pypi.python.org/pypi/sigllm) |
| 9 | +[](https://github.com/sintel-dev/sigllm/actions/workflows/tests.yml) |
| 10 | +[](https://pepy.tech/project/sigllm) |
10 | 11 |
|
11 | 12 |
|
| 13 | +# SigLLM |
12 | 14 |
|
13 |
| -# sigllm |
| 15 | +Using Large Language Models (LLMs) for time series anomaly detection. |
14 | 16 |
|
15 |
| -Signals plus LLMs |
16 |
| - |
17 |
| -- Documentation: https://sintel-dev.github.io/sigllm |
| 17 | +<!-- - Documentation: https://sintel-dev.github.io/sigllm --> |
18 | 18 | - Homepage: https://github.com/sintel-dev/sigllm
|
19 | 19 |
|
20 | 20 | # Overview
|
21 | 21 |
|
22 |
| -TODO: Provide a short overview of the project here. |
23 |
| - |
24 |
| -# Install |
| 22 | +SigLLM is an extension of the Orion library, built to detect anomalies in time series data using LLMs. |
| 23 | +We provide two types of pipelines for anomaly detection: |
| 24 | +* **Prompter**: directly prompting LLMs to find anomalies in time series. |
| 25 | +* **Detector**: using LLMs to forecast time series and finding anomalies through by comparing the real and forecasted signals. |
25 | 26 |
|
26 |
| -## Requirements |
| 27 | +For more details on our pipelines, please read our [paper](https://arxiv.org/pdf/2405.14755). |
27 | 28 |
|
28 |
| -**sigllm** has been developed and tested on [Python 3.8, 3.9, 3.10 and 3.11](https://www.python.org/downloads/) |
| 29 | +# Quickstart |
29 | 30 |
|
30 |
| -Also, although it is not strictly required, the usage of a [virtualenv](https://virtualenv.pypa.io/en/latest/) |
31 |
| -is highly recommended in order to avoid interfering with other software installed in the system |
32 |
| -in which **sigllm** is run. |
| 31 | +## Install with pip |
33 | 32 |
|
34 |
| -These are the minimum commands needed to create a virtualenv using python3.8 for **sigllm**: |
| 33 | +The easiest and recommended way to install **SigLLM** is using [pip](https://pip.pypa.io/en/stable/): |
35 | 34 |
|
36 | 35 | ```bash
|
37 |
| -pip install virtualenv |
38 |
| -virtualenv -p $(which python3.6) sigllm-venv |
| 36 | +pip install sigllm |
39 | 37 | ```
|
| 38 | +This will pull and install the latest stable release from [PyPi](https://pypi.org/). |
40 | 39 |
|
41 |
| -Afterwards, you have to execute this command to activate the virtualenv: |
42 | 40 |
|
43 |
| -```bash |
44 |
| -source sigllm-venv/bin/activate |
45 |
| -``` |
| 41 | +In the following example we show how to use one of the **SigLLM Pipelines**. |
46 | 42 |
|
47 |
| -Remember to execute it every time you start a new console to work on **sigllm**! |
| 43 | +# Detect anomalies using a SigLLM pipeline |
48 | 44 |
|
49 |
| -<!-- Uncomment this section after releasing the package to PyPI for installation instructions |
50 |
| -## Install from PyPI |
| 45 | +We will load a demo data located in `tutorials/data.csv` for this example: |
51 | 46 |
|
52 |
| -After creating the virtualenv and activating it, we recommend using |
53 |
| -[pip](https://pip.pypa.io/en/stable/) in order to install **sigllm**: |
| 47 | +```python3 |
| 48 | +import pandas as pd |
54 | 49 |
|
55 |
| -```bash |
56 |
| -pip install sigllm |
| 50 | +data = pd.read_csv('data.csv') |
| 51 | +data.head() |
57 | 52 | ```
|
58 | 53 |
|
59 |
| -This will pull and install the latest stable release from [PyPI](https://pypi.org/). |
60 |
| ---> |
| 54 | +which should show a signal with `timestamp` and `value`. |
| 55 | +``` |
| 56 | + timestamp value |
| 57 | +0 1222840800 6.357008 |
| 58 | +1 1222862400 12.763547 |
| 59 | +2 1222884000 18.204697 |
| 60 | +3 1222905600 21.972602 |
| 61 | +4 1222927200 23.986643 |
| 62 | +5 1222948800 24.906765 |
| 63 | +``` |
61 | 64 |
|
62 |
| -## Install from source |
| 65 | +In this example we use `gpt_detector` pipeline and set some hyperparameters. In this case, we set the thresholding strategy to dynamic. The hyperparameters are optional and can be removed. |
63 | 66 |
|
64 |
| -With your virtualenv activated, you can clone the repository and install it from |
65 |
| -source by running `make install` on the `stable` branch: |
| 67 | +In addtion, the `SigLLM` object takes in a `decimal` argument to determine how many digits from the float value include. Here, we don't want to keep any decimal values, so we set it to zero. |
66 | 68 |
|
67 |
| -```bash |
68 |
| -git clone [email protected]:sintel-dev/sigllm.git |
69 |
| -cd sigllm |
70 |
| -git checkout stable |
71 |
| -make install |
| 69 | +```python3 |
| 70 | +from sigllm import SigLLM |
| 71 | + |
| 72 | +hyperparameters = { |
| 73 | + "orion.primitives.timeseries_anomalies.find_anomalies#1": { |
| 74 | + "fixed_threshold": False |
| 75 | + } |
| 76 | +} |
| 77 | + |
| 78 | +sigllm = SigLLM( |
| 79 | + pipeline='gpt_detector', |
| 80 | + decimal=0, |
| 81 | + hyperparameters=hyperparameters |
| 82 | +) |
72 | 83 | ```
|
73 | 84 |
|
74 |
| -## Install for Development |
| 85 | +Now that we have initialized the pipeline, we are ready to use it to detect anomalies: |
75 | 86 |
|
76 |
| -If you want to contribute to the project, a few more steps are required to make the project ready |
77 |
| -for development. |
| 87 | +```python3 |
| 88 | +anomalies = sigllm.detect(data) |
| 89 | +``` |
| 90 | +> :warning: Depending on the length of your timeseries, this might take time to run. |
78 | 91 |
|
79 |
| -Please head to the [Contributing Guide](https://sintel-dev.github.io/sigllm/contributing.html#get-started) |
80 |
| -for more details about this process. |
| 92 | +The output of the previous command will be a ``pandas.DataFrame`` containing a table of detected anomalies: |
81 | 93 |
|
82 |
| -# Quickstart |
| 94 | +``` |
| 95 | + start end severity |
| 96 | +0 1225864800 1227139200 0.625879 |
| 97 | +``` |
| 98 | + |
| 99 | +# Resources |
| 100 | + |
| 101 | +Additional resources that might be of interest: |
| 102 | +* Learn about [Orion](https://github.com/sintel-dev/Orion). |
| 103 | +* Read our [paper](https://arxiv.org/pdf/2405.14755). |
83 | 104 |
|
84 |
| -In this short tutorial we will guide you through a series of steps that will help you |
85 |
| -getting started with **sigllm**. |
86 | 105 |
|
87 |
| -TODO: Create a step by step guide here. |
| 106 | +# Citation |
88 | 107 |
|
89 |
| -# What's next? |
| 108 | +If you use **SigLLM** for your research, please consider citing the following paper: |
90 | 109 |
|
91 |
| -For more details about **sigllm** and all its possibilities |
92 |
| -and features, please check the [documentation site]( |
93 |
| -https://sintel-dev.github.io/sigllm/). |
| 110 | +Sarah Alnegheimish, Linh Nguyen, Laure Berti-Equille, Kalyan Veeramachaneni. [Can Large Language Models be Anomaly Detectors for Time Series?](https://arxiv.org/pdf/2405.14755). |
| 111 | + |
| 112 | +``` |
| 113 | +@inproceedings{alnegheimish2024sigllm, |
| 114 | + title={Can Large Language Models be Anomaly Detectors for Time Series?}, |
| 115 | + author={Alnegheimish, Sarah and Nguyen, Linh and Berti-Equille, Laure and Veeramachaneni, Kalyan}, |
| 116 | + booktitle={2024 IEEE International Conferencze on Data Science and Advanced Analytics (IEEE DSAA)}, |
| 117 | + organization={IEEE}, |
| 118 | + year={2024} |
| 119 | +} |
| 120 | +``` |
0 commit comments