Skip to content

Commit b4d00d6

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents 4a14a45 + ddaacee commit b4d00d6

31 files changed

+3458
-497
lines changed

.github/workflows/readme.yml

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
2+
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
3+
4+
name: README
5+
6+
on:
7+
push:
8+
branches: [ master ]
9+
10+
jobs:
11+
readme:
12+
runs-on: ${{ matrix.os }}
13+
strategy:
14+
matrix:
15+
python-version: [3.8]
16+
os: [ubuntu-latest]
17+
steps:
18+
- uses: actions/checkout@v1
19+
- name: Set up Python ${{ matrix.python-version }}
20+
uses: actions/setup-python@v2
21+
with:
22+
python-version: ${{ matrix.python-version }}
23+
- name: Install package and dependencies
24+
run: |
25+
python -m pip install --upgrade pip
26+
python -m pip install invoke rundoc .
27+
- name: invoke readme
28+
env:
29+
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
30+
run: invoke readme

.github/workflows/tests.yml

-20
Original file line numberDiff line numberDiff line change
@@ -48,26 +48,6 @@ jobs:
4848
# run: make docs
4949

5050

51-
readme:
52-
runs-on: ${{ matrix.os }}
53-
strategy:
54-
matrix:
55-
python-version: ['3.8', '3.9', '3.10', '3.11']
56-
os: [ubuntu-20.04, macos-13]
57-
steps:
58-
- uses: actions/checkout@v1
59-
- name: Set up Python ${{ matrix.python-version }}
60-
uses: actions/setup-python@v2
61-
with:
62-
python-version: ${{ matrix.python-version }}
63-
- name: Install package and dependencies
64-
run: |
65-
python -m pip install --upgrade pip
66-
python -m pip install invoke rundoc .
67-
- name: invoke readme
68-
run: invoke readme
69-
70-
7151
unit:
7252
runs-on: ${{ matrix.os }}
7353
strategy:

HISTORY.md

+8
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# History
22

3+
## 0.0.2 - 2024-10-24
4+
5+
New Prompter pipeline.
6+
7+
* Test README with GPT – [Issue #20](https://github.com/sintel-dev/sigllm/issues/20) by @sarahmish
8+
* Mistral-prompter – [Issue #19](https://github.com/sintel-dev/sigllm/issues/19) by @Linh-nk
9+
10+
311
## 0.0.1 - 2024-09-25
412

513
First sigllm release to PyPI: https://pypi.org/project/sigllm/

README.md

+80-53
Original file line numberDiff line numberDiff line change
@@ -3,91 +3,118 @@
33
<i>An open source project from Data to AI Lab at MIT.</i>
44
</p>
55

6-
<!-- Uncomment these lines after releasing the package to PyPI for version and downloads badges -->
7-
<!--[![PyPI Shield](https://img.shields.io/pypi/v/sigllm.svg)](https://pypi.python.org/pypi/sigllm)-->
8-
<!--[![Downloads](https://pepy.tech/badge/sigllm)](https://pepy.tech/project/sigllm)-->
9-
[![Github Actions Shield](https://img.shields.io/github/workflow/status/sintel-dev/sigllm/Run%20Tests)](https://github.com/sintel-dev/sigllm/actions)
6+
[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
7+
[![Python](https://img.shields.io/badge/Python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)](https://badge.fury.io/py/sigllm)
8+
[![PyPi Shield](https://img.shields.io/pypi/v/sigllm.svg)](https://pypi.python.org/pypi/sigllm)
9+
[![Run Tests](https://github.com/sintel-dev/sigllm/actions/workflows/tests.yml/badge.svg)](https://github.com/sintel-dev/sigllm/actions/workflows/tests.yml)
10+
[![Downloads](https://pepy.tech/badge/sigllm)](https://pepy.tech/project/sigllm)
1011

1112

13+
# SigLLM
1214

13-
# sigllm
15+
Using Large Language Models (LLMs) for time series anomaly detection.
1416

15-
Signals plus LLMs
16-
17-
- Documentation: https://sintel-dev.github.io/sigllm
17+
<!-- - Documentation: https://sintel-dev.github.io/sigllm -->
1818
- Homepage: https://github.com/sintel-dev/sigllm
1919

2020
# Overview
2121

22-
TODO: Provide a short overview of the project here.
23-
24-
# Install
22+
SigLLM is an extension of the Orion library, built to detect anomalies in time series data using LLMs.
23+
We provide two types of pipelines for anomaly detection:
24+
* **Prompter**: directly prompting LLMs to find anomalies in time series.
25+
* **Detector**: using LLMs to forecast time series and finding anomalies through by comparing the real and forecasted signals.
2526

26-
## Requirements
27+
For more details on our pipelines, please read our [paper](https://arxiv.org/pdf/2405.14755).
2728

28-
**sigllm** has been developed and tested on [Python 3.8, 3.9, 3.10 and 3.11](https://www.python.org/downloads/)
29+
# Quickstart
2930

30-
Also, although it is not strictly required, the usage of a [virtualenv](https://virtualenv.pypa.io/en/latest/)
31-
is highly recommended in order to avoid interfering with other software installed in the system
32-
in which **sigllm** is run.
31+
## Install with pip
3332

34-
These are the minimum commands needed to create a virtualenv using python3.8 for **sigllm**:
33+
The easiest and recommended way to install **SigLLM** is using [pip](https://pip.pypa.io/en/stable/):
3534

3635
```bash
37-
pip install virtualenv
38-
virtualenv -p $(which python3.6) sigllm-venv
36+
pip install sigllm
3937
```
38+
This will pull and install the latest stable release from [PyPi](https://pypi.org/).
4039

41-
Afterwards, you have to execute this command to activate the virtualenv:
4240

43-
```bash
44-
source sigllm-venv/bin/activate
45-
```
41+
In the following example we show how to use one of the **SigLLM Pipelines**.
4642

47-
Remember to execute it every time you start a new console to work on **sigllm**!
43+
# Detect anomalies using a SigLLM pipeline
4844

49-
<!-- Uncomment this section after releasing the package to PyPI for installation instructions
50-
## Install from PyPI
45+
We will load a demo data located in `tutorials/data.csv` for this example:
5146

52-
After creating the virtualenv and activating it, we recommend using
53-
[pip](https://pip.pypa.io/en/stable/) in order to install **sigllm**:
47+
```python3
48+
import pandas as pd
5449

55-
```bash
56-
pip install sigllm
50+
data = pd.read_csv('data.csv')
51+
data.head()
5752
```
5853

59-
This will pull and install the latest stable release from [PyPI](https://pypi.org/).
60-
-->
54+
which should show a signal with `timestamp` and `value`.
55+
```
56+
timestamp value
57+
0 1222840800 6.357008
58+
1 1222862400 12.763547
59+
2 1222884000 18.204697
60+
3 1222905600 21.972602
61+
4 1222927200 23.986643
62+
5 1222948800 24.906765
63+
```
6164

62-
## Install from source
65+
In this example we use `gpt_detector` pipeline and set some hyperparameters. In this case, we set the thresholding strategy to dynamic. The hyperparameters are optional and can be removed.
6366

64-
With your virtualenv activated, you can clone the repository and install it from
65-
source by running `make install` on the `stable` branch:
67+
In addtion, the `SigLLM` object takes in a `decimal` argument to determine how many digits from the float value include. Here, we don't want to keep any decimal values, so we set it to zero.
6668

67-
```bash
68-
git clone [email protected]:sintel-dev/sigllm.git
69-
cd sigllm
70-
git checkout stable
71-
make install
69+
```python3
70+
from sigllm import SigLLM
71+
72+
hyperparameters = {
73+
"orion.primitives.timeseries_anomalies.find_anomalies#1": {
74+
"fixed_threshold": False
75+
}
76+
}
77+
78+
sigllm = SigLLM(
79+
pipeline='gpt_detector',
80+
decimal=0,
81+
hyperparameters=hyperparameters
82+
)
7283
```
7384

74-
## Install for Development
85+
Now that we have initialized the pipeline, we are ready to use it to detect anomalies:
7586

76-
If you want to contribute to the project, a few more steps are required to make the project ready
77-
for development.
87+
```python3
88+
anomalies = sigllm.detect(data)
89+
```
90+
> :warning: Depending on the length of your timeseries, this might take time to run.
7891
79-
Please head to the [Contributing Guide](https://sintel-dev.github.io/sigllm/contributing.html#get-started)
80-
for more details about this process.
92+
The output of the previous command will be a ``pandas.DataFrame`` containing a table of detected anomalies:
8193

82-
# Quickstart
94+
```
95+
start end severity
96+
0 1225864800 1227139200 0.625879
97+
```
98+
99+
# Resources
100+
101+
Additional resources that might be of interest:
102+
* Learn about [Orion](https://github.com/sintel-dev/Orion).
103+
* Read our [paper](https://arxiv.org/pdf/2405.14755).
83104

84-
In this short tutorial we will guide you through a series of steps that will help you
85-
getting started with **sigllm**.
86105

87-
TODO: Create a step by step guide here.
106+
# Citation
88107

89-
# What's next?
108+
If you use **SigLLM** for your research, please consider citing the following paper:
90109

91-
For more details about **sigllm** and all its possibilities
92-
and features, please check the [documentation site](
93-
https://sintel-dev.github.io/sigllm/).
110+
Sarah Alnegheimish, Linh Nguyen, Laure Berti-Equille, Kalyan Veeramachaneni. [Can Large Language Models be Anomaly Detectors for Time Series?](https://arxiv.org/pdf/2405.14755).
111+
112+
```
113+
@inproceedings{alnegheimish2024sigllm,
114+
title={Can Large Language Models be Anomaly Detectors for Time Series?},
115+
author={Alnegheimish, Sarah and Nguyen, Linh and Berti-Equille, Laure and Veeramachaneni, Kalyan},
116+
booktitle={2024 IEEE International Conferencze on Data Science and Advanced Analytics (IEEE DSAA)},
117+
organization={IEEE},
118+
year={2024}
119+
}
120+
```

setup.cfg

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 0.0.1
2+
current_version = 0.0.2.dev1
33
commit = True
44
tag = True
55
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\.(?P<release>[a-z]+)(?P<candidate>\d+))?

setup.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -112,11 +112,11 @@
112112
keywords='sigllm sigllm sigllm',
113113
name='sigllm',
114114
packages=find_packages(include=['sigllm', 'sigllm.*']),
115-
python_requires='>=3.8',
115+
python_requires='>=3.8,<3.12',
116116
setup_requires=setup_requires,
117117
test_suite='tests',
118118
tests_require=tests_require,
119119
url='https://github.com/sintel-dev/sigllm',
120-
version='0.0.1',
120+
version='0.0.2.dev1',
121121
zip_safe=False,
122122
)

sigllm/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
__author__ = 'MIT Data To AI Lab'
66
__email__ = '[email protected]'
7-
__version__ = '0.0.1'
7+
__version__ = '0.0.2.dev1'
88

99
import os
1010

sigllm/core.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,9 @@ class SigLLM(Orion):
4545
DEFAULT_PIPELINE = 'mistral_detector'
4646

4747
def _augment_hyperparameters(self, primitive, key, value):
48+
if not value:
49+
return
50+
4851
if self._hyperparameters is None:
4952
self._hyperparameters = {
5053
primitive: {}
@@ -53,8 +56,7 @@ def _augment_hyperparameters(self, primitive, key, value):
5356
if primitive not in self._hyperparameters:
5457
self._hyperparameters[primitive] = {}
5558

56-
if value:
57-
self._hyperparameters[primitive][key] = value
59+
self._hyperparameters[primitive][key] = value
5860

5961
def __init__(self, pipeline: Union[str, dict, MLPipeline] = None, interval: int = None,
6062
decimal: int = None, window_size: int = None, hyperparameters: dict = None):
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
{
2+
"primitives": [
3+
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate",
4+
"sklearn.impute.SimpleImputer",
5+
"sigllm.primitives.transformation.Float2Scalar",
6+
"sigllm.primitives.prompting.timeseries_preprocessing.rolling_window_sequences",
7+
"sigllm.primitives.transformation.format_as_string",
8+
"sigllm.primitives.prompting.gpt.GPT",
9+
"sigllm.primitives.transformation.format_as_integer",
10+
"sigllm.primitives.prompting.anomalies.val2idx",
11+
"sigllm.primitives.prompting.anomalies.find_anomalies_in_windows",
12+
"sigllm.primitives.prompting.anomalies.merge_anomalous_sequences",
13+
"sigllm.primitives.prompting.anomalies.format_anomalies"
14+
],
15+
"init_params": {
16+
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
17+
"time_column": "timestamp",
18+
"interval": 21600,
19+
"method": "mean"
20+
},
21+
"sigllm.primitives.transformation.Float2Scalar#1": {
22+
"decimal": 2,
23+
"rescale": true
24+
},
25+
"sigllm.primitives.prompting.timeseries_preprocessing.rolling_window_sequences#1": {
26+
"window_size": 200,
27+
"step_size": 40
28+
},
29+
"sigllm.primitives.transformation.format_as_string#1": {
30+
"space": true
31+
},
32+
"sigllm.primitives.prompting.gpt.GPT#1": {
33+
"name": "gpt-3.5-turbo",
34+
"samples": 10
35+
},
36+
"sigllm.primitives.prompting.anomalies.find_anomalies_in_windows#1": {
37+
"alpha": 0.4
38+
},
39+
"sigllm.primitives.prompting.anomalies.merge_anomalous_sequences#1": {
40+
"beta": 0.5
41+
}
42+
},
43+
"input_names": {
44+
"sigllm.primitives.prompting.gpt.GPT#1": {
45+
"X": "X_str"
46+
},
47+
"sigllm.primitives.transformation.format_as_integer#1":{
48+
"X": "y_hat"
49+
}
50+
},
51+
"output_names": {
52+
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
53+
"index": "timestamp"
54+
},
55+
"sigllm.primitives.transformation.format_as_string#1": {
56+
"X": "X_str"
57+
},
58+
"sigllm.primitives.prompting.gpt.GPT#1": {
59+
"y": "y_hat"
60+
},
61+
"sigllm.primitives.transformation.format_as_integer#1":{
62+
"X": "y"
63+
}
64+
}
65+
}

0 commit comments

Comments
 (0)