Skip to content

Commit e811cee

Browse files
feat: BROS-85: Add timeseries segmentation backend example (#782)
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
1 parent 7d81714 commit e811cee

19 files changed

+3852
-3
lines changed

.rules/new_models_best_practice.mdc

Lines changed: 1217 additions & 0 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ Check the **Required parameters** column to see if you need to set any additiona
6262
| [sklearn_text_classifier](/label_studio_ml/examples/sklearn_text_classifier) | Text classification with [scikit-learn](https://scikit-learn.org/stable/) |||| None | Arbitrary |
6363
| [spacy](/label_studio_ml/examples/spacy) | NER by [SpaCy](https://spacy.io/) |||| None | Set [(see documentation)](https://spacy.io/usage/linguistic-features) |
6464
| [tesseract](/label_studio_ml/examples/tesseract) | Interactive OCR. [Details](https://github.com/tesseract-ocr/tesseract) |||| None | Set (characters) |
65+
| [timeseries_segmenter](/label_studio_ml/examples/timeseries_segmenter) | Time series segmentation using a small LSTM network |||| None | Set |
6566
| [watsonX](/label_studio_ml/exampels/watsonx)| LLM inference with [WatsonX](https://www.ibm.com/products/watsonx-ai) and integration with [WatsonX.data](watsonx.data)|||| None| Arbitrary|
6667
| [yolo](/label_studio_ml/examples/yolo) | All YOLO tasks are supported: [YOLO](https://docs.ultralytics.com/tasks/) |||| None | Arbitrary |
6768

label_studio_ml/examples/gliner/docker-compose.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@ services:
3232
# Determine the actual IP using 'ifconfig' (Linux/Mac) or 'ipconfig' (Windows).
3333
- LABEL_STUDIO_URL=http://host.docker.internal:8080
3434
- LABEL_STUDIO_API_KEY=
35+
extra_hosts:
36+
- "host.docker.internal:host-gateway" # for macos and unix
3537
ports:
3638
- "9090:9090"
3739
volumes:

label_studio_ml/examples/llm_interactive/docker-compose.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,8 @@ services:
4545
# specify these parameters if you want to use basic auth for the model server
4646
- BASIC_AUTH_USER=
4747
- BASIC_AUTH_PASS=
48+
extra_hosts:
49+
- "host.docker.internal:host-gateway" # for macos and unix
4850
ports:
4951
- 9090:9090
5052
volumes:

label_studio_ml/examples/tesseract/docker-compose.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@ services:
2929
- AWS_ACCESS_KEY_ID=your-MINIO_ROOT_USER
3030
- AWS_SECRET_ACCESS_KEY=your-MINIO_ROOT_PASSWORD
3131
- AWS_ENDPOINT=http://host.docker.internal:9000
32+
extra_hosts:
33+
- "host.docker.internal:host-gateway" # for macos and unix
3234

3335
minio:
3436
container_name: minio
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# syntax=docker/dockerfile:1
2+
ARG PYTHON_VERSION=3.11
3+
4+
FROM python:${PYTHON_VERSION}-slim AS python-base
5+
ARG TEST_ENV
6+
7+
WORKDIR /app
8+
9+
ENV PYTHONUNBUFFERED=1 \
10+
PYTHONDONTWRITEBYTECODE=1 \
11+
PORT=${PORT:-9090} \
12+
PIP_CACHE_DIR=/.cache \
13+
WORKERS=1 \
14+
THREADS=8
15+
16+
# Update the base OS
17+
RUN --mount=type=cache,target="/var/cache/apt",sharing=locked \
18+
--mount=type=cache,target="/var/lib/apt/lists",sharing=locked \
19+
set -eux; \
20+
apt-get update; \
21+
apt-get upgrade -y; \
22+
apt install --no-install-recommends -y \
23+
git; \
24+
apt-get autoremove -y
25+
26+
# install base requirements
27+
COPY requirements-base.txt .
28+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
29+
pip install -r requirements-base.txt
30+
31+
# install custom requirements
32+
COPY requirements.txt .
33+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
34+
pip install -r requirements.txt
35+
36+
# install test requirements if needed
37+
COPY requirements-test.txt .
38+
# build only when TEST_ENV="true"
39+
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
40+
if [ "$TEST_ENV" = "true" ]; then \
41+
pip install -r requirements-test.txt; \
42+
fi
43+
44+
COPY . .
45+
46+
EXPOSE 9090
47+
48+
CMD gunicorn --preload --bind :$PORT --workers $WORKERS --threads $THREADS --timeout 0 _wsgi:app
Lines changed: 308 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,308 @@
1+
# Time Series Segmenter for Label Studio
2+
3+
https://github.com/user-attachments/assets/9f6a7ebb-bf3e-42d5-bde9-087719494f2d
4+
5+
This example demonstrates a minimal ML backend that performs time series segmentation.
6+
It trains a small LSTM neural network on labeled CSV data and predicts segments
7+
for new tasks. The backend expects the labeling configuration to use
8+
`<TimeSeries>` and `<TimeSeriesLabels>` tags.
9+
10+
## Before you begin
11+
12+
1. Install the [Label Studio ML backend](https://github.com/HumanSignal/label-studio-ml-backend?tab=readme-ov-file#quickstart).
13+
2. Set `LABEL_STUDIO_HOST` and `LABEL_STUDIO_API_KEY` in `docker-compose.yml`
14+
so the backend can download labeled tasks for training.
15+
16+
## Quick start
17+
18+
```bash
19+
# build and run
20+
docker-compose up --build
21+
```
22+
23+
A small example CSV is available in `tests/time_series.csv`.
24+
25+
Connect the model from the **Model** page in your project settings. The default
26+
URL is `http://localhost:9090`.
27+
28+
## Labeling configuration
29+
30+
Use a configuration similar to the following:
31+
32+
```xml
33+
<View>
34+
<TimeSeriesLabels name="label" toName="ts">
35+
<Label value="Run"/>
36+
<Label value="Walk"/>
37+
</TimeSeriesLabels>
38+
<TimeSeries name="ts" valueType="url" value="$csv_url" timeColumn="time">
39+
<Channel column="sensorone" />
40+
<Channel column="sensortwo" />
41+
</TimeSeries>
42+
</View>
43+
```
44+
45+
The backend reads the time column and channels to build feature vectors. Each
46+
CSV referenced by `csv_url` must contain the time column and the channel
47+
columns.
48+
49+
## Annotation Types
50+
51+
The backend supports two types of the time series segmentation:
52+
53+
**Range Annotations**
54+
- **Use case**: Events that have duration (e.g., "Running from 10s to 30s")
55+
- **Created by**: Dragging across time series to select a time range
56+
- **Behavior**: `start``end`, `instant` = `false`
57+
58+
**Instant Annotations**
59+
- **Use case**: Point events or moments in time (e.g., "Fall detected at 15s")
60+
- **Created by**: Double-clicking on a specific point in the time series
61+
- **Behavior**: `start` = `end`, `instant` = `true`
62+
63+
**Note**: Instant labels often create highly imbalanced datasets since they represent brief moments within long time series. The model's **balanced learning approach** is specifically designed to handle this challenge effectively.
64+
65+
## Training
66+
67+
Training starts automatically when annotations are created or updated. The model uses a PyTorch-based LSTM neural network with proper temporal modeling and **balanced learning** to handle imbalanced time series data effectively.
68+
69+
### Training Process
70+
71+
The model follows these steps during training:
72+
73+
1. **Data Collection**: Fetches all labeled tasks from your Label Studio project
74+
2. **Sample Generation**: Converts labeled time ranges into training samples:
75+
- **Background Class**: Unlabeled time periods are treated as "background" (class 0)
76+
- **Event Classes**: Your labeled segments (e.g., "Run", "Walk") become classes 1, 2, etc.
77+
- **Ground Truth Priority**: If multiple annotations exist for a task, ground truth annotations take precedence
78+
3. **Balanced Model Training**: Fits a multi-layer LSTM network with:
79+
- **Class-weighted loss function** to handle imbalanced data (important for instant labels)
80+
- **Balanced accuracy monitoring** instead of regular accuracy
81+
- **Per-class F1 score tracking** to ensure all classes learn properly
82+
- Configurable sequence windows (default: 50 timesteps)
83+
- Dropout regularization for better generalization
84+
- Background class support for realistic time series modeling
85+
4. **Model Persistence**: Saves trained model artifacts to `MODEL_DIR`
86+
87+
### Training Configuration
88+
89+
You can customize training behavior with these environment variables:
90+
91+
**Basic Configuration:**
92+
- `START_TRAINING_EACH_N_UPDATES`: How often to retrain (default: 1, trains on every annotation)
93+
- `TRAIN_EPOCHS`: Number of training epochs (default: 1000)
94+
- `SEQUENCE_SIZE`: Sliding window size for temporal context (default: 50)
95+
- `HIDDEN_SIZE`: LSTM hidden layer size (default: 64)
96+
97+
**Balanced Learning (for Imbalanced Data):**
98+
- `BALANCED_ACCURACY_THRESHOLD`: Stop training when balanced accuracy exceeds this (default: 0.85)
99+
- `MIN_CLASS_F1_THRESHOLD`: Stop training when minimum per-class F1 exceeds this (default: 0.70)
100+
- `USE_CLASS_WEIGHTS`: Enable class-weighted loss function (default: true)
101+
102+
The balanced learning approach is **especially important when using instant labels** (created by double-clicking on the time series), as these often create highly imbalanced datasets where background periods vastly outnumber event instances.
103+
104+
### Handling Imbalanced Data
105+
106+
Time series data is often highly imbalanced, especially when using instant labels:
107+
108+
**The Problem:**
109+
- Background periods typically constitute 90%+ of the data
110+
- Event instances (Run, Walk, etc.) are rare and brief
111+
- Standard training approaches achieve high accuracy by simply predicting "background" everywhere
112+
- Models fail to learn actual event patterns
113+
114+
**Our Solution:**
115+
```
116+
Class Weights: Automatically calculated inverse frequency weights
117+
├── Background (Class 0): Low weight (e.g., 0.1x)
118+
├── Run (Class 1): High weight (e.g., 5.0x)
119+
└── Walk (Class 2): High weight (e.g., 4.0x)
120+
121+
Early Stopping: Dual criteria prevent premature stopping
122+
├── Balanced Accuracy ≥ 85% (macro-averaged across classes)
123+
└── Minimum Class F1 ≥ 70% (worst-performing class must be decent)
124+
125+
Metrics: Focus on per-class performance
126+
├── Balanced Accuracy: Equal weight to each class
127+
├── Macro F1: Average F1 across all classes
128+
└── Per-class F1: Individual class performance tracking
129+
```
130+
131+
This ensures the model learns to detect actual events rather than just predicting background.
132+
133+
### Ground Truth Handling
134+
135+
When multiple annotations exist for the same task, the model prioritizes ground truth annotations:
136+
- Non-ground truth annotations are processed first
137+
- Ground truth annotations override previous labels and stop processing for that task
138+
- This ensures the highest quality labels are used for training
139+
140+
## Prediction
141+
142+
The model processes new time series data by applying the trained LSTM classifier with sliding window temporal context. Only meaningful event segments are returned to Label Studio, filtering out background periods automatically.
143+
144+
### Prediction Process
145+
146+
For each task, the model performs these steps:
147+
148+
1. **Model Loading**: Loads the trained PyTorch model from disk
149+
2. **Data Processing**: Reads the task CSV and creates feature vectors from sensor channels
150+
3. **Temporal Prediction**: Applies LSTM with sliding windows for temporal context:
151+
- Uses overlapping windows with 50% overlap for smoother predictions
152+
- Averages predictions across overlapping windows
153+
- Maintains temporal dependencies between timesteps
154+
4. **Segment Extraction**: Groups consecutive predictions into meaningful segments:
155+
- **Background Filtering**: Automatically filters out background (unlabeled) periods
156+
- **Event Segmentation**: Only returns segments with actual event labels
157+
- **Instant Detection**: Automatically sets `instant=true` for point events (start=end, one sample events that you can label using double click) and `instant=false` for ranges
158+
- **Score Calculation**: Averages prediction confidence per segment
159+
5. **Result Formatting**: Returns segments in Label Studio JSON format with proper instant field values
160+
161+
### Prediction Quality
162+
163+
The model provides several quality indicators:
164+
165+
- **Per-segment Confidence**: Average prediction probability for each returned segment
166+
- **Temporal Consistency**: Sliding window approach reduces prediction noise
167+
- **Background Suppression**: Only returns segments where the model is confident about specific events
168+
169+
This approach ensures that predictions focus on actual events rather than forcing labels on every timestep.
170+
171+
## Project-Specific Models
172+
173+
The backend automatically handles multiple Label Studio projects by maintaining separate trained models for each project. This ensures proper isolation and prevents cross-project interference.
174+
175+
### How Project Isolation Works
176+
177+
**Model Storage:**
178+
- Each project gets its own model file: `model_project_{project_id}.pt`
179+
- Example: Project 47 → `model_project_47.pt`, Project 123 → `model_project_123.pt`
180+
- Default fallback for backward compatibility: `model_project_0.pt`
181+
182+
**Model Training:**
183+
- Training events automatically identify the source project ID
184+
- Models are trained and saved with project-specific names
185+
- Each project's model only learns from that project's annotations
186+
187+
**Model Prediction:**
188+
- The backend automatically detects which project's model to use
189+
- Project ID is extracted from task context or prediction request
190+
- Falls back to default model (project_id=0) if no project information is available
191+
192+
### Multi-Tenant Benefits
193+
194+
This architecture provides several key advantages:
195+
196+
**Data Isolation:**
197+
- Project A's sensitive medical data never trains Project B's financial model
198+
- Each project can have completely different labeling configurations
199+
- Models can't accidentally predict wrong label types from other projects
200+
201+
**Performance Independence:**
202+
- Training on one project doesn't affect prediction quality for other projects
203+
- Each project's model optimizes specifically for that project's data characteristics
204+
- Poor annotations in one project won't degrade other projects' models
205+
206+
**Scalability:**
207+
- Backend can serve multiple Label Studio projects simultaneously
208+
- Memory management keeps frequently used models cached
209+
- Inactive project models are loaded on-demand
210+
211+
### Configuration
212+
213+
No additional configuration is required - project isolation works automatically. The backend determines project context from:
214+
215+
1. **Training**: Project ID from annotation webhook events
216+
2. **Prediction**: Project ID from task context or request metadata
217+
3. **Fallback**: Uses default project_id=0 for backward compatibility
218+
219+
This seamless multi-tenant support makes the backend suitable for enterprise Label Studio deployments where multiple teams or clients need isolated ML models.
220+
221+
## How it works
222+
223+
### Training Pipeline
224+
225+
```mermaid
226+
flowchart TD
227+
A[Annotation Event] --> B{Training Trigger?}
228+
B -- no --> C[Skip Training]
229+
B -- yes --> D[Fetch Labeled Tasks]
230+
D --> E[Process Annotations]
231+
E --> F{Ground Truth?}
232+
F -- yes --> G[Priority Processing]
233+
F -- no --> H[Standard Processing]
234+
G --> I[Generate Samples]
235+
H --> I
236+
I --> J[Background + Event Labels]
237+
J --> K[PyTorch LSTM Training]
238+
K --> L[Model Validation]
239+
L --> M[Save Model]
240+
M --> N[Cache in Memory]
241+
```
242+
243+
### Prediction Pipeline
244+
245+
```mermaid
246+
flowchart TD
247+
T[Prediction Request] --> U[Load PyTorch Model]
248+
U --> V[Read Task CSV]
249+
V --> W[Extract Features]
250+
W --> X[Sliding Window LSTM]
251+
X --> Y[Overlap Averaging]
252+
Y --> Z[Filter Background]
253+
Z --> AA[Group Event Segments]
254+
AA --> BB[Calculate Confidence]
255+
BB --> CC[Return Segments]
256+
```
257+
258+
### Key Technical Features
259+
260+
- **PyTorch-based LSTM**: Modern deep learning framework with better performance and flexibility
261+
- **Temporal Modeling**: Sliding windows capture time dependencies (default 50 timesteps)
262+
- **Background Class**: Realistic modeling where unlabeled periods are explicit background
263+
- **Balanced Learning**: Class-weighted loss function and balanced metrics for imbalanced data
264+
- **Instant Label Support**: Proper handling of point events (`instant=true`) vs. duration events (`instant=false`)
265+
- **Smart Early Stopping**: Dual criteria (balanced accuracy + minimum per-class F1) prevent premature stopping
266+
- **Ground Truth Priority**: Ensures highest quality annotations are used for training
267+
- **Overlap Averaging**: Smoother predictions through overlapping window consensus
268+
- **Project-Specific Models**: Each Label Studio project gets its own trained model for proper multi-tenant isolation
269+
270+
## Customize
271+
272+
Edit `docker-compose.yml` to set environment variables for your specific use case:
273+
274+
### Basic Configuration
275+
```yaml
276+
environment:
277+
- LABEL_STUDIO_HOST=http://localhost:8080
278+
- LABEL_STUDIO_API_KEY=your_api_key_here
279+
- MODEL_DIR=/app/models
280+
- START_TRAINING_EACH_N_UPDATES=1
281+
- TRAIN_EPOCHS=1000
282+
- SEQUENCE_SIZE=50
283+
- HIDDEN_SIZE=64
284+
```
285+
286+
### Balanced Learning (Recommended for Instant Labels)
287+
```yaml
288+
environment:
289+
# ... basic config above ...
290+
- BALANCED_ACCURACY_THRESHOLD=0.85
291+
- MIN_CLASS_F1_THRESHOLD=0.70
292+
- USE_CLASS_WEIGHTS=true
293+
```
294+
295+
### Common Scenarios
296+
297+
**For instant labels (point events):**
298+
- Keep balanced learning enabled (`USE_CLASS_WEIGHTS=true`)
299+
- Consider lower thresholds (`MIN_CLASS_F1_THRESHOLD=0.60`) for very rare events
300+
- Increase epochs (`TRAIN_EPOCHS=2000`) for better minority class learning
301+
302+
**For range annotations with balanced data:**
303+
- Can disable class weights (`USE_CLASS_WEIGHTS=false`) if classes are roughly equal
304+
- Use standard accuracy thresholds
305+
306+
**For short time series:**
307+
- Reduce sequence size (`SEQUENCE_SIZE=20`) for sequences shorter than 50 timesteps
308+
- Reduce hidden size (`HIDDEN_SIZE=32`) to prevent overfitting

0 commit comments

Comments
 (0)