Skip to content

Commit 255dedb

Browse files
author
clams-bot
committed
adding metadata of swt-detection.v7.0
1 parent f4209bb commit 255dedb

File tree

5 files changed

+435
-59
lines changed

5 files changed

+435
-59
lines changed
+180
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
---
2+
layout: posts
3+
classes: wide
4+
title: "Scenes-with-text Detection (v7.0)"
5+
date: 2024-10-29T02:34:29+00:00
6+
---
7+
## About this version
8+
9+
- Submitter: [keighrim](https://github.com/keighrim)
10+
- Submission Time: 2024-10-29T02:34:29+00:00
11+
- Prebuilt Container Image: [ghcr.io/clamsproject/app-swt-detection:v7.0](https://github.com/clamsproject/app-swt-detection/pkgs/container/app-swt-detection/v7.0)
12+
- Release Notes
13+
14+
> This version re-implements stitcher based on `simple-timepoints-stitcher`
15+
> - app now can run stitch-only mode (`useClassifier` and `useStitcher`)
16+
> - simple-timepoints-stitcher app will retire
17+
> - prefixed all parameters with their corresponding modes (e.g., `sampleRate` > `tpSampleRate`, `minTPScore` > `tfMinTPScore`
18+
> - changes to parameters
19+
> - `minTFCount` (frame count-based) became `tfMinTFDuration` (time-based)
20+
> - `map` became `tfLabelMap` to clarify what "map" the param sets
21+
> - `tfDynamicSceneLabels` is added to configure dynamic scene types that need multiple representative images/timepoints (defaults to [`credit`, `credits`])
22+
> - changes to app behavior
23+
> - new stitcher implementation is not exactly the same as the old, and users should expect more "break-ups" in the middle of long time frames
24+
> - for dynamic scene types, the gap between representative time points is now twice the `tfMinTFDuration` value
25+
> - image classification is now done in batches (currently fixed to size 2000) to reduce memory usage. This will add some time overhead to image extraction process
26+
27+
## About this app (See raw [metadata.json](metadata.json))
28+
29+
**Detects scenes with text, like slates, chyrons and credits. This app can run in three modes, depending on `useClassifier`, `useStitcher` parameters. When `useClassifier=True`, it runs in the "TimePoint mode" and generates TimePoint annotations. When `useStitcher=True`, it runs in the "TimeFrame mode" and generates TimeFrame annotations based on existing TimePoint annotations -- if no TimePoint is found, it produces an error. By default, it runs in the 'both' mode and first generates TimePoint annotations and then TimeFrame annotations on them.**
30+
31+
- App ID: [http://apps.clams.ai/swt-detection/v7.0](http://apps.clams.ai/swt-detection/v7.0)
32+
- App License: Apache 2.0
33+
- Source Repository: [https://github.com/clamsproject/app-swt-detection](https://github.com/clamsproject/app-swt-detection) ([source tree of the submitted version](https://github.com/clamsproject/app-swt-detection/tree/v7.0))
34+
35+
36+
#### Inputs
37+
(**Note**: "*" as a property value means that the property is required but can be any value.)
38+
39+
- [http://mmif.clams.ai/vocabulary/VideoDocument/v1](http://mmif.clams.ai/vocabulary/VideoDocument/v1) (required)
40+
(of any properties)
41+
42+
43+
44+
#### Configurable Parameters
45+
(**Note**: _Multivalued_ means the parameter can have one or more values.)
46+
47+
- `useClassifier`: optional, defaults to `true`
48+
49+
- Type: boolean
50+
- Multivalued: False
51+
- Choices: `false`, **_`true`_**
52+
53+
54+
> Use the image classifier model to generate TimePoint annotations
55+
- `tpModelName`: optional, defaults to `convnext_lg`
56+
57+
- Type: string
58+
- Multivalued: False
59+
- Choices: **_`convnext_lg`_**, `convnext_tiny`
60+
61+
62+
> model name to use for classification, only applies when `useClassifier=true`
63+
- `tpUsePosModel`: optional, defaults to `true`
64+
65+
- Type: boolean
66+
- Multivalued: False
67+
- Choices: `false`, **_`true`_**
68+
69+
70+
> Use the model trained with positional features, only applies when `useClassifier=true`
71+
- `tpStartAt`: optional, defaults to `0`
72+
73+
- Type: integer
74+
- Multivalued: False
75+
76+
77+
> Number of milliseconds into the video to start processing, only applies when `useClassifier=true`
78+
- `tpStopAt`: optional, defaults to `9223372036854775807`
79+
80+
- Type: integer
81+
- Multivalued: False
82+
83+
84+
> Number of milliseconds into the video to stop processing, only applies when `useClassifier=true`
85+
- `tpSampleRate`: optional, defaults to `1000`
86+
87+
- Type: integer
88+
- Multivalued: False
89+
90+
91+
> Milliseconds between sampled frames, only applies when `useClassifier=true`
92+
- `useStitcher`: optional, defaults to `true`
93+
94+
- Type: boolean
95+
- Multivalued: False
96+
- Choices: `false`, **_`true`_**
97+
98+
99+
> Use the stitcher after classifying the TimePoints
100+
- `tfMinTPScore`: optional, defaults to `0.01`
101+
102+
- Type: number
103+
- Multivalued: False
104+
105+
106+
> Minimum score for a TimePoint to be included in a TimeFrame, only applies when `useStitcher=true`
107+
- `tfMinTFScore`: optional, defaults to `0.5`
108+
109+
- Type: number
110+
- Multivalued: False
111+
112+
113+
> Minimum score for a TimeFrame, only applies when `useStitcher=true`
114+
- `tfMinTFDuration`: optional, defaults to `2000`
115+
116+
- Type: integer
117+
- Multivalued: False
118+
119+
120+
> Minimum duration of a TimeFrame in milliseconds, only applies when `useStitcher=true`
121+
- `tfAllowOverlap`: optional, defaults to `true`
122+
123+
- Type: boolean
124+
- Multivalued: False
125+
- Choices: `false`, **_`true`_**
126+
127+
128+
> Allow overlapping time frames, only applies when `useStitcher=true`
129+
- `tfDynamicSceneLabels`: optional, defaults to `['credit', 'credits']`
130+
131+
- Type: string
132+
- Multivalued: True
133+
134+
135+
> Labels that are considered dynamic scenes. For dynamic scenes, TimeFrame annotations contains multiple representative points to follow any changes in the scene. Only applies when `useStitcher=true`
136+
- `tfLabelMap`: optional, defaults to `['B:bars', 'S:slate', 'I:chyron', 'N:chyron', 'Y:chyron', 'C:credits', 'R:credits', 'W:other_opening', 'L:other_opening', 'O:other_opening', 'M:other_opening', 'E:other_text', 'K:other_text', 'G:other_text', 'T:other_text', 'F:other_text']`
137+
138+
- Type: map
139+
- Multivalued: True
140+
141+
142+
> Mapping of a label in the input annotations to a new label. Must be formatted as IN_LABEL:OUT_LABEL (with a colon). To pass multiple mappings, use this parameter multiple times. By default, all the input labels are passed as is, including any negative labels (with default value being no remapping at all). However, when at least one label is remapped, all the other "unset" labels are discarded as a negative label. Only applies when `useStitcher=true`
143+
- `pretty`: optional, defaults to `false`
144+
145+
- Type: boolean
146+
- Multivalued: False
147+
- Choices: **_`false`_**, `true`
148+
149+
150+
> The JSON body of the HTTP response will be re-formatted with 2-space indentation
151+
- `runningTime`: optional, defaults to `false`
152+
153+
- Type: boolean
154+
- Multivalued: False
155+
- Choices: **_`false`_**, `true`
156+
157+
158+
> The running time of the app will be recorded in the view metadata
159+
- `hwFetch`: optional, defaults to `false`
160+
161+
- Type: boolean
162+
- Multivalued: False
163+
- Choices: **_`false`_**, `true`
164+
165+
166+
> The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata
167+
168+
169+
#### Outputs
170+
(**Note**: "*" as a property value means that the property is required but can be any value.)
171+
172+
(**Note**: Not all output annotations are always generated.)
173+
174+
- [http://mmif.clams.ai/vocabulary/TimeFrame/v5](http://mmif.clams.ai/vocabulary/TimeFrame/v5)
175+
- _timeUnit_ = "milliseconds"
176+
177+
- [http://mmif.clams.ai/vocabulary/TimePoint/v4](http://mmif.clams.ai/vocabulary/TimePoint/v4)
178+
- _timeUnit_ = "milliseconds"
179+
- _labelset_ = a list of ["B", "S", "W", "L", "O", "M", "I", "N", "E", "P", "Y", "K", "G", "T", "F", "C", "R"]
180+
+186
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
{
2+
"name": "Scenes-with-text Detection",
3+
"description": "Detects scenes with text, like slates, chyrons and credits. This app can run in three modes, depending on `useClassifier`, `useStitcher` parameters. When `useClassifier=True`, it runs in the \"TimePoint mode\" and generates TimePoint annotations. When `useStitcher=True`, it runs in the \"TimeFrame mode\" and generates TimeFrame annotations based on existing TimePoint annotations -- if no TimePoint is found, it produces an error. By default, it runs in the 'both' mode and first generates TimePoint annotations and then TimeFrame annotations on them.",
4+
"app_version": "v7.0",
5+
"mmif_version": "1.0.5",
6+
"app_license": "Apache 2.0",
7+
"identifier": "http://apps.clams.ai/swt-detection/v7.0",
8+
"url": "https://github.com/clamsproject/app-swt-detection",
9+
"input": [
10+
{
11+
"@type": "http://mmif.clams.ai/vocabulary/VideoDocument/v1",
12+
"required": true
13+
}
14+
],
15+
"output": [
16+
{
17+
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v5",
18+
"properties": {
19+
"timeUnit": "milliseconds"
20+
}
21+
},
22+
{
23+
"@type": "http://mmif.clams.ai/vocabulary/TimePoint/v4",
24+
"properties": {
25+
"timeUnit": "milliseconds",
26+
"labelset": [
27+
"B",
28+
"S",
29+
"W",
30+
"L",
31+
"O",
32+
"M",
33+
"I",
34+
"N",
35+
"E",
36+
"P",
37+
"Y",
38+
"K",
39+
"G",
40+
"T",
41+
"F",
42+
"C",
43+
"R"
44+
]
45+
}
46+
}
47+
],
48+
"parameters": [
49+
{
50+
"name": "useClassifier",
51+
"description": "Use the image classifier model to generate TimePoint annotations",
52+
"type": "boolean",
53+
"default": true,
54+
"multivalued": false
55+
},
56+
{
57+
"name": "tpModelName",
58+
"description": "model name to use for classification, only applies when `useClassifier=true`",
59+
"type": "string",
60+
"choices": [
61+
"convnext_lg",
62+
"convnext_tiny"
63+
],
64+
"default": "convnext_lg",
65+
"multivalued": false
66+
},
67+
{
68+
"name": "tpUsePosModel",
69+
"description": "Use the model trained with positional features, only applies when `useClassifier=true`",
70+
"type": "boolean",
71+
"default": true,
72+
"multivalued": false
73+
},
74+
{
75+
"name": "tpStartAt",
76+
"description": "Number of milliseconds into the video to start processing, only applies when `useClassifier=true`",
77+
"type": "integer",
78+
"default": 0,
79+
"multivalued": false
80+
},
81+
{
82+
"name": "tpStopAt",
83+
"description": "Number of milliseconds into the video to stop processing, only applies when `useClassifier=true`",
84+
"type": "integer",
85+
"default": 9223372036854775807,
86+
"multivalued": false
87+
},
88+
{
89+
"name": "tpSampleRate",
90+
"description": "Milliseconds between sampled frames, only applies when `useClassifier=true`",
91+
"type": "integer",
92+
"default": 1000,
93+
"multivalued": false
94+
},
95+
{
96+
"name": "useStitcher",
97+
"description": "Use the stitcher after classifying the TimePoints",
98+
"type": "boolean",
99+
"default": true,
100+
"multivalued": false
101+
},
102+
{
103+
"name": "tfMinTPScore",
104+
"description": "Minimum score for a TimePoint to be included in a TimeFrame, only applies when `useStitcher=true`",
105+
"type": "number",
106+
"default": 0.01,
107+
"multivalued": false
108+
},
109+
{
110+
"name": "tfMinTFScore",
111+
"description": "Minimum score for a TimeFrame, only applies when `useStitcher=true`",
112+
"type": "number",
113+
"default": 0.5,
114+
"multivalued": false
115+
},
116+
{
117+
"name": "tfMinTFDuration",
118+
"description": "Minimum duration of a TimeFrame in milliseconds, only applies when `useStitcher=true`",
119+
"type": "integer",
120+
"default": 2000,
121+
"multivalued": false
122+
},
123+
{
124+
"name": "tfAllowOverlap",
125+
"description": "Allow overlapping time frames, only applies when `useStitcher=true`",
126+
"type": "boolean",
127+
"default": true,
128+
"multivalued": false
129+
},
130+
{
131+
"name": "tfDynamicSceneLabels",
132+
"description": "Labels that are considered dynamic scenes. For dynamic scenes, TimeFrame annotations contains multiple representative points to follow any changes in the scene. Only applies when `useStitcher=true`",
133+
"type": "string",
134+
"default": [
135+
"credit",
136+
"credits"
137+
],
138+
"multivalued": true
139+
},
140+
{
141+
"name": "tfLabelMap",
142+
"description": "Mapping of a label in the input annotations to a new label. Must be formatted as IN_LABEL:OUT_LABEL (with a colon). To pass multiple mappings, use this parameter multiple times. By default, all the input labels are passed as is, including any negative labels (with default value being no remapping at all). However, when at least one label is remapped, all the other \"unset\" labels are discarded as a negative label. Only applies when `useStitcher=true`",
143+
"type": "map",
144+
"default": [
145+
"B:bars",
146+
"S:slate",
147+
"I:chyron",
148+
"N:chyron",
149+
"Y:chyron",
150+
"C:credits",
151+
"R:credits",
152+
"W:other_opening",
153+
"L:other_opening",
154+
"O:other_opening",
155+
"M:other_opening",
156+
"E:other_text",
157+
"K:other_text",
158+
"G:other_text",
159+
"T:other_text",
160+
"F:other_text"
161+
],
162+
"multivalued": true
163+
},
164+
{
165+
"name": "pretty",
166+
"description": "The JSON body of the HTTP response will be re-formatted with 2-space indentation",
167+
"type": "boolean",
168+
"default": false,
169+
"multivalued": false
170+
},
171+
{
172+
"name": "runningTime",
173+
"description": "The running time of the app will be recorded in the view metadata",
174+
"type": "boolean",
175+
"default": false,
176+
"multivalued": false
177+
},
178+
{
179+
"name": "hwFetch",
180+
"description": "The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata",
181+
"type": "boolean",
182+
"default": false,
183+
"multivalued": false
184+
}
185+
]
186+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"time": "2024-10-29T02:34:29+00:00",
3+
"submitter": "keighrim",
4+
"image": "ghcr.io/clamsproject/app-swt-detection:v7.0",
5+
"releasenotes": "This version re-implements stitcher based on `simple-timepoints-stitcher`\n\n- app now can run stitch-only mode (`useClassifier` and `useStitcher`)\n- simple-timepoints-stitcher app will retire\n- prefixed all parameters with their corresponding modes (e.g., `sampleRate` > `tpSampleRate`, `minTPScore` > `tfMinTPScore`\n- changes to parameters\n - `minTFCount` (frame count-based) became `tfMinTFDuration` (time-based)\n - `map` became `tfLabelMap` to clarify what \"map\" the param sets\n - `tfDynamicSceneLabels` is added to configure dynamic scene types that need multiple representative images/timepoints (defaults to [`credit`, `credits`])\n- changes to app behavior\n - new stitcher implementation is not exactly the same as the old, and users should expect more \"break-ups\" in the middle of long time frames\n - for dynamic scene types, the gap between representative time points is now twice the `tfMinTFDuration` value\n - image classification is now done in batches (currently fixed to size 2000) to reduce memory usage. This will add some time overhead to image extraction process\n\n"
6+
}

0 commit comments

Comments
 (0)