|
| 1 | +--- |
| 2 | +layout: posts |
| 3 | +classes: wide |
| 4 | +title: "Scenes-with-text Detection (v7.0)" |
| 5 | +date: 2024-10-29T02:34:29+00:00 |
| 6 | +--- |
| 7 | +## About this version |
| 8 | + |
| 9 | +- Submitter: [keighrim](https://github.com/keighrim) |
| 10 | +- Submission Time: 2024-10-29T02:34:29+00:00 |
| 11 | +- Prebuilt Container Image: [ghcr.io/clamsproject/app-swt-detection:v7.0](https://github.com/clamsproject/app-swt-detection/pkgs/container/app-swt-detection/v7.0) |
| 12 | +- Release Notes |
| 13 | + |
| 14 | + > This version re-implements stitcher based on `simple-timepoints-stitcher` |
| 15 | + > - app now can run stitch-only mode (`useClassifier` and `useStitcher`) |
| 16 | + > - simple-timepoints-stitcher app will retire |
| 17 | + > - prefixed all parameters with their corresponding modes (e.g., `sampleRate` > `tpSampleRate`, `minTPScore` > `tfMinTPScore` |
| 18 | + > - changes to parameters |
| 19 | + > - `minTFCount` (frame count-based) became `tfMinTFDuration` (time-based) |
| 20 | + > - `map` became `tfLabelMap` to clarify what "map" the param sets |
| 21 | + > - `tfDynamicSceneLabels` is added to configure dynamic scene types that need multiple representative images/timepoints (defaults to [`credit`, `credits`]) |
| 22 | + > - changes to app behavior |
| 23 | + > - new stitcher implementation is not exactly the same as the old, and users should expect more "break-ups" in the middle of long time frames |
| 24 | + > - for dynamic scene types, the gap between representative time points is now twice the `tfMinTFDuration` value |
| 25 | + > - image classification is now done in batches (currently fixed to size 2000) to reduce memory usage. This will add some time overhead to image extraction process |
| 26 | +
|
| 27 | +## About this app (See raw [metadata.json](metadata.json)) |
| 28 | + |
| 29 | +**Detects scenes with text, like slates, chyrons and credits. This app can run in three modes, depending on `useClassifier`, `useStitcher` parameters. When `useClassifier=True`, it runs in the "TimePoint mode" and generates TimePoint annotations. When `useStitcher=True`, it runs in the "TimeFrame mode" and generates TimeFrame annotations based on existing TimePoint annotations -- if no TimePoint is found, it produces an error. By default, it runs in the 'both' mode and first generates TimePoint annotations and then TimeFrame annotations on them.** |
| 30 | + |
| 31 | +- App ID: [http://apps.clams.ai/swt-detection/v7.0](http://apps.clams.ai/swt-detection/v7.0) |
| 32 | +- App License: Apache 2.0 |
| 33 | +- Source Repository: [https://github.com/clamsproject/app-swt-detection](https://github.com/clamsproject/app-swt-detection) ([source tree of the submitted version](https://github.com/clamsproject/app-swt-detection/tree/v7.0)) |
| 34 | + |
| 35 | + |
| 36 | +#### Inputs |
| 37 | +(**Note**: "*" as a property value means that the property is required but can be any value.) |
| 38 | + |
| 39 | +- [http://mmif.clams.ai/vocabulary/VideoDocument/v1](http://mmif.clams.ai/vocabulary/VideoDocument/v1) (required) |
| 40 | +(of any properties) |
| 41 | + |
| 42 | + |
| 43 | + |
| 44 | +#### Configurable Parameters |
| 45 | +(**Note**: _Multivalued_ means the parameter can have one or more values.) |
| 46 | + |
| 47 | +- `useClassifier`: optional, defaults to `true` |
| 48 | + |
| 49 | + - Type: boolean |
| 50 | + - Multivalued: False |
| 51 | + - Choices: `false`, **_`true`_** |
| 52 | + |
| 53 | + |
| 54 | + > Use the image classifier model to generate TimePoint annotations |
| 55 | +- `tpModelName`: optional, defaults to `convnext_lg` |
| 56 | + |
| 57 | + - Type: string |
| 58 | + - Multivalued: False |
| 59 | + - Choices: **_`convnext_lg`_**, `convnext_tiny` |
| 60 | + |
| 61 | + |
| 62 | + > model name to use for classification, only applies when `useClassifier=true` |
| 63 | +- `tpUsePosModel`: optional, defaults to `true` |
| 64 | + |
| 65 | + - Type: boolean |
| 66 | + - Multivalued: False |
| 67 | + - Choices: `false`, **_`true`_** |
| 68 | + |
| 69 | + |
| 70 | + > Use the model trained with positional features, only applies when `useClassifier=true` |
| 71 | +- `tpStartAt`: optional, defaults to `0` |
| 72 | + |
| 73 | + - Type: integer |
| 74 | + - Multivalued: False |
| 75 | + |
| 76 | + |
| 77 | + > Number of milliseconds into the video to start processing, only applies when `useClassifier=true` |
| 78 | +- `tpStopAt`: optional, defaults to `9223372036854775807` |
| 79 | + |
| 80 | + - Type: integer |
| 81 | + - Multivalued: False |
| 82 | + |
| 83 | + |
| 84 | + > Number of milliseconds into the video to stop processing, only applies when `useClassifier=true` |
| 85 | +- `tpSampleRate`: optional, defaults to `1000` |
| 86 | + |
| 87 | + - Type: integer |
| 88 | + - Multivalued: False |
| 89 | + |
| 90 | + |
| 91 | + > Milliseconds between sampled frames, only applies when `useClassifier=true` |
| 92 | +- `useStitcher`: optional, defaults to `true` |
| 93 | + |
| 94 | + - Type: boolean |
| 95 | + - Multivalued: False |
| 96 | + - Choices: `false`, **_`true`_** |
| 97 | + |
| 98 | + |
| 99 | + > Use the stitcher after classifying the TimePoints |
| 100 | +- `tfMinTPScore`: optional, defaults to `0.01` |
| 101 | + |
| 102 | + - Type: number |
| 103 | + - Multivalued: False |
| 104 | + |
| 105 | + |
| 106 | + > Minimum score for a TimePoint to be included in a TimeFrame, only applies when `useStitcher=true` |
| 107 | +- `tfMinTFScore`: optional, defaults to `0.5` |
| 108 | + |
| 109 | + - Type: number |
| 110 | + - Multivalued: False |
| 111 | + |
| 112 | + |
| 113 | + > Minimum score for a TimeFrame, only applies when `useStitcher=true` |
| 114 | +- `tfMinTFDuration`: optional, defaults to `2000` |
| 115 | + |
| 116 | + - Type: integer |
| 117 | + - Multivalued: False |
| 118 | + |
| 119 | + |
| 120 | + > Minimum duration of a TimeFrame in milliseconds, only applies when `useStitcher=true` |
| 121 | +- `tfAllowOverlap`: optional, defaults to `true` |
| 122 | + |
| 123 | + - Type: boolean |
| 124 | + - Multivalued: False |
| 125 | + - Choices: `false`, **_`true`_** |
| 126 | + |
| 127 | + |
| 128 | + > Allow overlapping time frames, only applies when `useStitcher=true` |
| 129 | +- `tfDynamicSceneLabels`: optional, defaults to `['credit', 'credits']` |
| 130 | + |
| 131 | + - Type: string |
| 132 | + - Multivalued: True |
| 133 | + |
| 134 | + |
| 135 | + > Labels that are considered dynamic scenes. For dynamic scenes, TimeFrame annotations contains multiple representative points to follow any changes in the scene. Only applies when `useStitcher=true` |
| 136 | +- `tfLabelMap`: optional, defaults to `['B:bars', 'S:slate', 'I:chyron', 'N:chyron', 'Y:chyron', 'C:credits', 'R:credits', 'W:other_opening', 'L:other_opening', 'O:other_opening', 'M:other_opening', 'E:other_text', 'K:other_text', 'G:other_text', 'T:other_text', 'F:other_text']` |
| 137 | + |
| 138 | + - Type: map |
| 139 | + - Multivalued: True |
| 140 | + |
| 141 | + |
| 142 | + > Mapping of a label in the input annotations to a new label. Must be formatted as IN_LABEL:OUT_LABEL (with a colon). To pass multiple mappings, use this parameter multiple times. By default, all the input labels are passed as is, including any negative labels (with default value being no remapping at all). However, when at least one label is remapped, all the other "unset" labels are discarded as a negative label. Only applies when `useStitcher=true` |
| 143 | +- `pretty`: optional, defaults to `false` |
| 144 | + |
| 145 | + - Type: boolean |
| 146 | + - Multivalued: False |
| 147 | + - Choices: **_`false`_**, `true` |
| 148 | + |
| 149 | + |
| 150 | + > The JSON body of the HTTP response will be re-formatted with 2-space indentation |
| 151 | +- `runningTime`: optional, defaults to `false` |
| 152 | + |
| 153 | + - Type: boolean |
| 154 | + - Multivalued: False |
| 155 | + - Choices: **_`false`_**, `true` |
| 156 | + |
| 157 | + |
| 158 | + > The running time of the app will be recorded in the view metadata |
| 159 | +- `hwFetch`: optional, defaults to `false` |
| 160 | + |
| 161 | + - Type: boolean |
| 162 | + - Multivalued: False |
| 163 | + - Choices: **_`false`_**, `true` |
| 164 | + |
| 165 | + |
| 166 | + > The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata |
| 167 | +
|
| 168 | + |
| 169 | +#### Outputs |
| 170 | +(**Note**: "*" as a property value means that the property is required but can be any value.) |
| 171 | + |
| 172 | +(**Note**: Not all output annotations are always generated.) |
| 173 | + |
| 174 | +- [http://mmif.clams.ai/vocabulary/TimeFrame/v5](http://mmif.clams.ai/vocabulary/TimeFrame/v5) |
| 175 | + - _timeUnit_ = "milliseconds" |
| 176 | + |
| 177 | +- [http://mmif.clams.ai/vocabulary/TimePoint/v4](http://mmif.clams.ai/vocabulary/TimePoint/v4) |
| 178 | + - _timeUnit_ = "milliseconds" |
| 179 | + - _labelset_ = a list of ["B", "S", "W", "L", "O", "M", "I", "N", "E", "P", "Y", "K", "G", "T", "F", "C", "R"] |
| 180 | + |
0 commit comments