Skip to content

Commit b631243

Browse files
author
clams-bot
committed
adding metadata of whisper-wrapper.v9
1 parent 1a6314b commit b631243

File tree

5 files changed

+250
-39
lines changed

5 files changed

+250
-39
lines changed
+105
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
---
2+
layout: posts
3+
classes: wide
4+
title: "Whisper Wrapper (v9)"
5+
date: 2024-08-16T15:05:09+00:00
6+
---
7+
## About this version
8+
9+
- Submitter: [keighrim](https://github.com/keighrim)
10+
- Submission Time: 2024-08-16T15:05:09+00:00
11+
- Prebuilt Container Image: [ghcr.io/clamsproject/app-whisper-wrapper:v9](https://github.com/clamsproject/app-whisper-wrapper/pkgs/container/app-whisper-wrapper/v9)
12+
- Release Notes
13+
14+
> fixed inability to handle concurrent requests
15+
16+
## About this app (See raw [metadata.json](metadata.json))
17+
18+
**A CLAMS wrapper for Whisper-based ASR software originally developed by OpenAI.**
19+
20+
- App ID: [http://apps.clams.ai/whisper-wrapper/v9](http://apps.clams.ai/whisper-wrapper/v9)
21+
- App License: Apache 2.0
22+
- Source Repository: [https://github.com/clamsproject/app-whisper-wrapper](https://github.com/clamsproject/app-whisper-wrapper) ([source tree of the submitted version](https://github.com/clamsproject/app-whisper-wrapper/tree/v9))
23+
- Analyzer Version: 20231117
24+
- Analyzer License: MIT
25+
26+
27+
#### Inputs
28+
(**Note**: "*" as a property value means that the property is required but can be any value.)
29+
30+
One of the following is required: [
31+
- [http://mmif.clams.ai/vocabulary/AudioDocument/v1](http://mmif.clams.ai/vocabulary/AudioDocument/v1) (required)
32+
(of any properties)
33+
34+
- [http://mmif.clams.ai/vocabulary/VideoDocument/v1](http://mmif.clams.ai/vocabulary/VideoDocument/v1) (required)
35+
(of any properties)
36+
37+
38+
39+
]
40+
41+
42+
#### Configurable Parameters
43+
(**Note**: _Multivalued_ means the parameter can have one or more values.)
44+
45+
- `modelSize`: optional, defaults to `tiny`
46+
47+
- Type: string
48+
- Multivalued: False
49+
- Choices: **_`tiny`_**, `True`, `base`, `b`, `small`, `s`, `medium`, `m`, `large`, `l`, `large-v2`, `l2`, `large-v3`, `l3`
50+
51+
52+
> The size of the model to use. When `modelLang=en` is given, for non-`large` models, English-only models will be used instead of multilingual models for speed and accuracy. (For `large` models, English-only models are not available.) (also can be given as alias: tiny=t, base=b, small=s, medium=m, large=l, large-v2=l2, large-v3=l3)
53+
- `modelLang`: required
54+
55+
- Type: string
56+
- Multivalued: False
57+
58+
59+
> Language of the model to use, accepts two- or three-letter ISO 639 language codes, however Whisper only supports a subset of languages. If the language is not supported, error will be raised.For the full list of supported languages, see https://github.com/openai/whisper/blob/20231117/whisper/tokenizer.py . In addition to the langauge code, two-letter region codes can be added to the language code, e.g. "en-US" for US English. Note that the region code is only for compatibility and recording purpose, and Whisper neither detects regional dialects, nor use the given one for transcription. When the langauge code is not given, Whisper will run in langauge detection mode, and will use first few seconds of the audio to detect the language.
60+
- `pretty`: optional, defaults to `false`
61+
62+
- Type: boolean
63+
- Multivalued: False
64+
- Choices: **_`false`_**, `true`
65+
66+
67+
> The JSON body of the HTTP response will be re-formatted with 2-space indentation
68+
- `runningTime`: optional, defaults to `false`
69+
70+
- Type: boolean
71+
- Multivalued: False
72+
- Choices: **_`false`_**, `true`
73+
74+
75+
> The running time of the app will be recorded in the view metadata
76+
- `hwFetch`: optional, defaults to `false`
77+
78+
- Type: boolean
79+
- Multivalued: False
80+
- Choices: **_`false`_**, `true`
81+
82+
83+
> The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata
84+
85+
86+
#### Outputs
87+
(**Note**: "*" as a property value means that the property is required but can be any value.)
88+
89+
(**Note**: Not all output annotations are always generated.)
90+
91+
- [http://mmif.clams.ai/vocabulary/TextDocument/v1](http://mmif.clams.ai/vocabulary/TextDocument/v1)
92+
(of any properties)
93+
94+
- [http://mmif.clams.ai/vocabulary/TimeFrame/v5](http://mmif.clams.ai/vocabulary/TimeFrame/v5)
95+
- _timeUnit_ = "milliseconds"
96+
97+
- [http://mmif.clams.ai/vocabulary/Alignment/v1](http://mmif.clams.ai/vocabulary/Alignment/v1)
98+
(of any properties)
99+
100+
- [http://vocab.lappsgrid.org/Token](http://vocab.lappsgrid.org/Token)
101+
(of any properties)
102+
103+
- [http://vocab.lappsgrid.org/Sentence](http://vocab.lappsgrid.org/Sentence)
104+
(of any properties)
105+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
{
2+
"name": "Whisper Wrapper",
3+
"description": "A CLAMS wrapper for Whisper-based ASR software originally developed by OpenAI.",
4+
"app_version": "v9",
5+
"mmif_version": "1.0.5",
6+
"analyzer_version": "20231117",
7+
"app_license": "Apache 2.0",
8+
"analyzer_license": "MIT",
9+
"identifier": "http://apps.clams.ai/whisper-wrapper/v9",
10+
"url": "https://github.com/clamsproject/app-whisper-wrapper",
11+
"input": [
12+
[
13+
{
14+
"@type": "http://mmif.clams.ai/vocabulary/AudioDocument/v1",
15+
"required": true
16+
},
17+
{
18+
"@type": "http://mmif.clams.ai/vocabulary/VideoDocument/v1",
19+
"required": true
20+
}
21+
]
22+
],
23+
"output": [
24+
{
25+
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1"
26+
},
27+
{
28+
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v5",
29+
"properties": {
30+
"timeUnit": "milliseconds"
31+
}
32+
},
33+
{
34+
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1"
35+
},
36+
{
37+
"@type": "http://vocab.lappsgrid.org/Token"
38+
},
39+
{
40+
"@type": "http://vocab.lappsgrid.org/Sentence"
41+
}
42+
],
43+
"parameters": [
44+
{
45+
"name": "modelSize",
46+
"description": "The size of the model to use. When `modelLang=en` is given, for non-`large` models, English-only models will be used instead of multilingual models for speed and accuracy. (For `large` models, English-only models are not available.) (also can be given as alias: tiny=t, base=b, small=s, medium=m, large=l, large-v2=l2, large-v3=l3)",
47+
"type": "string",
48+
"choices": [
49+
"tiny",
50+
true,
51+
"base",
52+
"b",
53+
"small",
54+
"s",
55+
"medium",
56+
"m",
57+
"large",
58+
"l",
59+
"large-v2",
60+
"l2",
61+
"large-v3",
62+
"l3"
63+
],
64+
"default": "tiny",
65+
"multivalued": false
66+
},
67+
{
68+
"name": "modelLang",
69+
"description": "Language of the model to use, accepts two- or three-letter ISO 639 language codes, however Whisper only supports a subset of languages. If the language is not supported, error will be raised.For the full list of supported languages, see https://github.com/openai/whisper/blob/20231117/whisper/tokenizer.py . In addition to the langauge code, two-letter region codes can be added to the language code, e.g. \"en-US\" for US English. Note that the region code is only for compatibility and recording purpose, and Whisper neither detects regional dialects, nor use the given one for transcription. When the langauge code is not given, Whisper will run in langauge detection mode, and will use first few seconds of the audio to detect the language.",
70+
"type": "string",
71+
"default": "",
72+
"multivalued": false
73+
},
74+
{
75+
"name": "pretty",
76+
"description": "The JSON body of the HTTP response will be re-formatted with 2-space indentation",
77+
"type": "boolean",
78+
"default": false,
79+
"multivalued": false
80+
},
81+
{
82+
"name": "runningTime",
83+
"description": "The running time of the app will be recorded in the view metadata",
84+
"type": "boolean",
85+
"default": false,
86+
"multivalued": false
87+
},
88+
{
89+
"name": "hwFetch",
90+
"description": "The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata",
91+
"type": "boolean",
92+
"default": false,
93+
"multivalued": false
94+
}
95+
]
96+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"time": "2024-08-16T15:05:09+00:00",
3+
"submitter": "keighrim",
4+
"image": "ghcr.io/clamsproject/app-whisper-wrapper:v9",
5+
"releasenotes": "fixed inability to handle concurrent requests\n\n"
6+
}

docs/_data/app-index.json

+42-38
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,46 @@
11
{
2+
"http://apps.clams.ai/whisper-wrapper": {
3+
"description": "A CLAMS wrapper for Whisper-based ASR software originally developed by OpenAI.",
4+
"latest_update": "2024-08-16T15:05:09+00:00",
5+
"versions": [
6+
[
7+
"v9",
8+
"keighrim"
9+
],
10+
[
11+
"v8",
12+
"keighrim"
13+
],
14+
[
15+
"v7",
16+
"keighrim"
17+
],
18+
[
19+
"v6",
20+
"keighrim"
21+
],
22+
[
23+
"v5",
24+
"keighrim"
25+
],
26+
[
27+
"v4",
28+
"keighrim"
29+
],
30+
[
31+
"v3",
32+
"keighrim"
33+
],
34+
[
35+
"v2",
36+
"keighrim"
37+
],
38+
[
39+
"v1",
40+
"keighrim"
41+
]
42+
]
43+
},
244
"http://apps.clams.ai/distil-whisper-wrapper": {
345
"description": "The wrapper of Distil-Whisper, avaliable models: distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. The default model is distil-small.en.",
446
"latest_update": "2024-08-08T15:48:34+00:00",
@@ -109,44 +151,6 @@
109151
]
110152
]
111153
},
112-
"http://apps.clams.ai/whisper-wrapper": {
113-
"description": "A CLAMS wrapper for Whisper-based ASR software originally developed by OpenAI.",
114-
"latest_update": "2024-07-22T21:53:49+00:00",
115-
"versions": [
116-
[
117-
"v8",
118-
"keighrim"
119-
],
120-
[
121-
"v7",
122-
"keighrim"
123-
],
124-
[
125-
"v6",
126-
"keighrim"
127-
],
128-
[
129-
"v5",
130-
"keighrim"
131-
],
132-
[
133-
"v4",
134-
"keighrim"
135-
],
136-
[
137-
"v3",
138-
"keighrim"
139-
],
140-
[
141-
"v2",
142-
"keighrim"
143-
],
144-
[
145-
"v1",
146-
"keighrim"
147-
]
148-
]
149-
},
150154
"http://apps.clams.ai/tfidf-keywordextractor": {
151155
"description": "extract keywords of a text document according to TF-IDF values. IDF values and all features come from related pickle files in the current directory.App can either take a simple text document or take a MMIF file generated from the text slicer app.",
152156
"latest_update": "2024-07-19T14:07:21+00:00",

docs/_data/apps.json

+1-1
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)