Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App Submitted - distil-whisper-wrapper.v1.2 #177

Merged
merged 1 commit into from
Aug 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions docs/_apps/distil-whisper-wrapper/v1.2/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
---
layout: posts
classes: wide
title: "Distil Whisper Wrapper (v1.2)"
date: 2024-08-08T15:48:34+00:00
---
## About this version

- Submitter: [BenLambright](https://github.com/BenLambright)
- Submission Time: 2024-08-08T15:48:34+00:00
- Prebuilt Container Image: [ghcr.io/clamsproject/app-distil-whisper-wrapper:v1.2](https://github.com/clamsproject/app-distil-whisper-wrapper/pkgs/container/app-distil-whisper-wrapper/v1.2)
- Release Notes

> reverting back to HF pipeline using chunking transcription

## About this app (See raw [metadata.json](metadata.json))

**The wrapper of Distil-Whisper, avaliable models: distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. The default model is distil-small.en.**

- App ID: [http://apps.clams.ai/distil-whisper-wrapper/v1.2](http://apps.clams.ai/distil-whisper-wrapper/v1.2)
- App License: Apache 2.0
- Source Repository: [https://github.com/clamsproject/app-distil-whisper-wrapper](https://github.com/clamsproject/app-distil-whisper-wrapper) ([source tree of the submitted version](https://github.com/clamsproject/app-distil-whisper-wrapper/tree/v1.2))
- Analyzer Version: 1.0
- Analyzer License: MIT


#### Inputs
(**Note**: "*" as a property value means that the property is required but can be any value.)

One of the following is required: [
- [http://mmif.clams.ai/vocabulary/AudioDocument/v1](http://mmif.clams.ai/vocabulary/AudioDocument/v1) (required)
(of any properties)

- [http://mmif.clams.ai/vocabulary/VideoDocument/v1](http://mmif.clams.ai/vocabulary/VideoDocument/v1) (required)
(of any properties)



]


#### Configurable Parameters
(**Note**: _Multivalued_ means the parameter can have one or more values.)

- `modelSize`: optional, defaults to `distil-small.en`

- Type: string
- Multivalued: False
- Choices: `distil-large-v3`, `distil-large-v2`, `distil-medium.en`, **_`distil-small.en`_**, `small`, `s`, `medium`, `m`, `large-v2`, `l2`, `large-v3`, `l3`


> The size of the model to use. There are four size of model to use distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. You can also enter the abbreviation of the model as parameter. 'small' and 's' for distil-small.en; 'medium' and 'm' for distil-medium.en; 'large-v2' and 'l2' for distil-large-v2; 'large-v3' and 'l3' for distil-large-v3. The default model is distil-medium.en.)
- `pretty`: optional, defaults to `false`

- Type: boolean
- Multivalued: False
- Choices: **_`false`_**, `true`


> The JSON body of the HTTP response will be re-formatted with 2-space indentation
- `runningTime`: optional, defaults to `false`

- Type: boolean
- Multivalued: False
- Choices: **_`false`_**, `true`


> The running time of the app will be recorded in the view metadata
- `hwFetch`: optional, defaults to `false`

- Type: boolean
- Multivalued: False
- Choices: **_`false`_**, `true`


> The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata


#### Outputs
(**Note**: "*" as a property value means that the property is required but can be any value.)

(**Note**: Not all output annotations are always generated.)

- [http://mmif.clams.ai/vocabulary/TextDocument/v1](http://mmif.clams.ai/vocabulary/TextDocument/v1)
- _@lang_ = "en"

> Fully serialized text content of the recognized text in the input audio/video.
- [http://mmif.clams.ai/vocabulary/TimeFrame/v5](http://mmif.clams.ai/vocabulary/TimeFrame/v5)
- _timeUnit_ = "milliseconds"

- [http://mmif.clams.ai/vocabulary/Alignment/v1](http://mmif.clams.ai/vocabulary/Alignment/v1)
(of any properties)

> Alignments between 1) `TimeFrame` <-> `SENTENCE`, 2) `audio/video document` <-> `TextDocument`
- [http://vocab.lappsgrid.org/Sentence](http://vocab.lappsgrid.org/Sentence)
(of any properties)

> The smallest recognized unit of distil-whisper. Normally a complete sentence.
90 changes: 90 additions & 0 deletions docs/_apps/distil-whisper-wrapper/v1.2/metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
{
"name": "Distil Whisper Wrapper",
"description": "The wrapper of Distil-Whisper, avaliable models: distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. The default model is distil-small.en.",
"app_version": "v1.2",
"mmif_version": "1.0.5",
"analyzer_version": "1.0",
"app_license": "Apache 2.0",
"analyzer_license": "MIT",
"identifier": "http://apps.clams.ai/distil-whisper-wrapper/v1.2",
"url": "https://github.com/clamsproject/app-distil-whisper-wrapper",
"input": [
[
{
"@type": "http://mmif.clams.ai/vocabulary/AudioDocument/v1",
"required": true
},
{
"@type": "http://mmif.clams.ai/vocabulary/VideoDocument/v1",
"required": true
}
]
],
"output": [
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"description": "Fully serialized text content of the recognized text in the input audio/video.",
"properties": {
"@lang": "en"
}
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v5",
"properties": {
"timeUnit": "milliseconds"
}
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"description": "Alignments between 1) `TimeFrame` <-> `SENTENCE`, 2) `audio/video document` <-> `TextDocument`"
},
{
"@type": "http://vocab.lappsgrid.org/Sentence",
"description": "The smallest recognized unit of distil-whisper. Normally a complete sentence."
}
],
"parameters": [
{
"name": "modelSize",
"description": "The size of the model to use. There are four size of model to use distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. You can also enter the abbreviation of the model as parameter. 'small' and 's' for distil-small.en; 'medium' and 'm' for distil-medium.en; 'large-v2' and 'l2' for distil-large-v2; 'large-v3' and 'l3' for distil-large-v3. The default model is distil-medium.en.)",
"type": "string",
"choices": [
"distil-large-v3",
"distil-large-v2",
"distil-medium.en",
"distil-small.en",
"small",
"s",
"medium",
"m",
"large-v2",
"l2",
"large-v3",
"l3"
],
"default": "distil-small.en",
"multivalued": false
},
{
"name": "pretty",
"description": "The JSON body of the HTTP response will be re-formatted with 2-space indentation",
"type": "boolean",
"default": false,
"multivalued": false
},
{
"name": "runningTime",
"description": "The running time of the app will be recorded in the view metadata",
"type": "boolean",
"default": false,
"multivalued": false
},
{
"name": "hwFetch",
"description": "The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata",
"type": "boolean",
"default": false,
"multivalued": false
}
]
}
6 changes: 6 additions & 0 deletions docs/_apps/distil-whisper-wrapper/v1.2/submission.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"time": "2024-08-08T15:48:34+00:00",
"submitter": "BenLambright",
"image": "ghcr.io/clamsproject/app-distil-whisper-wrapper:v1.2",
"releasenotes": "reverting back to HF pipeline using chunking transcription\n\n"
}
32 changes: 18 additions & 14 deletions docs/_data/app-index.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,22 @@
{
"http://apps.clams.ai/distil-whisper-wrapper": {
"description": "The wrapper of Distil-Whisper, avaliable models: distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. The default model is distil-small.en.",
"latest_update": "2024-08-08T15:48:34+00:00",
"versions": [
[
"v1.2",
"BenLambright"
],
[
"v1.1",
"keighrim"
],
[
"v1.0",
"1192119703jzx"
]
]
},
"http://apps.clams.ai/simple-timepoints-stitcher": {
"description": "Stitches a sequence of `TimePoint` annotations into a sequence of `TimeFrame` annotations, performing simple smoothing of short peaks of positive labels.",
"latest_update": "2024-08-06T12:25:05+00:00",
Expand Down Expand Up @@ -129,20 +147,6 @@
]
]
},
"http://apps.clams.ai/distil-whisper-wrapper": {
"description": "The wrapper of Distil-Whisper, avaliable models: distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. The default model is distil-small.en.",
"latest_update": "2024-07-22T21:52:47+00:00",
"versions": [
[
"v1.1",
"keighrim"
],
[
"v1.0",
"1192119703jzx"
]
]
},
"http://apps.clams.ai/tfidf-keywordextractor": {
"description": "extract keywords of a text document according to TF-IDF values. IDF values and all features come from related pickle files in the current directory.App can either take a simple text document or take a MMIF file generated from the text slicer app.",
"latest_update": "2024-07-19T14:07:21+00:00",
Expand Down
2 changes: 1 addition & 1 deletion docs/_data/apps.json

Large diffs are not rendered by default.

Loading