Skip to content

Commit 03e770d

Browse files
gkatukaGloria Katukagoergenj
authored
Support for pretranscribed audio (#23)
* Process pretranscribed audio files * Preprocessing pretranscribed audio * moved transcript_processor to extension and created a new notebook conversational_field_extraction. * Adding CU-Demo-Assets to private branch * Adding descriptions and readme, minor functional adjustments and bug fixes for CU results processing * Removing demo assets accidentally added * Adding support for pretranscribed audio preprocessing --------- Co-authored-by: Gloria Katuka <[email protected]> Co-authored-by: Jan Görgen <[email protected]>
1 parent e6c0dbf commit 03e770d

12 files changed

+10020
-5
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ Azure AI Content Understanding is a new Generative AI-based [Azure AI service](h
1414
| --- | --- |
1515
| [content_extraction.ipynb](notebooks/content_extraction.ipynb) | In this sample we will show content understanding API can help you get semantic information from your file. For example OCR with table in document, audio transcription, and face analysis in video. |
1616
| [field_extraction.ipynb](notebooks/field_extraction.ipynb) | In this sample we will show how to create an analyzer to extract fields in your file. For example invoice amount in the document, how many people in an image, names mentioned in an audio, or summary of a video. You can customize the fields by creating your own analyzer template. |
17+
| [conversational_field_extraction.ipynb](notebooks/conversational_field_extraction.ipynb) | This sample shows you how to evaluate conversational audio data that has previously been transcribed with Content Understanding or Azure AI Speech in in an efficient way to optimize processing quality. This also allows you to re-analyze data in a cost-efficient way. This sample is based on the [field_extraction.ipynb](notebooks/field_extraction.ipynb) sample. |
1718
| [analyzer_training.ipynb](notebooks/analyzer_training.ipynb) | If you want to futher boost the performance for field extraction, we can do training when you provide few labeled samples to the API. Note: This feature is available to document scenario now. |
1819
| [management.ipynb](notebooks/management.ipynb) | This sample will demo how to create a minimal analyzer, list all the analyzers in your resource, and delete the analyzer you don't need. |
1920

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
{
2+
"description": "Sample call recording analytics",
3+
"scenario": "text",
4+
"config": {
5+
"returnDetails": true
6+
},
7+
"fieldSchema": {
8+
"fields": {
9+
"Summary": {
10+
"type": "string",
11+
"method": "generate",
12+
"description": "A one-paragraph summary"
13+
},
14+
"Topics": {
15+
"type": "array",
16+
"method": "generate",
17+
"description": "Top 5 topics mentioned",
18+
"items": {
19+
"type": "string"
20+
}
21+
},
22+
"Companies": {
23+
"type": "array",
24+
"method": "generate",
25+
"description": "List of companies mentioned",
26+
"items": {
27+
"type": "string"
28+
}
29+
},
30+
"People": {
31+
"type": "array",
32+
"method": "generate",
33+
"description": "List of people mentioned",
34+
"items": {
35+
"type": "object",
36+
"properties": {
37+
"Name": {
38+
"type": "string",
39+
"description": "Person's name"
40+
},
41+
"Role": {
42+
"type": "string",
43+
"description": "Person's title/role"
44+
}
45+
}
46+
}
47+
},
48+
"Sentiment": {
49+
"type": "string",
50+
"method": "classify",
51+
"description": "Overall sentiment",
52+
"enum": [
53+
"Positive",
54+
"Neutral",
55+
"Negative"
56+
]
57+
},
58+
"Categories": {
59+
"type": "array",
60+
"method": "classify",
61+
"description": "List of relevant categories",
62+
"items": {
63+
"type": "string",
64+
"enum": [
65+
"Agriculture",
66+
"Business",
67+
"Finance",
68+
"Health",
69+
"Insurance",
70+
"Mining",
71+
"Pharmaceutical",
72+
"Retail",
73+
"Technology",
74+
"Transportation"
75+
]
76+
}
77+
}
78+
}
79+
}
80+
}

0 commit comments

Comments
 (0)