-
Notifications
You must be signed in to change notification settings - Fork 3
Creating a diarization broadcast corpus
judyfong edited this page Jun 17, 2020
·
10 revisions
- Gecko
- rttm files
- corresponding videos of episode
Label speaker turns which last at least 60 ms. (CHANGED)
Each speaker gets their own speaker number per recording/episode.
Unknown speakers get labelled Unknown 01 etc.
There are at least two ways to create the csv file.
- Follow Aríel's video called My Movie.mp4. In it he uses VSCode, extension json2csv, and does some formatting.
- Add all the speakers to one segment in Gecko and copy over the list then remove them back all again to create initial list for the csv file.
- Generate the proposed rttm files for 28 episodes that week.
- Labelling - Gecko
- Open Gecko If you use the Gecko version linked here then you can save partially corrected files and reload them back into the editor to edit later.
- Upload the video file & rttm file
- Adjust the segment start and end times to match speaker turns.
- Add missing speaker turns.
- Correct speaker labels/numbers. Add new ones if necessary
- Write down the full speaker names which correspond to each speaker number. These go in a csv file.
- Label music, foreign language, or noise. They're available as default labels.
- Segments which are only silence can be deleted.
- Review the segments in case you missed anything or added tiny segments.
- Export as json, srt, and rttm.
- Turn in the csv, json, srt, and corrected rttm files to the relevant folders. Then get new rttm and video files.
- Repeat for a new episode.
- Judy reports the new DER with that week's data. When it is under 10%, this project is done.
format
<recording/episode id>, <speaker_number in rttm file>, <speaker name>
example
Fréttirkl1900-5022010T0,1, Bogi Águstsson