Skip to content

Commit f73c2f9

Browse files
committed
unified naming scheme for import scripts
1 parent e7f27b4 commit f73c2f9

7 files changed

+8
-8
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -463,7 +463,7 @@ The following list contains speech corpora supported by this script collection.
463463
- [German Speechdata Package Version 2 (German, 148 hours)](http://www.repository.voxforge1.org/downloads/de/german-speechdata-package-v2.tar.gz):
464464
+ Unpack the archive such that the directories `dev`, `test`, and `train` are
465465
direct subdirectories of `<~/.speechrc:speech_arc>/gspv2`.
466-
+ Then run run the script `./gspv2_to_vf.py` to convert the corpus to the VoxForge
466+
+ Then run run the script `./import_gspv2.py` to convert the corpus to the VoxForge
467467
format. The resulting corpus will be written to `<~/.speechrc:speech_corpora>/gspv2`.
468468
469469
- [Noise](http://goofy.zamia.org/zamia-speech/corpora/noise.tar.xz):
@@ -474,35 +474,35 @@ The following list contains speech corpora supported by this script collection.
474474
+ Download the set of 360 hours "clean" speech tarball
475475
+ Unpack the archive such that the directory `LibriSpeech` is a direct
476476
subdirectory of `<~/.speechrc:speech_arc>`.
477-
+ Then run run the script `./librispeech_to_vf.py` to convert the corpus to the VoxForge
477+
+ Then run run the script `./import_librispeech.py` to convert the corpus to the VoxForge
478478
format. The resulting corpus will be written to `<~/.speechrc:speech_corpora>/librispeech`.
479479
480480
- [The LJ Speech Dataset (English, 24 hours)](https://keithito.com/LJ-Speech-Dataset/):
481481
+ Download the tarball
482482
+ Unpack the archive such that the directory `LJSpeech-1.1` is a direct
483483
subdirectory of `<~/.speechrc:speech_arc>`.
484-
+ Then run run the script `ljspeech_to_vf.py` to convert the corpus to the VoxForge
484+
+ Then run run the script `import_ljspeech.py` to convert the corpus to the VoxForge
485485
format. The resulting corpus will be written to `<~/.speechrc:speech_corpora>/lindajohnson-11`.
486486
487487
- [Mozilla Common Voice German (German, 140 hours)](https://voice.mozilla.org/en/datasets):
488488
+ Download `de.tar.gz`
489489
+ Unpack the archive such that the directory `cv_de` is a direct
490490
subdirectory of `<~/.speechrc:speech_arc>`.
491-
+ Then run run the script `./mozde_to_vf.py` to convert the corpus to the VoxForge
491+
+ Then run run the script `./import_mozde.py` to convert the corpus to the VoxForge
492492
format. The resulting corpus will be written to `<~/.speechrc:speech_corpora>/cv_de`.
493493
494494
- [Mozilla Common Voice V1 (English, 252 hours)](https://voice.mozilla.org/en/data):
495495
+ Download `cv_corpus_v1.tar.gz`
496496
+ Unpack the archive such that the directory `cv_corpus_v1` is a direct
497497
subdirectory of `<~/.speechrc:speech_arc>`.
498-
+ Then run run the script `./mozcv1_to_vf.py` to convert the corpus to the VoxForge
498+
+ Then run run the script `./import_mozcv1.py` to convert the corpus to the VoxForge
499499
format. The resulting corpus will be written to `<~/.speechrc:speech_corpora>/cv_corpus_v1`.
500500
501501
- [Munich Artificial Intelligence Laboratories GmbH (M-AILABS) Speech Dataset (English, 147 hours, German, 237 hours)](http://www.m-ailabs.bayern/en/):
502502
+ Download `de_DE.tgz`, `en_UK.tgz`, `en_US.tgz` ([Mirror](https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/))
503503
+ Create a subdirectory `m_ailabs` in `<~/.speechrc:speech_arc>`
504504
+ Unpack the downloaded tarbals inside the `m_ailabs` subdirectory
505-
+ Then run run the script `./mailabs_to_vf.py` to convert the corpus to the VoxForge
505+
+ Then run run the script `./import_mailabs.py` to convert the corpus to the VoxForge
506506
format. The resulting corpus will be written to `<~/.speechrc:speech_corpora>/m_ailabs_en` and `<~/.speechrc:speech_corpora>/m_ailabs_de`.
507507
508508
- [VoxForge (English, 75 hours)](http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/16kHz_16bit/):
File renamed without changes.
File renamed without changes.
File renamed without changes.

mailabs_to_vf.py renamed to import_mailabs.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# -*- coding: utf-8 -*-
33

44
#
5-
# Copyright 2018 Guenter Bartsch
5+
# Copyright 2018, 2019 Guenter Bartsch
66
#
77
# This program is free software: you can redistribute it and/or modify
88
# it under the terms of the GNU Lesser General Public License as published by
@@ -35,7 +35,7 @@
3535
from optparse import OptionParser
3636
from nltools import misc
3737

38-
PROC_TITLE = 'mailabs_to_vf'
38+
PROC_TITLE = 'import_mailabs'
3939
DEFAULT_NUM_CPUS = 12
4040

4141
GENDERS = set(['male', 'female'])
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)