Skip to content

Commit fb238d8

Browse files
authored
update vctk voc1, test=tts (#1294)
1 parent 9c1e098 commit fb238d8

File tree

6 files changed

+24
-24
lines changed

6 files changed

+24
-24
lines changed

examples/vctk/tts3/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -95,16 +95,16 @@ optional arguments:
9595
### Synthesizing
9696
We use [parallel wavegan](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/vctk/voc1) as the neural vocoder.
9797

98-
Download pretrained parallel wavegan model from [pwg_vctk_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_vctk_ckpt_0.5.zip)and unzip it.
98+
Download pretrained parallel wavegan model from [pwg_vctk_ckpt_0.1.1.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_vctk_ckpt_0.1.1.zip) and unzip it.
9999
```bash
100-
unzip pwg_vctk_ckpt_0.5.zip
100+
unzip pwg_vctk_ckpt_0.1.1.zip
101101
```
102102
Parallel WaveGAN checkpoint contains files listed below.
103103
```text
104-
pwg_vctk_ckpt_0.5
105-
├── pwg_default.yaml # default config used to train parallel wavegan
106-
├── pwg_snapshot_iter_1000000.pdz # generator parameters of parallel wavegan
107-
└── pwg_stats.npy # statistics used to normalize spectrogram when training parallel wavegan
104+
pwg_vctk_ckpt_0.1.1
105+
├── default.yaml # default config used to train parallel wavegan
106+
├── snapshot_iter_1500000.pdz # generator parameters of parallel wavegan
107+
└── feats_stats.npy # statistics used to normalize spectrogram when training parallel wavegan
108108
```
109109
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
110110
```bash

examples/vctk/tts3/local/synthesize.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ python3 ${BIN_DIR}/../synthesize.py \
1212
--am_ckpt=${train_output_path}/checkpoints/${ckpt_name} \
1313
--am_stat=dump/train/speech_stats.npy \
1414
--voc=pwgan_vctk \
15-
--voc_config=pwg_vctk_ckpt_0.5/pwg_default.yaml \
16-
--voc_ckpt=pwg_vctk_ckpt_0.5/pwg_snapshot_iter_1000000.pdz \
17-
--voc_stat=pwg_vctk_ckpt_0.5/pwg_stats.npy \
15+
--voc_config=pwg_vctk_ckpt_0.1.1/default.yaml \
16+
--voc_ckpt=pwg_vctk_ckpt_0.1.1/snapshot_iter_1500000.pdz \
17+
--voc_stat=pwg_vctk_ckpt_0.1.1/feats_stats.npy \
1818
--test_metadata=dump/test/norm/metadata.jsonl \
1919
--output_dir=${train_output_path}/test \
2020
--phones_dict=dump/phone_id_map.txt \

examples/vctk/tts3/local/synthesize_e2e.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ python3 ${BIN_DIR}/../synthesize_e2e.py \
1212
--am_ckpt=${train_output_path}/checkpoints/${ckpt_name} \
1313
--am_stat=dump/train/speech_stats.npy \
1414
--voc=pwgan_vctk \
15-
--voc_config=pwg_vctk_ckpt_0.5/pwg_default.yaml \
16-
--voc_ckpt=pwg_vctk_ckpt_0.5/pwg_snapshot_iter_1000000.pdz \
17-
--voc_stat=pwg_vctk_ckpt_0.5/pwg_stats.npy \
15+
--voc_config=pwg_vctk_ckpt_0.1.1/default.yaml \
16+
--voc_ckpt=pwg_vctk_ckpt_0.1.1/snapshot_iter_1500000.pdz \
17+
--voc_stat=pwg_vctk_ckpt_0.1.1/feats_stats.npy \
1818
--lang=en \
1919
--text=${BIN_DIR}/../sentences_en.txt \
2020
--output_dir=${train_output_path}/test_e2e \

examples/vctk/voc1/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -132,15 +132,15 @@ optional arguments:
132132
5. `--ngpu` is the number of gpus to use, if ngpu == 0, use cpu.
133133

134134
## Pretrained Model
135-
Pretrained models can be downloaded here [pwg_vctk_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_vctk_ckpt_0.5.zip).
135+
Pretrained models can be downloaded here [pwg_vctk_ckpt_0.1.1.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_vctk_ckpt_0.1.1.zip).
136136

137137
Parallel WaveGAN checkpoint contains files listed below.
138138

139139
```text
140-
pwg_vctk_ckpt_0.5
141-
├── pwg_default.yaml # default config used to train parallel wavegan
142-
├── pwg_snapshot_iter_1000000.pdz # generator parameters of parallel wavegan
143-
└── pwg_stats.npy # statistics used to normalize spectrogram when training parallel wavegan
140+
pwg_vctk_ckpt_0.1.1
141+
├── default.yaml # default config used to train parallel wavegan
142+
├── snapshot_iter_1500000.pdz # generator parameters of parallel wavegan
143+
└── feats_stats.npy # statistics used to normalize spectrogram when training parallel wavegan
144144
```
145145
## Acknowledgement
146146
We adapted some code from https://github.com/kan-bayashi/ParallelWaveGAN.

examples/vctk/voc1/conf/default.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ lambda_adv: 4.0 # Loss balancing coefficient.
7070
###########################################################
7171
# DATA LOADER SETTING #
7272
###########################################################
73-
batch_size: 8 # Batch size.
73+
batch_size: 6 # Batch size.
7474
batch_max_steps: 24000 # Length of each audio in batch. Make sure dividable by n_shift.
7575
num_workers: 2 # Number of workers in DataLoader.
7676

@@ -100,7 +100,7 @@ discriminator_grad_norm: 1 # Discriminator's gradient norm.
100100
# INTERVAL SETTING #
101101
###########################################################
102102
discriminator_train_start_steps: 100000 # Number of steps to start to train discriminator.
103-
train_max_steps: 1000000 # Number of training steps.
103+
train_max_steps: 1500000 # Number of training steps.
104104
save_interval_steps: 5000 # Interval steps to save checkpoint.
105105
eval_interval_steps: 1000 # Interval steps to evaluate the network.
106106

paddlespeech/cli/tts/infer.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -156,15 +156,15 @@
156156
},
157157
"pwgan_vctk-en": {
158158
'url':
159-
'https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_vctk_ckpt_0.5.zip',
159+
'https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_vctk_ckpt_0.1.1.zip',
160160
'md5':
161-
'322ca688aec9b127cec2788b65aa3d52',
161+
'b3da1defcde3e578be71eb284cb89f2c',
162162
'config':
163-
'pwg_default.yaml',
163+
'default.yaml',
164164
'ckpt':
165-
'pwg_snapshot_iter_1000000.pdz',
165+
'snapshot_iter_1500000.pdz',
166166
'speech_stats':
167-
'pwg_stats.npy',
167+
'feats_stats.npy',
168168
},
169169
# mb_melgan
170170
"mb_melgan_csmsc-zh": {

0 commit comments

Comments
 (0)