|
207 | 207 | "cell_type": "markdown",
|
208 | 208 | "metadata": {},
|
209 | 209 | "source": [
|
210 |
| - "There are 2 models in OpenVoice: first one is responsible for speach generation `BaseSpeakerTTS` and the second one `ToneColorConverter` imposes arbitrary voice tone to the original speech. To convert to OpenVino IR format first we need to get acceptable `torch.nn.Module` object. Both ToneColorConverter, BaseSpeakerTTS instead of using `self.forward` as the main entry point use custom `infer` and `convert_voice` methods respectively, therefore need to wrap them with a custom class that is inherited from torch.nn.Module. \n", |
| 210 | + "There are 2 models in OpenVoice: first one is responsible for speech generation `BaseSpeakerTTS` and the second one `ToneColorConverter` imposes arbitrary voice tone to the original speech. To convert to OpenVino IR format first we need to get acceptable `torch.nn.Module` object. Both ToneColorConverter, BaseSpeakerTTS instead of using `self.forward` as the main entry point use custom `infer` and `convert_voice` methods respectively, therefore need to wrap them with a custom class that is inherited from torch.nn.Module. \n", |
211 | 211 | "\n",
|
212 | 212 | "<!---\n",
|
213 | 213 | "# One more reason to make a wrapper is also that these functions use float arguments while only torch.Tensor and tuple of torch.Tensors are acceptable \n",
|
|
217 | 217 | },
|
218 | 218 | {
|
219 | 219 | "cell_type": "code",
|
220 |
| - "execution_count": 8, |
| 220 | + "execution_count": null, |
221 | 221 | "metadata": {},
|
222 | 222 | "outputs": [],
|
223 | 223 | "source": [
|
|
237 | 237 | " \n",
|
238 | 238 | "class OVOpenVoiceTTS(OVOpenVoiceBase):\n",
|
239 | 239 | " \"\"\"\n",
|
240 |
| - " Constructor of this class accepts BaseSpeakerTTS object for speach generation and wraps it's 'infer' method with forward.\n", |
| 240 | + " Constructor of this class accepts BaseSpeakerTTS object for speech generation and wraps it's 'infer' method with forward.\n", |
241 | 241 | " \"\"\"\n",
|
242 | 242 | " def get_example_input(self):\n",
|
243 | 243 | " stn_tst = self.voice_model.get_text('this is original text', self.voice_model.hps, False)\n",
|
|
366 | 366 | },
|
367 | 367 | {
|
368 | 368 | "cell_type": "code",
|
369 |
| - "execution_count": 11, |
| 369 | + "execution_count": null, |
370 | 370 | "metadata": {},
|
371 |
| - "outputs": [ |
372 |
| - { |
373 |
| - "data": { |
374 |
| - "application/vnd.jupyter.widget-view+json": { |
375 |
| - "model_id": "e3dc3666c26c432bac345c670fd42c3a", |
376 |
| - "version_major": 2, |
377 |
| - "version_minor": 0 |
378 |
| - }, |
379 |
| - "text/plain": [ |
380 |
| - "Dropdown(description='reference voice from which tone color will be copied', options=('demo_speaker0.mp3', 'de…" |
381 |
| - ] |
382 |
| - }, |
383 |
| - "execution_count": 11, |
384 |
| - "metadata": {}, |
385 |
| - "output_type": "execute_result" |
386 |
| - } |
387 |
| - ], |
| 371 | + "outputs": [], |
388 | 372 | "source": [
|
389 | 373 | "REFERENCE_VOICES_PATH = f'{repo_dir}/resources/'\n",
|
390 | 374 | "reference_speakers = [\n",
|
391 |
| - " *[path for path in os.listdir(REFERENCE_VOICES_PATH) if os.path.splitext(path)[-1] == '.mp3'],\n", |
| 375 | + " *[path for path in os.listdir(REFERENCE_VOICES_PATH) if os.path.splitext(path)[-1] == '.mp3'],\n", |
392 | 376 | " 'record_manually',\n",
|
393 | 377 | " 'load_manually',\n",
|
394 | 378 | "]\n",
|
|
609 | 593 | "outputs": [],
|
610 | 594 | "source": [
|
611 | 595 | "if voice_source.value == 'choose_manually':\n",
|
612 |
| - " upload_orig_voice = widgets.FileUpload(accept=allowed_audio_types, multiple=False, \n", |
613 |
| - " description='audo whose tone will be replaced')\n", |
| 596 | + " upload_orig_voice = widgets.FileUpload(accept=allowed_audio_types, multiple=False, description='audo whose tone will be replaced')\n", |
614 | 597 | " display(upload_orig_voice)"
|
615 | 598 | ]
|
616 | 599 | },
|
|
0 commit comments