New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

An Effective Style Token Weight Control Technique for End-to-End Emotional Speech Synthesis #7

Open

supikiti opened this issue Sep 21, 2020 · 0 comments

Assignees

Labels

2020 Emotion icassp tacotron tts

Owner

supikiti commented Sep 21, 2020 •

edited

Loading

リンク

https://ieeexplore.ieee.org/document/8778667

どんなもの？

GSTを用いた高品質な感情音声合成を提案

先行研究と比べてどこがすごい？

話者の特定の感情を表現するための潜在表現を抽出する際，学習するための音声データを選択することが難しい
学習済みGSTモデルにより出力された各感情のembeddingの重み付け和により各音声を表現

技術と手法のキモはどこ？

上に同じ

どうやって有効だと検証した？

教師とする音声を選択するモデルとの主観評価を実施 & 性能向上を確認

議論はある？

次に読むべき論文

Style Token

supikiti added 2020 icassp tacotron tts Emotion labels

supikiti self-assigned this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment