Skip to content
/ DFADD Public

Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset

License

Notifications You must be signed in to change notification settings

isjwdu/DFADD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset

Paper, Dataset, Demo, Homepage

SLT 2024

Updates

[11/2024] We provide checkpoint training on Unofficial PFlow-TTS.

[09/2024] We release all DFADD datasets on Huggingface.

Key Features:

  1. DFADD is the first dataset that includes spoofed speech generated specifically using diffusion and Flow-matching based TTS models.

  2. Compared to anti-spoofing models trained on the ASVspoof, models trained on DFADD exhibit better Equal Error Rates (EERs) when confronted with spoofed speech generated using the same methods.

Dataset Download

  1. HuggingFace dataset
from datasets import load_dataset
DFADD = load_dataset('isjwdu/DFADD')
  1. ZIP files

Checkpoint Download

For those interested in PFlow, we provide checkpoints trained for 1100 epochs on V100-32G.

Download PFlow Checkpoint

or

wget https://huggingface.co/datasets/isjwdu/DFADD/resolve/main/pflowtts_checkpoint_epoch%3D1099.ckpt

To generate speech for a VCTK speaker, follow PFlow-TTS with the vocoder replaced by HiFi-GAN VCTK_V1 pre-trained on VCTK

Acknowledgement

DFADD is created based on several official and unofficial open-source implementations and datasets:

VCTK dataset, licensed under CC-BY-4.0.

LJ Speech dataset, licensed under Public Domain.

Hifi-GAN Vocoder (Official), https://github.com/jik876/hifi-gan.

PFlow-TTS (Unofficial), https://github.com/p0p4k/pflowtts_pytorch.

NaturalSpeech2 (Unofficial), https://github.com/CODEJIN/NaturalSpeech2.

Grad-TTS (Official), https://github.com/huawei-noah/Speech-Backbones/tree/main/Grad-TTS.

Style-TTS2 (Official), https://github.com/yl4579/StyleTTS2.

Matcha-TTS (Official), https://github.com/shivammehta25/Matcha-TTS.

Citation

Please consider citing our paper if this work helps your research. Thank you!

@inproceedings{du2024dfadd,
  title={DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset},
  author={Du, Jiawei and Lin, I-Ming and Chiu, I-Hsiang and Chen, Xuanjun and Wu, Haibin and Ren, Wenze and Tsao, Yu and Lee, Hung-yi and Jang, Jyh-Shing Roger},
  booktitle={2024 IEEE Spoken Language Technology Workshop (SLT)},
  pages={921--928},
  year={2024},
  organization={IEEE}
}

About

Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published