FedML supports comprehensive research-oriented (synthetic and public) FL datasets and models, including four representative synthetic FL datasets used by top-tier publications:
-
EMNIST: EMNIST dataset extends MNIST dataset with upper and lower case English characters.
-
CIFAR-100: CIFAR-100 dataset consists of 100 image classes with each containing 600 images.
-
Shakespeare: Shakespeare dataset is built from the collective works of William Shakespeare.
-
Stack Overflow: Stack Overflow dataset originally hosted by Kaggle consists of questions and answers from the website Stack Overflow. This dataset is used to perform two tasks: tag prediction via logistic regression and next word prediction.
-
MNIST
-
cifar10
-
cifar100
-
fed_cifar100
-
fed_emnist
-
cinic10
-
ImageNet
-
Landmarks
-
shakespeare
-
fed_shakespeare
-
stackoverflow
-
lending_club_loan
-
NUS_WIDE
-
UCI
-
Synthetic
-
edge_case_examples (tailored for paper "Attack of the Tails: Yes, You Really Can Backdoor Federated Learning")
For a comprehensive dataset list, please check the following APIs:
fedml.data.load(args)
(https://github.com/FedML-AI/FedML/tree/master/python/fedml/data) and
fedml.model.create(args)
(https://github.com/FedML-AI/FedML/tree/master/python/fedml/data)
Their usage in different algorithms are as follows:
- Computer Vision: Federated EMNIST + CNN (2 conv layers)
- Computer Vision: CIFAR100 + ResNet18 (Group Normalization)
- Natural Language Processing: shakespeare + RNN (bi-LSTM)
- Natural Language Processing: stackoverflow (NWP) + RNN (bi-LSTM)
- Computer Vision: CIFAR10, CIFAR100, CINIC10 + ResNet
- Computer Vision: CIFAR10, CIFAR100, CINIC10 + MobileNet
- Computer Vision (linear model): MNIST + Logistic Regression
- Computer Vision (linear model): Synthetic + Logistic Regression
- lending_club_loan + VFL
- NUS_WIDE + VFL
- cross-silo CV: CIFAR10, CIFAR100, CINIC10 + ResNet
- cross-silo CV: CIFAR10, CIFAR100, CINIC10 + MobileNet
- cross-silo CV: CIFAR10, CIFAR100, CINIC10 + ResNet
- cross-silo CV: CIFAR10, CIFAR100, CINIC10 + MobileNet