Async simulation and FedBuff aggregation #197

ewenw · 2023-02-01T14:30:12Z

Implements the async simulation mode with FedBuff using device traces without needing a constant arrival parameter, following the min-heap method described here.

Benchmarks comparison for Femnist (Sync vs Async with Fedbuff):

Sync / Async

Round: 100 / 180
Virtual clock: 27,618s / 18,913s
Top_5 eval accuracy: 0.914 / 0.92

Params: 5 clients per round, model = resnet18, max_concurrency=10

The results are consistent with the hypothesis that the async scheduling system increases the number of rounds that can be completed within the same amount of virtual clock time, and improves straggler tolerance. They also show the effect of aggregating stale updates from previous rounds, resulting in more rounds before convergence.

Checks

I've included any doc changes needed for https://fedscale.readthedocs.io/en/latest/
I've made sure the following tests are passing.
Testing Configurations
- Dry Run (20 training rounds & 1 evaluation round)
- Cifar 10 (20 training rounds & 1 evaluation round)
- Femnist (20 training rounds & 1 evaluation round)

fanlai0990

Thank you Ewen. It looks good to me. @AmberLJC @IKACE Can you please run some test? Thanks.

IKACE · 2023-02-09T22:10:31Z

Thank you so much for you contribution Ewen! Just one small thing:

The model_zoo config in benchmark/configs/fedbuff_femnist/conf.yml seems to cause input/output mismatch error in training (see below). I can verify that commenting it out like benchmark/configs/femnist/conf.yml did would solve the issue.

ewenw · 2023-02-10T01:09:46Z

Thank you so much for you contribution Ewen! Just one small thing:

The model_zoo config in benchmark/configs/fedbuff_femnist/conf.yml seems to cause input/output mismatch error in training (see below). I can verify that commenting it out like benchmark/configs/femnist/conf.yml did would solve the issue.

Thanks for catching this. I just commented it out.

IKACE · 2023-02-10T01:19:28Z

Thanks for catching this. I just commented it out.

Great! It looks perfect to me now. Thanks again!!

ewenw added 5 commits January 30, 2023 16:04

Async FedBuff simulation support

876b03a

Add overrides

1539103

Address comments

549e426

Style

8bc8940

Test fix

325e586

ewenw marked this pull request as ready for review February 1, 2023 14:57

fanlai0990 requested review from AmberLJC and fanlai0990 February 3, 2023 14:39

fanlai0990 reviewed Feb 6, 2023

View reviewed changes

Comment out model zoo

294d759

fanlai0990 merged commit 34e07aa into SymbioticLab:master Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async simulation and FedBuff aggregation #197

Async simulation and FedBuff aggregation #197

ewenw commented Feb 1, 2023 •

edited

Loading

fanlai0990 left a comment

IKACE commented Feb 9, 2023

ewenw commented Feb 10, 2023

IKACE commented Feb 10, 2023

Async simulation and FedBuff aggregation #197

Async simulation and FedBuff aggregation #197

Conversation

ewenw commented Feb 1, 2023 • edited Loading

Benchmarks comparison for Femnist (Sync vs Async with Fedbuff):

Sync / Async

Checks

fanlai0990 left a comment

Choose a reason for hiding this comment

IKACE commented Feb 9, 2023

ewenw commented Feb 10, 2023

IKACE commented Feb 10, 2023

ewenw commented Feb 1, 2023 •

edited

Loading