Optimizations to be done on fed_benchmark #154

jeandut · 2022-06-15T07:49:54Z

Hi,

Thanks for starting the refactoring. Here are a few comments to go even further. I will not mark anything as compulsory given the time constraints, but it would be great to save what will not be done as an issue for future work.

Optimization: in fed_benchmark.py, in as much as possible, initialize dataloaders after checking if the experiment needs to be run or not
Cutting main in fed_benchmark.py into sub-functions to make it even more readable (it's > 370 lines long at the moment...). I give more actionable suggestions below.
Split main into sub-functions pooled_training, local_training, strategy_training. That will make this main much more simpler (but +1 for the single_centric_training which is much better than before).
The logic for the tests if the exps are finished or not should be hidden in sub-functions for readability.
replace all the len(index_of_interest) < (NUM_CLIENTS + 1) tests to check if experiment is finished by a more explicit function like if check_exp_finished(df, hyperparameters, sname, num_updates)
same for all the statements #dealing with edge cases that should not happen --> in sub-functions
Lots of variables are assigned very short names, like training_dls for training_dataloaders, s for strategy, m for model... Better to use larger names
the logic for filling the hyperparameters' param could also be delegated in sub-functions.
conf.py should be renamed into something more explicit, like configuration_utils.py

jeandut · 2022-06-29T12:41:30Z

For the first point we need to find a quick way before initializing the dataloader if no experiments need to be run aka all results from the specified parameters set are already found in the dataframe. One should implement a function:
check_new_xps_parameters_given() that takes the do_baseline dict, the strategy arguments and the seed and use the helper function already in benchmark utils to check that. If nothing is found it calls sys.exit() with a helpful message.
If and only if at least one xp is not found then we initialize dataloaders and thus we do not waste any time otherwise.
It would be very helpful when performing grid-search for different strategies in the same loop aka being able to use scripts similar to what is below:

import os
from itertools import product

#fedopt paper grid
lrs = [1e-3, 10**(-2.5), 10**(-2), 10**(-1.5), 10**(-1), 10**(-0.5)]
slrs = [1e-3, 10**(-2.5), 10**(-2), 10**(-1.5), 10**(-1), 10**(-0.5), 1, 10**(0.5), 10]
mus = [0.001, 0.01, 0.1, 1.]

for lr, slr, mu in product(lrs, slrs, mus):
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedAvg --learning_rate {lr} -rfp fedavg.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedProx --learning_rate {lr} --mu {mu} -rfp fedprox.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedAdam --server_learning_rate {slr} --learning_rate {lr} -rfp fedadam.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedAdagrad --server_learning_rate {slr} --learning_rate {lr} -rfp fedadagrad.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedYogi --server_learning_rate {slr} --learning_rate {lr} -rfp fedyogi.csv --debug")

and not waste any time initializing dataloaders multiple times for fedavg when mu changes but the learning rate doesn't change.

jeandut added the enhancement New feature or request label Jun 15, 2022

jeandut mentioned this issue Jun 15, 2022

Jean/refacto benchmark #150

Merged

jeandut changed the title ~~Optimization to be done on fed_benchmark~~ Optimizations to be done on fed_benchmark Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations to be done on fed_benchmark #154

Optimizations to be done on fed_benchmark #154

jeandut commented Jun 15, 2022

jeandut commented Jun 29, 2022 •

edited

Loading

Optimizations to be done on fed_benchmark #154

Optimizations to be done on fed_benchmark #154

Comments

jeandut commented Jun 15, 2022

jeandut commented Jun 29, 2022 • edited Loading

jeandut commented Jun 29, 2022 •

edited

Loading