Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations to be done on fed_benchmark #154

Open
jeandut opened this issue Jun 15, 2022 · 1 comment
Open

Optimizations to be done on fed_benchmark #154

jeandut opened this issue Jun 15, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@jeandut
Copy link
Collaborator

jeandut commented Jun 15, 2022

Hi,

Thanks for starting the refactoring. Here are a few comments to go even further. I will not mark anything as compulsory given the time constraints, but it would be great to save what will not be done as an issue for future work.

  1. Optimization: in fed_benchmark.py, in as much as possible, initialize dataloaders after checking if the experiment needs to be run or not
  2. Cutting main in fed_benchmark.py into sub-functions to make it even more readable (it's > 370 lines long at the moment...). I give more actionable suggestions below.
  3. Split main into sub-functions pooled_training, local_training, strategy_training. That will make this main much more simpler (but +1 for the single_centric_training which is much better than before).
  4. The logic for the tests if the exps are finished or not should be hidden in sub-functions for readability.
    replace all the len(index_of_interest) < (NUM_CLIENTS + 1) tests to check if experiment is finished by a more explicit function like if check_exp_finished(df, hyperparameters, sname, num_updates)
    same for all the statements #dealing with edge cases that should not happen --> in sub-functions
  5. Lots of variables are assigned very short names, like training_dls for training_dataloaders, s for strategy, m for model... Better to use larger names
  6. the logic for filling the hyperparameters' param could also be delegated in sub-functions.
  7. conf.py should be renamed into something more explicit, like configuration_utils.py
@jeandut jeandut added the enhancement New feature or request label Jun 15, 2022
@jeandut jeandut changed the title Optimization to be done on fed_benchmark Optimizations to be done on fed_benchmark Jun 15, 2022
@jeandut
Copy link
Collaborator Author

jeandut commented Jun 29, 2022

For the first point we need to find a quick way before initializing the dataloader if no experiments need to be run aka all results from the specified parameters set are already found in the dataframe. One should implement a function:
check_new_xps_parameters_given() that takes the do_baseline dict, the strategy arguments and the seed and use the helper function already in benchmark utils to check that. If nothing is found it calls sys.exit() with a helpful message.
If and only if at least one xp is not found then we initialize dataloaders and thus we do not waste any time otherwise.
It would be very helpful when performing grid-search for different strategies in the same loop aka being able to use scripts similar to what is below:

import os
from itertools import product

#fedopt paper grid
lrs = [1e-3, 10**(-2.5), 10**(-2), 10**(-1.5), 10**(-1), 10**(-0.5)]
slrs = [1e-3, 10**(-2.5), 10**(-2), 10**(-1.5), 10**(-1), 10**(-0.5), 1, 10**(0.5), 10]
mus = [0.001, 0.01, 0.1, 1.]

for lr, slr, mu in product(lrs, slrs, mus):
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedAvg --learning_rate {lr} -rfp fedavg.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedProx --learning_rate {lr} --mu {mu} -rfp fedprox.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedAdam --server_learning_rate {slr} --learning_rate {lr} -rfp fedadam.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedAdagrad --server_learning_rate {slr} --learning_rate {lr} -rfp fedadagrad.csv --debug")
    os.system(f"python fed_benchmark.py -cfp ../config_lidc_idri.json --strategy FedYogi --server_learning_rate {slr} --learning_rate {lr} -rfp fedyogi.csv --debug")

and not waste any time initializing dataloaders multiple times for fedavg when mu changes but the learning rate doesn't change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant