diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..0f87cff --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,77 @@ +## Contributing to HpBandSter + +You are interested in developing a new feature or have found a bug? +Awesome, feel welcome and read this guideline in order to find out how to best report your ideas so that we can include +them as quickly as possible. + +### Report security issues + +You must never report security related issues, vulnerabilities or bugs including sensitive information to the bug tracker, + or elsewhere in public. Instead sensitive bugs must be sent by email to one of the maintainers. + +### New Features + +We are always happy to read about your ideas on how to improve HpBandSter. +If you find yourself wishing for a feature that doesn't exist in HpBandster, +you are probably not alone. There are bound to be others out there with similar needs. +Open an issue on our [issues list on GitHub](https://github.com/automl/HpBandSter/issues), + and describe +- the feature you would like to see +- why you need it and +- how it should work. + +If you already know how to implement, we love pull requests. +Please see the [Pull request](#pull-requests) section, to read further details on pull requests. + + +### Report Bugs + +Report issues at + +Before you report a bug, please make sure that: + +1. Your bug hasn't already been reported in our [issue tracker](https://github.com/automl/HpBandSter/issues). +2. You are using the latest HpBandSter version. + +If you found a bug, please provide us the following information: + +- Your operating system name and version +- Any information about your setup that could be helpful to resolve the bug +- A simple example that reproduces that issue would be amazing. But if you can't provide an example, +just note your observations as much detail as you can. +- Feel free, to add a screenshot showing the issue, if it helps. + +If the issue needs an urgent fix, please mark it with the label "urgent". +Then either fix it or mark as "help wanted". + +### Work on own features + +To work on own features, first you need to create a fork of the original repository. +A good tutorial on how to do this is in the Github Guide: [Fork a repo](https://help.github.com/articles/fork-a-repo/). + +You could install the forked repository via: + +
+git clone git@github.com:automl/HpBandSter.git
+cd HpBandSter
+python3 setup.py develop --user 
+
+ +### Pull requests + +If you have not worked with pull requests, you can learn how from this *free* series [How to Contribute to an Open Source Project on GitHub](https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github). +Or read more in the official github documentation + +You know how to fix a bug or implement your own feature, follow this small guide: + +- Check the issue tracker if someone has already reported the same idea or found the same bug. + (Note: If you only want to make some smaller changes, opening a new issue is less important, as the changes can be + discussed in the pull request.) +- Create a pull request, after you have implemented your changes in your fork and make a pull request. + Using a separate branch for the fix is recommend. +- Pull request should include tests. +- We are using the Google Style Python Docstrings. Take a look at this [example](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html). +- The code should follow the PEP8 coding convention. +- We try to react as fast as possible to your pull request, but if you haven't received a feedback from us after some days + feel free to leave a comment on the pull request. + \ No newline at end of file diff --git a/README.md b/README.md index d8b18ad..c44f58b 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,19 @@ -# HpBandSter [![Build Status](https://travis-ci.org/automl/HpBandSter.svg?branch=master)](https://travis-ci.org/automl/HpBandSter) +# HpBandSter [![Build Status](https://travis-ci.org/automl/HpBandSter.svg?branch=master)](https://travis-ci.org/automl/HpBandSter) [![codecov](https://codecov.io/gh/automl/HpBandSter/branch/master/graph/badge.svg)](https://codecov.io/gh/automl/HpBandSter) a distributed Hyperband implementation on Steroids +## News: Not Maintained Anymore! + +Please note that we don't maintain this repository anymore. We also cannot ensure that we can reply to issues in the issue tracker or look into PRs. + +We offer two successor packages which showed in our [HPOBench paper](https://arxiv.org/abs/2109.06716) superior performance: + +1. [SMAC3](https://github.com/automl/SMAC3): is a versatile HPO package with different HPO strategies. It also implements the main idea of BOHB, but uses a RF (or GP) as a predictive model instead of a KDE. +2. [DEHB](https://github.com/automl/dehb): is a HPO package using a combination of differential evolution and hyperband. + +In particular, SMAC3 has an active group of developers working on it and maintaining it. So, we strongly recommend using one of these two packages instead of HPBandSter. + +## Overview + This python 3 package is a framework for distributed hyperparameter optimization. It started out as a simple implementation of [Hyperband (Li et al. 2017)](http://jmlr.org/papers/v18/16-558.html), and contains an implementation of [BOHB (Falkner et al. 2018)](http://proceedings.mlr.press/v80/falkner18a.html) diff --git a/codecov.yml b/codecov.yml new file mode 100644 index 0000000..b9c0233 --- /dev/null +++ b/codecov.yml @@ -0,0 +1,32 @@ +codecov: + notify: + require_ci_to_pass: yes + +coverage: + precision: 2 + round: down + range: "70...100" + + status: + project: yes + patch: yes + changes: no + +parsers: + gcov: + branch_detection: + conditional: yes + loop: yes + method: no + macro: no + +comment: + layout: "header, diff" + behavior: default + require_changes: no + +ignore: + - "hpbandster/examples/.*" + - "hpbandster/workers/hpolibbenchmark.py" + - "hpbandster/visualization.py" + - "hpbandster/optimizers/lerning_curve_models/base.py" diff --git a/hpbandster/examples/example_5_pytorch_worker.py b/hpbandster/examples/example_5_pytorch_worker.py index 72902ab..3109ab4 100644 --- a/hpbandster/examples/example_5_pytorch_worker.py +++ b/hpbandster/examples/example_5_pytorch_worker.py @@ -268,7 +268,7 @@ def number_of_parameters(self): if __name__ == "__main__": - worker = KerasWorker(run_id='0') + worker = PyTorchWorker(run_id='0') cs = worker.get_configspace() config = cs.sample_configuration().get_dictionary() diff --git a/hpbandster/optimizers/config_generators/h2bo.py b/hpbandster/optimizers/config_generators/h2bo.py index 72fc6e9..da1587a 100644 --- a/hpbandster/optimizers/config_generators/h2bo.py +++ b/hpbandster/optimizers/config_generators/h2bo.py @@ -227,8 +227,9 @@ def new_result(self, job, update_model=True): # skip model building: # a) if not enough points are available - if np.sum(np.isfinite(self.losses[budget])) < min_num_points: - self.logger.debug("Only %i successful run(s) for budget %f available, need more than %s -> can't build model!"%(np.sum(np.isfinite(self.losses[budget])), budget, min_num_points)) + tmp = np.array([np.mean(r) for r in self.losses[budget]]) + if np.sum(np.isfinite(tmp)) < min_num_points: + self.logger.debug("Only %i successful run(s) for budget %f available, need more than %s -> can't build model!"%(np.sum(np.isfinite(tmp)), budget, min_num_points)) return # b) during warnm starting when we feed previous results in and only update once @@ -266,15 +267,6 @@ def new_result(self, job, update_model=True): if self.bw_estimator in ['mlcv'] and n_good < 3: self.kde_models[budget]['good'].bandwidths[:] = self.kde_models[budget]['bad'].bandwidths - - print('='*50) - print(self.kde_models[budget]['good'].bandwidths) - #print('best:\n',self.kde_models[budget]['good'].data[0]) - print(self.kde_models[budget]['good'].data.mean(axis=0)) - print(self.kde_models[budget]['good'].data.std(axis=0)) - print((train_losses[idx])[:n_good]) - - print(self.kde_models[budget]['bad'].bandwidths) # update probs for the categorical parameters for later sampling self.logger.debug('done building a new model for budget %f based on %i/%i split\nBest loss for this budget:%f\n\n\n\n\n'%(budget, n_good, n_bad, np.min(train_losses))) diff --git a/hpbandster/optimizers/iterations/successiveresampling.py b/hpbandster/optimizers/iterations/successiveresampling.py index d023a72..7753749 100644 --- a/hpbandster/optimizers/iterations/successiveresampling.py +++ b/hpbandster/optimizers/iterations/successiveresampling.py @@ -21,8 +21,8 @@ def __init__(self, *args, resampling_rate = 0.5, min_samples_advance = 1, **kwar stage regardless of the fraction. """ - self.resampling_rate = resampling_rate - self.min_samples_advance = min_samples_advance + self.resampling_rate = resampling_rate + self.min_samples_advance = min_samples_advance def _advance_to_next_stage(self, config_ids, losses): diff --git a/hpbandster/optimizers/kde/kernels.py b/hpbandster/optimizers/kde/kernels.py index 1d39090..83bbaeb 100644 --- a/hpbandster/optimizers/kde/kernels.py +++ b/hpbandster/optimizers/kde/kernels.py @@ -169,7 +169,7 @@ def sample(self, sample_indices=None, num_samples=1): """ if sample_indices is None: sample_indices = np.random.choice(self.data.shape[0], size=num_samples) - samples = self.data[sample_indices] + samples = self.data[sample_indices] possible_steps = np.arange(-self.num_values+1,self.num_values) idx = (np.abs(possible_steps) < 1e-2) diff --git a/setup.py b/setup.py index 12b214d..c224f72 100644 --- a/setup.py +++ b/setup.py @@ -2,7 +2,7 @@ setup( name='hpbandster', - version='0.7.3', + version='0.7.4', description='HyPerBAND on STERoids, a distributed Hyperband implementation with lots of room for improvement', author='Stefan Falkner', author_email='sfalkner@cs.uni-freiburg.de', @@ -16,4 +16,5 @@ 'docu': ['sphinx', 'sphinx_rtd_theme', 'sphinx_gallery'], }, keywords=['distributed', 'optimization', 'multifidelity'], + test_suite="tests" ) diff --git a/tests/__init__.py b/tests/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/tests/test_result.py b/tests/test_result.py new file mode 100644 index 0000000..147b87d --- /dev/null +++ b/tests/test_result.py @@ -0,0 +1,80 @@ +import unittest +from hpbandster.core.result import Run, extract_HBS_learning_curves, \ + json_result_logger, logged_results_to_HBS_result +from hpbandster.core.base_iteration import Datum + +import ConfigSpace as CS + +import tempfile +import sys +import os.path + + +class TestResult(unittest.TestCase): + + def test_init(self): + run_obj = Run(config_id=1, budget=2, loss=[3, 1], info={'loss': [3, 1]}, + time_stamps={'submitted': 0, 'started': 10}, error_logs=None) + + self.assertEqual(run_obj.config_id, 1) + self.assertEqual(run_obj.budget, 2) + self.assertListEqual(run_obj.loss, [3, 1]) + self.assertListEqual(run_obj.info['loss'], [3, 1]) + self.assertDictEqual(run_obj.time_stamps, {'submitted': 0, 'started': 10}) + +class TestExtraction(unittest.TestCase): + def test_extract_HBS_learning_curves(self): + run_1 = Run('1', 10, 1, {}, {}, None) + run_2 = Run('2', 6, 3, {}, {}, None) + # the function should filter out invalid runs --> runs with no loss value + run_3 = Run('3', 3, None, {}, {}, None) + run_4 = Run('4', 1, 7, {}, {}, None) + + self.assertListEqual(extract_HBS_learning_curves([run_1, run_2, run_3, run_4]), + [[(1, 7), (6, 3), (10, 1)]]) + +class TestJsonResultLogger(unittest.TestCase): + def test_write_new_config(self): + + cs = CS.ConfigurationSpace() + cs.add_hyperparameter(CS.CategoricalHyperparameter('test', [1])) + + with tempfile.TemporaryDirectory() as temp_dir: + logger = json_result_logger(temp_dir) + + logger.new_config('1', cs.sample_configuration().get_dictionary(), {'test': 'test'}) + + self.assertTrue(os.path.exists(temp_dir)) + self.assertTrue(os.path.exists(os.path.join(temp_dir, 'configs.json'))) + self.assertTrue(os.path.exists(os.path.join(temp_dir, 'results.json'))) + self.assertEqual(logger.config_ids, set('1')) + + with open(os.path.join(temp_dir, 'configs.json')) as fh: + data = fh.read() + data = data.rstrip() + self.assertEqual(data, r'["1", {"test": 1}, {"test": "test"}]') + +""" +class TestResultObject(unittest.TestCase): + def setUp(self): + + self.temp_dir = tempfile.TemporaryDirectory() + with open(os.path.join(self.temp_dir.name, 'configs.json'), 'w') as f: + f.write('[[0, 0, 0], {"act_f": "Tanh"}, {"model_based_pick": false}]\n') + f.write('[[0, 0, 1], {"act_f": "ReLU"}, {"model_based_pick": false}]') + + with open(os.path.join(self.temp_dir.name, 'results.json'), 'w') as f: + f.write('[[0, 0, 0], 5, {"submitted": 15, "started": 16, "finished": 17},' + ' {"loss": 7, "info": {"loss": 7}}, null]\n') + f.write('[[0, 0, 1], 10, {"submitted": 17, "started": 18, "finished": 19},' + ' {"loss": 9, "info": {"loss": 9}}, null]') + + def tearDown(self): + os.remove(os.path.join(self.temp_dir.name, 'configs.json')) + os.remove(os.path.join(self.temp_dir.name, 'results.json')) + #os.rmdir(self.temp_dir.name) + + def test_logged_results_to_HBS_result(self): + # result, config = logged_results_to_HBS_result(self.temp_dir.name) + # print(result, config) +""" diff --git a/tests/test_utils.py b/tests/test_utils.py new file mode 100644 index 0000000..3220cb7 --- /dev/null +++ b/tests/test_utils.py @@ -0,0 +1,36 @@ +import unittest +import logging + +logging.basicConfig(level=logging.WARNING) + +import ConfigSpace as CS +from hpbandster.core.worker import Worker +import hpbandster.core.nameserver as hpn +import hpbandster.utils as utils + +rapid_development = True +rapid_development = False + + +class TestUtils(unittest.TestCase): + + def test_local_nameserver_1(self): + host, port = utils.start_local_nameserver(host=None, nic_name=None) + self.assertEqual(host, 'localhost') + + ns = hpn.NameServer('0', host=host) + ns_host, ns_port = ns.start() + self.assertEqual(ns.host, 'localhost') + ns.shutdown() + + def test_local_nameserver_2(self): + host, port = utils.start_local_nameserver(host=None, nic_name='lo') + self.assertEqual(host, '127.0.0.1') + + ns = hpn.NameServer('0', host=host) + ns_host, ns_port = ns.start() + self.assertEqual(ns.host, '127.0.0.1') + ns.shutdown() + +if __name__ == '__main__': + unittest.main() diff --git a/tests/test_worker.py b/tests/test_worker.py index 720dadf..910c818 100644 --- a/tests/test_worker.py +++ b/tests/test_worker.py @@ -4,142 +4,169 @@ import tempfile import logging -logging.basicConfig(level=logging.WARNING) +logging.basicConfig(level=logging.WARNING) import ConfigSpace as CS from hpbandster.core.worker import Worker import hpbandster.core.nameserver as hpn + from hpbandster.optimizers.hyperband import HyperBand -rapid_development=True -rapid_development=False +from hpbandster.optimizers.bohb import BOHB +from hpbandster.optimizers.h2bo import H2BO +# from hpbandster.optimizers.lcnet import LCNet +from hpbandster.optimizers.randomsearch import RandomSearch +rapid_development = True +rapid_development = False class TestWorker(Worker): - - def __init__(self, sleep_duration=0, *args, **kwargs): - super().__init__(*args, **kwargs) - self.sleep_duration = sleep_duration - - def compute(self, *args, **kwargs): - time.sleep(self.sleep_duration) - return({'loss': 0, 'info': {}}) + + def __init__(self, sleep_duration=0, *args, **kwargs): + super().__init__(*args, **kwargs) + self.sleep_duration = sleep_duration + + def compute(self, *args, **kwargs): + time.sleep(self.sleep_duration) + return ({'loss': 0, 'info': {}}) class TestWorkers(unittest.TestCase): - def setUp(self): - self.configspace = CS.ConfigurationSpace(42) - self.configspace.add_hyperparameters([CS.UniformFloatHyperparameter('cont1', lower=0, upper=1)]) - - self.run_id = 'hpbandsterUnittestWorker' - - def tearDown(self): - self.configspace = None - - @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") - def test_NoNameserverForeground(self): - w = TestWorker(run_id='test') - self.assertRaises(RuntimeError,w.run, background=False) - - @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") - def test_NoNameserverBackground(self): - w = TestWorker(run_id='test') - w.run(background=True) - w.thread.join() - self.assertFalse(w.thread.is_alive()) - - @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") - def test_NoNameserverCredentials(self): - w = TestWorker(run_id='test') - - with tempfile.TemporaryDirectory() as working_directory: - self.assertRaises(RuntimeError,w.load_nameserver_credentials,working_directory, num_tries=1) - - @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") - def test_Timeout(self): - - - class dummy_callback(object): - def register_result(self, *args, **kwargs): - pass - - host = hpn.nic_name_to_host('lo') - - w = TestWorker(run_id=self.run_id, sleep_duration=0, timeout=1, host=host) - - dc = dummy_callback() - - with tempfile.TemporaryDirectory() as working_directory: - # start up nameserver - ns = hpn.NameServer(self.run_id, working_directory=working_directory, host=host) - ns_host, ns_port = ns.start() - - # connect worker to it - w.load_nameserver_credentials(working_directory) - w.run(background=True) - - # start a computation with a dummy callback and dummy id - w.start_computation(dc, '0') - - # at this point the worker must still be alive - self.assertTrue(w.thread.is_alive()) - - # as the timeout is only 1, after 2 seconds, the worker thread should be dead - time.sleep(2) - self.assertFalse(w.thread.is_alive()) - - # shutdown the nameserver before the temporary directory is gone - ns.shutdown() - - @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") - def test_Timeout(self): - - host = hpn.nic_name_to_host('lo') - - with tempfile.TemporaryDirectory() as working_directory: - - # start up nameserver - ns = hpn.NameServer(self.run_id, working_directory=working_directory, host=host) - ns_host, ns_port = ns.start() - - - - # create workers and connect them to the nameserver - workers = [] - for i in range(3): - w = TestWorker(run_id=self.run_id, sleep_duration=2, timeout=1, host=host, id=i) - w.load_nameserver_credentials(working_directory) - w.run(background=True) - workers.append(w) - - - # at this point all workers must still be alive - alive = [w.thread.is_alive() for w in workers] - self.assertTrue(all(alive)) - - opt = HyperBand(run_id=self.run_id, - configspace=self.configspace, - nameserver=ns_host, - nameserver_port=ns_port, - min_budget=1, max_budget=3, eta=3, ping_interval=1) - opt.run(1, min_n_workers=3) - - # only one worker should be alive when the run is done - alive = [w.thread.is_alive() for w in workers] - self.assertEqual(1, sum(alive)) - - opt.shutdown() - time.sleep(2) - - # at this point all workers should have finished - alive = [w.thread.is_alive() for w in workers] - self.assertFalse(any(alive)) - - # shutdown the nameserver before the temporary directory is gone - ns.shutdown() + def setUp(self): + self.configspace = CS.ConfigurationSpace(42) + self.configspace.add_hyperparameters([CS.UniformFloatHyperparameter('cont1', lower=0, upper=1)]) + + self.run_id = 'hpbandsterUnittestWorker' + + def tearDown(self): + self.configspace = None + + @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") + def test_NoNameserverForeground(self): + w = TestWorker(run_id='test') + self.assertRaises(RuntimeError, w.run, background=False) + + @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") + def test_NoNameserverBackground(self): + w = TestWorker(run_id='test') + w.run(background=True) + w.thread.join() + self.assertFalse(w.thread.is_alive()) + + @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") + def test_NoNameserverCredentials(self): + w = TestWorker(run_id='test') + + with tempfile.TemporaryDirectory() as working_directory: + self.assertRaises(RuntimeError, w.load_nameserver_credentials, working_directory, num_tries=1) + + @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") + def test_Timeout(self): + class dummy_callback(object): + def register_result(self, *args, **kwargs): + pass + + host = hpn.nic_name_to_host('lo') + + w = TestWorker(run_id=self.run_id, sleep_duration=0, timeout=1, host=host) + + dc = dummy_callback() + + with tempfile.TemporaryDirectory() as working_directory: + # start up nameserver + ns = hpn.NameServer(self.run_id, working_directory=working_directory, host=host) + ns_host, ns_port = ns.start() + + # connect worker to it + w.load_nameserver_credentials(working_directory) + w.run(background=True) + + # start a computation with a dummy callback and dummy id + w.start_computation(dc, '0') + + # at this point the worker must still be alive + self.assertTrue(w.thread.is_alive()) + + # as the timeout is only 1, after 2 seconds, the worker thread should be dead + time.sleep(2) + self.assertFalse(w.thread.is_alive()) + + # shutdown the nameserver before the temporary directory is gone + ns.shutdown() + + @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") + def test_Timeout(self): + host = hpn.nic_name_to_host('lo') + + with tempfile.TemporaryDirectory() as working_directory: + # start up nameserver + ns = hpn.NameServer(self.run_id, working_directory=working_directory, host=host) + ns_host, ns_port = ns.start() + + # create workers and connect them to the nameserver + workers = [] + for i in range(3): + w = TestWorker(run_id=self.run_id, sleep_duration=2, timeout=1, host=host, id=i) + w.load_nameserver_credentials(working_directory) + w.run(background=True) + workers.append(w) + + # at this point all workers must still be alive + alive = [w.thread.is_alive() for w in workers] + self.assertTrue(all(alive)) + + opt = HyperBand(run_id=self.run_id, + configspace=self.configspace, + nameserver=ns_host, + nameserver_port=ns_port, + min_budget=1, max_budget=3, eta=3, ping_interval=1) + opt.run(1, min_n_workers=3) + + # only one worker should be alive when the run is done + alive = [w.thread.is_alive() for w in workers] + self.assertEqual(1, sum(alive)) + + opt.shutdown() + time.sleep(2) + + # at this point all workers should have finished + alive = [w.thread.is_alive() for w in workers] + self.assertFalse(any(alive)) + + # shutdown the nameserver before the temporary directory is gone + ns.shutdown() + + @unittest.skipIf(rapid_development, "test skipped to accelerate developing new tests") + def test_optimizers(self): + optimizers = [BOHB, H2BO, RandomSearch] + + for optimizer in optimizers: + host = hpn.nic_name_to_host('lo') + + with tempfile.TemporaryDirectory() as working_directory: + # start up nameserver + ns = hpn.NameServer(self.run_id, working_directory=working_directory, host=host) + ns_host, ns_port = ns.start() + + # create workers and connect them to the nameserver + w = TestWorker(run_id=self.run_id, sleep_duration=2, timeout=1, host=host, id=1) + w.load_nameserver_credentials(working_directory) + w.run(background=True) + + opt = optimizer(run_id=self.run_id, + configspace=self.configspace, + nameserver=ns_host, + nameserver_port=ns_port, + min_budget=1, max_budget=3, eta=3, ping_interval=1) + opt.run(1, min_n_workers=1) + + opt.shutdown() + time.sleep(2) + # shutdown the nameserver before the temporary directory is gone + ns.shutdown() if __name__ == '__main__': - unittest.main() + unittest.main()