v0.5.0: Big refactor, `GlobalPowerLimitOptimizer`
What's New
Callback-based architecture
zeus.callback.Callback
is the new backbone for Zeus componentsGlobalPowerLimitOptimizer
is the shiny new way to online-profile and optimize the power limit of DNN training.EarlyStopController
monitors and manages all sorts of conditions to determine whether training should stop.
Extensive testing
tests/
is richer than ever. With deep component tests with exhaustive parametrization, there are now around 1500 test cases.- Especially,
zeus.util.testing.ReplayZeusMonitor
exposes the same public API asZeusMonitor
but replays the measurement window logs produced byZeusMonitor
, instead of doing actual measurement. With this, Zeus can now be tested without any actual GPUs.