In this repository, we lean into the efficiency and robustness advantages of a hierarchical learning structure with HAC-E-SAK, iterating on the Hierarchical Actor-Critic (HAC) framework. We do so through the introduction of a synchronized knowledge-based exploration paradigm motivating guided environment discovery for tasks with continuous state and action spaces. While HAC emphasizes a strictly-defined hierarchical organization for rapid learning through parallelized training of multilevel subtask transition functions, it does not extend this principle to the exploration phase of training, an oversight addressed by our approach. Further, HAC's exploration strategy consists of simple
We successfully extend the aforementioned hierarchical organization used by leading methods in subtask learning for the parallel purpose of structured exploration, allowing for explicit synchronization between levels. We demonstrate the merits of our method experimentally through testing in sparse-reward, complex-action scenarios, showing the value of our novel approach in terms of further improved sample efficiency and consistently robust performance. The combination of our synchronization structure with our adversarial knowledge-based exploration learning strategy clearly outperforms all other presented procedures, validating the viability of HAC-E-SAK as a robust method for hierarchical learning.