Learning World Models for Unconstrained Goal Navigation (MUN)

Yuanlin Duan, Wensen Mao and He Zhu

Code for "Learning World Models for Unconstrained Goal Navigation" (NeurIPS 2024), a method to improve the quality of world model in MBRL.

If you find our paper or code useful, please reference us:

@article{duan2024learning,
  title={Learning World Models for Unconstrained Goal Navigation},
  author={Duan, Yuanlin and Mao, Wensen and Zhu, He},
  journal={arXiv preprint arXiv:2411.02446},
  year={2024}
}

Model-based GCRL Framework:

The richness of environment space and dynamics captured by the replay buffer sets the upper limit for what the world model can learn about the real world. It also significantly influences the training level of the policy.

MUN's bidirectional Replay Buffer

MUN learns a world model from state transitions between any states in the replay buffer (whether tracing back along recorded trajectories or transitioning across separate trajectories).

Previous Replay Buffer: One-way direction

Our Method (MUN)’s Replay Buffer: Two-way direction

Key Subgoal

During the trajectory evolution towards the agent's goal, there often exist certain states termed as key subgoal states.

We observed that key subgoal states typically correspond to actions with significant differences. So we designed DAD alogorithm to find key subgoal states.

Some key subgoals found by DAD:

Experiments results

MUN trains better policies in different tasks compared with other baselines:

Success rate of MUN crossing different key subgoal pairs:

Code Structure

MUN/
  |- Config/                        # config file for each environment.
  |- dreamerv2_APS/                 # MUN implement
  |- dreamerv2_APS/gc_main.py       # Main running file

Step 1: MUN installation

pip intall all dependencies:

pip install -r library.txt

And then, run:

pip install -e .

Step 2: Environment installation

We evaluate MUN on six environments: Ant Maze, Walker, 3-block Stacking, Block Rotation, Pen Rotation, Fetch Slide.

MUJOCO install: MuJoCo 2.0

Ant Maze, 3-Block Stack environments:

The mrl codebase contains Ant Maze and 3-block Stack environments.

git clone https://github.com/hueds/mrl.git

Before testing these two environments, you should make sure that the mrl path is set in the PYTHONPATH.

# if you want to run environments in the mrl codebase(Ant Maze, 3-block Stacking)
export PYTHONPATH=<path to your mrl folder>

Walker environment:

Clone the lexa-benchmark and dm_control repos.

git clone https://github.com/hueds/dm_control
git clone https://github.com/hueds/lexa-benchmark.git

Set up dm_control as a local python module:

cd dm_control
pip install .

Set LD_PRELOAD to the libGLEW path, and set the MUJOCO_GL and MUJOCO_RENDERER variables.

# if you want to run environments in the lexa-benchmark codebase
MUJOCO_GL=egl MUJOCO_RENDERER=egl LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/x86_64-linux-gnu/libGL.so  PYTHONPATH=<path to your lexa-benchmark folder like "/home/edward/lexa-benchmark">

Step 3: Run MUN

Training Scripts:

python dreamerv2_APS/gc_main.py --configs RotatePen(environment name in config file) --logdir "your logdir path"

Use the tensorboard to check the results.

tensorboard --logdir ~/logdir/your_logdir_name

Acknowledgements

MUN builds on many prior works, and we thank the authors for their contributions.

PEG for the goal-cond MBRL Agent and goal picking for exploration implement
Dreamerv2 for the non-goal-cond. MBRL agent
LEXA for goal-cond. policy training logic, P2E exploration, and Walker task
mrl for their implement of baselines and environments

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Config		Config
Resources		Resources
dreamerv2_APS		dreamerv2_APS
README.md		README.md
library.txt		library.txt
script.txt		script.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning World Models for Unconstrained Goal Navigation (MUN)

Model-based GCRL Framework:

MUN's bidirectional Replay Buffer

Key Subgoal

Experiments results

Code Structure

Step 1: MUN installation

Step 2: Environment installation

Step 3: Run MUN

Acknowledgements

About

Releases

Packages

Languages

RU-Automated-Reasoning-Group/MUN

Folders and files

Latest commit

History

Repository files navigation

Learning World Models for Unconstrained Goal Navigation (MUN)

Model-based GCRL Framework:

MUN's bidirectional Replay Buffer

Key Subgoal

Experiments results

Code Structure

Step 1: MUN installation

Step 2: Environment installation

Step 3: Run MUN

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages