Questions about evaluating Octo on LIBERO tasks #40

MasterXiong · 2024-10-17T04:32:11Z

Hi,

Thanks for your great work! I'm trying to evaluate Octo, a vision-language-action model on LIBERO, and I have several specific questions that I hope you can help with to make sure I'm evaluating in the correct way.

What is the action space used in LIBERO? Is it delta in end-effector position and orientation? And what representation is used for orientation? Like axis angle or roll, pitch, yaw? For the gripper openness/closeness, how is it represented? Like 1 for open, -1 for close? Or in some other numerical form?
I got the error ValueError: executing action in terminated episode when evaluating Octo in LIBERO env. Is it because that the env won't set done to True if it reaches the maximal episode length?
I notice that the benchmark provides 50 initial states for each task. I was wondering that are these initial states different from the initial states in the task demos? And how can I generate new initial states, like just reseting the env with a different seed?

Thanks a lot for your help!

The text was updated successfully, but these errors were encountered:

zhuyifengzju · 2024-10-20T15:08:34Z

Thanks for your interest in the work. Answers to your questions:

It's the OSC + gripper action space, 6-dim end-effector task space commands + 1 dim gripper action. Orientation takes the axis angle. It's a convention strictly following the robosuite repo. For gripper, your understanding is basically correct, but instead, it should be 0/1 if I remember correctly (I don't have the time run the code at the moment, so you can check the dataset to see its range)
I need more details, like if you can point us to the specific line that is throwing this error. We do assume maximal episode length, which is configurable.
Yes, they are generated differently from the ones in demos. And your understanding on generation new states is correct.

MasterXiong · 2024-10-23T06:02:44Z

Thanks for your detailed reply! They are quite helpful. About the second question, I got the error when trying to stop an episode by only checking the done signal. Here is a minimal example to reproduce the error I got:

import os
from libero.libero import get_libero_path
from libero.libero import benchmark
from libero.libero.envs import OffScreenRenderEnv

benchmark_dict = benchmark.get_benchmark_dict()
task_suite_name = "libero_90" # can also choose libero_spatial, libero_object, etc.
task_suite = benchmark_dict[task_suite_name]()

# retrieve a specific task
task_id = 0
task = task_suite.get_task(task_id)
task_name = task.name
task_description = task.language
task_bddl_file = os.path.join(get_libero_path("bddl_files"), task.problem_folder, task.bddl_file)

# step over the environment
env_args = {
    "bddl_file_name": task_bddl_file,
    "camera_heights": 128,
    "camera_widths": 128
}
env = OffScreenRenderEnv(**env_args)
env.seed(0)
env.reset()
init_states = task_suite.get_task_init_states(task_id) # for benchmarking purpose, we fix a set of initial states
init_state_id = 0
init_state = env.set_init_state(init_states[init_state_id])

dummy_action = [0.] * 7
done = False
while not done:
    obs, reward, done, info = env.step(dummy_action)
env.close()

It seems that done will not be set to True when reaching the maximal episode length, which gives the following error:

Traceback (most recent call last):
  File "/user/octo/test.py", line 73, in <module>
  File "/LIBERO/libero/libero/envs/env_wrapper.py", line 88, in step
    return self.env.step(action)
  File "/LIBERO/libero/libero/envs/bddl_base_domain.py", line 806, in step
    obs, reward, done, info = super().step(action)
  File "/opt/conda/lib/python3.10/site-packages/robosuite/environments/base.py", line 379, in step
    raise ValueError("executing action in terminated episode")
ValueError: executing action in terminated episode

Is there any way to solve this issue? Thanks!

zhuyifengzju · 2024-10-23T06:48:59Z

I see. I would need to look into more details. In general, it will be good to set the horizon explicitly, and you can double check if the while loop reaches the maximum loop and exit direction.

IcarusWizard · 2024-10-24T08:51:50Z

I have faced the same problem, and my solution is to check env.env.done as the additional signal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about evaluating Octo on LIBERO tasks #40

Questions about evaluating Octo on LIBERO tasks #40

MasterXiong commented Oct 17, 2024

zhuyifengzju commented Oct 20, 2024

MasterXiong commented Oct 23, 2024

zhuyifengzju commented Oct 23, 2024

IcarusWizard commented Oct 24, 2024

Questions about evaluating Octo on LIBERO tasks #40

Questions about evaluating Octo on LIBERO tasks #40

Comments

MasterXiong commented Oct 17, 2024

zhuyifengzju commented Oct 20, 2024

MasterXiong commented Oct 23, 2024

zhuyifengzju commented Oct 23, 2024

IcarusWizard commented Oct 24, 2024