Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about evaluating Octo on LIBERO tasks #40

Open
MasterXiong opened this issue Oct 17, 2024 · 4 comments
Open

Questions about evaluating Octo on LIBERO tasks #40

MasterXiong opened this issue Oct 17, 2024 · 4 comments

Comments

@MasterXiong
Copy link

Hi,

Thanks for your great work! I'm trying to evaluate Octo, a vision-language-action model on LIBERO, and I have several specific questions that I hope you can help with to make sure I'm evaluating in the correct way.

  1. What is the action space used in LIBERO? Is it delta in end-effector position and orientation? And what representation is used for orientation? Like axis angle or roll, pitch, yaw? For the gripper openness/closeness, how is it represented? Like 1 for open, -1 for close? Or in some other numerical form?
  2. I got the error ValueError: executing action in terminated episode when evaluating Octo in LIBERO env. Is it because that the env won't set done to True if it reaches the maximal episode length?
  3. I notice that the benchmark provides 50 initial states for each task. I was wondering that are these initial states different from the initial states in the task demos? And how can I generate new initial states, like just reseting the env with a different seed?

Thanks a lot for your help!

@zhuyifengzju
Copy link
Contributor

Thanks for your interest in the work. Answers to your questions:

  1. It's the OSC + gripper action space, 6-dim end-effector task space commands + 1 dim gripper action. Orientation takes the axis angle. It's a convention strictly following the robosuite repo. For gripper, your understanding is basically correct, but instead, it should be 0/1 if I remember correctly (I don't have the time run the code at the moment, so you can check the dataset to see its range)
  2. I need more details, like if you can point us to the specific line that is throwing this error. We do assume maximal episode length, which is configurable.
  3. Yes, they are generated differently from the ones in demos. And your understanding on generation new states is correct.

@MasterXiong
Copy link
Author

Thanks for your detailed reply! They are quite helpful. About the second question, I got the error when trying to stop an episode by only checking the done signal. Here is a minimal example to reproduce the error I got:

import os
from libero.libero import get_libero_path
from libero.libero import benchmark
from libero.libero.envs import OffScreenRenderEnv

benchmark_dict = benchmark.get_benchmark_dict()
task_suite_name = "libero_90" # can also choose libero_spatial, libero_object, etc.
task_suite = benchmark_dict[task_suite_name]()

# retrieve a specific task
task_id = 0
task = task_suite.get_task(task_id)
task_name = task.name
task_description = task.language
task_bddl_file = os.path.join(get_libero_path("bddl_files"), task.problem_folder, task.bddl_file)

# step over the environment
env_args = {
    "bddl_file_name": task_bddl_file,
    "camera_heights": 128,
    "camera_widths": 128
}
env = OffScreenRenderEnv(**env_args)
env.seed(0)
env.reset()
init_states = task_suite.get_task_init_states(task_id) # for benchmarking purpose, we fix a set of initial states
init_state_id = 0
init_state = env.set_init_state(init_states[init_state_id])

dummy_action = [0.] * 7
done = False
while not done:
    obs, reward, done, info = env.step(dummy_action)
env.close()

It seems that done will not be set to True when reaching the maximal episode length, which gives the following error:

Traceback (most recent call last):
  File "/user/octo/test.py", line 73, in <module>
  File "/LIBERO/libero/libero/envs/env_wrapper.py", line 88, in step
    return self.env.step(action)
  File "/LIBERO/libero/libero/envs/bddl_base_domain.py", line 806, in step
    obs, reward, done, info = super().step(action)
  File "/opt/conda/lib/python3.10/site-packages/robosuite/environments/base.py", line 379, in step
    raise ValueError("executing action in terminated episode")
ValueError: executing action in terminated episode

Is there any way to solve this issue? Thanks!

@zhuyifengzju
Copy link
Contributor

I see. I would need to look into more details. In general, it will be good to set the horizon explicitly, and you can double check if the while loop reaches the maximum loop and exit direction.

@IcarusWizard
Copy link

I have faced the same problem, and my solution is to check env.env.done as the additional signal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants