Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PullDrawer-v1 Env with reward function and motion planner #774

Open
wants to merge 69 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
f482c6d
first draft of drawer env
Viswesh-N Nov 22, 2024
40d1433
fixed spawn and dimensions, need to debug joint
Viswesh-N Nov 22, 2024
ed5da11
drawer construction bug fix
Viswesh-N Nov 23, 2024
40f1176
drawer construction bug fix
Viswesh-N Nov 23, 2024
80c894a
drawer construction bug fix
Viswesh-N Nov 23, 2024
8ca873e
drawer construction bug fix
Viswesh-N Nov 23, 2024
f71d295
mostly fixed drawer, need to verify dense reward
Viswesh-N Nov 23, 2024
92b9db9
Merge branch 'haosulab:main' into main
Viswesh-N Nov 24, 2024
0560810
main axis motion fix
Viswesh-N Nov 24, 2024
cada3c2
reward function experiment
Viswesh-N Nov 25, 2024
2f9fe20
test run
Viswesh-N Nov 25, 2024
aba57a5
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N Nov 25, 2024
2dc2a36
test run
Viswesh-N Nov 25, 2024
7ac0869
test run
Viswesh-N Nov 25, 2024
5003d52
fixed handle bugs
Viswesh-N Nov 25, 2024
12d44fb
Update pull_drawer.py
Viswesh-N Nov 26, 2024
86a1f56
added draft planner, needs env fix
Viswesh-N Nov 26, 2024
df8c69a
Update pull_drawer.py
Viswesh-N Nov 27, 2024
31bfe1e
Update pull_drawer.py
Viswesh-N Nov 27, 2024
646e2d6
Update pull_drawer.py
Viswesh-N Nov 27, 2024
831af6f
Update pull_drawer.py
Viswesh-N Nov 27, 2024
0fd6082
Update pull_drawer.py
Viswesh-N Nov 27, 2024
b621c05
modified offset for grasp
Viswesh-N Nov 27, 2024
0774d19
modified offset for grasp
Viswesh-N Nov 27, 2024
2ee90fb
Merge branch 'haosulab:main' into main
Viswesh-N Nov 28, 2024
d521c79
test sphere for goal
Viswesh-N Nov 28, 2024
51b06ce
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N Nov 28, 2024
3899bb0
Merge branch 'haosulab:main' into main
Viswesh-N Nov 28, 2024
8ec7401
fixed handle pose handling issue
Viswesh-N Nov 30, 2024
825d773
Merge branch 'main' of github.com:Viswesh-N/ManiSkill
Viswesh-N Nov 30, 2024
3c518a7
tests for run
Viswesh-N Nov 30, 2024
d5a39c7
Update pull_drawer.py
Viswesh-N Dec 2, 2024
3a231af
Update pull_drawer.py
Viswesh-N Dec 2, 2024
4952aef
expt new reward
Viswesh-N Dec 2, 2024
959eb5b
modified reward expt
Viswesh-N Dec 2, 2024
792704e
Update pull_drawer.py
Viswesh-N Dec 2, 2024
d769d7e
Update pull_drawer.py
Viswesh-N Dec 2, 2024
98c5f9f
Update pull_drawer.py
Viswesh-N Dec 2, 2024
7af12ab
Update pull_drawer.py
Viswesh-N Dec 2, 2024
4865a9c
Update pull_drawer.py
Viswesh-N Dec 2, 2024
f7b9096
Update pull_drawer.py
Viswesh-N Dec 3, 2024
d01e766
reward forcing expt
Viswesh-N Dec 4, 2024
c6e2802
Merge branch 'haosulab:main' into main
Viswesh-N Dec 4, 2024
7814612
Update pull_drawer.py
Viswesh-N Dec 4, 2024
e7b9e25
planner test
Viswesh-N Dec 4, 2024
5909993
planner test
Viswesh-N Dec 4, 2024
8299c48
Update pull_drawer.py
Viswesh-N Dec 4, 2024
ffd0148
Update pull_drawer.py
Viswesh-N Dec 4, 2024
810977d
Update pull_drawer.py
Viswesh-N Dec 4, 2024
e37e74f
draft planner
Viswesh-N Dec 5, 2024
0644386
Updated env with spawn randomness
Viswesh-N Dec 5, 2024
440f65b
Update pull_drawer.py
Viswesh-N Dec 5, 2024
70ffa41
revert to last working reward version
Viswesh-N Dec 5, 2024
3fe43ab
orientation expt
Viswesh-N Dec 5, 2024
c9044dd
Update pull_drawer.py
Viswesh-N Dec 8, 2024
a471559
Update pull_drawer.py
Viswesh-N Dec 8, 2024
b6335f0
Update pull_drawer.py
Viswesh-N Dec 13, 2024
7cbb255
final working planner and dense reward
Viswesh-N Dec 13, 2024
e56b17a
Update pull_drawer.py
Viswesh-N Dec 13, 2024
f865816
fully working planner with seeding
Viswesh-N Dec 18, 2024
23e85be
Merge branch 'haosulab:main' into main
Viswesh-N Dec 20, 2024
6fd0cc8
improved dense reward
Dec 20, 2024
d8a19cd
Merge branch 'haosulab:main' into main
Viswesh-N Dec 23, 2024
b711f0b
update motion planning sh with new env
Jan 3, 2025
bf62882
added sample ppo script
Jan 3, 2025
a1a7a2c
added docs for new task
Jan 5, 2025
714a541
final cleanup, working dense reward
Viswesh-N Jan 6, 2025
e716253
increase randomness
Viswesh-N Jan 6, 2025
5828ac1
Merge branch 'haosulab:main' into main
Viswesh-N Jan 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions docs/source/tasks/table_top_gripper/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,14 @@ Table of all tasks/environments in this category. Task column is the environment
<td><p>100</p></td>
</tr>
<tr class="row-odd">
<td><p><a href="#pulldrawer-v1">PullDrawer-v1</a></p></td>
<td><div style='display:flex;gap:4px;align-items:center'><img style='min-width:min(50%, 100px);max-width:100px;height:auto' src='../../_static/env_thumbnails/PullDrawer-v1_rt_thumb_first.png' alt='PullDrawer-v1'> <img style='min-width:min(50%, 100px);max-width:100px;height:auto' src='../../_static/env_thumbnails/PullDrawer-v1_rt_thumb_last.png' alt='PullDrawer-v1'></div></td>
<td><p>✅</p></td>
<td><p>✅</p></td>
<td><p>❌</p></td>
<td><p>200</p></td>
</tr>
<tr class="row-odd">
<td><p><a href="#pushcube-v1">PushCube-v1</a></p></td>
<td><div style='display:flex;gap:4px;align-items:center'><img style='min-width:min(50%, 100px);max-width:100px;height:auto' src='../../_static/env_thumbnails/PushCube-v1_rt_thumb_first.png' alt='PushCube-v1'> <img style='min-width:min(50%, 100px);max-width:100px;height:auto' src='../../_static/env_thumbnails/PushCube-v1_rt_thumb_last.png' alt='PushCube-v1'></div></td>
<td><p>✅</p></td>
Expand Down
3 changes: 3 additions & 0 deletions examples/baselines/ppo/examples.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ python ppo.py --env_id="PushT-v1" \
python ppo.py --env_id="PickSingleYCB-v1" \
--num_envs=1024 --update_epochs=8 --num_minibatches=32 \
--total_timesteps=25_000_000
python ppo.py --env_id="PullDrawer-v1" \
--num_envs=1024 --update_epochs=8 --num_minibatches=32 \
--total_timesteps=100_000_000
python ppo.py --env_id="PegInsertionSide-v1" \
--num_envs=1024 --update_epochs=8 --num_minibatches=32 \
--total_timesteps=250_000_000 --num-steps=100 --num-eval-steps=100
Expand Down
Binary file added figures/environment_demos/PullDrawerv1_rt.mp4
Binary file not shown.
3 changes: 2 additions & 1 deletion mani_skill/envs/tasks/tabletop/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@
from .place_sphere import PlaceSphereEnv
from .roll_ball import RollBallEnv
from .push_t import PushTEnv
from .pull_cube_tool import PullCubeToolEnv
from .pull_cube_tool import PullCubeToolEnv
from .pull_drawer import PullDrawerEnv
361 changes: 361 additions & 0 deletions mani_skill/envs/tasks/tabletop/pull_drawer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,361 @@
from typing import Dict, Union, Any
import numpy as np
import sapien
import torch
from mani_skill.agents.robots import Fetch, Panda
from mani_skill.envs.sapien_env import BaseEnv
from mani_skill.utils import sapien_utils
from mani_skill.utils.registration import register_env
from mani_skill.utils.scene_builder.table import TableSceneBuilder
from mani_skill.utils.structs.types import SimConfig, GPUMemoryConfig
from mani_skill.sensors.camera import CameraConfig
from mani_skill.utils.structs import Pose
from mani_skill.utils.building import actors

@register_env("PullDrawer-v1", max_episode_steps=200)
class PullDrawerEnv(BaseEnv):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we recently updated how we document tasks/environments. It is nearly completely automatically generated now. Can you follow the documentation style of e.g. PushCube? Moreover can you save an example video with ray tracing of this task and put it in the folder with the others?

SUPPORTED_REWARD_MODES = ("sparse", "dense", "normalized_dense", "none")
SUPPORTED_ROBOTS = ["panda", "fetch"]
agent: Union[Panda, Fetch]

def __init__(
self,
*args,
robot_uids="panda",
robot_init_qpos_noise=0.02,
**kwargs
):
self.robot_init_qpos_noise = robot_init_qpos_noise

# Outer cabinet dimensions
self.outer_width = 0.225
self.outer_depth = 0.3
self.outer_height = 0.225
self.wall_thickness = 0.03

# Inner drawer dimensions
self.inner_width = self.outer_width - 2 * self.wall_thickness
self.inner_depth = self.outer_depth - 2.1 * self.wall_thickness
self.inner_height = self.outer_height - 2.1 * self.wall_thickness

# Handle dimensions
self.handle_width = 0.18 # Width of handle bar
self.handle_height = 0.06 # Height of handle from drawer face
self.handle_thickness = 0.015 # Thickness of handle material
self.handle_offset = 0.11 # Offset from drawer side

# Movement parameters
self.max_pull_distance = self.outer_width * 0.8 # Can pull out 80% of width
self.target_pos = -self.max_pull_distance * 0.8
self.k = 0.03

super().__init__(
*args,
robot_uids=robot_uids,
**kwargs
)

@property
def _default_sim_config(self):
return SimConfig(
gpu_memory_config=GPUMemoryConfig(
found_lost_pairs_capacity=2**25,
max_rigid_patch_count=2**18
)
)

@property
def _default_sensor_configs(self):
pose = sapien_utils.look_at(eye=[0.3, 0, 0.5], target=[-0.1, 0, 0.1])
return [
CameraConfig(
"base_camera",
pose=pose,
width=128,
height=128,
fov=np.pi / 2,
near=0.01,
far=100,
)
]

@property
def _default_human_render_camera_configs(self):
pose = sapien_utils.look_at([-0.8, 0.7, 0.7], [0.0, 0.0, 0.0])
return [
CameraConfig(
"render_camera",
pose=pose,
width=512,
height=512,
fov=1,
near=0.01,
far=100,
)
]

def _load_scene(self, options: dict):
self.scene_builder = TableSceneBuilder(
self, robot_init_qpos_noise=self.robot_init_qpos_noise
)
self.scene_builder.build()

builder = self.scene.create_articulation_builder()

# Create outer cabinet
base = builder.create_link_builder()
base.set_name('cabinet')

# Bottom base
base.add_box_collision(
sapien.Pose([0, 0, -self.outer_height/2]),
half_size=[self.outer_width/2, self.outer_depth/2, self.wall_thickness/2]
)
base.add_box_visual(
sapien.Pose([0, 0, -self.outer_height/2]),
half_size=[self.outer_width/2, self.outer_depth/2, self.wall_thickness/2],
)

# Top wall
base.add_box_collision(
sapien.Pose([0, 0, self.outer_height/2]),
half_size=[self.outer_width/2, self.outer_depth/2, self.wall_thickness/2]
)
base.add_box_visual(
sapien.Pose([0, 0, self.outer_height/2]),
half_size=[self.outer_width/2, self.outer_depth/2, self.wall_thickness/2],
)

# Left wall
base.add_box_collision(
sapien.Pose([0, -self.outer_depth/2, 0]),
half_size=[self.outer_width/2, self.wall_thickness/2, self.outer_height/2]
)
base.add_box_visual(
sapien.Pose([0, -self.outer_depth/2, 0]),
half_size=[self.outer_width/2, self.wall_thickness/2, self.outer_height/2],
)

# Left wall
base.add_box_collision(
sapien.Pose([0, self.outer_depth/2, 0]),
half_size=[self.outer_width/2, self.wall_thickness/2, self.outer_height/2]
)
base.add_box_visual(
sapien.Pose([0, self.outer_depth/2, 0]),
half_size=[self.outer_width/2, self.wall_thickness/2, self.outer_height/2],
)

# Right wall
base.add_box_collision(
sapien.Pose([self.outer_width/2, 0, 0]),
half_size=[self.wall_thickness/2, self.outer_depth/2, self.outer_height/2]
)
base.add_box_visual(
sapien.Pose([self.outer_width/2, 0, 0]),
half_size=[self.wall_thickness/2, self.outer_depth/2, self.outer_height/2],
)

# Create sliding drawer
drawer = builder.create_link_builder(parent=base)
drawer.set_name('drawer')

# Drawer bottom
drawer.add_box_collision(
sapien.Pose([0, 0, -self.inner_height/2]),
half_size=[self.inner_width/2, self.inner_depth/2, self.wall_thickness/2]
)
drawer.add_box_visual(
sapien.Pose([0, 0, -self.inner_height/2]),
half_size=[self.inner_width/2, self.inner_depth/2, self.wall_thickness/2],
)

# Drawer right
drawer.add_box_collision(
sapien.Pose([0, -self.inner_depth/2, 0]),
half_size=[self.inner_width/2, self.wall_thickness/2, self.inner_height/2]
)
drawer.add_box_visual(
sapien.Pose([0, -self.inner_depth/2, 0]),
half_size=[self.inner_width/2, self.wall_thickness/2, self.inner_height/2],
)

# Drawer left
drawer.add_box_collision(
sapien.Pose([0, self.inner_depth/2, 0]),
half_size=[self.inner_width/2, self.wall_thickness/2, self.inner_height/2]
)
drawer.add_box_visual(
sapien.Pose([0, self.inner_depth/2, 0]),
half_size=[self.inner_width/2, self.wall_thickness/2, self.inner_height/2],
)

# Drawer back
drawer.add_box_collision(
sapien.Pose([self.inner_width/2, 0, 0]),
half_size=[self.wall_thickness/2, self.inner_depth/2, self.inner_height/2]
)
drawer.add_box_visual(
sapien.Pose([self.inner_width/2, 0, 0]),
half_size=[self.wall_thickness/2, self.inner_depth/2, self.inner_height/2],
)

# Drawer front
drawer.add_box_collision(
sapien.Pose([-self.inner_width/2, 0, 0]),
half_size=[self.wall_thickness/2, self.inner_depth/2, self.inner_height/2]
)
drawer.add_box_visual(
sapien.Pose([-self.inner_width/2, 0, 0]),
half_size=[self.wall_thickness/2, self.inner_depth/2, self.inner_height/2],
)

# Handle material
mat = sapien.render.RenderMaterial()
mat.set_base_color([1, 0, 0, 1])
mat.metallic = 1.0
mat.roughness = 0.0
mat.specular = 1.0

# Main handle bar
drawer.add_box_collision(
sapien.Pose([-self.inner_width/2 - self.handle_offset, 0, 0]),
half_size=[self.handle_thickness/2, self.handle_width/2, self.handle_thickness/2]
)
drawer.add_box_visual(
sapien.Pose([-self.inner_width/2 - self.handle_offset, 0, 0]),
half_size=[self.handle_thickness/2, self.handle_width/2, self.handle_thickness/2],
material=mat
)

# Handle supports
for y_sign in [-1, 1]:
support_y = y_sign * (self.handle_width/2 - self.handle_thickness/2)
drawer.add_box_collision(
sapien.Pose([-self.inner_width/2 - self.handle_offset/2, support_y, 0]),
half_size=[self.handle_offset/2, self.handle_thickness/2, self.handle_thickness/2]
)
drawer.add_box_visual(
sapien.Pose([-self.inner_width/2 - self.handle_offset/2, support_y, 0]),
half_size=[self.handle_offset/2, self.handle_thickness/2, self.handle_thickness/2],
material=mat
)

# Configure drawer joint
drawer.set_joint_properties(
type="prismatic",
limits=(-self.max_pull_distance, 0),
pose_in_parent=sapien.Pose(),
pose_in_child=sapien.Pose(),
friction=0.4,
damping=10
)

builder.set_scene_idxs(scene_idxs=range(self.num_envs))
# builder.set_initial_pose(sapien.Pose(p=[0.17, 0.15, 0.12]))

self.drawer = builder.build(fix_root_link=True, name="drawer_articulation")
self.drawer_link = self.drawer.get_links()[1]

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
with torch.device(self.device):
b = len(env_idx)
self.scene_builder.initialize(env_idx)

drawer_xyz = torch.zeros((b, 3), device=self.device)
drawer_xyz[..., 0] = torch.rand((b,), device=self.device) * self.k + 0.17
drawer_xyz[..., 1] = torch.rand((b,), device=self.device) * self.k + 0.15
drawer_xyz[..., 2] = self.outer_height / 2 + 0.005


init_pos = Pose.create_from_pq(p=drawer_xyz)

self.drawer.set_pose(init_pos)

closed_qpos = torch.zeros((b, 1), device=self.device)
self.drawer.set_qpos(closed_qpos)

def _get_obs_extra(self, info: Dict):
obs = dict(
tcp_pose=self.agent.tcp.pose.raw_pose,
)

if self._obs_mode in ["state", "state_dict"]:
obs.update(
drawer_pose=self.drawer.pose.raw_pose,
drawer_qpos=self.drawer.get_qpos(),
)
return obs

def evaluate(self):
drawer_qpos = self.drawer.get_qpos()
pos_dist = torch.abs(self.target_pos - drawer_qpos)
drawer_pulled = pos_dist.squeeze(-1) < 0.03

progress = 1 - torch.tanh(5.0 * pos_dist)

return {
"success": drawer_pulled,
"success_once": drawer_pulled,
"success_at_end": drawer_pulled,
"drawer_progress": progress.mean(),
"drawer_distance": pos_dist.mean(),
"reward": self.compute_normalized_dense_reward(
None, None, {"success": drawer_pulled}
),
}


def compute_dense_reward(self, obs: Any, action: torch.Tensor, info: Dict):
# Batch size extraction
batch_size = self.drawer.get_qpos().shape[0]
device = self.device

self.scene._gpu_apply_all()
self.scene.px.gpu_update_articulation_kinematics()
self.scene._gpu_fetch_all()

# Get TCP pose and drawer link pose
tcp_pose = self.agent.tcp.pose.raw_pose
tcp_pos = tcp_pose[..., :3]
drawer_link_pose = self.drawer.links_map['drawer'].pose.raw_pose
drawer_pose = self.drawer.pose.raw_pose

handle_offset = torch.tensor([-self.inner_width/2 - self.handle_offset, 0, 0], device=drawer_link_pose.device)
handle_pose = drawer_pose[:, :3] + handle_offset

# 1. Orientation Reward - Modified for continuous feedback
tcp_pose_q = self.agent.tcp.pose.q # Current quaternion
desired_q = torch.tensor([0.5, 0.5, 0.5, 0.5], device=device).expand(batch_size, 4)

# Calculate quaternion distance (dot product between quaternions)
# Abs because q and -q represent the same rotation
quat_dot = torch.abs(torch.sum(tcp_pose_q * desired_q, dim=-1))
# Clip to handle numerical errors
quat_dot = torch.clamp(quat_dot, -1.0, 1.0)
# Convert to angle (in radians)
angle_dist = 2.0 * torch.acos(quat_dot)
# Normalize to [0, 1] range and invert so smaller angles give higher rewards
orientation_reward = 4.0 * (1.0 - torch.tanh(2.0 * angle_dist))

# 2. Approach Reward
reach_dist = torch.norm(tcp_pos - handle_pose, dim=-1)
approach_reward = 4.0 * (1 - torch.tanh(5.0 * reach_dist))

# 3. Progress Reward
drawer_qpos = self.drawer.get_qpos()
pos_dist = torch.abs(self.target_pos - drawer_qpos)
pulling_reward = 4.0 * (1 - torch.tanh(5.0 * pos_dist)).squeeze(-1)

# 4. Success Reward
success_mask = info.get("success", torch.zeros_like(pulling_reward, dtype=torch.bool))
completion_reward = 4.0 * success_mask

return orientation_reward + approach_reward + pulling_reward + completion_reward

def compute_normalized_dense_reward(
self, obs: Any, action: torch.Tensor, info: Dict
):
max_reward = 16.0 # Maximum possible reward
dense_reward = self.compute_dense_reward(obs=obs, action=action, info=info)
return dense_reward / max_reward
Loading