-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Adding RoboCasa to LeRobot #2725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…tasets to LeRobot format * Introduced a new script for converting RoboCasa HDF5 datasets to LeRobot format. * Added a new environment class for RoboCasa (similar to LIBERO).
* Added a new function `get_max_dims` to determine maximum dimensions for keys with variable dimension across HDF5 datasets. * Changed camera segmentation type from "segmentation_level" to "instance".
The end-to-end pipeline now runs for diffusion policy training. TODO: - Fix `max_episode_steps` in RoboCasa - Validate with RoboCasa and RoboSuite master branch - Validate vector env output - Train and evaluate diffusion policy Refs huggingface#2380
- Set `episode_length` in `RoboCasaEnvConfig` from dataset max trajectory length instead of a fixed value - Fix `dummy_action_inputs` in `RoboCasaEnv` to avoid reuse across different robots/controllers - Ensure compatibility with default branches of robosuite and robocasa - [BUG] Known issue: vectorized environments (batch size > 1) have rendering issues
- Refactored type hints in `port_robocasa.py` to use built-in generic types. - Improved code formatting and consistency across various functions in `port_robocasa.py`. - Updated comments for clarity and corrected minor typos in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR integrates the RoboCasa simulation environment into LeRobot, enabling users to train and evaluate policies on kitchen manipulation tasks. The integration includes environment wrappers, dataset conversion utilities, and example training scripts.
Key changes:
- Added RoboCasa environment wrapper with Gymnasium API compatibility
- Created dataset porting script to convert RoboCasa HDF5 datasets to LeRobot format
- Integrated RoboCasa into the environment factory and configuration system
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
src/lerobot/envs/robocasa_env.py |
New environment wrapper implementing RoboCasa gym interface with support for multiple observation types and camera configurations |
src/lerobot/envs/factory.py |
Added factory method for creating RoboCasa environments, integrated into existing environment creation pipeline |
src/lerobot/envs/configs.py |
Added RoboCasaEnvConfig dataclass with task-specific settings and feature mappings |
examples/training/train_policy_casa.py |
Example training script demonstrating diffusion policy training on RoboCasa datasets |
examples/port_datasets/port_robocasa.py |
Comprehensive dataset conversion script with automatic property discovery and CLIP embedding generation |
pyproject.toml |
Added "PnP" to spelling ignore list for task names like "PnPCabToCounter" |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| dummy_action = get_robocasa_dummy_action(self._env) | ||
| for _ in range(self.num_steps_wait): | ||
| raw_obs, _, _, _ = self._env.step(dummy_action) |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name 'dummy_action' should be changed to match the function rename. Consider renaming to 'zero_action' or 'noop_action' for consistency and clarity.
| dummy_action = get_robocasa_dummy_action(self._env) | |
| for _ in range(self.num_steps_wait): | |
| raw_obs, _, _, _ = self._env.step(dummy_action) | |
| zero_action = get_robocasa_dummy_action(self._env) | |
| for _ in range(self.num_steps_wait): | |
| raw_obs, _, _, _ = self._env.step(zero_action) |
| ##################################################### CHANGE HERE ##################################################### | ||
| # remove unnecessary columns; it will throttle the whole process | ||
| want = {"observation.state", "action", "timestamp", "index", "episode_index", "task_index", "task"} | ||
| have = set(dataset.hf_dataset.column_names) | ||
| drop = [c for c in have if c not in want] | ||
| dataset.hf_dataset = dataset.hf_dataset.remove_columns(drop) |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment marker and the code manipulation below it seem like development/debugging code that was left in. The comment "CHANGE HERE" and the column removal logic that follows appear to be a temporary workaround rather than production code. This should either be properly documented explaining why columns need to be removed, moved to a helper function, or removed if it's not necessary.
| # TODO: estimate the variable lengths keys directly from the dataset instead of hardcoding them | ||
| keys_with_variable_length = ["states", "obs/object-state", "obs/objects-joint-state", "obs/object"] |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded list of keys with variable length should be discovered from the dataset rather than hardcoded. The TODO comment acknowledges this, but having hardcoded keys creates a maintenance burden and may cause issues with datasets that have different structures. Consider implementing automatic discovery or at least validating that these keys exist in the dataset.
| import shutil | ||
| import tempfile |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nested import inside the function is unusual. The imports for 'shutil' and 'tempfile' should be moved to the top of the file with other imports, following Python best practices for import organization.
| gym_kwargs=gym_kwargs, | ||
| ) | ||
| vec_env = env_cls(fns) | ||
| print(f"Built vec env | n_envs={n_envs}") |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The print statement should use proper logging instead of print. This is especially important for library code where users may want to control log verbosity. Consider using the logging module with an appropriate log level (e.g., logging.info).
| camera_name: str | Sequence[str] = "", | ||
| env_cls: Callable[[Sequence[Callable[[], Any]]], Any] | None = None, | ||
| ) -> RoboCasaEnv | Any: | ||
| """ | ||
| Create vectorized RoboCasa environments from an HDF5 dataset. | ||
|
|
||
| Args: | ||
| task_name: Name of the task | ||
| n_envs: Number of environments to create | ||
| gym_kwargs: Additional arguments to pass to RoboCasaEnv | ||
| camera_name: Camera name(s) to use for observations, overrides gym_kwargs['camera_name'] if provided | ||
| env_cls: Callable that wraps a list of environment factory callables (for vectorization) | ||
|
|
||
| Returns: | ||
| If env_cls is provided, returns vectorized environment. Otherwise returns a single RoboCasaEnv. | ||
| """ | ||
| if not isinstance(n_envs, int) or n_envs <= 0: | ||
| raise ValueError(f"n_envs must be a positive int; got {n_envs}.") | ||
|
|
||
| gym_kwargs = dict(gym_kwargs or {}) | ||
| gym_kwargs_camera_name = gym_kwargs.pop("camera_name", None) | ||
| camera_name = camera_name if camera_name != "" else gym_kwargs_camera_name |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition 'if camera_name != ""' is checking for an empty string, but the default value in the function signature is an empty string. This creates ambiguous behavior when the user explicitly passes an empty string versus not passing the argument at all. Consider using None as the default value instead of an empty string, and check 'if camera_name is not None'.
| from lerobot.datasets.video_utils import encode_video_frames | ||
| from lerobot.utils.constants import ACTION, HF_LEROBOT_HOME, OBS_IMAGES, OBS_STATE | ||
|
|
||
| HF_TOKEN = os.environ["HF_TOKEN"] |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment variable HF_TOKEN is accessed without checking if it exists first. If the environment variable is not set, this will raise a KeyError. Consider using os.environ.get("HF_TOKEN") with a fallback or adding a clear error message explaining that HF_TOKEN must be set.
| HF_TOKEN = os.environ["HF_TOKEN"] | |
| HF_TOKEN = os.environ.get("HF_TOKEN") | |
| if HF_TOKEN is None: | |
| raise RuntimeError( | |
| "Environment variable HF_TOKEN must be set to use the Hugging Face Hub. " | |
| "Please set HF_TOKEN to a valid Hugging Face access token." | |
| ) |
| f = h5py.File(dataset_path, "r") | ||
| env_args = json.loads(f["data"].attrs["env_args"]) if "data" in f else json.loads(f.attrs["env_args"]) | ||
| if isinstance(env_args, str): | ||
| env_args = json.loads(env_args) # double leads to dict type | ||
| f.close() | ||
| return env_args | ||
|
|
||
|
|
||
| def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]: | ||
| """Extract environment metadata from dataset.""" | ||
| dataset_path = os.path.expanduser(dataset_path) | ||
| f = h5py.File(dataset_path, "r") | ||
| data = f.get("data", f) | ||
| keys = list(data.keys()) | ||
| env_meta = data[keys[episode_index]].attrs["ep_meta"] | ||
| env_meta = json.loads(env_meta) | ||
| assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}" | ||
| f.close() |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HDF5 file is opened but not closed in the error path. If an exception occurs during execution, the file handle will leak. Use a context manager or try-finally block to ensure proper cleanup.
| f = h5py.File(dataset_path, "r") | |
| env_args = json.loads(f["data"].attrs["env_args"]) if "data" in f else json.loads(f.attrs["env_args"]) | |
| if isinstance(env_args, str): | |
| env_args = json.loads(env_args) # double leads to dict type | |
| f.close() | |
| return env_args | |
| def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]: | |
| """Extract environment metadata from dataset.""" | |
| dataset_path = os.path.expanduser(dataset_path) | |
| f = h5py.File(dataset_path, "r") | |
| data = f.get("data", f) | |
| keys = list(data.keys()) | |
| env_meta = data[keys[episode_index]].attrs["ep_meta"] | |
| env_meta = json.loads(env_meta) | |
| assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}" | |
| f.close() | |
| with h5py.File(dataset_path, "r") as f: | |
| if "data" in f: | |
| raw_env_args = f["data"].attrs["env_args"] | |
| else: | |
| raw_env_args = f.attrs["env_args"] | |
| env_args = json.loads(raw_env_args) | |
| if isinstance(env_args, str): | |
| env_args = json.loads(env_args) # double leads to dict type | |
| return env_args | |
| def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]: | |
| """Extract environment metadata from dataset.""" | |
| dataset_path = os.path.expanduser(dataset_path) | |
| with h5py.File(dataset_path, "r") as f: | |
| data = f.get("data", f) | |
| keys = list(data.keys()) | |
| env_meta = data[keys[episode_index]].attrs["ep_meta"] | |
| env_meta = json.loads(env_meta) | |
| assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}" |
| delta_timestamps = { | ||
| "observation.images.robot0_agentview_center": [ | ||
| i / dataset_metadata.fps for i in cfg.observation_delta_indices | ||
| ], | ||
| "observation.images.robot0_eye_in_hand": [ | ||
| i / dataset_metadata.fps for i in cfg.observation_delta_indices | ||
| ], | ||
| "action": [i / dataset_metadata.fps for i in cfg.action_delta_indices], | ||
| } | ||
|
|
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assignment to 'delta_timestamps' is unnecessary as it is redefined before this value is used.
| delta_timestamps = { | |
| "observation.images.robot0_agentview_center": [ | |
| i / dataset_metadata.fps for i in cfg.observation_delta_indices | |
| ], | |
| "observation.images.robot0_eye_in_hand": [ | |
| i / dataset_metadata.fps for i in cfg.observation_delta_indices | |
| ], | |
| "action": [i / dataset_metadata.fps for i in cfg.action_delta_indices], | |
| } |
| ) | ||
| ) | ||
| # Other groups - we'll handle them separately | ||
| pass |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary 'pass' statement.
| pass |
Title
Added RoboCasa environment.
Type / Scope
Summary / Motivation
Related issues
What changed
examples/port_datasets/port_robocasa.pyPorts the RoboCasa dataset to LeRobot formatexamples/training/train_policy_casa.pyExample training script with RoboCasa datasetsrc/lerobot/envs/configs.pyAdds extra RoboCasa configsrc/lerobot/envs/factory.pyAdds the intermediate condition to call the RoboCasa envsrc/lerobot/envs/robocasa_env.pyRoboCasa env wrapperbatch_size > 1leads to some rendering issuesHow was this tested
To run the code conversion, download the dataset from RoboCasa with images (https://robocasa.ai/docs/use_cases/downloading_datasets.html). Then, port the dataset to Lerobot format using the following:
For training a diffusion policy with the generated dataset
For running evaluations,
Run a quick example or CLI (if applicable):
How to run locally (reviewer)
Checklist (required before merge)
pre-commit run -a)pytest)Reviewer notes
Known Issues
Evaluation with the vectorized env (batch_size > 1) causes rendering issues.