Skip to content

Conversation

@ShahRutav
Copy link

Title

Added RoboCasa environment.

Type / Scope

  • Type: (Feature)
  • Scope: (optional — name of module or package affected)

Summary / Motivation

Related issues

What changed

  • Short, concrete bullets of the modifications (files/behaviour).
    • examples/port_datasets/port_robocasa.py Ports the RoboCasa dataset to LeRobot format
    • examples/training/train_policy_casa.py Example training script with RoboCasa dataset
    • src/lerobot/envs/configs.py Adds extra RoboCasa config
    • src/lerobot/envs/factory.py Adds the intermediate condition to call the RoboCasa env
    • src/lerobot/envs/robocasa_env.py RoboCasa env wrapper
  • Short note if this introduces breaking changes and migration steps.
    • Vectorized env with batch_size > 1 leads to some rendering issues

How was this tested

  • Manual checks / dataset runs performed.

To run the code conversion, download the dataset from RoboCasa with images (https://robocasa.ai/docs/use_cases/downloading_datasets.html). Then, port the dataset to Lerobot format using the following:

python -m robocasa.scripts.download_datasets --ds_types human_im
python examples/port_datasets/port_robocasa.py --dataset_path /path/to/dataset.hdf5 --repo_name your_hf_username/robocasa_dataset

For training a diffusion policy with the generated dataset

python examples/training/train_policy_casa.py

For running evaluations,

lerobot-eval \      
    --policy.path=outputs/train/example_robocasa_diffusion \
    --env.type=robocasa --env.task TurnOnStove \
    --eval.batch_size=1 \
    --eval.n_episodes=20 \
    --policy.use_amp=false \
    --policy.device=cuda
  • Run a quick example or CLI (if applicable):

    lerobot-train --some.option=true

How to run locally (reviewer)

  • Run the relevant tests:

Checklist (required before merge)

  • Linting/formatting run (pre-commit run -a)
  • All tests pass locally (pytest)
  • Documentation updated
  • CI is green

Reviewer notes

  • Anything the reviewer should focus on (performance, edge-cases, specific files) or general notes.
  • Anyone in the community is free to review the PR.

Known Issues

Evaluation with the vectorized env (batch_size > 1) causes rendering issues.

…tasets to LeRobot format

* Introduced a new script for converting RoboCasa HDF5 datasets to LeRobot format.
* Added a new environment class for RoboCasa (similar to LIBERO).
* Added a new function `get_max_dims` to determine maximum dimensions for keys with variable dimension across HDF5 datasets.
* Changed camera segmentation type from "segmentation_level" to "instance".
The end-to-end pipeline now runs for diffusion policy training.

TODO:
- Fix `max_episode_steps` in RoboCasa
- Validate with RoboCasa and RoboSuite master branch
- Validate vector env output
- Train and evaluate diffusion policy

Refs huggingface#2380
- Set `episode_length` in `RoboCasaEnvConfig` from dataset max trajectory length instead of a fixed value
- Fix `dummy_action_inputs` in `RoboCasaEnv` to avoid reuse across different robots/controllers
- Ensure compatibility with default branches of robosuite and robocasa

- [BUG] Known issue: vectorized environments (batch size > 1) have rendering issues
- Refactored type hints in `port_robocasa.py` to use built-in generic types.
- Improved code formatting and consistency across various functions in `port_robocasa.py`.
- Updated comments for clarity and corrected minor typos in the code.
Copilot AI review requested due to automatic review settings December 26, 2025 23:36
@github-actions github-actions bot added examples Issues related to the examples evaluation For issues or PRs related to environment evaluation, and benchmarks. labels Dec 26, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates the RoboCasa simulation environment into LeRobot, enabling users to train and evaluate policies on kitchen manipulation tasks. The integration includes environment wrappers, dataset conversion utilities, and example training scripts.

Key changes:

  • Added RoboCasa environment wrapper with Gymnasium API compatibility
  • Created dataset porting script to convert RoboCasa HDF5 datasets to LeRobot format
  • Integrated RoboCasa into the environment factory and configuration system

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
src/lerobot/envs/robocasa_env.py New environment wrapper implementing RoboCasa gym interface with support for multiple observation types and camera configurations
src/lerobot/envs/factory.py Added factory method for creating RoboCasa environments, integrated into existing environment creation pipeline
src/lerobot/envs/configs.py Added RoboCasaEnvConfig dataclass with task-specific settings and feature mappings
examples/training/train_policy_casa.py Example training script demonstrating diffusion policy training on RoboCasa datasets
examples/port_datasets/port_robocasa.py Comprehensive dataset conversion script with automatic property discovery and CLIP embedding generation
pyproject.toml Added "PnP" to spelling ignore list for task names like "PnPCabToCounter"

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +388 to +390
dummy_action = get_robocasa_dummy_action(self._env)
for _ in range(self.num_steps_wait):
raw_obs, _, _, _ = self._env.step(dummy_action)
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'dummy_action' should be changed to match the function rename. Consider renaming to 'zero_action' or 'noop_action' for consistency and clarity.

Suggested change
dummy_action = get_robocasa_dummy_action(self._env)
for _ in range(self.num_steps_wait):
raw_obs, _, _, _ = self._env.step(dummy_action)
zero_action = get_robocasa_dummy_action(self._env)
for _ in range(self.num_steps_wait):
raw_obs, _, _, _ = self._env.step(zero_action)

Copilot uses AI. Check for mistakes.
Comment on lines +96 to +101
##################################################### CHANGE HERE #####################################################
# remove unnecessary columns; it will throttle the whole process
want = {"observation.state", "action", "timestamp", "index", "episode_index", "task_index", "task"}
have = set(dataset.hf_dataset.column_names)
drop = [c for c in have if c not in want]
dataset.hf_dataset = dataset.hf_dataset.remove_columns(drop)
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment marker and the code manipulation below it seem like development/debugging code that was left in. The comment "CHANGE HERE" and the column removal logic that follows appear to be a temporary workaround rather than production code. This should either be properly documented explaining why columns need to be removed, moved to a helper function, or removed if it's not necessary.

Copilot uses AI. Check for mistakes.
Comment on lines +130 to +131
# TODO: estimate the variable lengths keys directly from the dataset instead of hardcoding them
keys_with_variable_length = ["states", "obs/object-state", "obs/objects-joint-state", "obs/object"]
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded list of keys with variable length should be discovered from the dataset rather than hardcoded. The TODO comment acknowledges this, but having hardcoded keys creates a maintenance burden and may cause issues with datasets that have different structures. Consider implementing automatic discovery or at least validating that these keys exist in the dataset.

Copilot uses AI. Check for mistakes.
Comment on lines +463 to +464
import shutil
import tempfile
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested import inside the function is unusual. The imports for 'shutil' and 'tempfile' should be moved to the top of the file with other imports, following Python best practices for import organization.

Copilot uses AI. Check for mistakes.
gym_kwargs=gym_kwargs,
)
vec_env = env_cls(fns)
print(f"Built vec env | n_envs={n_envs}")
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The print statement should use proper logging instead of print. This is especially important for library code where users may want to control log verbosity. Consider using the logging module with an appropriate log level (e.g., logging.info).

Copilot uses AI. Check for mistakes.
Comment on lines +467 to +488
camera_name: str | Sequence[str] = "",
env_cls: Callable[[Sequence[Callable[[], Any]]], Any] | None = None,
) -> RoboCasaEnv | Any:
"""
Create vectorized RoboCasa environments from an HDF5 dataset.

Args:
task_name: Name of the task
n_envs: Number of environments to create
gym_kwargs: Additional arguments to pass to RoboCasaEnv
camera_name: Camera name(s) to use for observations, overrides gym_kwargs['camera_name'] if provided
env_cls: Callable that wraps a list of environment factory callables (for vectorization)

Returns:
If env_cls is provided, returns vectorized environment. Otherwise returns a single RoboCasaEnv.
"""
if not isinstance(n_envs, int) or n_envs <= 0:
raise ValueError(f"n_envs must be a positive int; got {n_envs}.")

gym_kwargs = dict(gym_kwargs or {})
gym_kwargs_camera_name = gym_kwargs.pop("camera_name", None)
camera_name = camera_name if camera_name != "" else gym_kwargs_camera_name
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition 'if camera_name != ""' is checking for an empty string, but the default value in the function signature is an empty string. This creates ambiguous behavior when the user explicitly passes an empty string versus not passing the argument at all. Consider using None as the default value instead of an empty string, and check 'if camera_name is not None'.

Copilot uses AI. Check for mistakes.
from lerobot.datasets.video_utils import encode_video_frames
from lerobot.utils.constants import ACTION, HF_LEROBOT_HOME, OBS_IMAGES, OBS_STATE

HF_TOKEN = os.environ["HF_TOKEN"]
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variable HF_TOKEN is accessed without checking if it exists first. If the environment variable is not set, this will raise a KeyError. Consider using os.environ.get("HF_TOKEN") with a fallback or adding a clear error message explaining that HF_TOKEN must be set.

Suggested change
HF_TOKEN = os.environ["HF_TOKEN"]
HF_TOKEN = os.environ.get("HF_TOKEN")
if HF_TOKEN is None:
raise RuntimeError(
"Environment variable HF_TOKEN must be set to use the Hugging Face Hub. "
"Please set HF_TOKEN to a valid Hugging Face access token."
)

Copilot uses AI. Check for mistakes.
Comment on lines +91 to +108
f = h5py.File(dataset_path, "r")
env_args = json.loads(f["data"].attrs["env_args"]) if "data" in f else json.loads(f.attrs["env_args"])
if isinstance(env_args, str):
env_args = json.loads(env_args) # double leads to dict type
f.close()
return env_args


def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]:
"""Extract environment metadata from dataset."""
dataset_path = os.path.expanduser(dataset_path)
f = h5py.File(dataset_path, "r")
data = f.get("data", f)
keys = list(data.keys())
env_meta = data[keys[episode_index]].attrs["ep_meta"]
env_meta = json.loads(env_meta)
assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}"
f.close()
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HDF5 file is opened but not closed in the error path. If an exception occurs during execution, the file handle will leak. Use a context manager or try-finally block to ensure proper cleanup.

Suggested change
f = h5py.File(dataset_path, "r")
env_args = json.loads(f["data"].attrs["env_args"]) if "data" in f else json.loads(f.attrs["env_args"])
if isinstance(env_args, str):
env_args = json.loads(env_args) # double leads to dict type
f.close()
return env_args
def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]:
"""Extract environment metadata from dataset."""
dataset_path = os.path.expanduser(dataset_path)
f = h5py.File(dataset_path, "r")
data = f.get("data", f)
keys = list(data.keys())
env_meta = data[keys[episode_index]].attrs["ep_meta"]
env_meta = json.loads(env_meta)
assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}"
f.close()
with h5py.File(dataset_path, "r") as f:
if "data" in f:
raw_env_args = f["data"].attrs["env_args"]
else:
raw_env_args = f.attrs["env_args"]
env_args = json.loads(raw_env_args)
if isinstance(env_args, str):
env_args = json.loads(env_args) # double leads to dict type
return env_args
def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]:
"""Extract environment metadata from dataset."""
dataset_path = os.path.expanduser(dataset_path)
with h5py.File(dataset_path, "r") as f:
data = f.get("data", f)
keys = list(data.keys())
env_meta = data[keys[episode_index]].attrs["ep_meta"]
env_meta = json.loads(env_meta)
assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}"

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +78
delta_timestamps = {
"observation.images.robot0_agentview_center": [
i / dataset_metadata.fps for i in cfg.observation_delta_indices
],
"observation.images.robot0_eye_in_hand": [
i / dataset_metadata.fps for i in cfg.observation_delta_indices
],
"action": [i / dataset_metadata.fps for i in cfg.action_delta_indices],
}

Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to 'delta_timestamps' is unnecessary as it is redefined before this value is used.

Suggested change
delta_timestamps = {
"observation.images.robot0_agentview_center": [
i / dataset_metadata.fps for i in cfg.observation_delta_indices
],
"observation.images.robot0_eye_in_hand": [
i / dataset_metadata.fps for i in cfg.observation_delta_indices
],
"action": [i / dataset_metadata.fps for i in cfg.action_delta_indices],
}

Copilot uses AI. Check for mistakes.
)
)
# Other groups - we'll handle them separately
pass
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary 'pass' statement.

Suggested change
pass

Copilot uses AI. Check for mistakes.
@ShahRutav ShahRutav marked this pull request as draft December 26, 2025 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

evaluation For issues or PRs related to environment evaluation, and benchmarks. examples Issues related to the examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant