Adding RoboCasa to LeRobot #2725

ShahRutav · 2025-12-26T23:36:54Z

Title

Added RoboCasa environment.

Type / Scope

Type: (Feature)
Scope: (optional — name of module or package affected)

Summary / Motivation

Adds another benchmark to the lerobot pipeline (https://github.com/robocasa/robocasa)

Related issues

Fixes / Closes: # (if any)
Related: Add RoboCasa environment and imitation learning dataset integration #2380

What changed

Short, concrete bullets of the modifications (files/behaviour).
- examples/port_datasets/port_robocasa.py Ports the RoboCasa dataset to LeRobot format
- examples/training/train_policy_casa.py Example training script with RoboCasa dataset
- src/lerobot/envs/configs.py Adds extra RoboCasa config
- src/lerobot/envs/factory.py Adds the intermediate condition to call the RoboCasa env
- src/lerobot/envs/robocasa_env.py RoboCasa env wrapper
Short note if this introduces breaking changes and migration steps.
- Vectorized env with batch_size > 1 leads to some rendering issues

How was this tested

Manual checks / dataset runs performed.

To run the code conversion, download the dataset from RoboCasa with images (https://robocasa.ai/docs/use_cases/downloading_datasets.html). Then, port the dataset to Lerobot format using the following:

python -m robocasa.scripts.download_datasets --ds_types human_im
python examples/port_datasets/port_robocasa.py --dataset_path /path/to/dataset.hdf5 --repo_name your_hf_username/robocasa_dataset

For training a diffusion policy with the generated dataset

python examples/training/train_policy_casa.py

For running evaluations,

lerobot-eval \      
    --policy.path=outputs/train/example_robocasa_diffusion \
    --env.type=robocasa --env.task TurnOnStove \
    --eval.batch_size=1 \
    --eval.n_episodes=20 \
    --policy.use_amp=false \
    --policy.device=cuda

Run a quick example or CLI (if applicable):
```
lerobot-train --some.option=true
```

How to run locally (reviewer)

Run the relevant tests:

Checklist (required before merge)

Linting/formatting run (pre-commit run -a)
All tests pass locally (pytest)
Documentation updated
CI is green

Reviewer notes

Anything the reviewer should focus on (performance, edge-cases, specific files) or general notes.
Anyone in the community is free to review the PR.

Known Issues

Evaluation with the vectorized env (batch_size > 1) causes rendering issues.

…tasets to LeRobot format * Introduced a new script for converting RoboCasa HDF5 datasets to LeRobot format. * Added a new environment class for RoboCasa (similar to LIBERO).

* Added a new function `get_max_dims` to determine maximum dimensions for keys with variable dimension across HDF5 datasets. * Changed camera segmentation type from "segmentation_level" to "instance".

The end-to-end pipeline now runs for diffusion policy training. TODO: - Fix `max_episode_steps` in RoboCasa - Validate with RoboCasa and RoboSuite master branch - Validate vector env output - Train and evaluate diffusion policy Refs huggingface#2380

- Set `episode_length` in `RoboCasaEnvConfig` from dataset max trajectory length instead of a fixed value - Fix `dummy_action_inputs` in `RoboCasaEnv` to avoid reuse across different robots/controllers - Ensure compatibility with default branches of robosuite and robocasa - [BUG] Known issue: vectorized environments (batch size > 1) have rendering issues

- Refactored type hints in `port_robocasa.py` to use built-in generic types. - Improved code formatting and consistency across various functions in `port_robocasa.py`. - Updated comments for clarity and corrected minor typos in the code.

Copilot

Pull request overview

This PR integrates the RoboCasa simulation environment into LeRobot, enabling users to train and evaluate policies on kitchen manipulation tasks. The integration includes environment wrappers, dataset conversion utilities, and example training scripts.

Key changes:

Added RoboCasa environment wrapper with Gymnasium API compatibility
Created dataset porting script to convert RoboCasa HDF5 datasets to LeRobot format
Integrated RoboCasa into the environment factory and configuration system

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
`src/lerobot/envs/robocasa_env.py`	New environment wrapper implementing RoboCasa gym interface with support for multiple observation types and camera configurations
`src/lerobot/envs/factory.py`	Added factory method for creating RoboCasa environments, integrated into existing environment creation pipeline
`src/lerobot/envs/configs.py`	Added RoboCasaEnvConfig dataclass with task-specific settings and feature mappings
`examples/training/train_policy_casa.py`	Example training script demonstrating diffusion policy training on RoboCasa datasets
`examples/port_datasets/port_robocasa.py`	Comprehensive dataset conversion script with automatic property discovery and CLIP embedding generation
`pyproject.toml`	Added "PnP" to spelling ignore list for task names like "PnPCabToCounter"

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-26T23:40:35Z

src/lerobot/envs/robocasa_env.py

+        dummy_action = get_robocasa_dummy_action(self._env)
+        for _ in range(self.num_steps_wait):
+            raw_obs, _, _, _ = self._env.step(dummy_action)


The variable name 'dummy_action' should be changed to match the function rename. Consider renaming to 'zero_action' or 'noop_action' for consistency and clarity.

Suggested change

dummy_action = get_robocasa_dummy_action(self._env)

for _ in range(self.num_steps_wait):

raw_obs, _, _, _ = self._env.step(dummy_action)

zero_action = get_robocasa_dummy_action(self._env)

for _ in range(self.num_steps_wait):

raw_obs, _, _, _ = self._env.step(zero_action)

Copilot · 2025-12-26T23:40:35Z

examples/training/train_policy_casa.py

+    ##################################################### CHANGE HERE #####################################################
+    # remove unnecessary columns; it will throttle the whole process
+    want = {"observation.state", "action", "timestamp", "index", "episode_index", "task_index", "task"}
+    have = set(dataset.hf_dataset.column_names)
+    drop = [c for c in have if c not in want]
+    dataset.hf_dataset = dataset.hf_dataset.remove_columns(drop)


This comment marker and the code manipulation below it seem like development/debugging code that was left in. The comment "CHANGE HERE" and the column removal logic that follows appear to be a temporary workaround rather than production code. This should either be properly documented explaining why columns need to be removed, moved to a helper function, or removed if it's not necessary.

Copilot · 2025-12-26T23:40:36Z

examples/port_datasets/port_robocasa.py

+    # TODO: estimate the variable lengths keys directly from the dataset instead of hardcoding them
+    keys_with_variable_length = ["states", "obs/object-state", "obs/objects-joint-state", "obs/object"]


The hardcoded list of keys with variable length should be discovered from the dataset rather than hardcoded. The TODO comment acknowledges this, but having hardcoded keys creates a maintenance burden and may cause issues with datasets that have different structures. Consider implementing automatic discovery or at least validating that these keys exist in the dataset.

Copilot · 2025-12-26T23:40:36Z

examples/port_datasets/port_robocasa.py

+            import shutil
+            import tempfile


The nested import inside the function is unusual. The imports for 'shutil' and 'tempfile' should be moved to the top of the file with other imports, following Python best practices for import organization.

Copilot · 2025-12-26T23:40:36Z

src/lerobot/envs/robocasa_env.py

+            gym_kwargs=gym_kwargs,
+        )
+        vec_env = env_cls(fns)
+        print(f"Built vec env | n_envs={n_envs}")


The print statement should use proper logging instead of print. This is especially important for library code where users may want to control log verbosity. Consider using the logging module with an appropriate log level (e.g., logging.info).

Copilot · 2025-12-26T23:40:39Z

src/lerobot/envs/robocasa_env.py

+    camera_name: str | Sequence[str] = "",
+    env_cls: Callable[[Sequence[Callable[[], Any]]], Any] | None = None,
+) -> RoboCasaEnv | Any:
+    """
+    Create vectorized RoboCasa environments from an HDF5 dataset.
+
+    Args:
+        task_name: Name of the task
+        n_envs: Number of environments to create
+        gym_kwargs: Additional arguments to pass to RoboCasaEnv
+        camera_name: Camera name(s) to use for observations, overrides gym_kwargs['camera_name'] if provided
+        env_cls: Callable that wraps a list of environment factory callables (for vectorization)
+
+    Returns:
+        If env_cls is provided, returns vectorized environment. Otherwise returns a single RoboCasaEnv.
+    """
+    if not isinstance(n_envs, int) or n_envs <= 0:
+        raise ValueError(f"n_envs must be a positive int; got {n_envs}.")
+
+    gym_kwargs = dict(gym_kwargs or {})
+    gym_kwargs_camera_name = gym_kwargs.pop("camera_name", None)
+    camera_name = camera_name if camera_name != "" else gym_kwargs_camera_name


The condition 'if camera_name != ""' is checking for an empty string, but the default value in the function signature is an empty string. This creates ambiguous behavior when the user explicitly passes an empty string versus not passing the argument at all. Consider using None as the default value instead of an empty string, and check 'if camera_name is not None'.

Copilot · 2025-12-26T23:40:39Z

examples/port_datasets/port_robocasa.py

+from lerobot.datasets.video_utils import encode_video_frames
+from lerobot.utils.constants import ACTION, HF_LEROBOT_HOME, OBS_IMAGES, OBS_STATE
+
+HF_TOKEN = os.environ["HF_TOKEN"]


The environment variable HF_TOKEN is accessed without checking if it exists first. If the environment variable is not set, this will raise a KeyError. Consider using os.environ.get("HF_TOKEN") with a fallback or adding a clear error message explaining that HF_TOKEN must be set.

Suggested change

HF_TOKEN = os.environ["HF_TOKEN"]

HF_TOKEN = os.environ.get("HF_TOKEN")

if HF_TOKEN is None:

raise RuntimeError(

"Environment variable HF_TOKEN must be set to use the Hugging Face Hub. "

"Please set HF_TOKEN to a valid Hugging Face access token."

)

Copilot · 2025-12-26T23:40:39Z

src/lerobot/envs/robocasa_env.py

+    f = h5py.File(dataset_path, "r")
+    env_args = json.loads(f["data"].attrs["env_args"]) if "data" in f else json.loads(f.attrs["env_args"])
+    if isinstance(env_args, str):
+        env_args = json.loads(env_args)  # double leads to dict type
+    f.close()
+    return env_args
+
+
+def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]:
+    """Extract environment metadata from dataset."""
+    dataset_path = os.path.expanduser(dataset_path)
+    f = h5py.File(dataset_path, "r")
+    data = f.get("data", f)
+    keys = list(data.keys())
+    env_meta = data[keys[episode_index]].attrs["ep_meta"]
+    env_meta = json.loads(env_meta)
+    assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}"
+    f.close()


The HDF5 file is opened but not closed in the error path. If an exception occurs during execution, the file handle will leak. Use a context manager or try-finally block to ensure proper cleanup.

Suggested change

f = h5py.File(dataset_path, "r")

env_args = json.loads(f["data"].attrs["env_args"]) if "data" in f else json.loads(f.attrs["env_args"])

if isinstance(env_args, str):

env_args = json.loads(env_args) # double leads to dict type

f.close()

return env_args

def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]:

"""Extract environment metadata from dataset."""

dataset_path = os.path.expanduser(dataset_path)

f = h5py.File(dataset_path, "r")

data = f.get("data", f)

keys = list(data.keys())

env_meta = data[keys[episode_index]].attrs["ep_meta"]

env_meta = json.loads(env_meta)

assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}"

f.close()

with h5py.File(dataset_path, "r") as f:

if "data" in f:

raw_env_args = f["data"].attrs["env_args"]

else:

raw_env_args = f.attrs["env_args"]

env_args = json.loads(raw_env_args)

if isinstance(env_args, str):

env_args = json.loads(env_args) # double leads to dict type

return env_args

def _parse_env_meta_from_hdf5(dataset_path: str, episode_index: int = 0) -> dict[str, Any]:

"""Extract environment metadata from dataset."""

dataset_path = os.path.expanduser(dataset_path)

with h5py.File(dataset_path, "r") as f:

data = f.get("data", f)

keys = list(data.keys())

env_meta = data[keys[episode_index]].attrs["ep_meta"]

env_meta = json.loads(env_meta)

assert isinstance(env_meta, dict), f"Expected dict type but got {type(env_meta)}"

Copilot · 2025-12-26T23:40:39Z

examples/training/train_policy_casa.py

+    delta_timestamps = {
+        "observation.images.robot0_agentview_center": [
+            i / dataset_metadata.fps for i in cfg.observation_delta_indices
+        ],
+        "observation.images.robot0_eye_in_hand": [
+            i / dataset_metadata.fps for i in cfg.observation_delta_indices
+        ],
+        "action": [i / dataset_metadata.fps for i in cfg.action_delta_indices],
+    }
+


This assignment to 'delta_timestamps' is unnecessary as it is redefined before this value is used.

Suggested change

delta_timestamps = {

"observation.images.robot0_agentview_center": [

i / dataset_metadata.fps for i in cfg.observation_delta_indices

],

"observation.images.robot0_eye_in_hand": [

i / dataset_metadata.fps for i in cfg.observation_delta_indices

],

"action": [i / dataset_metadata.fps for i in cfg.action_delta_indices],

}

Copilot · 2025-12-26T23:40:40Z

examples/port_datasets/port_robocasa.py

+                            )
+                        )
+                        # Other groups - we'll handle them separately
+                        pass


Unnecessary 'pass' statement.

Suggested change

pass

ShahRutav added 8 commits November 10, 2025 10:09

feat(robocasa): add script and environment for converting RoboCasa da…

9286612

…tasets to LeRobot format * Introduced a new script for converting RoboCasa HDF5 datasets to LeRobot format. * Added a new environment class for RoboCasa (similar to LIBERO).

feat(robocasa):

8b2045c

* Added a new function `get_max_dims` to determine maximum dimensions for keys with variable dimension across HDF5 datasets. * Changed camera segmentation type from "segmentation_level" to "instance".

Merge branch 'huggingface:main' into robocasa-env

e6e02da

removing keys with dtype as object from saving

e9f3a52

Merge branch 'huggingface:main' into robocasa-env

b5afbca

Copilot AI review requested due to automatic review settings December 26, 2025 23:36

github-actions bot added examples Issues related to the examples evaluation For issues or PRs related to environment evaluation, and benchmarks. labels Dec 26, 2025

Copilot started reviewing on behalf of ShahRutav December 26, 2025 23:37 View session

fixing names

436fd9d

Copilot AI reviewed Dec 26, 2025

View reviewed changes

ShahRutav marked this pull request as draft December 26, 2025 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding RoboCasa to LeRobot #2725

Adding RoboCasa to LeRobot #2725

ShahRutav commented Dec 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		# TODO: estimate the variable lengths keys directly from the dataset instead of hardcoding them
		keys_with_variable_length = ["states", "obs/object-state", "obs/objects-joint-state", "obs/object"]

-HF_TOKEN = os.environ["HF_TOKEN"]
+HF_TOKEN = os.environ.get("HF_TOKEN")
+if HF_TOKEN is None:
+    raise RuntimeError(
+        "Environment variable HF_TOKEN must be set to use the Hugging Face Hub. "
+        "Please set HF_TOKEN to a valid Hugging Face access token."
+    )

Adding RoboCasa to LeRobot #2725

Are you sure you want to change the base?

Adding RoboCasa to LeRobot #2725

Conversation

ShahRutav commented Dec 26, 2025

Title

Type / Scope

Summary / Motivation

Related issues

What changed

How was this tested

How to run locally (reviewer)

Checklist (required before merge)

Reviewer notes

Known Issues

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant