feat(datasets): expose video codec option for dataset recording #2771
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Type / Scope
Summary / Motivation
LeRobot hardcodes
libsvtav1for video encoding during dataset recording. While AV1 offers excellent compression, it is CPU-heavy and can starve camera capture threads during recording. This PR exposes the existingvcodecparameter (whichencode_video_framesalready supports) through theLeRobotDatasetAPI and recording CLI, allowing users to choose faster codecs likeh264orhevcwhen needed.The implementation threads the codec option through both sequential and parallel encoding paths without changing any defaults—existing workflows continue to use
libsvtav1unless explicitly overridden.Related issues
LeRobotDatasetor monkeypatch to use different codecs during recordingWhat changed
src/lerobot/datasets/lerobot_dataset.py:VALID_VIDEO_CODECSconstant (h264,hevc,libsvtav1)_encode_video_worker()to accept and forwardvcodecparametervcodecparameter toLeRobotDataset.__init__()andLeRobotDataset.create()with validationself.vcodecsrc/lerobot/scripts/lerobot_record.py:vcodecfield toDatasetRecordConfigvcodecto bothLeRobotDataset()(resume) andLeRobotDataset.create()(new dataset)--dataset.vcodecoptiontests/datasets/test_datasets.py:No breaking changes: Default remains
libsvtav1, all existing code continues to work unchanged.How was this tested
test_encode_video_worker_forwards_vcodec— verifies vcodec is forwarded toencode_video_framestest_encode_video_worker_default_vcodec— verifies default islibsvtav1test_lerobot_dataset_vcodec_validation— verifies invalid codecs raiseValueErrortest_valid_video_codecs_constant— verifies constant contains expected codecsHow to run locally (reviewer)
Run the new tests:
pytest -q tests/datasets/test_datasets.py -k "vcodec"Test recording with different codec (requires robot hardware or mock):
lerobot-record \ --robot.type=so100_follower \ --dataset.repo_id=test/vcodec_test \ --dataset.single_task="Test task" \ --dataset.vcodec=h264Checklist (required before merge)
pre-commit run -a)pytest)Reviewer notes
ProcessPoolExecutorsubmissions) correctly pickles and forwards thevcodecargumentencode_video_framesvalidates codecs