[AAAI 2025 Oral] FlowPolicy: Enabling Fast and Robust 3D Flow-based Policy via Consistency Flow Matching for Robot Manipulation
This is the official implementation of our AAAI2025 paper: FlowPolicy: Enabling Fast and Robust 3D Flow-based Policy via Consistency Flow Matching for Robot Manipulation. Paper
- 2025.1.18 Our paper has been selected for oral presentation at AAAI 2025.
- 2024.12.17 The final version of our paper is now available on arXiv.
- 2024.12.10 Our paper has been accepted by AAAI 2025.
Robots can acquire complex manipulation skills by learning policies from expert demonstrations, which is often known as vision-based imitation learning. Generating policies based on diffusion and flow matching models has been shown to be effective, particularly in robotic manipulation tasks. However, recursion-based approaches are inference inefficient in working from noise distributions to policy distributions, posing a challenging trade-off between efficiency and quality. This motivates us to propose FlowPolicy, a novel framework for fast policy generation based on consistency flow matching and 3D vision. Our approach refines the flow dynamics by normalizing the self-consistency of the velocity field, enabling the model to derive task execution policies in a single inference step. Specifically, FlowPolicy conditions on the observed 3D point cloud, where consistency flow matching directly defines straight-line flows from different time states to the same action space, while simultaneously constraining their velocity values, that is, we approximate the trajectories from noise to robot actions by normalizing the self-consistency of the velocity field within the action space, thus improving the inference efficiency. We validate the effectiveness of FlowPolicy in Adroit and Metaworld, demonstrating a 7x increase in inference speed while maintaining competitive average success rates compared to state-of-the-art methods.
The above section is a visualization of FlowPolicy. By defining a straight-line flow the data can flow the fastest from the noise distribution to the action distribution (Adroit: Open the door). The following section shows the details of FlowPolicy. Given a certain number of expert presentations, it is first converted into 3D point clouds. The 3D point clouds and the robot state are then fed into two encoders to obtain the compact 3D visual representation and the robot state embedding, respectively. Finally, a straight-line flow is learned by conditional consistency flow matching to generate high-quality robot actions and perform the corresponding tasks (Metaworld: Assembly) at real-time inference speed.See install.md for installation instructions.
You could generate demonstrations by yourself using our provided expert policies. Generated demonstrations are under $YOUR_REPO_PATH/FlowPolicy/data/
.
Scripts for generating demonstrations, training, and evaluation are all provided in the scripts/
folder.
The results are logged by wandb
, so you need to wandb login
first to see the results and videos.
-
Generate demonstrations by
gen_demonstration_adroit.sh
andgen_demonstration_metaworld.sh
. See the scripts for details. For example:bash scripts/gen_demonstration_adroit.sh hammer
This will generate demonstrations for the
hammer
task in Adroit environment. The data will be saved inFlowPolicy/data/
folder automatically. -
Train and evaluate a policy with behavior cloning. For example:
bash scripts/train_policy.sh flowpolicy adroit_hammer 0129 0 0
This will train a flowpolicy policy on the
hammer
task in Adroit environment using point cloud modality. -
Evaluate a saved policy or use it for inference. Please set For example:
bash scripts/eval_policy.sh flowpolicy adroit_hammer 0129 0 0
This will evaluate the saved flowpolicy policy you just trained. Note: the evaluation script is only provided for deployment/inference. For benchmarking, please use the results logged in wandb during training.
This repository is released under the MIT license.
Our code is built upon 3D Diffusion Policy, Consistency_FM, VRL3, and Metaworld. We would like to thank the authors for their excellent works.
If you find this repository helpful, please consider citing:
@article{zhang2024flowpolicy,
title={FlowPolicy: Enabling Fast and Robust 3D Flow-based Policy via Consistency Flow Matching for Robot Manipulation},
author={Qinglun Zhang and Zhen Liu and Haoqiang Fan and Guanghui Liu and Bing Zeng and Shuaicheng Liu},
year={2024},
eprint={2412.04987},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2412.04987}
}
If you have any questions, feel free to contact Qinglun Zhang at [email protected].