Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
So i've worked on the full collision mesh and examples, i have trained successfully Joystick, Handstand, Footstand and Getup. The policies need some rewards tuning but training works. Let me know if I need to do anything else. |
|
note: the actuator order in the mjx for Go2 does not follow the Unitree ordering for legs which is FR/FL/RR/RL, in the mjx it's FL/FR/RL/RR. just a note as simply forwarding the actions to the default order in LowCmd leads to mixing up the joints. Should this be fixed in the MJX? or it's an implementation detail left to the driver? |
|
Hello! Have you successfully trained Go2Getup and sim2realized it to a real robot? I found that a single training does not work like this: python train_jax_ppo.py --env_name=Go2Getup |
|
Yes I trained the Joystick policy and transfered it on a real Go2 successfully, the Getup and Handstand are straight copied from Go1, but from quick tests they did result in successful policies in sim, I didnt transfer these on the real Go2. |
|
for Getup there's an issue about it for Go1 so might have a look #65 |
That's the key point! It's mentioned in #65 that 50M timesteps is not enough for training Go1Getup. But should I train 750M timesteps at once, or train 50M timesteps each time and repeat loading checkpoints? More importantly, the paper mentions:
How should these two tricks be added to the training process? |
|
that's a bit off-topic regarding this PR, I'd suggest you ask directly in the issue itself since it's pretty much the same problem, the Go1/Go2 architectures are very similar. |
|
Hi @aatb-ch thanks for the PR! I'll try to get to this after the CoRL supplemental deadline (probs end of week). |
Thank you! I have reproduced it after 750M timesteps training. |
|
@kevinzakka super, yeah no stress just let me know once you got time if I need to change anything. |
|
Hi, https://github.com/DerSimi/unitree_go2_sim2real But note, when the Go2 is in low state mode, which is necessary for low level control, the "sportstatemode" is not published by the robot. This means that the linear velocity used as an observation here is not available. In my code, you can see that I circumvented this by setting the 'linvel' in the observation to the current command. It's a wonder it worked at all. |
|
even more, try zeroing out the linvel, gyro and gravity, and it still works! but yeah you dont get any state estimation from the internal sportmodestate, have to estimate it some other way. i did same as you initially but realized it's the same data as the command passed as observation anyway so it doesnt really matter to pass it again instead of linvel. |
|
@DerSimi May I know if you still have sim-to-real code available? |
|
hi, @kevinzakka was there anything missing for merging? |
|
@aatb-ch do you think you could share the code that you used to train and then make inference from the policy of the model please? I've been trying to do precisely the same thing modifying the go1 files to go2 (before knowing this PR actually existed) and I've arrived at a good result in the recording part (see in https://drive.google.com/file/d/1aaLDFLu4hyKNuKUTYwuNz3HxN5GG7BK5/view?usp=sharing). The problem is that I don't know why, but I'm trying to run it separately using the passive viewer of mujoco instead of just recording video and the policy is working pretty bad. Right now this is the code that I'm using trying to replicate the behavior. Could you tell me if you see any error (or difference) with what you used please?? This is how the robot is behavioring right now: https://drive.google.com/file/d/1J84Kv9RD5Pw7A9nEjK52qyiqF20p-2wE/view?usp=sharing Thanks in advance! |
Can I ask you what values of KP and KD did you use for the joystick policy in the sim2real please?? For some reason my policy when I use the same KP and KD |
This PR adds Unitree Go2 support, based off existing Go1 support. Used the Menagerie Go2 MJX model and adjusted accordingly to add correct sensors, collisions etc.
TODO: adjust full collision mjx, not 100% sure, seems some things are missing, have to go through the mesh of Go1 and compare, then test getup/handstand before adding these tasks.