Hi,
I’m currently exploring curriculum learning strategies with mjlab, and I was wondering whether there is an existing way to access the mean reward during training.
At the moment, I’m basing the curriculum solely on the number of training steps, which works to some extent but is not very satisfying compared to a performance-based signal.
Thanks in advance!