Performance of MiroRL-14B-SingleAgent-Preview-v0.1

First, great work!

I would like to understand how well the performance "40.29%(Avg@8) on the GAIA-text-103 subset" is. You mentioned that this score was obtained by fine-tuning your MiroRL-14B-SingleAgent-Preview-v0.1 model with GRPO. But you didn't mention the original model's performance.

In addition, I found that MiroThinker-14B-SFT-v0.1 obtained a 44.4 score on the same benchmark. I assume that MiroRL-14B-SingleAgent-Preview-v0.1 should achieve a score lower than 40.29. Could you explain the differences between MiroRL-14B-SingleAgent-Preview-v0.1 and MiroThinker-14B-SFT-v0.1?

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of MiroRL-14B-SingleAgent-Preview-v0.1 #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance of MiroRL-14B-SingleAgent-Preview-v0.1 #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions