Hi Multiverse Team,
Thank you for your remarkable work! Could you share more details on how you conducted the evaluation of Multiverse-32B on AIME24, AIME25, and MATH500—for example, the prompts used, generation configurations, and the maximum number of new tokens?