-
Notifications
You must be signed in to change notification settings - Fork 341
Closed
Description
mmlu_pro score is very low for Qwen2.5-32B-Instruct (8.01).
When I changed generate_prompt() in MMLUProTaskHandler class like generate_prompt() MMLUTaskHandler class, the score was increased into 57.80.
skythought/evals/tasks/mmlu/mmlu_handler.py
class MMLUProTaskHandler(MMLUTaskHandler):
def generate_prompt(self, prompt):
multiple_choice_string = self.get_multiple_choice_answers(prompt) # ADDED
prompt = prompt["question"] + "\n" + multiple_choice_string # ADDED
return self.task_config.templating_parameters["template"].format(prompt=prompt)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels