[FEAT] Add MMAR/ MMAU-PRO for reasoning task support by nhhoang96 · Pull Request #16 · ServiceNow/AU-Harness

nhhoang96 · 2025-09-21T17:32:16Z

📌 Description

Adding support for deep reasoning tasks, including MMAR and MMAU-PRO.
Reusing llm-judge-binary for the MCQ evaluation for these two datasets.

🔗 Related Issue(s)

🛠️ Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality including new tasks)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactor / Code cleanup
Maintenance / Chore / Task
Other (please describe):

✅ How Has This Been Tested?

Unit tests
Integration tests
Manual testing

Test Results / Screenshots (if applicable):

Sample run_config:

task_metric: 
  - ['mmar', 'llm_judge_binary']
  - ['mmau-pro', 'llm_judge_binary']
aggregate:
  - ['llm_judge_binary', ['mmau-pro']]
  - ['llm_judge_binary', ['mmar']]

Sample Experimental Log:
2025-09-20_23-40-59_544502_default.log

📸 Screenshots / Demos

N/A

📋 Checklist

Code follows project style guidelines
Tests have been added/updated (if applicable)
Documentation has been updated (if applicable)
Linked relevant issue(s)
Self-reviewed my code

🙌 Additional Notes

N/A

… dataset loading.

nhhoang96 added 5 commits September 21, 2025 16:52

Adding MMAR and MMAU-Pro preprocessing

de7a943

Revert unrelated changes to MMAR

97d4b13

Adding documentation for the added tasks

b5d2579

Updating comment documentation

4e15e11

Updating documentation clarification

25f33ba

nhhoang96 requested review from akshaykalkunte, aman-servicenow, jash-mehta-3300 and oluwanifemibamgbose September 21, 2025 17:37

nhhoang96 self-assigned this Sep 21, 2025

nhhoang96 added documentation Improvements or additions to documentation enhancement New feature or request labels Sep 21, 2025

nhhoang96 and others added 3 commits September 29, 2025 17:51

Rectify the sub-task names and add dependency requirement package for…

bd537de

… dataset loading.

Merge branch 'main' into feat/add_mmar

79c94a7

Adding updated support for MMAR sub-modalities

a9c06ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Add MMAR/ MMAU-PRO for reasoning task support#16

[FEAT] Add MMAR/ MMAU-PRO for reasoning task support#16
nhhoang96 wants to merge 8 commits intomainfrom
feat/add_mmar

nhhoang96 commented Sep 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nhhoang96 commented Sep 21, 2025

📌 Description

🔗 Related Issue(s)

🛠️ Type of Change

✅ How Has This Been Tested?

📸 Screenshots / Demos

📋 Checklist

🙌 Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant