EvolvingLMMs-Lab / lmms-eval Public

Notifications You must be signed in to change notification settings
Fork 547
Star 3.9k

Code
Issues 20
Pull requests 10
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: EvolvingLMMs-Lab/lmms-eval

Labels 14 Milestones 0

New pull request New

10 Open 753 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix: handle internvl_hf video-only inputs and enable frame sampling

#1279 opened Mar 28, 2026 by akawincent

Loading…

fix: preserve HME100k prediction case in OCRBench scoring

#1278 opened Mar 27, 2026 by akawincent

Loading…

Fix the incompatibility issue caused by top_p=0 when using vllm to inference (#1265)

#1277 opened Mar 27, 2026 by akawincent

Loading…

feat: add MMBench static evaluation mode (no OpenAI API needed)

#1276 opened Mar 26, 2026 by Luodian

Loading…

3 tasks

feat: add process_results_use_image and video metadata dict support in task API

#1275 opened Mar 26, 2026 by Luodian

Loading…

3 tasks

fix: improve evaluation logic across 10+ existing benchmarks

#1274 opened Mar 26, 2026 by Luodian

Loading…

3 tasks

feat: add COVER and WM-aBench video understanding benchmarks

#1273 opened Mar 26, 2026 by Luodian

Loading…

4 tasks

feat: add physics reasoning benchmarks (PhysBench, ContPhy, PhysGame, PhysicsRW, PhysReason)

#1272 opened Mar 26, 2026 by Luodian

Loading…

4 tasks

feat: add VBench video generation evaluation benchmark

#1271 opened Mar 26, 2026 by Luodian

Loading…

3 tasks

feat: add MiniMax as LLM judge provider

#1263 opened Mar 22, 2026 by octo-patch

Loading…

3 tasks done

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!