feat(omnidocbench): add normalized Levenshtein distance metric by MaxwellJryao · Pull Request #1246 · EvolvingLMMs-Lab/lmms-eval

MaxwellJryao · 2026-03-10T02:34:38Z

Add omnidocbench_nld_score metric computed as (1 - NLD) * 100, following the Kimi K2.5 technical report scoring method. The existing exact_match metric is preserved alongside the new one.

Summary

Add omnidocbench_nld_score metric: (1 - normalized_levenshtein_distance) * 100, using the Levenshtein library
When multiple reference answers exist, take the best (max) score across all answers
Register the new metric in omnidocbench.yaml with aggregation: mean

In scope

lmms_eval/tasks/omnidocbench/utils.py: add _normalized_levenshtein_score() helper, update omnidocbench_process_results to return both
omnidocbench_exact_match and omnidocbench_nld_score
lmms_eval/tasks/omnidocbench/omnidocbench.yaml: register omnidocbench_nld_score in metric_list

Out of scope

No changes to other tasks (charxiv, ocrbench, ocrbench_v2, etc.)
No changes to the existing omnidocbench_exact_match scoring logic
No new dependencies added (Levenshtein is already declared in pyproject.toml under [project.optional-dependencies].metrics)

Validation

python -m lmms_eval --model vllm --model_args model=Qwen/Qwen3-VL-8B-Instruct,tensor_parallel_size=2,data_parallel_size=4 --tasks omnidocbench
--limit 4 --batch_size 4 | sample size: N=4 | key metrics: omnidocbench_exact_match, omnidocbench_nld_score both reported | result: pass

Risk / Compatibility

Non-breaking: existing omnidocbench_exact_match metric is unchanged; the new metric is purely additive
Results from prior runs remain valid and comparable

Type of Change

Add omnidocbench_nld_score metric computed as (1 - NLD) * 100, following the Kimi K2.5 technical report scoring method. The existing exact_match metric is preserved alongside the new one.

feat(omnidocbench): add normalized Levenshtein distance metric

b0130f7

Add omnidocbench_nld_score metric computed as (1 - NLD) * 100, following the Kimi K2.5 technical report scoring method. The existing exact_match metric is preserved alongside the new one.

MaxwellJryao force-pushed the feat/omnidocbench-nld-metric branch from f2b55c3 to b0130f7 Compare March 10, 2026 02:41

Luodian approved these changes Mar 10, 2026

View reviewed changes

Luodian merged commit 4650095 into EvolvingLMMs-Lab:main Mar 10, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(omnidocbench): add normalized Levenshtein distance metric#1246

feat(omnidocbench): add normalized Levenshtein distance metric#1246
Luodian merged 1 commit intoEvolvingLMMs-Lab:mainfrom
MaxwellJryao:feat/omnidocbench-nld-metric

MaxwellJryao commented Mar 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MaxwellJryao commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

In scope

Out of scope

Validation

Risk / Compatibility

Type of Change

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MaxwellJryao commented Mar 10, 2026 •

edited

Loading