fix: cross-class filelock deadlock in datasets loading by Luodian · Pull Request #1247 · EvolvingLMMs-Lab/lmms-eval

Luodian · 2026-03-10T08:15:29Z

Summary

Patch FileLockMeta.__call__ with a cross-class singleton cache so filelock.FileLock and datasets.utils._filelock.FileLock return the same instance for the same lock path
Fixes RuntimeError: Deadlock during datasets.load_dataset() in distributed eval — filelock 3.25.0's global _registry detects deadlocks across all classes, but is_singleton caches per-class, so two different classes targeting the same .lock file trigger a false deadlock
Adds regression test covering both same-class and cross-class singleton scenarios

Test plan

Cross-class singleton verified inside container (filelock 3.25.0 + datasets 4.x)
31-task loading passes in single-process container test
Job 2451 running on 4-node dp32 FluidStack — passed task loading phase (previously deadlocked at this stage)

Merge dummy_video_reader into a single dummy model that serves both use cases: - Default mode: instant no-op responses for dataset hydration and task smoke tests - Video-bench mode (read_bytes/decode_num_frames > 0): full IO/decode latency tracking The old name dummy_video_reader is kept as a MODEL_ALIASES alias for backward compat.

… inputs SGLang's Engine runs its own Qwen3-VL processor internally. When lmms-eval pre-tokenized inputs with the HF processor and passed the expanded input_ids to SGLang, pad tokens were expanded twice, causing IndexError on image inputs and potential failures on video inputs. - Image path: pass prompt text directly to Engine.generate() instead of pre-tokenized input_ids, letting SGLang handle tokenization end-to-end - Video path: pass prompt text + video_data to Engine.generate() using SGLang's native video support instead of pre-tokenizing and swapping video tokens to image tokens - Fix tools check: use truthy check instead of 'is not None' so empty list from disabled MCP does not trigger tool-handling code paths - Fix tools param: pass tools=None instead of tools=[] to apply_chat_template to avoid unexpected preprocessing - Lazy-import MCP deps: avoid ImportError at module load when mcp package is not installed - Broaden optional metric imports: catch Exception instead of ImportError so numpy/spacy binary incompatibilities do not crash metric aggregation for unrelated tasks

Luodian · 2026-03-15T09:05:57Z

Closing in favor of #1253, which is a strict superset of this PR (same 7 commits + additional SGLang refactor, distributed eval fixes, and cache simplification). All changes from this PR are included there.

Luodian and others added 7 commits March 9, 2026 01:27

fix: land layered cache support on main worktree

d38a731

fix: stabilize dataset loading and mmmu pro prompts

5359b3a

fix: add eval batch watchdog heartbeats

a056a4a

feat: promote sealed cache segments during eval

8ac3b4b

style: auto-fix lint (black + isort)

3052409

Luodian mentioned this pull request Mar 15, 2026

feat: SGLang refactor, distributed eval fixes, and cache simplification #1253

Merged

4 tasks

Luodian closed this Mar 15, 2026

Luodian deleted the brianli/dev branch March 15, 2026 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: cross-class filelock deadlock in datasets loading#1247

fix: cross-class filelock deadlock in datasets loading#1247
Luodian wants to merge 7 commits intomainfrom
brianli/dev

Luodian commented Mar 10, 2026

Uh oh!

Luodian commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Luodian commented Mar 10, 2026

Summary

Test plan

Uh oh!

Luodian commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant