fix: make default Slurm CPU headers portable by SchrodingersCattt · Pull Request #596 · deepmodeling/dpdispatcher

SchrodingersCattt · 2026-05-12T06:37:32Z

Avoid login-shell and GPU directives for CPU Slurm jobs so generated scripts work on clusters with stricter compute-node environments.

Summary by CodeRabbit

Bug Fixes
- Removed login-shell (-l) from script shebang and stopped embedding the parsable flag in headers for more predictable submissions.
- GPU request logic now omits GPU directives for CPU jobs and respects explicit empty overrides when provided.
Documentation
- Clarified guidance about login-shell effects and that GPU directives are omitted by default for CPU jobs.
Tests
- Improved Slurm header tests for clearer, more targeted coverage.

coderabbitai · 2026-05-12T06:40:53Z

Warning

Rate limit exceeded

@SchrodingersCattt has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 28 minutes before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9e778470-05ee-4152-971a-e6824e998ba3

📥 Commits

Reviewing files that changed from the base of the PR and between 252b175 and 593e67f.

📒 Files selected for processing (5)

doc/examples/expanse.md
dpdispatcher/machines/slurm.py
examples/resources/expanse_cpu.json
examples/resources/template.slurm
tests/test_slurm_script_generation.py

📝 Walkthrough

Walkthrough

The PR removes the login-shell invocation (-l) and embedded --parsable directive from Slurm script headers, allowing sbatch --parsable to be supplied at submission time. GPU directive generation now uses an explicit custom_gpu_line when provided, otherwise defaults to --gres=gpu:<count> for positive GPU requests or an empty directive for zero GPUs. Tests are refactored with a helper to simplify validation.

Changes

Slurm Header and GPU Directive Refactoring

Layer / File(s)	Summary
Slurm header template and GPU directive generation `dpdispatcher/machines/slurm.py`, `examples/resources/template.slurm`, `examples/resources/expanse_cpu.json`, `doc/examples/expanse.md`, `doc/context.md`	Shebang changed from `#!/bin/bash -l` to `#!/bin/bash` and `#SBATCH --parsable` removed from templates. `Slurm.gen_script_header` now has a `-> str` annotation and selects GPU directive: use `custom_gpu_line` only when it is non-`None`; otherwise emit `--gres=gpu:<gpu_per_node>` when GPU count > 0 or no GPU directive when count == 0. Documentation and example resource JSON updated to match the new default behavior.
Test infrastructure and header validation `tests/test_slurm_script_generation.py`	Added `_make_header()` helper to generate script headers from a mutated machine JSON and `Resources` without constructing full `Task`/`Submission` objects. Tests updated to expect `#!/bin/bash` (no `-l`), no embedded `#SBATCH --parsable`, and to validate GPU `--gres` presence/absence for CPU/GPU cases.

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels: size:L, lgtm

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: make default Slurm CPU headers portable' directly and clearly summarizes the main change: modifying Slurm CPU script headers to be portable by removing login-shell and GPU directives.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

dpdispatcher/machines/slurm.py (1)
38-38: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add return type hint to gen_script_header.

The method should include a return type annotation -> str to comply with the project's type-hint requirements. As per coding guidelines, "Always add type hints - Include proper type annotations in all Python code for better maintainability" for files matching dpdispatcher/**/*.py.
📝 Proposed fix
-    def gen_script_header(self, job):
+    def gen_script_header(self, job) -> str:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dpdispatcher/machines/slurm.py` at line 38, Add a return type annotation to
the method signature of gen_script_header so it declares it returns a string
(i.e., change def gen_script_header(self, job) to include -> str). Locate the
gen_script_header method in the Slurm-related class (method name:
gen_script_header) and update its signature to include the return type; no other
behavioral changes are required. Ensure the change compiles with the existing
codebase and update any stubs/tests if they assert signature compatibility.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_slurm_script_generation.py`:
- Around line 25-37: Add type annotations to the test helper _make_header:
change its signature to accept resource_updates: Optional[Dict[str, Any]] = None
and remove_resource_keys: Optional[List[str]] = None and declare the return type
as str (the header returned by Machine.gen_script_header). Also ensure the
necessary typing imports (Optional, Dict, Any, List) are present at the top of
the test file; leave the function body unchanged and keep using Machine,
Resources and SimpleNamespace as before.

---

Outside diff comments:
In `@dpdispatcher/machines/slurm.py`:
- Line 38: Add a return type annotation to the method signature of
gen_script_header so it declares it returns a string (i.e., change def
gen_script_header(self, job) to include -> str). Locate the gen_script_header
method in the Slurm-related class (method name: gen_script_header) and update
its signature to include the return type; no other behavioral changes are
required. Ensure the change compiles with the existing codebase and update any
stubs/tests if they assert signature compatibility.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7df7e554-64d2-465d-9941-587b3a7b0c45

📥 Commits

Reviewing files that changed from the base of the PR and between b23161c and d2e9d18.

📒 Files selected for processing (2)

dpdispatcher/machines/slurm.py
tests/test_slurm_script_generation.py

codecov · 2026-05-12T08:21:58Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.40%. Comparing base (b23161c) to head (593e67f).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   48.33%   48.40%   +0.07%     
==========================================
  Files          40       40              
  Lines        3958     3960       +2     
==========================================
+ Hits         1913     1917       +4     
+ Misses       2045     2043       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR improves the portability of generated Slurm job scripts by removing login-shell usage in the default Slurm header and ensuring GPU directives are only emitted when explicitly requested or configured, which helps CPU-only jobs run on stricter clusters.

Changes:

Removed -l from the Slurm shebang and dropped #SBATCH --parsable from the default header template.
Updated Slurm GPU header generation to omit GPU requests when gpu_per_node == 0, unless an explicit custom_gpu_line is provided.
Refactored Slurm script-header tests to focus on gen_script_header directly and added CPU/GPU-specific assertions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`dpdispatcher/machines/slurm.py`	Makes the default Slurm header more portable and fixes GPU directive emission logic for CPU jobs.
`tests/test_slurm_script_generation.py`	Simplifies header tests and adds coverage for CPU headers omitting GPU directives and for default GPU `--gres` behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

njzjz-bot

Thanks for the portability fix. The code direction and the added Slurm header tests look reasonable, and CI is green, but I think this PR still needs documentation/example updates before merging.

Blocking documentation gaps:

doc/context.md still says DPDispatcher submission scripts use bash -l and therefore execute login-shell startup files. This PR changes the default Slurm header to #!/bin/bash, so that statement is no longer universally true for Slurm users and directly contradicts the new behavior.
doc/examples/expanse.md still says Expanse needs custom_gpu_line because the default would emit --gres=gpu:0. With this PR, CPU Slurm jobs now omit the GPU directive by default, so the example text should be updated.
examples/resources/expanse_cpu.json still sets kwargs.custom_gpu_line to #SBATCH --gpus=0. If the purpose of this PR is to make CPU Slurm headers portable by default, this example should demonstrate the new default behavior by removing that override, unless Expanse specifically requires --gpus=0—in which case the docs should explain that distinction.
examples/resources/template.slurm still includes #!/bin/bash -l and #SBATCH --parsable. Since this is the documented custom-header template example, please either update it to match the new recommended default or explicitly state that custom templates are user-controlled and may still opt into login shells / embedded --parsable.

The tests cover the Python behavior well enough for this small change (tests/test_slurm_script_generation.py passes locally; selected CLI/submission tests also pass in an editable venv), but the user-facing docs are now stale.

— OpenClaw 2026.4.22 (model: gpt-5.5)

njzjz · 2026-05-12T12:40:25Z

This PR introduces an inconsistency among different types of Machines. Shell, Slurm, PBS, LSF, and Bohrium all used bash -l, but this PR only changed Slurm, leaving others unchanged.

Avoid invalid Slurm directives for CPU jobs while preserving the login-shell shebang used by other machine backends.

SchrodingersCattt · 2026-05-12T12:55:26Z

This PR introduces an inconsistency among different types of Machines. Shell, Slurm, PBS, LSF, and Bohrium all used bash -l, but this PR only changed Slurm, leaving others unchanged.

I've restored the line involving bash -l and perhaps I'll open a new PR to handle the issue of -l.

njzjz-bot

Thanks for the updates. The documentation/examples now match the final scope of this PR:

#SBATCH --parsable is removed from the default Slurm header and custom template example, while sbatch --parsable is still used at submission time.
CPU Slurm jobs now omit the default GPU directive when gpu_per_node == 0.
The Expanse example no longer relies on custom_gpu_line for the default CPU case, and documents when an explicit site-specific directive such as --gpus=0 is still appropriate.
Keeping #!/bin/bash -l in this PR is acceptable because it avoids changing login-shell behavior for only Slurm while the other Machine backends still use bash -l; handling login-shell consistency across all backends should be a separate PR.

Local validation:

git diff --check origin/master...HEAD
pytest tests/test_slurm_script_generation.py tests/test_argcheck.py tests/test_submit.py tests/test_run_submission.py -q → 20 passed, 14 skipped
ruff check dpdispatcher/machines/slurm.py tests/test_slurm_script_generation.py

— OpenClaw 2026.4.22 (model: gpt-5.5)

dosubot Bot added size:XS This PR changes 0-9 lines, ignoring generated files. bug Something isn't working labels May 12, 2026

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Comment thread tests/test_slurm_script_generation.py Outdated

SchrodingersCattt force-pushed the fix/slurm-cpu-header branch from d2e9d18 to ddc4482 Compare May 12, 2026 06:46

dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels May 12, 2026

SchrodingersCattt changed the title ~~fix: make slurm CPU headers portable~~ fix: make default Slurm CPU headers portable May 12, 2026

njzjz requested a review from Copilot May 12, 2026 08:39

Copilot started reviewing on behalf of njzjz May 12, 2026 08:40 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

njzjz-bot suggested changes May 12, 2026

View reviewed changes

SchrodingersCattt force-pushed the fix/slurm-cpu-header branch from 252b175 to 593e67f Compare May 12, 2026 12:48

fix: make default Slurm CPU directives portable

593e67f

Avoid invalid Slurm directives for CPU jobs while preserving the login-shell shebang used by other machine backends.

njzjz approved these changes May 12, 2026

View reviewed changes

dosubot Bot added the lgtm This PR has been approved by a maintainer label May 12, 2026

njzjz-bot approved these changes May 12, 2026

View reviewed changes

njzjz merged commit 581a0bd into deepmodeling:master May 13, 2026
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make default Slurm CPU headers portable#596

fix: make default Slurm CPU headers portable#596
njzjz merged 1 commit into
deepmodeling:masterfrom
SchrodingersCattt:fix/slurm-cpu-header

SchrodingersCattt commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

codecov Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

njzjz-bot left a comment

Uh oh!

njzjz commented May 12, 2026

Uh oh!

SchrodingersCattt commented May 12, 2026

Uh oh!

njzjz-bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

SchrodingersCattt commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

njzjz-bot left a comment

Choose a reason for hiding this comment

Uh oh!

njzjz commented May 12, 2026

Uh oh!

SchrodingersCattt commented May 12, 2026

Uh oh!

njzjz-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SchrodingersCattt commented May 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading

codecov Bot commented May 12, 2026 •

edited

Loading