Skip to content

fix: clean up ray ec2 test temp dirs to prevent /tmp fill#6180

Open
jinyan-li1 wants to merge 1 commit into
mainfrom
fix/ray-ec2-tmp-cleanup
Open

fix: clean up ray ec2 test temp dirs to prevent /tmp fill#6180
jinyan-li1 wants to merge 1 commit into
mainfrom
fix/ray-ec2-tmp-cleanup

Conversation

@jinyan-li1
Copy link
Copy Markdown
Contributor

@jinyan-li1 jinyan-li1 commented Jun 3, 2026

Summary

The autorelease ray pipeline fails because /tmp fills to 100% on the GitHub Actions runner. Three NLP test temp dirs leak per run:

Root cause: `make_container_fixture` in `test/ray/ec2/common.py` calls `tempfile.mkdtemp(prefix=f"ray-ec2-{model_name}-")` to extract model tarballs but never cleans them up.

Changes

  • `test/ray/ec2/common.py`: add `shutil.rmtree(model_dir, ignore_errors=True)` in fixture teardown
  • `pr-ray-ec2-{cpu,gpu}.yml`: add pre-test cleanup (clears any prior leaks) and post-test cleanup with `if: always()` (handles mid-run crashes where pytest teardown didn't run)

Test plan

  • CI passes on this PR
  • Verify next ray autorelease run completes without /tmp fill

🤖 Generated with Claude Code

NLP model tarballs (1-2GB each) leaked from make_container_fixture.
Add shutil.rmtree in fixture teardown + pre/post workflow cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant