Skip to content

Conversation

@tristan-f-r
Copy link
Collaborator

@tristan-f-r tristan-f-r commented Jul 9, 2025

This change means that output files will not be reused whenever SPRAS is updated, furthering the immutability goal necessary to get OSDF integration working for SPRAS benchmarking. ('updated' depends on the git commit hash or the actual SPRAS release version)

This adds the unique spras_revision to every single paramater combination (before hashing) and the dataset label, to provide OSDF support on the level of deterministic, non-seeded algorithms when datasets are immutable.

This has the added benefit of allowing SPRAS users to simply upgrade their SPRAS version without needing to clear output, which complements #380. The refactored test also partially covers #165 and #45. (This is also where the majority of the code comes from: The actual feature patch here is a 50 line change.)

See #321 implemented by #335 for handling nondeterministic algorithms / seeded algorithms.


To make this change, a significant test refactor in test/analysis was needed to remove hardcoded paths (which contained the hashes being modified per-commit in this PR.) It turns out that whenever we make any change to the hash, this [original: the patch here fixes this] test breaks! That's why this PR is depended on by so many other PRs.

This adds the unique spras_revision to every single paramater combination (before hashing) and the dataset label, to provide OSDF support on the level of deterministic algorithms.
@tristan-f-r tristan-f-r marked this pull request as ready for review July 9, 2025 20:51
@tristan-f-r tristan-f-r added enhancement New feature or request needed for benchmarking Priority PRs needed for the benchmarking paper labels Jul 9, 2025
@tristan-f-r tristan-f-r changed the title feat: spras_revision feat: SPRAS revision Jul 9, 2025
@tristan-f-r

This comment was marked as outdated.

@tristan-f-r tristan-f-r marked this pull request as draft July 9, 2025 21:37
@tristan-f-r tristan-f-r marked this pull request as ready for review July 10, 2025 19:34
@tristan-f-r tristan-f-r changed the title feat: SPRAS revision feat!: SPRAS revision Jul 10, 2025
@tristan-f-r

This comment was marked as outdated.

@tristan-f-r tristan-f-r added the P-high This is a blocker for many PRs/issues/features label Jul 24, 2025
@read-the-docs-community
Copy link

read-the-docs-community bot commented Aug 4, 2025

Documentation build overview

📚 spras | 🛠️ Build #30075056 | 📁 Comparing a169505 against latest (c3b02cd)


🔍 Preview build

Show files changed (4 files in total): 📝 4 modified | ➕ 0 added | ➖ 0 deleted
File Status
genindex.html 📝 modified
fordevs/spras.analysis.html 📝 modified
fordevs/spras.config.html 📝 modified
fordevs/spras.html 📝 modified

@tristan-f-r tristan-f-r mentioned this pull request Sep 6, 2025
3 tasks
@tristan-f-r tristan-f-r added the awaiting-author Author of the PR needs to fix something from a review / etc. label Sep 25, 2025
@tristan-f-r tristan-f-r removed the awaiting-author Author of the PR needs to fix something from a review / etc. label Sep 25, 2025
@tristan-f-r tristan-f-r mentioned this pull request Oct 7, 2025
1 task
@tristan-f-r tristan-f-r added the tuning Workflow-spanning algorithm tuning label Oct 8, 2025
@tristan-f-r tristan-f-r requested a review from ntalluri October 8, 2025 05:44
@github-actions github-actions bot added the merge-conflict This PR has merge conflicts. label Oct 24, 2025
@github-actions github-actions bot removed the merge-conflict This PR has merge conflicts. label Oct 24, 2025
@tristan-f-r tristan-f-r added P-medium medium prirotity; this is needed for some external service or another PR and removed tuning Workflow-spanning algorithm tuning P-high This is a blocker for many PRs/issues/features labels Nov 1, 2025
Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I have only gone through this PR, not the linked PRs in the initial comment. Can you please help keep the scope self-contained by summarizing any relevant changes from other PRs that are relevant context to keep in mind while reviewing this?

I could use more guidance before going through the specific files to see if the implementation matches the design. Otherwise I am trying to build the design in my head from the implementation.

Why do we have a new config file and why did the existing config files change so much? What is an example of the new and old output directory structure we should expect? What is the purpose of the ignored run directory? Why do we delete so many test-related files?

@tristan-f-r
Copy link
Collaborator Author

The move from config.yaml to example.yaml was nearly arbitrary (it's been renamed since it does actually differ from the top-level config/config.yaml).

The original purpose of test/analysis/input was to archive the output structure of SPRAS. This structure has not fundamentally changed, but we can no longer store that structure inside the repository easily, as with this PR, the hashes change when the SPRAS commit revision changes (as explained in the original issue description). Since we did not test against this entire structure, but only against cytoscape and summary files, we could safely remove this file.

Now, these tests do Snakemake runs instead (also as a side-goal to encourage integration testing #165).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request needed for benchmarking Priority PRs needed for the benchmarking paper P-medium medium prirotity; this is needed for some external service or another PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants