-
Notifications
You must be signed in to change notification settings - Fork 927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MemoryDataset
entries to free_outputs
#3475
Add MemoryDataset
entries to free_outputs
#3475
Conversation
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Is this what we intended? I expect the change to be minimal. It should return datasets satisfy the following conditions:
The assumption of 2. is faulty because user can define Cc @merelcht In #1900, it is written that
This can cause huge memory consumption, it will most likely fail too because Runner will release intermediate MemoryDataset so we cannot return all MemoryDataSet in catalog. |
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
I agree with @noklam, the initial approach included intermediate Based on on that I've updated the implementation to do the following:
@noklam comment from DMs:
|
Signed-off-by: Sajid Alam <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved with a minor comment. Nice work!
# in the catalog. | ||
free_outputs = pipeline.outputs() - set(registered_ds) | ||
# in the catalog and include MemoryDataset. | ||
free_outputs = pipeline.outputs() - (set(registered_ds) - memory_datasets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The definition of free_outputs
has always been confusing to me, I think what are returned here is "in_memory_dataset" as we are trying to return the dataset as long as there are no I/O penalties.
Feel free to come up with other names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree with this approach. Great work @SajidAlamQB 👍
…aSet Signed-off-by: Sajid Alam <[email protected]>
Description
Context: #1900
The
free_outputs
output fromsession
isn't very clear we'll change it to return all free outputs and additionally any MemoryDataSets that are defined in the catalog.Development notes
Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file