Skip to content

Document various trait invariants / requirements #516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 6, 2025

Conversation

bkirwi
Copy link
Contributor

@bkirwi bkirwi commented Mar 17, 2023

Some of these I have picked up in conversation; others come from documenting what the code seems to assume. Happy to be corrected if I got something wrong!

@@ -28,14 +31,21 @@ pub trait PathSummary<T> : Clone+'static+Eq+PartialOrder+Debug+Default {
/// in computation, uses this method and will drop messages with timestamps that when advanced
/// result in `None`. Ideally, all other timestamp manipulation should behave similarly.
///
/// This summary's partial order is expected to be compatible with the partial order of [T],
/// in the sense that if `s1.less_equal(s2)`, then `s1.results_in(&t)` is less than or equal to
/// `s2.results_in(&t)`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems clear that we do assume roughly that the smallest summary advances the timestamp the least, but I'm not confident I have this description exactly right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct. Though I'd put it on PartialOrder instead, as I don't think it is specific to Timestamp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this comment is deeper than I thought. It is not a re-statement of the PartialOrder requirement, but that the application of the path summary respects the partial order. I'm sure there is a name for this, but submodularity is not right but the closest I could think of (where nudging one input in one direction (the summary, "greater") results in a nudge to the output in the same direction. Perhaps "monotonic wrt to"? Anyhow, I retract this proposed edit!

/// `s2.results_in(&t)`.
///
/// Note that `Self::default()` is expected to behave as an "empty" or "noop" summary, such that
/// `Self::default().results_in(&t) == Some(t)`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cycle checking seems to assume that non-default summaries always advance the timestamp. Is that a requirement?

Copy link
Member

@frankmcsherry frankmcsherry May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, ah .. I think they do not make this assumption. I believe they did at one point, but now have a different assumption. Worth talking through. Roughly, the Pointstamp summary has the potential to move timestamps backwards, for a good-ish reason, and the cycle detection isn't bright enough to understand what that means. I think the is_acyclic method is doc'd correctly ("Tests whether the graph a cycle of default path summaries"), but the certainty the test provides is not as strong as you might imagine.

NB: This is in response to your comment, about assumptions on non-default summaries, not the edit itself which is fine..

At time of writing, I'm not sure about the exact relationship here, but
there definitely ought to be one!
Copy link
Member

@frankmcsherry frankmcsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems totally positive, though it did make me think about some ways it could probably be even better (things still unclear, or things that could have better homes). I'm happy to merge as is and tidy, or if you are keen to let you take a swing. Either works! Thanks very much!

/// `s2.results_in(&t)`.
///
/// Note that `Self::default()` is expected to behave as an "empty" or "noop" summary, such that
/// `Self::default().results_in(&t) == Some(t)`.
Copy link
Member

@frankmcsherry frankmcsherry May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, ah .. I think they do not make this assumption. I believe they did at one point, but now have a different assumption. Worth talking through. Roughly, the Pointstamp summary has the potential to move timestamps backwards, for a good-ish reason, and the cycle detection isn't bright enough to understand what that means. I think the is_acyclic method is doc'd correctly ("Tests whether the graph a cycle of default path summaries"), but the certainty the test provides is not as strong as you might imagine.

NB: This is in response to your comment, about assumptions on non-default summaries, not the edit itself which is fine..

Comment on lines +12 to +14
///
/// By implementing this trait, you promise that the type's [PartialOrder] implementation
/// is compatible with [Ord], such that if `a.less_equal(b)` then `a <= b`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but would it be better to say that by implementing the PartialOrder trait you have this requirement? It's a bit awkward because PartialOrder does not require Eq, or Ord, but various uses of it do assume the implied relationship (though not with PartialOrd). I'm thinking a beefed up PartialOrder doc makes more sense than this comment here (and you have that as part of the PR, but we might want to add even more to it, e.g. wrt Ord).

pub trait Timestamp: Clone+Eq+PartialOrder+Debug+Send+Any+Data+Hash+Ord {
/// A type summarizing action on a timestamp along a dataflow path.
type Summary : PathSummary<Self> + 'static;
/// A minimum value suitable as a default.
/// A unique minimum value in our partial order, suitable as a default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, I can't imagine what "suitable as a default" means. I know I wrote it, but .. reminder that we should probably rip it out.

@@ -28,14 +31,21 @@ pub trait PathSummary<T> : Clone+'static+Eq+PartialOrder+Debug+Default {
/// in computation, uses this method and will drop messages with timestamps that when advanced
/// result in `None`. Ideally, all other timestamp manipulation should behave similarly.
///
/// This summary's partial order is expected to be compatible with the partial order of [T],
/// in the sense that if `s1.less_equal(s2)`, then `s1.results_in(&t)` is less than or equal to
/// `s2.results_in(&t)`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct. Though I'd put it on PartialOrder instead, as I don't think it is specific to Timestamp.

@frankmcsherry
Copy link
Member

Merging, and will follow up with proposed edits in a new PR! (( and any merge skew I missed ))

@frankmcsherry frankmcsherry merged commit 72f34aa into TimelyDataflow:master May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants