Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems intended to do something different from
first_time != second_time
, but it effectively behaves the same. Thesubsec_nanos
method is effectively just taking the duration modulo 1 second and expressing the value in nanoseconds. In nearly all cases of unequal timestamps that arise from real-world processes, a difference between timestamps will give a nonzero value ofsubsec_nanos()
, except possibly on a system wheresubsecond_nanos()
always returns0
due to 1 second precision. As evaluated here,d.subsec_nanos() != 0
is in practicetrue
under almost exactly the circumstances thatfirst_time != second_time
wastrue
.(There are some differences, but they are much rarer than the failures we have observed. If the clock were adjusted such that the timestamps cannot be substracted to give a nonnegative value, or if the difference is nonzero but the time between the two file creation/write operations was curiously long and also an exact multiple of 1 second, then this would evaluate to
false
. It seems unlikely that those scenarios ever happen when running this on CI. If they do, they would happen far less often than the failures we had before as reported in #1896. Thinking roughly about it: these would be much rarer than network problems causing checkout failure, which I believe could arise due to less extreme timing oddities and which we observe on rare occasion, but the failures are instead much more common than that.)Although the current failures happen on systems where this is wrongly
true
, it's worth noting that the tests usually pass on those systems, which makes me wonder if thejourney
test can be adjusted to make sure a sufficient delay occurs and/or is recorded even on systems with millisecond precision. Furthermore, the code comment here notes that it could be wronglyfalse
even with high precision, on a sufficiently fast filesystem. If so, then if I understand thejourney
test correctly, it could also produce writes so close together that they have the same timestamp. So if thejourney
test can be adjusted to work reliably regardless of filesystem precision (so long as we do actually have working modification times, that is), then maybe that would benefit all systems.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the analysis! It looks like I am on the 'worst' platform to actually get this right, along with my misperception of subsecond-precision always meaning nanosecond support.
Then it looks like the only way to stabilise this is to see if there is subsecond precision or not, assume millisecond precision if it is, and use that to choose a sleep-duration that will make the writes work.
Beyond that, the test also shows the limits of this snapshot system which is likely to not pick up changes if they happen in short succession.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another idea. I think my idea may well be worse than what you have just described, but I'm just finishing up testing it. So I'll open a PR for it so you can take a look.
Edit: I've opened #1899.