Skip to content

improve detection of nanosecond support in gix-fs (#1896) #1897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 19, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion gix-fs/tests/fs/snapshot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,5 +50,8 @@ fn has_nanosecond_times(root: &Path) -> std::io::Result<bool> {
std::fs::write(&test_file, "b")?;
let second_time = test_file.metadata()?.modified()?;

Ok(first_time != second_time)
Ok(second_time.duration_since(first_time).is_ok_and(|d|
// This can be falsely false if a filesystem would be ridiculously fast,
// which means a test won't run even though it could. But that's OK, and unlikely.
d.subsec_nanos() != 0))
Comment on lines -53 to +56
Copy link
Member

@EliahKagan EliahKagan Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems intended to do something different from first_time != second_time, but it effectively behaves the same. The subsec_nanos method is effectively just taking the duration modulo 1 second and expressing the value in nanoseconds. In nearly all cases of unequal timestamps that arise from real-world processes, a difference between timestamps will give a nonzero value of subsec_nanos(), except possibly on a system where subsecond_nanos() always returns 0 due to 1 second precision. As evaluated here, d.subsec_nanos() != 0 is in practice true under almost exactly the circumstances that first_time != second_time was true.

(There are some differences, but they are much rarer than the failures we have observed. If the clock were adjusted such that the timestamps cannot be substracted to give a nonnegative value, or if the difference is nonzero but the time between the two file creation/write operations was curiously long and also an exact multiple of 1 second, then this would evaluate to false. It seems unlikely that those scenarios ever happen when running this on CI. If they do, they would happen far less often than the failures we had before as reported in #1896. Thinking roughly about it: these would be much rarer than network problems causing checkout failure, which I believe could arise due to less extreme timing oddities and which we observe on rare occasion, but the failures are instead much more common than that.)

Although the current failures happen on systems where this is wrongly true, it's worth noting that the tests usually pass on those systems, which makes me wonder if the journey test can be adjusted to make sure a sufficient delay occurs and/or is recorded even on systems with millisecond precision. Furthermore, the code comment here notes that it could be wrongly false even with high precision, on a sufficiently fast filesystem. If so, then if I understand the journey test correctly, it could also produce writes so close together that they have the same timestamp. So if the journey test can be adjusted to work reliably regardless of filesystem precision (so long as we do actually have working modification times, that is), then maybe that would benefit all systems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the analysis! It looks like I am on the 'worst' platform to actually get this right, along with my misperception of subsecond-precision always meaning nanosecond support.

Then it looks like the only way to stabilise this is to see if there is subsecond precision or not, assume millisecond precision if it is, and use that to choose a sleep-duration that will make the writes work.

Beyond that, the test also shows the limits of this snapshot system which is likely to not pick up changes if they happen in short succession.

Copy link
Member

@EliahKagan EliahKagan Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another idea. I think my idea may well be worse than what you have just described, but I'm just finishing up testing it. So I'll open a PR for it so you can take a look.

Edit: I've opened #1899.

}
Loading