How well are arrow
builds synchronized across test processes?
#1839
Replies: 1 comment 6 replies
-
Even though it's unclear how it can happen, I think it's clear that it must be My guess is that If this keeps happening or happens often enough to be annoying, I suppose the test-suite could use its own lock to prevent multiple concurrent On top of that, or alternatively, I could imagine that it should be quite straightforward to write a test-program that ramps up the parallelism to reliably trigger this issue. Then it could be posted on the CC @weihanglo in case they have thoughts. |
Beta Was this translation helpful? Give feedback.
-
I got a failure in
overwriting_files_and_lone_directories_works
on CI when fast-forwarding the main branch of my fork. It occurred in thetest-fast
macos-latest
job. The failure appears intermittent and rare because it went away when the job was rerun, and because it seems like it arises from a strange race condition.I believe this is completely unrelated to #1373, even though the failing test case is the same. There, the problem was in the symlink probe. Here, the problem is in calling the
arrow.rs
example filter, which does not exist when called. The messages about the build blocking are normal, and often happen, at least when the test suite is run locally and I believe also on CI. But it looks like the multiple concurrent runs ofcargo build -p=gix-filter --example arrow
somehow result in thearrow
executable temporarily being absent when called. When an attempt is made to run it, it's not there:I'm not clear on how that would happen, though. I'm not sure if this is a bug in the test suite, or a bug in
cargo
orrustc
, or some other condition.overwriting_files_and_lone_directories_works
is one of three tests that callsetup_filter_pipeline
, which callsdriver_exe
, which accessesDRIVER
, which is aonce_cell::sync::Lazy
instance that runscargo build -p=gix-filter --example arrow
at most once per process.Although this may help avoid unnecessary runs of that build command in some circumstances, it neither prevents nor typically decreases the likelihood of running that build command two or more times concurrently. This is because gitoxide primarily uses the nextest runner, which uses multiple test processes to run separate tests in parallel. But it seems to me that this shouldn't be a problem, because
cargo
does its own synchronization. This can be seen in:So I don't understand why
arrow
is not there when an attempt is made to run it as a filter.This does not seem to be due to problems with the directory from which the command is run, nor to any wrong assumptions about where the executable should be created. If it were, then this test would always or at least regularly fail, instead of hardly ever failing. In addition, I've verified locally (though on GNU/Linux, not macOS) that the
arrow
executable is in that location when built from the root of the workspace as well as fromgix-worktree-state/tests
.Beta Was this translation helpful? Give feedback.
All reactions