Skip to content

Cargo should say which lock it waits on when run with --verbose #15094

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
WaffleLapkin opened this issue Jan 24, 2025 · 8 comments · May be fixed by #15486
Open

Cargo should say which lock it waits on when run with --verbose #15094

WaffleLapkin opened this issue Jan 24, 2025 · 8 comments · May be fixed by #15486
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage.

Comments

@WaffleLapkin
Copy link
Member

WaffleLapkin commented Jan 24, 2025

Problem

Due to an unfortunate series of events (rust-analyzer OOM'ing after using >44GiB of RAM) I had cargo's package cache permanently locked. When I'd try to build something all I would see is this:

    Blocking waiting for file lock on package cache

(I had no cargo/r-a running at that point)

This is only semi-helpful -- it shows you what's the problem, but not how to fix it.

Proposed Solution

It would be nice if cargo would print the path to the lockfile. In this case that would be something like ~/.cargo/.package-cache and ~/.cargo/.package-cache-mutate (deleting those fixed the problem for me). It's probably better to only print those with --verbose, since usually you are actually waiting for another cargo process.

Notes

Upd: this can be even worse for other locks/in combination with other build systems.

Just not I had to figure out that the blocking/broken lock is actually .../rust-q/build/rust-analyzer/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/.cargo-lock.

@WaffleLapkin WaffleLapkin added C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage. labels Jan 24, 2025
@ChrisDenton
Copy link
Member

Shouldn't the lock be automatically released when the process dies and the OS cleans up the fd?

@WaffleLapkin
Copy link
Member Author

That would be a reasonable assumption, however it seems like this time it did not happen. I don't really have any knowledge on why it might have not been cleaned up. But it did happen with both cargo and bootstrap at the same time, so I assume something went wrong.

There is also this stack overflow question: https://stackoverflow.com/questions/47565203/cargo-build-hangs-with-blocking-waiting-for-file-lock-on-the-registry-index-a, where some people recommend deleting the file or say that it helped them. So I'd assume there are cases where the OS does not do the cleanup...

@Gordon01 Gordon01 linked a pull request May 3, 2025 that will close this issue
@epage
Copy link
Contributor

epage commented May 5, 2025

@Gordon01 provided an implementation for this in #15486.

@weihanglo commented:

Unsure if the discoverability of this behind --verobse. Personally I am fine with just printing the lock path unconditionally,

@epage
Copy link
Contributor

epage commented May 5, 2025

For me, this is too low level of a detail to be shown by default.

I do wonder what should be our bar for -v vs -vv vs CARGO_LOG.

@Gordon01
Copy link
Contributor

Gordon01 commented May 8, 2025

Unsure if the discoverability of this behind --verobse

I agree. Without prior knowledge its hard to discover.

Personally I am fine with just printing the lock path unconditionally

Yes, but in 99,9% it's a garbage cluttering the output which is definitely a bad UX.

For me, this is too low level of a detail to be shown by default.

Agree.

The only good idea I have is to print path after some delay (say 5 seconds).

@weihanglo
Copy link
Member

The only good idea I have is to print path after some delay (say 5 seconds).

I would love to see something like this, with some more details like who is holding the lock.

@Gordon01
Copy link
Contributor

with some more details like who is holding the lock.

I've discovered that lock files are currently empty. I propose saving a process ID inside, possibly with an acquisition timestamp.

A PID alone would be sufficient to easily determine if a process is still running. While waiting for a lock, I could periodically print a status like {running, stopped}. Alternatively, we could simply print the raw PID if we prefer a minimal approach. Since this code would rarely execute, simple makes sense.

Another option would be to also save a timestamp. In this case, the file should be structured as JSON (or TOML), for example:

{ "pid": 12345, "timestamp": "2025-05-13T22:00:00Z" }

The status message could then display:

Blocking waiting for file lock on package cache held by PID 12345, running for 45s (/home/user/.cargo/.package-cache)

My plan is to write metadata in cargo::util::flock::Filesystem methods like open_rw_exclusive_create().

Currently, cargo::util::flock::Filesystem::open() returns (PathBuf, File), but since it's only used for lockfiles, would it make more sense to return a FileLock instead? This would allow adding a method to write PID metadata. Here's how open_rw_exclusive_create() might look:

pub fn open_rw_exclusive_create<P>(
    &self,
    path: P,
    gctx: &GlobalContext,
    msg: &str,
) -> CargoResult<FileLock>
where
    P: AsRef<Path>,
{
    let mut opts = OpenOptions::new();
    opts.read(true).write(true).create(true);
    let mut lock = self.open(path.as_ref(), &opts, true)?;
    acquire(
        gctx,
        msg,
        lock.path(),
        &|| try_lock_exclusive(lock.file()),
        &|| lock_exclusive(lock.file()),
    )?;
    // Write PID/timestamp after acquiring the lock
    let _ = lock.write_metadata();
    Ok(lock)
}

Open Questions

  • What would be a sensible delay before showing lock wait messages?
  • Should we store just the PID in plaintext, or use a structured format with additional fields like timestamp?
  • Should we handle stale locks?

@weihanglo
Copy link
Member

Attaching some extra data sounds good. I know that PostgreSQL does that in the postmaster.pid file. We can start small from pid, and expand to more information if needed. Note that this should be advisory only.

  • What would be a sensible delay before showing lock wait messages?

It varies. Also depends on how focused people can be these day 😆. I would love to try like 15s to start with, but I don't really now.

  • Should we store just the PID in plaintext, or use a structured format with additional fields like timestamp?

I would recommend JSON so we have room to expand it in the future. Also easy to parse without needing to look at any bespoke format.

  • Should we handle stale locks?

We could clean up the content before unlock, though as it is advisory only it is fine as well if we don't want to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants