Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: introduce blocking vfs.FS; add some tests #98852

Closed
wants to merge 4 commits into from

Conversation

tbg
Copy link
Member

@tbg tbg commented Mar 17, 2023

edit: with the third commit in place, the test in 2) now passes.


This introduces a vfs.FS implementation that can be toggled to block
on any write activity.

Add two tests:

  1. check that even though a durable commit is inflight, we can still
    write non-durably. I.e. at the pebble level there is no immediate
    reason why we couldn't keep applying the raft log even though log
    appends are stuck.
  2. end-to-end test where we set up a three node cluster, send unique
    writes and stall the leaseholder's disk, hoping for the writes to
    keep going through. This fails, as is expected, because raft does
    not even put up entries for application if they haven't been durably
    appended to the leader's log (which they can't be due to the stall
    we're injecting).

If we can get 2) to pass, we can have reason to believe that we can
address #88699 (though we'd need to verify with "real" disk stalls
and higher write volumes/stall durations and would certainly need to
configure pebble to allow much more data to accumulate in memory).

The additional benefit test 2) provides is that it prints who is
blocked. At the time of writing, it's encouraging - we see just
the LogWriter blocked on a call to Write (via flushPending).

The next step would be to hack in the ability to apply entries
that aren't durable on the leader and to see what we get.

Touches #88699.

Release note: None

This introduces a `vfs.FS` implementation that can be toggled to block
on any write activity.

Add two tests:

1. check that even though a durable commit is inflight, we can still
   write non-durably. I.e. at the pebble level there is no immediate
   reason why we couldn't keep *applying* the raft log even though log
   *appends* are stuck.
2. end-to-end test where we set up a three node cluster, send unique
   writes and stall the leaseholder's disk, hoping for the writes to
   keep going through. This fails, as is expected, because raft does
   not even put up entries for application if they haven't been durably
   appended to the leader's log (which they can't be due to the stall
   we're injecting).

If we can get 2) to pass, we can have reason to believe that we can
address cockroachdb#88699 (though we'd need to verify with "real" disk stalls
and higher write volumes/stall durations and would certainly need to
configure pebble to allow much more data to accumulate in memory).

The additional benefit test 2) provides is that it prints who is
blocked. At the time of writing, it's encouraging - we see just
the LogWriter blocked on a call to `Write` (via `flushPending`).o

The next step would be to hack in the ability to apply entries
that aren't durable on the leader and to see what we get.

Touches cockroachdb#88699.

Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member Author

tbg commented Mar 17, 2023

cc @jbowens @nvanbenschoten @erikgrinaker since we talked about #88699.

(Internal) Slack thread

@tbg
Copy link
Member Author

tbg commented Mar 17, 2023

Going to try to hook this up with etcd-io/raft#37

tbg added 2 commits March 18, 2023 20:37
Surprised me, but TestWriteWithStuckLeaseholder now passes.

Epic: none
Release note: None
@tbg
Copy link
Member Author

tbg commented Mar 28, 2023

From the above Slack thread:

I got raft patched to also put up entries for application that are committed via a non-local quorum, and magically things work.. until the ~257th write, then it stalls:

Jackson then had some responses on where the stalls would be.

At this point I'm going to disengage from the prototype, we can pick this back up if we're interested in making progress in earnest. There seems to be reason to believe that with some pebble work we can possibly mask short disk stalls or latency degradations without necessarily exposing them to users. What's unexplored is how "real" disk stalls affect disk reads. We need to be doing all pebble reads out of the cache during evaluation and replication or requests get stuck anyway.

@tbg
Copy link
Member Author

tbg commented Mar 28, 2023

Closing: the work is tracked in #88699.

@jbowens any interest in putting the blocking vfs.VFS anywhere? I'm not sure how disk stalls are being tested right now, I epxected something similar to exist for that use case but it doesn't seem to.

@tbg tbg closed this Mar 28, 2023
@jbowens
Copy link
Collaborator

jbowens commented Mar 28, 2023

Yeah, maybe we should start a pebble/vfs/vfstest package that can be consumed within both Pebble and Cockroach. Our current disk stall unit tests use a completely mocked fs: vfs/disk_health_test.go

IIRC, there's also a few tests that use errorfs and a custom InjectorFunc to induce blocking. We could also move errorfs out of the internal package so that it can be used by cockroach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants