Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libsql-server: use LRU cache to store active namespaces #744

Merged
merged 16 commits into from
Dec 20, 2023
Merged

Conversation

psarna
Copy link
Contributor

@psarna psarna commented Dec 1, 2023

This is a port of Marin's libsql/sqld#689

This PR replaces the namespace store dumb hashmap with an LRU cache to bound the maximum number of namespaces loaded into memory while unbounding the number of namespaces allocated on a sqld instance. Additionally, this enables fully concurrent access to namespaces by removing the global lock on the namespace store.

@psarna psarna requested a review from MarinPostma December 1, 2023 12:34
@psarna
Copy link
Contributor Author

psarna commented Dec 1, 2023

@MarinPostma unique opportunity for you: to review a patch that is authored under your name in git, but actually half of it is hacked by me to match the new code 😅 not tested yet beyond if it compiles

})
})
// TODO(marin): configurable capacity
.max_capacity(25)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarinPostma does it imply we cap our active databases at 25? Then we need to definitely make it configurable, but also bump it to at least a 100, even for those small 256MiB machines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And https://docs.rs/moka/latest/moka/future/struct.CacheBuilder.html#method.time_to_idle time_to_idle sounds very relevant for us too, accessing the db bumps the counter, but it gets automatically evicted after some time (and be generous with that time, like, minutes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it was still very POC and I hit other issues even before I started tuning the eviction logic

Copy link
Contributor

@MarinPostma MarinPostma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a bunch of nits for now, I need to dig deeper into this, and test it as well. Especially since the locking is not quite as straightforward as it used to be

Comment on lines 402 to 403
// FIXME: when destroying, we are waiting for all the tasks associated with the
// allocation to finnish, which create a lot of contention on the lock. Need to use a
// allocation to finish, which create a lot of contention on the lock. Need to use a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not really relevant anymore

Comment on lines 538 to 539
// FIXME: find how to deal with Arc<Error>
.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an Error variant for Arc now

})
})
// TODO(marin): configurable capacity
.max_capacity(25)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it was still very POC and I hit other issues even before I started tuning the eviction logic

@psarna psarna force-pushed the marinsport branch 5 times, most recently from cf5e74f to d5dd4af Compare December 5, 2023 17:08
@psarna
Copy link
Contributor Author

psarna commented Dec 5, 2023

@MarinPostma interesting observation: with moka's LFU cache for admission and LRU for eviction, we have an anomaly due to heartbeat. Existing namespaces are touched with heartbeat, so when a new namespace is inserted, it's most likely to get evicted, because it was not accessed even once yet. We can probably elide that by artificially bumping newly inserted namespaces, I'll experiment some more with it. Detected when inspecting how the cache behaves with --max-active-namespaces 2

@psarna
Copy link
Contributor Author

psarna commented Dec 6, 2023

update: not really an anomaly, expected behavior! After creating the namespace with the admin API, it makes perfect sense to evict it if no requests were sent to it yet. And once any request is sent, the database will be brought back to the active list, and it will not be evicted as long as it was recently used frequently enough

Copy link
Contributor

@MarinPostma MarinPostma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more confident merging that if we had some more bottomless tests that showed the interaction with that feature

let namespace = namespace.clone();
async move {
if namespace != NamespaceName::default()
&& !self.inner.make_namespace.exists(&namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm we may need to check from bottomless if we need a restore, in case the namespace exists but is not on disk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter? Isn't "create" only about recreating the local directory structure? I think the code we have in our current master branch doesn't care about bottomless either

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in master does care about bottomless, the allow_lazy creation is passed down to new_primary, which makes the decision to create or not. This happens here: https://github.com/tursodatabase/libsql/blob/main/libsql-server/src/namespace/mod.rs#L930.

As you see, we try to go as far as possible with the creation. If the db doesn't exist AND we didn't recover for bottomless AND we're not allowed to lazy create, then it's an error, we abort and clean after ourselves

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, that makes the last couple of commits counterproductive, I tried to push the check as early as possible to simplify the logic. On the other hand, let my try a different direction: what is the bad effect of us creating a namespace and not returning an error, if it in fact resides in bottomless? From there I see 3 directions to go:

  1. Roll back to the previous logic with trying to create and then bailing out, and just make sure the new caching code still works as expected
  2. If it's actually harmless, leave as is
  3. If harmless, maybe just expand the exists() implementation to also check if there's a bottomless laying around somewhere, if enabled? Should be a single http call to s3, and creating a namespace is considered rare

@MarinPostma
Copy link
Contributor

I have run some tests on that branch. I can allocate 10k namespaces, but then memory seems to get stuck:
image
I think there are resources that are not being deallocated when we release a namespace
we need to know what they are

@psarna
Copy link
Contributor Author

psarna commented Dec 6, 2023

@MarinPostma there's also the matter of how quickly you allocated. With our setup, eviction is async, so a peak would kill us. We can wait for async evictions with run_pending_tasks().await

@psarna
Copy link
Contributor Author

psarna commented Dec 6, 2023

@MarinPostma we can also get an estimated cache size with https://docs.rs/moka/latest/moka/future/struct.Cache.html#method.entry_count, and if it's beyond some hard limit, wait for pending tasks

@psarna psarna force-pushed the marinsport branch 3 times, most recently from 35a116c to 7f2d22e Compare December 15, 2023 09:21
@psarna psarna marked this pull request as draft December 15, 2023 09:22
@psarna
Copy link
Contributor Author

psarna commented Dec 15, 2023

Downgrading to draft, it needs a very heavy rewrite after MetaStore got merged

@psarna psarna force-pushed the marinsport branch 2 times, most recently from a72b58e to 78cc9da Compare December 15, 2023 11:27
Marin Postma and others added 7 commits December 19, 2023 09:55
This is a port of Marin's libsql/sqld#689

This PR replaces the namespace store dumb hashmap with an LRU cache to bound
the maximum number of namespaces loaded into memory while unbounding the number
of namespaces allocated on a sqld instance.  Additionally, this enables fully
concurrent access to namespaces by removing the global lock on the namespace
store.

Co-authored-by: Piotr Sarna <[email protected]>
... specifically, half-configurable, since we could also configure
the time-to-idle period. On the other hand, that's another burden
for the user to decide, and 5 minutes sound generous enough.
Originally the test was performed in the f() function,
but that changed after we allowed default namespace creation.
TODO: snapshotting on eviction will kill I/O if we get n+1 active
namespaces, given a cache of n entries... We need a more robust
mechanism for snapshot tracking to avoid:
 1. snapshotting the same namespace more than once, if it was
    evicted, then recreated, and then evicted again
 2. snapshotting a namespace if the previous snapshot happened
    not long ago (esp. if there's no new data)
Now that we have cache, it's better to consult the MakeNamespace
implementation to check if the namespace has already been created.
Now that we abstracted out exists() for MakeNamespace,
we no longer need to pass a boolean that allows or disallows
auto-creation.
With MakeNamespace::exists(), the logic for checking if a namespace
already exists is shifted to a separate call.
Even if the namespace doesn't exist locally, it might have a functional
backup.
@psarna psarna marked this pull request as ready for review December 19, 2023 09:18
@psarna
Copy link
Contributor Author

psarna commented Dec 19, 2023

Ok, rebased and retested. This code assumes that MetaStore is the source of truth for checking if a namespace exists. In particular, it assumes that when we restore from bottomless, MetaStore should get restored first, before we try to instantiate any namespace. That's not the case yet as of #616, but it's expected to be added as a follow-up

@psarna psarna added this pull request to the merge queue Dec 20, 2023
Merged via the queue into main with commit 8845d45 Dec 20, 2023
@psarna psarna deleted the marinsport branch December 20, 2023 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants