Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cache to support storing resources in memory #52210

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

rosstimothy
Copy link
Contributor

@rosstimothy rosstimothy commented Feb 16, 2025

While the current cache storage is also in memory, it leverages the memory backend which requires converting resources to and from json. Marshaling json is suboptimal and is often a source of CPU and memory consumption. The biggest problem with json marshaling though is that it requires calling CheckAndSetDefaults. When validations enforced in CASD become stricter it can leave caches unable to become healthy when there are mixed versions within a cluster as they all might have a slightly different view of what should be allowed.

As a means to solve both problems, this changes the cache to store resources received from the upstream directly in memory without doing any conversion to json. There are two gotchas with this approach, caching is no longer "free" and cloning of resources must be done diligently to avoid races. Historically the cache relied on the storage implementation in services/local to manage persisting resources in the cache's backend.Memory. However, that will no longer be the case as the cache needs to support storage directly in memory. In order to reduce the burden this may pose on developers the new collection machinery is simpler, and helpers will be added on top of sortcache.SortCache to make a second storage implementation trivial.

The changes here are two fold, mark the existing collections machinery as legacy and introduce new machinery to operate entirely in memory. This includes an initial implementation of caching for static tokens, users, and cert authorities that leverages the new collection machinery. Additionally, all of the new resource specific code was moved to lib/cache/static_tokens.go, lib/cache/users.go, and lib/cache/cert_authority.go. This follows the blue print some of the newer cached resources use to make it easier to identify where code for a specific resource lives and to reduce the size of the cache.go and collections.go files.

Once this lands I plan to copy the same pattern used on other cached resource until all of the legacy collections are gone.

@rosstimothy rosstimothy added the no-changelog Indicates that a PR does not require a changelog entry label Feb 16, 2025
@rosstimothy rosstimothy force-pushed the tross/refactor_cache branch 4 times, most recently from 3be09f7 to 9baa177 Compare February 18, 2025 18:04
This is a mechanical operation to rename the existing collections
to legacy collections. These collections will slowly be converted
over to a new variant that holds resources in memory instead of
relying on the memory backend and storage implementation in
lib/services/local.
Introduces an alternate to the legacyCollections and
genericCollection that relies on caching resource in memory. The
new collection[T] takes inspiration from the legacy collection types
but aims to both simplify and break reliance on using the local
storage services.

The two collection mechanisms will and can coexist until all types
are converted to the new mechanism. A single implementation of
the new collection exists for the StaticTokens.

Additionally, during the conversion process some refactoring will
be done to try to better organize and reduce the size of cache.go.
In this PR all of the code specific to static tokens was moved
out of cache.go and legacy_collections.go and into static_tokens.go.
This will allow separating concerns and making it much easier to
identify where specific content lives.
Comment on lines +127 to +142
// Always perform the delete if this is not a singleton, otherwise
// only perform the delete if the singleton wasn't found
// or the resource kind isn't cached in the current generation.
if !c.singleton || deleteSingleton || !cacheOK {
if err := c.store.clear(); err != nil {
if !trace.IsNotFound(err) {
return trace.Wrap(err)
}
}
}
// If this is a singleton and we performed a deletion, return here
// because we only want to update or delete a singleton, not both.
// Also don't continue if the resource kind isn't cached in the current generation.
if c.singleton && deleteSingleton || !cacheOK {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking over this and the old collection logic that this was derived from, I'm not sure we should be caring about whether or not the underlying type is a singleton. As far as I can see, it seems like if it just always clear and always apply any resources in the slice we should achieve the exact same end-effect with much less control-flow.

Comment on lines +96 to +98
for idx, transform := range s.indexes {
s.cache.Delete(idx, transform(t))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SortCache deletes the item across all indexes if it is deleted on any index, so only one call to Delete is required (though maybe just leaving as-is is easier than deciding which one to use lol).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Indicates that a PR does not require a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants