Refactor cache to support storing resources in memory #52210

rosstimothy · 2025-02-16T18:52:11Z

While the current cache storage is also in memory, it leverages the memory backend which requires converting resources to and from json. Marshaling json is suboptimal and is often a source of CPU and memory consumption. The biggest problem with json marshaling though is that it requires calling CheckAndSetDefaults. When validations enforced in CASD become stricter it can leave caches unable to become healthy when there are mixed versions within a cluster as they all might have a slightly different view of what should be allowed.

As a means to solve both problems, this changes the cache to store resources received from the upstream directly in memory without doing any conversion to json. There are two gotchas with this approach, caching is no longer "free" and cloning of resources must be done diligently to avoid races. Historically the cache relied on the storage implementation in services/local to manage persisting resources in the cache's backend.Memory. However, that will no longer be the case as the cache needs to support storage directly in memory. In order to reduce the burden this may pose on developers the new collection machinery is simpler, and helpers will be added on top of sortcache.SortCache to make a second storage implementation trivial.

The changes here are two fold, mark the existing collections machinery as legacy and introduce new machinery to operate entirely in memory. This includes an initial implementation of caching for static tokens, users, and cert authorities that leverages the new collection machinery. Additionally, all of the new resource specific code was moved to lib/cache/static_tokens.go, lib/cache/users.go, and lib/cache/cert_authority.go. This follows the blue print some of the newer cached resources use to make it easier to identify where code for a specific resource lives and to reduce the size of the cache.go and collections.go files.

Once this lands I plan to copy the same pattern used on other cached resource until all of the legacy collections are gone.

lib/cache/store.go

lib/cache/static_tokens.go

This is a mechanical operation to rename the existing collections to legacy collections. These collections will slowly be converted over to a new variant that holds resources in memory instead of relying on the memory backend and storage implementation in lib/services/local.

Introduces an alternate to the legacyCollections and genericCollection that relies on caching resource in memory. The new collection[T] takes inspiration from the legacy collection types but aims to both simplify and break reliance on using the local storage services. The two collection mechanisms will and can coexist until all types are converted to the new mechanism. A single implementation of the new collection exists for the StaticTokens. Additionally, during the conversion process some refactoring will be done to try to better organize and reduce the size of cache.go. In this PR all of the code specific to static tokens was moved out of cache.go and legacy_collections.go and into static_tokens.go. This will allow separating concerns and making it much easier to identify where specific content lives.

lib/cache/cert_authority.go

fspmarshall · 2025-02-20T21:26:27Z

lib/cache/collection.go

+		// Always perform the delete if this is not a singleton, otherwise
+		// only perform the delete if the singleton wasn't found
+		// or the resource kind isn't cached in the current generation.
+		if !c.singleton || deleteSingleton || !cacheOK {
+			if err := c.store.clear(); err != nil {
+				if !trace.IsNotFound(err) {
+					return trace.Wrap(err)
+				}
+			}
+		}
+		// If this is a singleton and we performed a deletion, return here
+		// because we only want to update or delete a singleton, not both.
+		// Also don't continue if the resource kind isn't cached in the current generation.
+		if c.singleton && deleteSingleton || !cacheOK {
+			return nil
+		}


Looking over this and the old collection logic that this was derived from, I'm not sure we should be caring about whether or not the underlying type is a singleton. As far as I can see, it seems like if it just always clear and always apply any resources in the slice we should achieve the exact same end-effect with much less control-flow.

fspmarshall · 2025-02-20T21:31:02Z

lib/cache/store.go

+	for idx, transform := range s.indexes {
+		s.cache.Delete(idx, transform(t))
+	}


nit: SortCache deletes the item across all indexes if it is deleted on any index, so only one call to Delete is required (though maybe just leaving as-is is easier than deciding which one to use lol).

lib/cache/collection.go

rosstimothy added the no-changelog Indicates that a PR does not require a changelog entry label Feb 16, 2025

rosstimothy force-pushed the tross/refactor_cache branch 4 times, most recently from 3be09f7 to 9baa177 Compare February 18, 2025 18:04

rosstimothy requested review from fspmarshall and espadolini February 18, 2025 18:10

fspmarshall reviewed Feb 18, 2025

View reviewed changes

lib/cache/store.go Outdated Show resolved Hide resolved

lib/cache/static_tokens.go Outdated Show resolved Hide resolved

rosstimothy force-pushed the tross/refactor_cache branch from fe33466 to b28680c Compare February 19, 2025 16:30

rosstimothy requested a review from fspmarshall February 20, 2025 16:29

rosstimothy added 5 commits February 20, 2025 15:23

remove direct interaction with internal cache state

52d6fb1

fix: improve singleton store not found error message

cec516d

refactor sortcache to use iterators -- pull this out into it's own PR

fda7108

rosstimothy force-pushed the tross/refactor_cache branch from 88fdb27 to ab060e3 Compare February 20, 2025 20:35

rosstimothy added 4 commits February 20, 2025 17:00

migrate cas to in-memory cache

b84d9e1

add resource store tests

9102820

migrate users to in-memory cache

004d033

appease linter

63618f5

rosstimothy force-pushed the tross/refactor_cache branch from ab060e3 to 63618f5 Compare February 20, 2025 22:00

fspmarshall reviewed Feb 20, 2025

View reviewed changes

rosstimothy added 4 commits February 21, 2025 08:38

rename legacy cache reading machinery

49ff9e2

simplify read guards

ed13fe6

simplify new cache collectin

dd9ac1c

fix: always clone resources

39d9934

rosstimothy force-pushed the tross/refactor_cache branch from 970ff84 to 39d9934 Compare February 21, 2025 16:30

fspmarshall reviewed Feb 21, 2025

View reviewed changes

lib/cache/collection.go Outdated Show resolved Hide resolved

lib/cache/collection.go Outdated Show resolved Hide resolved

rosstimothy added 2 commits February 21, 2025 12:19

fix: cas users after forcing clone in (UserV2) WithoutSecrets

0ff672b

fix: set ListUserRequest page size if 0

dd3a9c2

rosstimothy force-pushed the tross/refactor_cache branch from a3fd56f to dd3a9c2 Compare February 21, 2025 19:52

rosstimothy added 2 commits February 21, 2025 15:08

rename and document store

56a7cbe

improve documentation and organization

fb1947b

rosstimothy force-pushed the tross/refactor_cache branch from f2d8fe6 to fb1947b Compare February 21, 2025 21:38

rosstimothy added 2 commits February 21, 2025 16:45

add test to cover (Cache) ListUsers

f654ddb

remove redundant watches param

0614a1c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor cache to support storing resources in memory #52210

Refactor cache to support storing resources in memory #52210

rosstimothy commented Feb 16, 2025 •

edited

Loading

fspmarshall Feb 20, 2025

fspmarshall Feb 20, 2025

Refactor cache to support storing resources in memory #52210

Are you sure you want to change the base?

Refactor cache to support storing resources in memory #52210

Conversation

rosstimothy commented Feb 16, 2025 • edited Loading

fspmarshall Feb 20, 2025

Choose a reason for hiding this comment

fspmarshall Feb 20, 2025

Choose a reason for hiding this comment

rosstimothy commented Feb 16, 2025 •

edited

Loading