Skip to content

feat: cross-replica, OFREP bulk evaluation caching#1858

Open
toddbaert wants to merge 4 commits intomainfrom
feat/ofrep-evaluation-caching
Open

feat: cross-replica, OFREP bulk evaluation caching#1858
toddbaert wants to merge 4 commits intomainfrom
feat/ofrep-evaluation-caching

Conversation

@toddbaert
Copy link
Member

@toddbaert toddbaert commented Jan 30, 2026

This PR implements an OFREP bulk-evaluation caching mechanism compliant with https://openfeature.dev/docs/reference/other-technologies/ofrep/openapi/, that entirely avoids re-evaluation. The cache is invalidated based on the flag configuration, not on the context of the request (ie, it tells the client to keep it's current cache if no configuration has changed).

This is an alternative implementation to #1854. The main difference here is evaluation is avoided entirely if the ETag matches. This implementation:

  • sends per-selector ETag for caching
  • ensures that ETags will be consistent across replicas
  • skips evaluation entirely if ETag matches
  • ETags are invalidated/updated with sync changes (file, http, etc)

One challenge and part of the complexity here is that I wanted the invalidation of ETags to be "Selector aware" so that clients interested in one set of flags didn't have their ETag invalidated if a sync update didn't impact them.

curl -s -i -X POST http://localhost:8016/ofrep/v1/evaluate/flags \
  -H "Content-Type: application/json" \
  -H "flagd-selector: flagSetId=myFlagSet" \  # the ETag is bound to this selector, a different selector will have a different ETag
  -d '{"context": {}}'

⚠️ This implementation has one implication on OFREP web provider(s). Specifically, the provider must invalidate it's stored ETag when it's context changes to force re-evaluation. Otherwise, if flags have not changed, it will always get a 304 empty response even if its context has changed. See this PR: open-feature/js-sdk-contrib#1460. We should consider this before merging... cc @lukas-reining

@toddbaert toddbaert requested review from a team as code owners January 30, 2026 17:10
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 30, 2026
@netlify
Copy link

netlify bot commented Jan 30, 2026

Deploy Preview for polite-licorice-3db33c ready!

Name Link
🔨 Latest commit 95aec74
🔍 Latest deploy log https://app.netlify.com/projects/polite-licorice-3db33c/deploys/697d15ed974a410008bc6da8
😎 Deploy Preview https://deploy-preview-1858--polite-licorice-3db33c.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @toddbaert, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant performance enhancement for flagd's OFREP bulk evaluation endpoint by implementing a sophisticated caching mechanism. It leverages HTTP ETags, generated per flag selector, to allow clients to avoid re-evaluation when the underlying flag configuration has not changed. The system ensures ETag consistency across distributed instances and intelligently invalidates cached entries when relevant flag configurations are updated, optimizing network traffic and server load.

Highlights

  • OFREP Bulk Evaluation Caching: Implements a caching mechanism for OFREP bulk flag evaluations, entirely avoiding re-evaluation if the ETag matches.
  • Per-Selector ETags: Sends per-selector ETags for caching, ensuring consistency across replicas.
  • Watch-Based Invalidation: ETags are invalidated and updated based on flag configuration changes detected via a new 'WatchSelector' mechanism in the store.
  • Configurable Cache Capacity: Introduces a new 'ofrep-cache-capacity' flag to control the maximum number of selectors cached.
  • Documentation Update: Adds documentation for clients on how to use 'If-None-Match' headers and the implications of ETag usage.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@toddbaert toddbaert force-pushed the feat/ofrep-evaluation-caching branch from a5c7c8d to c1fefab Compare January 30, 2026 17:12
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR introduces an ETag-based caching mechanism for OFREP bulk evaluations, which is a great feature for performance. The implementation is well-structured, with a new SelectorVersionTracker to manage ETags per selector. The caching is aware of flag configuration changes through a watch mechanism on the store. The changes are accompanied by good documentation and comprehensive tests.

I've found a potential race condition in the cache eviction logic that could lead to the cache growing beyond its configured capacity. I've also included a couple of suggestions to improve logging. Overall, this is a solid contribution.

return flags, queryMeta, nil
}

// watchSelector returns a channel that will be closed when the flags matching the given selector are modified.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we "hook in" to memdb's listener for the selector, and use it to invalidate the cache.

}

// SelectorVersionTracker tracks content hashes for selectors to enable ETag-based caching.
type SelectorVersionTracker struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This object handles the caching, and hooks into the store for invalidation whenever a selector it's tracking is updated (the same way we fire messages to listening providers over gRPC when selectors change).

@toddbaert toddbaert force-pushed the feat/ofrep-evaluation-caching branch from ea65da0 to 28ee86e Compare January 30, 2026 18:00
@toddbaert toddbaert force-pushed the feat/ofrep-evaluation-caching branch 2 times, most recently from 0c3d99e to 7414115 Compare January 30, 2026 20:13
toddbaert and others added 4 commits January 30, 2026 15:34
- sends per-selector ETag for caching
- ensures that ETags will be consistent across replicas
- skips evaluation entirely if ETag matches

Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
@toddbaert toddbaert force-pushed the feat/ofrep-evaluation-caching branch from 7414115 to 95aec74 Compare January 30, 2026 20:34
@sonarqubecloud
Copy link

Copy link
Member

@erka erka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach feels a bit hacky. While it may be fine for internal company use, it could cause issues in an open-source context. Users may run into problems, particularly if they rely on bulk evaluation with their custom clients (someone really likes to generate their own) that don’t match flagd’s/OpenFeature assumptions. It could also make migration between OFREP providers and flagd unnecessarily painful, and custom logic in the web JS provider may cause issues for other OFREP providers.

I think the evaluation context needs to be hashed as part of the ETag to make this implementation safe. It should still be faster than the other PR.

@toddbaert
Copy link
Member Author

toddbaert commented Feb 2, 2026

This approach feels a bit hacky. While it may be fine for internal company use, it could cause issues in an open-source context. Users may run into problems, particularly if they rely on bulk evaluation with their custom clients (someone really likes to generate their own) that don’t match flagd’s/OpenFeature assumptions. It could also make migration between OFREP providers and flagd unnecessarily painful, and custom logic in the web JS provider may cause issues for other OFREP providers.

I think the evaluation context needs to be hashed as part of the ETag to make this implementation safe. It should still be faster than the other PR.

I suppose that if we hash the context and add that as a factor in the ETag, we don't need to worry about this at all - basically we will be able to enforce that opinion in flagd itself, instead of pushing it into the OFREP provider generally. I think that's a good point @erka .

@tangenti
Copy link
Contributor

tangenti commented Feb 5, 2026

Could you share more context of the motivation here?

  1. This proposes a server-side caching, which I assume because the clients require a lower latency when it comes to cache invalidation? Would client-side caching with TTL work for the use case?

  2. As Roman mentioned, we should hash the context into ETag as well.

  3. With the point 2, you will need to have a maximum size for the cache entries.

  4. I wonder why the caching and invalidation are done per selector? With a per-selector caching, you will need a separate implementation for single flag caching I assume?

@toddbaert
Copy link
Member Author

toddbaert commented Feb 5, 2026

Could you share more context of the motivation here?

  1. This proposes a server-side caching, which I assume because the clients require a lower latency when it comes to cache invalidation? Would client-side caching with TTL work for the use case?
  2. As Roman mentioned, we should hash the context into ETag as well.
  3. With the point 2, you will need to have a maximum size for the cache entries.
  4. I wonder why the caching and invalidation are done per selector? With a per-selector caching, you will need a separate implementation for single flag caching I assume?
  1. There's already implicit client-side caching in the static context paradigm. Flags are generally evaluated in bulk and cached. This specifically addresses the polling for changes outlined in OFREP. The polling is meant to detect changes in the ruleset, which require re-evaluation. This implementation avoids that re-evaluation when we can.

  2. Agreed, in the condition we don't want to do this, which would accomplish the same thing, but as @erka poined out it's a lot more opinionated.

  3. Agreed.

  4. The selector corresponds to the set of flags a client is interested in. If the set they are interested in has not changed, and their context has not changed, there's no reason to re-evaluate. OFREP doesn't concern itself much with single flag caching; the bulk evaluation is intended for client-side use, and that's the only endpoint that has caching implied, per spec.

@tangenti
Copy link
Contributor

Could you share more context of the motivation here?

  1. This proposes a server-side caching, which I assume because the clients require a lower latency when it comes to cache invalidation? Would client-side caching with TTL work for the use case?
  2. As Roman mentioned, we should hash the context into ETag as well.
  3. With the point 2, you will need to have a maximum size for the cache entries.
  4. I wonder why the caching and invalidation are done per selector? With a per-selector caching, you will need a separate implementation for single flag caching I assume?
  1. There's already implicit client-side caching in the static context paradigm. Flags are generally evaluated in bulk and cached. This specifically addresses the polling for changes outlined in OFREP. The polling is meant to detect changes in the ruleset, which require re-evaluation. This implementation avoids that re-evaluation when we can.
  2. Agreed, in the condition we don't want to do this, which would accomplish the same thing, but as @erka poined out it's a lot more opinionated.
  3. Agreed.
  4. The selector corresponds to the set of flags a client is interested in. If the set they are interested in has not changed, and their context has not changed, there's no reason to re-evaluate. OFREP doesn't concern itself much with single flag caching; the bulk evaluation is intended for client-side use, and that's the only endpoint that has caching implied, per spec.

Thanks for the explanation. Could you help me understand why single flag caching is not an interest of OFREP? We're exploring to adopt OFREP soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants