-
Notifications
You must be signed in to change notification settings - Fork 5k
[cache-processor] WIP: SetPaths draft #47353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
|
One other idea I had was to stop registering the processors in the This has the advantage of getting rid of calls to |
Proposal: Lazy Initialization of the Cache Processor's File Store
The Problem
The basic problem is that processors often use
paths.Resolveto find directories like "data" or "logs". This function uses a global variable for the base path, which is fine when a Beat runs as a standalone process.But when a Beat is embedded as a receiver (e.g.,
fbreceiverin the OTel Collector), this global causes problems. Each receiver needs its own isolated state directory, and a single global path prevents this.The
cacheprocessor currently tries to set up its file-based store in itsNewfunction, which is too early. It only has access to the global path, not the receiver-specific path that gets configured later.The Solution
My solution is to initialize the cache's file store lazily.
Instead of creating the store in
cache.New, I've added aSetPaths(*paths.Path)method to the processor. This method creates the file store and is wrapped in async.Onceto make sure it only runs once. The processor's internal store object staysniluntilSetPathsis called during pipeline construction.How it Works
The path info gets passed down when a client connects to the pipeline. Here's the flow:
x-pack/filebeat/fbreceiver:createReceiverinstantiates the processors (includingcachewith anilstore) and callsinstance.NewBeatForReceiver.x-pack/libbeat/cmd/instance:NewBeatForReceivercreates thepaths.Pathobject from the receiver's specific configuration.libbeat/publisher/pipeline: Thispaths.Pathobject is passed into the pipeline. When a client connects, the path is added to thebeat.ProcessingConfig.libbeat/publisher/processing: The processing builder gets this config and callsgroup.SetPaths, which passes the path down to each processor.libbeat/processors/cache:SetPathsis finally called on the cache processor instance, and thesync.Onceguard ensures the file store is created with the correct path.Diagram
graph TD subgraph "libbeat/processors/cache (init)" A["init()"] end subgraph "libbeat/processors" B["processors.RegisterPlugin"] C{"registry"} end A --> B; B -- "Save factory" --> C; subgraph "x-pack/filebeat/fbreceiver" D["createReceiver"] end subgraph "libbeat/processors" E["processors.New(config)"] C -. "Lookup 'cache'" .-> E; end D --> E; D --> I; E --> G; subgraph "libbeat/processors/cache" G["cache.New()"] -- store=nil --> H{"cache"}; end subgraph "x-pack/libbeat/cmd/instance" I["instance.NewBeatForReceiver"]; I --> J{"paths.Path object"}; end subgraph "libbeat/publisher/pipeline" J --> K["pipeline.New"]; K --> L["ConnectWith"]; end subgraph "libbeat/publisher/processing" L -- "Config w/ paths" --> N["builder.Create"]; N --> O["group.SetPaths"]; end subgraph "libbeat/processors/cache" O --> P["cache.SetPaths"]; P --> Q["sync.Once"]; Q -- "initialize store" --> H; endPros and Cons of This Approach
libbeat.setPathsinterface feels a bit like magic, since the behavior changes at runtime depending on whether a processor implements it.Alternatives Considered
Option 1: Add a
pathsargument to all processor constructorspathsargument is not needed in many processors, so adding a rarely used option to the function signature is verbose.Option 2: Refactor
processorsto introduce a "V2" interfaceProposed commit message
Checklist
stresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.Disruptive User Impact
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs