Add Azure Blob and ADLS support#1279
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ddl-ebrown The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
- Azure supports two different types of blob storage: * Blob * ADLS Gen2 - The setup and configuration of these alternatives varies, as do the Azure SDKs used to access them. Provide support for both of these options to sidekick Signed-off-by: ddl-ebrown <ethan.brown@dominodatalab.com>
4da5fd0 to
6b9a0d0
Compare
|
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
|
/remove-lifecycle stale |
leogr
left a comment
There was a problem hiding this comment.
Hi @ddl-ebrown
Thanks for this PR and sorry for the long delay.
I reviewed this PR and found some issues (see my suggestions below).
Moreover, every output has its defaults set via v.SetDefault(...) and its MinimumPriority validated via checkPriority() in config.go.
This is missing for Blob/ADLS configs.
You should add something like the following near the other Azure defaults:
v.SetDefault("Azure.Blob.Account", "")
v.SetDefault("Azure.Blob.Prefix", "falco")
v.SetDefault("Azure.Blob.Container", "")
v.SetDefault("Azure.Blob.MinimumPriority", "")
v.SetDefault("Azure.ADLS.Account", "")
v.SetDefault("Azure.ADLS.Prefix", "falco")
v.SetDefault("Azure.ADLS.Container", "")
v.SetDefault("Azure.ADLS.MinimumPriority", "")
And validation near the other checkPriority calls:
c.Azure.Blob.MinimumPriority = checkPriority(c.Azure.Blob.MinimumPriority)
c.Azure.ADLS.MinimumPriority = checkPriority(c.Azure.ADLS.MinimumPriority)
| if err != nil { | ||
| c.setBlobErrorMetrics() | ||
| utils.Log(utils.ErrorLvl, c.OutputType+" Blob", | ||
| fmt.Sprintf("Error with acccount %s: %v", serviceURL, err)) |
There was a problem hiding this comment.
| fmt.Sprintf("Error with acccount %s: %v", serviceURL, err)) | |
| fmt.Sprintf("Error with account %s: %v", serviceURL, err)) |
| attribute.String("status", Error)).Inc() | ||
| } | ||
|
|
||
| func (c *Client) UploadBlob(falcopayload types.FalcoPayload) { |
There was a problem hiding this comment.
Every output function in the codebase starts by incrementing the Total counter (e.g., EventHubPost at line 37 does c.Stats.AzureEventHub.Add(Total, 1)). Please add at the top of the function: c.Stats.AzureBlob.Add(Total, 1)
Same for UploadADLS: c.Stats.AzureADLS.Add(Total, 1)
| prefix = c.Config.Azure.Blob.Prefix | ||
| } | ||
|
|
||
| serviceURL := fmt.Sprintf("https://%s.dfs.core.windows.net/", c.Config.Azure.Blob.Account) |
There was a problem hiding this comment.
Shouldn't it be https://%s.blob.core.windows.net/ ?
|
|
||
| // upload the file to the specified container with the specified blob name | ||
| // TODO: should any part of the response be validated here? aws s3 client performs a few checks | ||
| _, err = client.UploadStream(context.TODO(), c.Config.Azure.Blob.Container, key, bytes.NewReader(f), nil) |
There was a problem hiding this comment.
Without a timeout, a hanging upload will block the goroutine indefinitely. Please add a timeout to this and UploadADLS as well
| c.PromStats.Outputs.With(map[string]string{"destination": "azdatalake", "status": Error}).Inc() | ||
| c.OTLPMetrics.Outputs.With(attribute.String("destination", "azdatalake"), | ||
| attribute.String("status", Error)).Inc() | ||
| utils.Log(utils.ErrorLvl, c.OutputType+" S3", err.Error()) |
There was a problem hiding this comment.
| utils.Log(utils.ErrorLvl, c.OutputType+" S3", err.Error()) | |
| utils.Log(utils.ErrorLvl, c.OutputType+" Blob", err.Error()) |
| } | ||
|
|
||
| go c.CountMetric("outputs", 1, []string{"output:azblob", "status:ok"}) | ||
| c.PromStats.Outputs.With(map[string]string{"destination": "azblob", "status": "ok"}).Inc() |
There was a problem hiding this comment.
| c.PromStats.Outputs.With(map[string]string{"destination": "azblob", "status": "ok"}).Inc() | |
| c.PromStats.Outputs.With(map[string]string{"destination": "azblob", "status": OK}).Inc() |
| if err != nil { | ||
| c.setADLSErrorMetrics() | ||
| utils.Log(utils.ErrorLvl, c.OutputType+" ADLS Gen2 Blob", | ||
| fmt.Sprintf("Error with acccount %s: %v", serviceURL, err)) |
There was a problem hiding this comment.
| fmt.Sprintf("Error with acccount %s: %v", serviceURL, err)) | |
| fmt.Sprintf("Error with account %s: %v", serviceURL, err)) |
| } | ||
|
|
||
| go c.CountMetric("outputs", 1, []string{"output:azdatalake", "status:ok"}) | ||
| c.PromStats.Outputs.With(map[string]string{"destination": "azdatalake", "status": "ok"}).Inc() |
There was a problem hiding this comment.
| c.PromStats.Outputs.With(map[string]string{"destination": "azdatalake", "status": "ok"}).Inc() | |
| c.PromStats.Outputs.With(map[string]string{"destination": "azdatalake", "status": OK}).Inc() |
| go azureClient.EventHubPost(falcopayload) | ||
| } | ||
|
|
||
| if config.Azure.Blob.Container != "" && (falcopayload.Priority >= types.Priority(config.Azure.Blob.MinimumPriority) || falcopayload.Rule == testRule) { |
There was a problem hiding this comment.
The azureClient is only created in main.go when config.Azure.EventHub.Name != "".
If a user configures only Blob or only ADLS (without EventHub), azureClient will be nil, and these lines will panic with a nil pointer dereference.
The initialization in main.go needs to be updated so azureClient is also created when config.Azure.Blob.Container != "" or config.Azure.ADLS.Container != "". See how awsClient is created when any AWS sub-service is configured.
Also, EnabledOutputs needs to be appended for each enabled sub-service (e.g., "AzureBlob", "AzureADLS")
| AzureBlob *expvar.Map | ||
| AzureADLS *expvar.Map |
There was a problem hiding this comment.
AzureBlob and AzureADLS should be initialized in getInitStats() in stats.go
First take on adding Azure Blob and ADLS Gen 2 write support as backends for sidekick. We're testing this internally right now to make sure everything works as expected, since there are no tests in this repo that could properly vet things.
What type of PR is this?
/kind feature
/area outputs
What this PR does / why we need it:
Azure supports two different types of blob storage:
The setup and configuration of these alternatives varies, as do the Azure SDKs used to access them.
Provide support for both of these options to sidekick
Which issue(s) this PR fixes:
Fixes # 1209
Special notes for your reviewer: