[DSEC] [SDS] [DO NOT MERGE] Data Security MVP#51830
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
|
@codex review |
|
|
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1e1b7a0ed1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| rtloader_root=None, | ||
| python_home_3=None, | ||
| exclude_rtloader=False, | ||
| include_sds=False, |
There was a problem hiding this comment.
Propagate --include-sds into agent build tags
When omnibus and CI call dda inv agent.build --include-sds, this new option is accepted but never passed to compute_build_tags_for_flavor below, so the agent binary is still built without the sds tag and uses scanner_nosds.go while packaging the shared library. This makes the SDS-enabled package silently run the no-op scanner; please thread include_sds=include_sds into the build tag computation.
Useful? React with 👍 / 👎.
| s.Lock() | ||
| defer s.Unlock() | ||
|
|
||
| log.Infof("Scanning event: %s", string(event)) |
There was a problem hiding this comment.
Avoid logging unredacted scan inputs
For any integration that calls datadog_agent.scan or the new scan-and-submit path with a payload containing credentials/PII, this INFO log writes the original bytes before SDS has a chance to redact them. Since agent logs are commonly collected and shipped, this defeats the purpose of the scanner; log only non-sensitive metadata or remove this line.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Yes this is a POC just to show for now that events get scanner
This PR will not be merged as it is for now
| //sender.EventPlatformEvent(processed, C.GoString(eventType)) | ||
| // Submit the scanned payload as a regular Datadog event, keeping the JSON | ||
| // format in the event text, rather than forwarding it to the event platform. | ||
| sender.Event(metricsevent.Event{ |
There was a problem hiding this comment.
Submit scanned payloads to event platform
In the aggregator.scan_and_submit_event_platform_event path, this sends the scanned JSON as a regular Datadog event with hard-coded host/tags instead of forwarding it through EventPlatformEvent. Integrations using the new API for event-platform payloads will no longer hit the event-platform pipeline or preserve the intended payload semantics; the processed bytes should be submitted with the original event platform sender.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Yes comment is false but the code above will be updated (we are exploring for now different options which are intended only for poke)
Gitlab CI Configuration Changes
|
| Removed | Modified | Added | Renamed |
|---|---|---|---|
| 0 | 8 | 0 | 0 |
ℹ️ Diff available in the job log.
Go Package Import DifferencesBaseline: 2d4cf88
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6c6b2a5995
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| rtloader_root=None, | ||
| python_home_3=None, | ||
| exclude_rtloader=False, | ||
| include_sds=False, |
There was a problem hiding this comment.
Pass the SDS flag into agent build tags
When dda inv agent.build --include-sds is used by the updated CI and omnibus paths, this new option is never forwarded to compute_build_tags_for_flavor in build(), so the agent is still compiled without the sds tag and uses the no-op implementation from scanner_nosds.go. That means shipped Linux/macOS packages can include libdd_sds while the binary never links or runs the real scanner; pass include_sds=include_sds when computing the agent build tags.
Useful? React with 👍 / 👎.
| s.Lock() | ||
| defer s.Unlock() | ||
|
|
||
| log.Infof("Scanning event: %s", string(event)) |
There was a problem hiding this comment.
For any check calling datadog_agent.scan or the new scan-and-submit path, this logs the full pre-redaction event at INFO level before SDS can remove secrets. Since the payload is expected to contain sensitive data, default agent logs will retain exactly the values SDS is meant to protect; remove this log or make it non-payload metadata only.
Useful? React with 👍 / 👎.
| //sender.EventPlatformEvent(processed, C.GoString(eventType)) | ||
| // Submit the scanned payload as a regular Datadog event, keeping the JSON | ||
| // format in the event text, rather than forwarding it to the event platform. | ||
| sender.Event(metricsevent.Event{ |
There was a problem hiding this comment.
Preserve event-platform submission after scanning
In the new scan_and_submit_event_platform_event flow, the scanned payload is no longer sent through sender.EventPlatformEvent; it is converted into a regular Datadog event with hard-coded host/tags instead. Integrations that call this new method expecting the same event-platform ingestion semantics as submit_event_platform_event will have their payloads routed to the wrong pipeline, so submit processed via EventPlatformEvent after scanning.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b595a6b596
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| s.Lock() | ||
| defer s.Unlock() | ||
|
|
||
| log.Infof("Scanning event: %s", string(event)) |
There was a problem hiding this comment.
When SDS is enabled, every call to Scan logs the full pre-redaction event at INFO before applying any rules. For the Python datadog_agent.scan path and the new event-platform scanning path, this is precisely the data expected to contain secrets/PII, so enabling SDS can leak sensitive payloads into Agent logs even when redaction succeeds.
Useful? React with 👍 / 👎.
| //sender.EventPlatformEvent(processed, C.GoString(eventType)) | ||
| // Submit the scanned payload as a regular Datadog event, keeping the JSON | ||
| // format in the event text, rather than forwarding it to the event platform. | ||
| sender.Event(metricsevent.Event{ |
There was a problem hiding this comment.
Route scanned payloads through EventPlatformEvent
This new rtloader API is documented and named as scan_and_submit_event_platform_event, but after scanning it submits a classic Datadog event instead of calling sender.EventPlatformEvent(processed, C.GoString(eventType)). Any integration using this path for event-platform payloads will stop sending to the event-platform forwarder and instead emit a regular event with hard-coded host/tags, so the backend intake and payload semantics are wrong.
Useful? React with 👍 / 👎.
| rtloader_root=None, | ||
| python_home_3=None, | ||
| exclude_rtloader=False, | ||
| include_sds=False, |
There was a problem hiding this comment.
Propagate include_sds into agent build tags
The new agent.build --include-sds option is never passed into compute_build_tags_for_flavor in this task, so building the agent with the option still omits the sds build tag and compiles the no-op scanner_nosds.go implementation. In the dev image path this can copy libdd_sds.so into the image while the agent binary cannot use it, making the advertised SDS-enabled build silently ineffective.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 26855fb9e2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
|
|
||
| rawEvent := C.GoBytes(unsafe.Pointer(rawEventPtr), rawEventSize) | ||
| log.Infof("AIMENE: eventplatform: received: %s", string(rawEvent)) |
There was a problem hiding this comment.
Stop logging unredacted SDS payloads
When this new callback is used for sensitive-data scanning, the raw payload is written to the Agent log before SDS has a chance to redact it. That exposes exactly the secrets this path is meant to protect, and the later processed log can still contain sensitive data when no rule matches or scanning falls back to the original event.
Useful? React with 👍 / 👎.
| rtloader_root=None, | ||
| python_home_3=None, | ||
| exclude_rtloader=False, | ||
| include_sds=False, |
There was a problem hiding this comment.
Pass include_sds into the agent build tags
The new --include-sds option is accepted by agent.build, and CI/omnibus now pass it, but this function never forwards it to compute_build_tags_for_flavor when constructing build_tags below. As a result, those builds still omit the sds build tag and compile the no-op scanner_nosds.go path, so packaged agents will ship the SDS library but the new scan APIs won't actually redact anything.
Useful? React with 👍 / 👎.
| //sender.EventPlatformEvent(processed, C.GoString(eventType)) | ||
| // Submit the scanned payload as a regular Datadog event, keeping the JSON | ||
| // format in the event text, rather than forwarding it to the event platform. | ||
| sender.Event(metricsevent.Event{ |
There was a problem hiding this comment.
Submit scanned payloads through Event Platform
For callers of aggregator.scan_and_submit_event_platform_event, this replaces the event-platform submission with a regular Datadog event, so payloads meant for Event Platform ingestion are routed to the Events product with hard-coded metadata instead of the caller's event-platform stream. This changes the contract described by the rtloader method and prevents downstream Event Platform consumers from receiving the scanned payload.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eddd8399a5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| s.Lock() | ||
| defer s.Unlock() | ||
|
|
||
| log.Infof("Scanning event: %s", string(event)) |
There was a problem hiding this comment.
Avoid logging raw SDS payloads
When SDS scans payloads that contain credentials or PII, this INFO log writes the unredacted event to the Agent log before any redaction happens. That defeats the purpose of routing the data through Sensitive Data Scanner and can leak exactly the values SDS is meant to protect whenever datadog_agent.scan or the scan-and-submit path is used.
Useful? React with 👍 / 👎.
| //sender.EventPlatformEvent(processed, C.GoString(eventType)) | ||
| // Submit the scanned payload as a regular Datadog event, keeping the JSON | ||
| // format in the event text, rather than forwarding it to the event platform. | ||
| sender.Event(metricsevent.Event{ |
There was a problem hiding this comment.
Submit scanned payloads through the event-platform path
For callers using the new scan_and_submit_event_platform_event rtloader method, the existing unscanned path submits bytes via sender.EventPlatformEvent, and the new C/API docs also say this method submits an event-platform event; however this code sends a generic Datadog event with hard-coded host/tags instead. That changes routing and payload semantics for scanned events, so downstream event-platform consumers will not receive the data they requested.
Useful? React with 👍 / 👎.
What does this PR do?
Motivation
Describe how you validated your changes
Additional Notes