title | summary | toc | docs_area |
---|---|---|---|
Log SQL Activity to Datadog |
Examples of common logging use cases and possible CockroachDB logging sink configurations. |
true |
manage |
{% include_cached new-in.html version="v23.1.4" %} This tutorial describes how to configure logging of [sampled_query
events]({% link {{ page.version.version }}/eventlog.md %}#sampled_query) to Datadog for finer granularity and long-term retention of SQL activity. The sampled_query
events contain common SQL event and execution details for sessions, transactions, and statements.
CockroachDB supports a built-in integration with Datadog which sends query events as logs via the Datadog HTTP API. This integration is the recommended path to achieve high throughput data ingestion, which will in turn provide more query events for greater workload observability.
{{site.data.alerts.callout_info}} {% include feature-phases/preview.md %} {{site.data.alerts.end}}
- In Datadog, navigate to Organization Settings > API keys.
- Follow the steps in the Datadog documentation on how to add an API key.
- Copy the newly created API key to be used in Step 2.
Configure an [HTTP network collector]({% link {{ page.version.version }}/configure-logs.md %}#output-to-http-network-collectors) by creating or modifying the [logs.yaml
file]({% link {{ page.version.version }}/configure-logs.md %}#yaml-payload).
{{site.data.alerts.callout_danger}}
Given the volume of sampled_query
events, do not write sampled_query
events to disk, or [file-groups
]({% link {{ page.version.version }}/configure-logs.md %}#output-to-files). Writing a high volume of sampled_query
events to a file group will unnecessarily consume cluster resources and impact workload performance.
To disable the creation of a telemetry file and avoid writing sampled_query
events and other [telemetry events]({% link {{ page.version.version }}/eventlog.md %}#telemetry-events) to disk, change the telemetry file-groups
setting from the [default of channels: [TELEMETRY]
]({% link {{ page.version.version }}/configure-logs.md %}#default-logging-configuration) to channels: []
.
{{site.data.alerts.end}}
In this logs.yaml
example:
- To send
sampled_query
events directly to Datadog without writing events to disk, override telemetry default configuration by settingfile-groups: telemetry: channels:
to[]
. - To connect to Datadog, replace
{DATADOG API KEY}
with the value you copied in Step 1. - To control the ingestion and potential drop rate for
sampled_query
events, configure the followingbuffering
values depending on your workload:
max-staleness
: The maximum time a log message will wait in the buffer before a flush is triggered. Set to0
to disable flushing based on elapsed time. Default:5s
flush-trigger-size
: The number of bytes that will trigger the buffer to flush. Set to0
to disable flushing based on accumulated size. Default:1MiB
. In this example, override to2.5MiB
.max-buffer-size
: The maximum size of the buffer: new log messages received when the buffer is full cause older messages to be dropped. Default:50MiB
{% include_cached copy-clipboard.html %}
sinks:
http-servers:
datadog:
channels: [TELEMETRY]
address: https://http-intake.logs.datadoghq.com/api/v2/logs
format: json
method: POST
compression: gzip
headers: {DD-API-KEY: "{DATADOG API KEY}"} # replace with actual API key
buffering:
format: json-array
max-staleness: 5s
flush-trigger-size: 2.5MiB # override default value
max-buffer-size: 50MiB
file-groups: # override default configuration
telemetry: # do not write telemetry events to disk
channels: [] # set to empty square brackets
Pass the [logs.yaml
file]({% link {{ page.version.version }}/configure-logs.md %}#yaml-payload) to the cockroach
process with either --log-config-file
or --log
flag.
Enable the [sql.telemetry.query_sampling.enabled
cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}#setting-sql-telemetry-query-sampling-enabled) so that executed queries will emit an event on the telemetry [logging channel]({% link {{ page.version.version }}/logging-overview.md %}#logging-channels):
{% include_cached copy-clipboard.html %}
SET CLUSTER SETTING sql.telemetry.query_sampling.enabled = true;
Configure the following [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}) to a value that is dependent on the level of granularity you require and how much performance impact from frequent logging you can tolerate:
sql.telemetry.query_sampling.max_event_frequency
(default8
) is the max event frequency (events per second) at which we sample executed queries for telemetry. In practice, this means that we only sample an executed query if 1/max_event_frequency
seconds have elapsed since the last executed query was sampled. Sampling impacts the volume of query events emitted which can have downstream impact to workload performance and third-party processing costs. Slowly increase this sampling threshold and monitor potential impact.
{{site.data.alerts.callout_info}}
The sql.telemetry.query_sampling.max_event_frequency
cluster setting and the buffering
options in the logs.yaml
control how many events are emitted to Datadog and that can be potentially dropped. Adjust this setting and these options according to your workload, depending on the size of events and the queries per second (QPS) observed through monitoring.
{{site.data.alerts.end}}
- Navigate to Datadog > Logs.
- Filter by OTHERS > channel: TELEMETRY to see the logs for the query events that are emitted. For example:
- [Notable Event Types]({% link {{ page.version.version }}/eventlog.md %})
- [Configure logs]({% link {{ page.version.version }}/configure-logs.md %})
- [Cluster Settings]({% link {{ page.version.version }}/cluster-settings.md %})