Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphQL: Report multiple query errors #4177

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

marcotc
Copy link
Member

@marcotc marcotc commented Nov 30, 2024

Captures GraphQL error information as span events:
Screenshot 2025-01-17 at 3 59 23 PM

This is necessary because each query can have multiple errors (GraphQL spec for the "errors" field), which cannot reported using span tags (span tags only support one error per span).

Change log entry
GraphQL query errors are now reported as Span Events. This includes support for multiple errors, if present.

How to test the change?
All changes have unit tests and system-tests: DataDog/system-tests#3840

@github-actions github-actions bot added integrations Involves tracing integrations tracing labels Nov 30, 2024
@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Nov 30, 2024

Datadog Report

Branch report: graphql-error-event
Commit report: a5b863e
Test service: dd-trace-rb

✅ 0 Failed, 22066 Passed, 1476 Skipped, 5m 30.76s Total Time

@codecov-commenter
Copy link

codecov-commenter commented Nov 30, 2024

Codecov Report

Attention: Patch coverage is 98.30508% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.72%. Comparing base (8dba6cb) to head (cf6efdc).

Files with missing lines Patch % Lines
...cing/contrib/graphql/support/application_schema.rb 66.66% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4177   +/-   ##
=======================================
  Coverage   97.72%   97.72%           
=======================================
  Files        1368     1368           
  Lines       82997    83046   +49     
  Branches     4219     4222    +3     
=======================================
+ Hits        81105    81157   +52     
+ Misses       1892     1889    -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pr-commenter
Copy link

pr-commenter bot commented Nov 30, 2024

Benchmarks

Benchmark execution time: 2025-01-28 00:07:46

Comparing candidate commit cf6efdc in PR branch graphql-error-event with baseline commit 8dba6cb in branch master.

Found 0 performance improvements and 2 performance regressions! Performance is the same for 29 metrics, 2 unstable metrics.

scenario:profiler - Allocations (profiling disabled)

  • 🟥 throughput [-448576.194op/s; -439911.897op/s] or [-8.752%; -8.583%]

scenario:profiler - Allocations (profiling enabled)

  • 🟥 throughput [-444491.310op/s; -435607.794op/s] or [-8.750%; -8.575%]

@marcotc marcotc force-pushed the graphql-error-event branch from af565a0 to 10f8fe3 Compare January 17, 2025 23:32
@github-actions github-actions bot added the core Involves Datadog core libraries label Jan 17, 2025
@marcotc marcotc force-pushed the graphql-error-event branch from 10f8fe3 to 523aa51 Compare January 17, 2025 23:54
Copy link

github-actions bot commented Jan 17, 2025

👋 Hey @marcotc, please fill "Change log entry" section in the pull request description.

If changes need to be present in CHANGELOG.md you can state it this way

**Change log entry**

Yes. A brief summary to be placed into the CHANGELOG.md

(possible answers Yes/Yep/Yeah)

Or you can opt out like that

**Change log entry**

None.

(possible answers No/Nope/None)

Visited at: 2025-01-18 00:02:08 UTC

@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Jan 23, 2025

Datadog Report

Branch report: graphql-error-event
Commit report: cf6efdc
Test service: dd-trace-rb

✅ 0 Failed, 22119 Passed, 1476 Skipped, 5m 48.31s Total Time
⌛ 13 Performance Regressions

⌛ Performance Regressions vs Default Branch (13)

This report shows up to 5 performance regressions.

  • Rails integration tests for an application with a basic route GET request with an event-triggering request in IP behaves like normal with tracing disable is expected to have 0 items - rspec 2.97s (+2.42s, +443%) - Details
  • Rails integration tests for an application with a basic route GET request with an event-triggering request in route parameter behaves like a trace with AppSec api security tags with api security enabled is expected not to be empty - rspec 3.02s (+2.44s, +422%) - Details
  • Rails integration tests for an application with a basic route GET request with an event-triggering request in route parameter is expected to be ok - rspec 3.11s (+2.54s, +445%) - Details
  • Rails integration tests for an application with a basic route GET request with an event-triggering request in query string behaves like a trace with AppSec events is expected to be a kind of String - rspec 3.35s (+2.77s, +477%) - Details
  • Rails integration tests for an application with a basic route GET request with an event-triggering request in route parameter behaves like a trace with AppSec events is expected to be a kind of String - rspec 3.03s (+2.46s, +432%) - Details

@marcotc marcotc force-pushed the graphql-error-event branch from 3b8bf6c to cf6efdc Compare January 27, 2025 23:43
@marcotc marcotc marked this pull request as ready for review January 28, 2025 22:29
@marcotc marcotc requested review from a team as code owners January 28, 2025 22:29
Copy link
Contributor

@brett0000FF brett0000FF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving docs change. Thanks!

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM! Looks pretty slick!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious what this did exactly and there's excellent docs to explain it! https://github.com/DataDog/dd-trace-rb/blob/502c4da38bffc97097df024e97047277346fd06a/docs/ForcingSystemTests.md :D

Comment on lines +883 to 884
| `with_unified_tracer` | `DD_TRACE_GRAPHQL_WITH_UNIFIED_TRACER` | `Bool` | (Recommended) Enable to instrument with `UnifiedTrace` tracer for `graphql` >= v2.2, **enabling support for Endpoints list** in the Service Catalog. `with_deprecated_tracer` has priority over this. Default is `false`, using `GraphQL::Tracing::DataDogTrace` instead | `false` |
| `with_deprecated_tracer` | | `Bool` | Enable to instrument with deprecated `GraphQL::Tracing::DataDogTracing`. This has priority over `with_unified_tracer`. Default is `false`, using `GraphQL::Tracing::DataDogTrace` instead | `false` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I'm assuming we haven't flipped the default for backwards-compatibility. But maybe we could be a bit clearer in here that that's the reason? E.g. something like "Due to backwards compatibility, this is not the default, but we strongly suggest enabling this if possible" or something like that?

(Or we could say, "this will be the default for dd-trace-rb 3.x"...)

Comment on lines +168 to +170
if (before_callable = before || before_block)
before_callable.call(span)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth perhaps raising if there's a before && before_block, just to make sure we don't accidentally introduce bugs in the future?

span.span_events << Datadog::Tracing::SpanEvent.new(
Ext::EVENT_QUERY_ERROR,
attributes: {
message: err['message'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can this be e.message?

Comment on lines +209 to +230
locations: serialize_error_locations(err['locations']),
path: err['path'],
}
)
end
end

# Serialize error's `locations` array as an array of Strings, given
# Span Events do not support hashes nested inside arrays.
#
# Here's an example in which `locations`:
# [
# {"line" => 3, "column" => 10},
# {"line" => 7, "column" => 8},
# ]
# is serialized as:
# ["3:10", "7:8"]
def serialize_error_locations(locations)
locations.map do |location|
"#{location['line']}:#{location['column']}"
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these locations a GraphQL thing? They're not backtracelocations, right?

Comment on lines +12 to +13
ENV_WITH_UNIFIED_TRACER: string
EVENT_QUERY_ERROR: String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I like it when we inline the actual strings here because we usually use the constants everywhere and it seems slightly weird to me if there isn't a test that fails if we rename the constants. Having the values here is equivalent to having a test, and means we need to change them in two places and thus that seems like a deliberate, rather than an accidental thing :)

Suggested change
ENV_WITH_UNIFIED_TRACER: string
EVENT_QUERY_ERROR: String
ENV_WITH_UNIFIED_TRACER: "DD_TRACE_GRAPHQL_WITH_UNIFIED_TRACER"
EVENT_QUERY_ERROR: "dd.graphql.query.error"

Comment on lines +155 to +186
describe 'query with a GraphQL error' do
subject(:result) { schema.execute(query: 'query Error{ graphqlError }', variables: { var: 1 }) }

let(:graphql_execute) { spans.find { |s| s.name == 'graphql.execute' } }

it 'creates query span for error' do
expect(result.to_h['errors'][0]['message']).to eq('GraphQL error')
expect(result.to_h['data']).to eq('graphqlError' => nil)

expect(graphql_execute.resource).to eq('Error')
expect(graphql_execute.service).to eq(service)
expect(graphql_execute.type).to eq('graphql')

expect(graphql_execute.get_tag('graphql.source')).to eq('query Error{ graphqlError }')

expect(graphql_execute.get_tag('graphql.operation.type')).to eq('query')
expect(graphql_execute.get_tag('graphql.operation.name')).to eq('Error')

expect(graphql_execute.events).to contain_exactly(
a_span_event_with(
name: 'dd.graphql.query.error',
attributes: {
'message' => 'GraphQL error',
'type' => 'GraphQL::ExecutionError',
'stacktrace' => include(__FILE__),
'locations' => ['1:14'],
'path' => ['graphqlError'],
}
)
)
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth perhaps adding a test for multiple errors? Since that seems a big part of the feature

Comment on lines +42 to +44
class Error
def to_h: -> Hash[String, untyped]
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious -- why do we need to turn the object into a hash first? Is it harder to get the values we want without the additional intermediate hash?

span.set_tag("graphql.variables.#{key}", value)
end
end
trace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Can we extract those callable lambda and procs?

I assume they are static and they looked weird to me that we are creating new ones in the memory for every invocation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Involves Datadog core libraries integrations Involves tracing integrations tracing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants