Skip to content

feat: add distributed tracing for webhook handling and PipelineRun timing#2605

Open
ci-operator wants to merge 2 commits intotektoncd:mainfrom
ci-operator:distributed-tracing
Open

feat: add distributed tracing for webhook handling and PipelineRun timing#2605
ci-operator wants to merge 2 commits intotektoncd:mainfrom
ci-operator:distributed-tracing

Conversation

@ci-operator
Copy link
Copy Markdown

📝 Description of the Change

Add OpenTelemetry distributed tracing to Pipelines-as-Code. When tracing is enabled via the pipelines-as-code-config-observability ConfigMap, PaC emits trace spans for webhook event processing and PipelineRun lifecycle timing.

Controller: Emits a PipelinesAsCode:ProcessEvent span covering the full webhook event lifecycle — from SCM event receipt through PipelineRun creation. Propagates trace context onto created PipelineRuns via the tekton.dev/pipelinerunSpanContext annotation, enabling end-to-end traces when Tekton Pipelines also has tracing enabled.

Watcher: Emits waitDuration (creation → start) and executeDuration (start → completion) timing spans for completed PipelineRuns, using resource timestamps for accurate wall-clock timing.

Tracing is configured through the existing observability ConfigMap with three new keys: tracing-protocol, tracing-endpoint, and tracing-sampling-rate. The controller's Knative observability configurator is pointed at the correct PaC-specific ConfigMap (pipelines-as-code-config-observability).

🔗 Linked GitHub Issue

https://issues.redhat.com/browse/SRVKP-8544

🧪 Testing Strategy

  • Unit tests
  • Integration tests
  • End-to-end tests
  • Manual testing
  • Not Applicable

Manually tested end-to-end with an OpenTelemetry collector and Tempo backend. Verified traces appear with correct span names, attributes, and parent-child relationships across PaC and Tekton Pipelines reconciler spans.

🤖 AI Assistance

  • I have not used any AI assistance for this PR.
  • I have used AI assistance for this PR.

AI (Claude) was used for code generation, debugging, and documentation. All code has been reviewed, tested, and deployed. Co-authored-by trailers are on each commit.

✅ Submitter Checklist

  • 📝 My commit messages are clear, informative, and follow the project's commit message guide.
  • ✨ I have ensured my commit message prefix (e.g., fix:, feat:) matches the "Type of Change" I selected above.
  • ♽ I have run make test locally. make lint requires golangci-lint which is not installed locally; CI will validate.
  • 📖 I have added or updated documentation for any user-facing changes.
  • 🧪 I have added sufficient unit tests for my code changes.
  • 🎁 I have added end-to-end tests where feasible.
  • 🔎 I have addressed any CI test flakiness or provided a clear reason to bypass it.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability of Pipelines-as-Code by integrating OpenTelemetry distributed tracing. This allows operators and developers to gain deeper insights into the performance and flow of webhook event processing and the various stages of PipelineRun execution. By propagating trace context, it facilitates a unified view of operations across PaC and Tekton Pipelines, streamlining debugging and performance analysis.

Highlights

  • Distributed Tracing Implementation: Introduced OpenTelemetry distributed tracing to Pipelines-as-Code, enabling visibility into webhook event processing and PipelineRun lifecycle timing.
  • Webhook Event Tracing: The controller now emits a 'PipelinesAsCode:ProcessEvent' span, covering the full webhook event lifecycle from receipt to PipelineRun creation, with relevant VCS attributes.
  • PipelineRun Timing Spans: The watcher emits 'waitDuration' (creation to start) and 'executeDuration' (start to completion) spans for completed PipelineRuns, using accurate resource timestamps.
  • Trace Context Propagation: Trace context is propagated to created PipelineRuns via the 'tekton.dev/pipelinerunSpanContext' annotation, allowing for end-to-end traces when Tekton Pipelines also has tracing enabled.
  • Configurable Observability: Tracing is configured through the existing 'pipelines-as-code-config-observability' ConfigMap using new keys: 'tracing-protocol', 'tracing-endpoint', and 'tracing-sampling-rate'.
  • New Documentation: Added comprehensive documentation on how to enable and configure distributed tracing for Pipelines-as-Code.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry distributed tracing to Pipelines-as-Code. Key changes include integrating tracing into the event handling flow, propagating trace context to PipelineRuns via annotations, and emitting timing spans for PipelineRun lifecycle events. The observability configuration has been updated to include new tracing options, but the removal of existing metrics-protocol and metrics-endpoint configurations requires clarification and documentation. Additionally, an improvement opportunity was identified to ensure consistent tracing data by always setting VCS repository and revision attributes, even when empty.

I am having trouble creating individual review comments. Click here to see my feedback.

config/305-config-observability.yaml (24-25)

high

The metrics-protocol and metrics-endpoint configurations are being removed from the data section. This change was not explicitly mentioned in the pull request description, which focuses on adding tracing. If these metrics were actively used, their removal could be a breaking change or an unintended side effect. Please clarify if this removal is intentional and, if so, document it in the PR description or release notes.

pkg/adapter/adapter.go (212-217)

medium

For better consistency in tracing data, consider always setting the VCSRepositoryKey and VCSRevisionKey attributes, even if l.event.URL or l.event.SHA are empty. This ensures the attribute key is always present in the span, which can simplify querying and analysis in tracing backends. You could set them to an empty string or a placeholder like "unknown" if the values are not available, instead of omitting the attribute entirely.

if l.event.URL != "" {
			span.SetAttributes(tracing.VCSRepositoryKey.String(l.event.URL))
		} else {
			span.SetAttributes(tracing.VCSRepositoryKey.String(""))
		}
		if l.event.SHA != "" {
			span.SetAttributes(tracing.VCSRevisionKey.String(l.event.SHA))
		} else {
			span.SetAttributes(tracing.VCSRevisionKey.String(""))
		}

@ci-operator ci-operator force-pushed the distributed-tracing branch from 393b9d3 to 3870a7f Compare March 25, 2026 15:04
@chmouel
Copy link
Copy Markdown
Member

chmouel commented Mar 25, 2026

/ok-to-test

@chmouel
Copy link
Copy Markdown
Member

chmouel commented Mar 27, 2026

@zakisk can you have a look pls

…ming

Emit a PipelinesAsCode:ProcessEvent span covering the full webhook
event lifecycle. Emit waitDuration and executeDuration timing spans
for completed PipelineRuns. Propagate trace context onto created
PipelineRuns via the tekton.dev/pipelinerunSpanContext annotation.

Configure the Knative observability framework to read tracing config
from the pipelines-as-code-config-observability ConfigMap. Add tracing
configuration guide and config examples.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ci-operator ci-operator force-pushed the distributed-tracing branch from cf3108c to 501e750 Compare March 27, 2026 19:14
@zakisk
Copy link
Copy Markdown
Member

zakisk commented Mar 30, 2026

@ci-operator for E2E run we're working on permission workaround in this PR #2611

@chmouel
Copy link
Copy Markdown
Member

chmouel commented Mar 30, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements distributed tracing for Pipelines-as-Code using OpenTelemetry. It adds logic to extract trace context from incoming webhook headers, propagate it to PipelineRuns via a new annotation, and emit timing spans for event processing and PipelineRun execution. The PR also includes configuration updates and new documentation. Feedback points out that an error during JSON marshalling of the span context is currently ignored and should be logged to assist with debugging.

Comment on lines +73 to +78
if jsonBytes, err := json.Marshal(carrier); err == nil {
if existing := pipelineRun.GetAnnotations()[keys.SpanContextAnnotation]; existing != "" {
logging.FromContext(ctx).Warnf("overwriting pre-existing %s annotation on PipelineRun template; honoring initiating event trace context", keys.SpanContextAnnotation)
}
annotations[keys.SpanContextAnnotation] = string(jsonBytes)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error from json.Marshal(carrier) is silently ignored. If marshalling fails for some reason, trace context propagation will silently fail. It would be better to log this error to aid in debugging potential tracing issues.

		if jsonBytes, err := json.Marshal(carrier); err != nil {
			logging.FromContext(ctx).Errorf("failed to marshal span context carrier: %v", err)
		} else {
			if existing := pipelineRun.GetAnnotations()[keys.SpanContextAnnotation]; existing != "" {
				logging.FromContext(ctx).Warnf("overwriting pre-existing %s annotation on PipelineRun template; honoring initiating event trace context", keys.SpanContextAnnotation)
			}
			annotations[keys.SpanContextAnnotation] = string(jsonBytes)
		}

@chmouel
Copy link
Copy Markdown
Member

chmouel commented Mar 30, 2026

this conflitcs with recently merged 0faad24

@chmouel
Copy link
Copy Markdown
Member

chmouel commented Mar 30, 2026

/ok-to-test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants