feat!: telemetry metrics updates as per semantic convention by AjmeraParth132 · Pull Request #2566 · googleapis/genai-toolbox

AjmeraParth132 · 2026-02-26T01:16:19Z

Description

This PR updates MCP telemetry to align with OTel semantic conventions by removing all existing transport-specific and HTTP-API metrics and introducing a unified, future-proof set. It adds

mcp.server.operation.duration to capture per-method request rate, latency, and errors across all MCP transports;
mcp.server.session.duration to track session lifecycles for stdio and SSE;
toolbox.server.mcp.active_sessions for real-time capacity, saturation, and leak detection.
toolbox.tool.operation.duration to isolate backend tool execution time from MCP/protocol overhead, enabling clearer attribution of performance bottlenecks between the MCP layer and tool runtime.

Together, these changes simplify the metrics surface while significantly improving observability and debuggability with minimal runtime overhead.

PR Checklist

Make sure you reviewed
CONTRIBUTING.md
Make sure to open an issue as a
bug/issue
before writing your code! That way we can discuss the change, evaluate
designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)
Make sure to add ! if this involve a breaking change

🛠️ Fixes #2222

gemini-code-assist · 2026-02-26T01:16:41Z

Summary of Changes

Hello @AjmeraParth132, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the telemetry system to align with OpenTelemetry semantic conventions, enhancing observability and debuggability. By replacing a fragmented set of metrics with a unified, future-proof collection, it provides clearer insights into the performance and behavior of MCP operations, sessions, and backend tool executions. This change simplifies metric collection while offering more granular data for identifying bottlenecks and understanding system health.

Highlights

New Metrics Introduced: Introduced four new OpenTelemetry semantic convention-aligned metrics: mcp.server.operation.duration for per-method request tracking, mcp.server.session.duration for session lifecycles, toolbox.server.mcp.active_sessions for real-time capacity, and toolbox.tool.operation.duration to isolate tool execution time.
Old Metrics Removed: Removed all existing transport-specific and HTTP-API metrics (toolbox.server.toolset.get.count, toolbox.server.tool.get.count, toolbox.server.tool.invoke.count, toolbox.server.mcp.sse.count, toolbox.server.mcp.post.count) to simplify the metrics surface.
Metric Attribute Propagation: Implemented a mechanism to propagate gen_ai and network-related attributes through the context for richer metric data, allowing for more detailed analysis of operations and sessions.
Session and Operation Duration Tracking: Added logic to track the duration of MCP sessions (stdio and SSE) and individual MCP JSON-RPC operations, including error types, providing comprehensive performance insights.

Changelog

internal/server/api.go
- Removed metric import.
- Removed old ToolsetGet, ToolGet, and ToolInvoke metric recording from API handlers.
internal/server/mcp.go
- Added sessionStart timestamp and sessionErr variable to track session duration and errors.
- Introduced mcp.server.session.duration metric recording for stdio and SSE sessions, including error types.
- Implemented toolbox.server.mcp.active_sessions metric to track active stdio and SSE sessions.
- Added operationStart timestamp and metricErrorType variable for processMcpMessage.
- Introduced mcp.server.operation.duration metric recording for MCP JSON-RPC operations, including gen_ai and network attributes.
- Integrated GenAIMetricAttrs into the context for attribute propagation to method handlers.
- Updated error handling to set metricErrorType for mcp.server.operation.duration.
internal/server/mcp/v20241105/method.go
- Imported time and metric packages.
- Populated GenAIMetricAttrs in the context for toolsCallHandler and promptsGetHandler.
- Added toolbox.tool.execution.duration metric recording for tool invocations, including gen_ai and network attributes.
internal/server/mcp/v20250326/method.go
- Imported time and metric packages.
- Populated GenAIMetricAttrs in the context for toolsCallHandler and promptsGetHandler.
- Added toolbox.tool.execution.duration metric recording for tool invocations, including gen_ai and network attributes.
internal/server/mcp/v20250618/method.go
- Imported time and metric packages.
- Populated GenAIMetricAttrs in the context for toolsCallHandler and promptsGetHandler.
- Added toolbox.tool.execution.duration metric recording for tool invocations, including gen_ai and network attributes.
internal/server/mcp/v20251125/method.go
- Imported time and metric packages.
- Populated GenAIMetricAttrs in the context for toolsCallHandler and promptsGetHandler.
- Added toolbox.tool.execution.duration metric recording for tool invocations, including gen_ai and network attributes.
internal/telemetry/instrumentation.go
- Removed old metric constants (toolsetGetCountName, toolGetCountName, toolInvokeCountName, mcpSseCountName, mcpPostCountName).
- Defined new OTel semantic convention metric constants (mcpOperationDurationName, mcpSessionDurationName, mcpActiveSessionsName, toolExecutionDurationName).
- Updated Instrumentation struct to use new metric types (Float64Histogram, Int64UpDownCounter).
- Replaced creation of old Int64Counter metrics with new Float64Histogram and Int64UpDownCounter metrics.
internal/util/util.go
- Added GenAIMetricAttrs struct to hold generic AI and network metric attributes.
- Implemented WithGenAIMetricAttrs to add GenAIMetricAttrs to the context.
- Implemented GenAIMetricAttrsFromContext to retrieve GenAIMetricAttrs from the context.

Activity

No human activity (comments, reviews) was detected on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the telemetry system to align with OpenTelemetry semantic conventions, removing outdated metrics and introducing a new, more descriptive set for MCP operations and sessions. However, a critical security concern has been identified: several new telemetry metrics, specifically labels like mcp.method.name, toolset.name, gen_ai.tool.name, and gen_ai.prompt.name, are populated directly from untrusted user input without prior validation. This poses a significant risk of metrics cardinality explosion, potentially leading to a Denial of Service by exhausting memory. Additionally, there are suggestions to improve code clarity and robustness regarding how metric attributes are handled to prevent potential slice aliasing issues.

gemini-code-assist · 2026-02-26T01:19:39Z

internal/server/mcp.go

+	defer func() {
+		operationDuration := time.Since(operationStart).Seconds()
+		durationAttrs := []attribute.KeyValue{
+			attribute.String("mcp.method.name", baseMessage.Method),


The mcp.method.name label is populated directly from the baseMessage.Method field of the JSON-RPC request body without any validation. Since this field is user-controlled, an attacker can send a large number of requests with unique, arbitrary method names, leading to a metrics cardinality explosion. This can exhaust the memory of the metrics collection system or the application itself, causing a Denial of Service (DoS). It is recommended to validate the method name against an allow-list of known methods before using it as a metric label.

gemini-code-assist · 2026-02-26T01:19:39Z

internal/server/mcp.go

+			attribute.String("mcp.method.name", baseMessage.Method),
+			attribute.String("network.transport", networkTransport),
+			attribute.String("network.protocol.name", networkProtocolName),
+			attribute.String("toolset.name", toolsetName),


The toolset.name label is populated from the toolsetName URL parameter, which is untrusted. An attacker can send many requests with unique toolset names in the URL (e.g., /mcp/{toolsetName}), causing a label cardinality explosion in the metrics system. This can lead to a Denial of Service. Consider validating the toolset name against the list of configured toolsets before recording it in metrics.

gemini-code-assist · 2026-02-26T01:19:39Z

internal/server/mcp/v20241105/method.go

+	if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {
+		genAIAttrs.OperationName = "execute_tool"
+		genAIAttrs.ToolName = toolName
+	}


The gen_ai.tool.name attribute is populated from the toolName variable before it is validated against the available tools. This attribute is later used as a label in the McpOperationDuration metric (recorded in internal/server/mcp.go). An attacker can send requests with arbitrary tool names, leading to a metrics cardinality explosion and potential Denial of Service. The assignment should be moved after the tool existence check on line 122.

Suggested change

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "execute_tool"

genAIAttrs.ToolName = toolName

}

tool, ok := resourceMgr.GetTool(toolName)

if !ok {

err = fmt.Errorf("invalid tool name: tool with name %q does not exist", toolName)

return jsonrpc.NewError(id, jsonrpc.INVALID_PARAMS, err.Error(), nil), err

}

// Populate gen_ai attributes for operation duration metric

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "execute_tool"

genAIAttrs.ToolName = toolName

}

gemini-code-assist · 2026-02-26T01:19:40Z

internal/server/mcp/v20241105/method.go

+	if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {
+		genAIAttrs.OperationName = "get_prompt"
+		genAIAttrs.PromptName = promptName
+	}


The gen_ai.prompt.name attribute is populated from the promptName variable before it is validated. This can lead to a metrics cardinality explosion if used as a label in metrics recorded for every request. Move the assignment after the prompt existence check on line 369.

Suggested change

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "get_prompt"

genAIAttrs.PromptName = promptName

}

prompt, ok := resourceMgr.GetPrompt(promptName)

if !ok {

err := fmt.Errorf("prompt with name %q does not exist", promptName)

return jsonrpc.NewError(id, jsonrpc.INVALID_PARAMS, err.Error(), nil), err

}

// Populate gen_ai attributes for operation duration metric

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "get_prompt"

genAIAttrs.PromptName = promptName

}

gemini-code-assist · 2026-02-26T01:19:40Z

internal/server/mcp/v20250326/method.go

+	if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {
+		genAIAttrs.OperationName = "execute_tool"
+		genAIAttrs.ToolName = toolName
+	}


The gen_ai.tool.name attribute is populated from the untrusted toolName variable before validation, which can lead to a metrics cardinality explosion. Move the assignment after the tool existence check on line 123.

Suggested change

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "execute_tool"

genAIAttrs.ToolName = toolName

}

tool, ok := resourceMgr.GetTool(toolName)

if !ok {

err = fmt.Errorf("invalid tool name: tool with name %q does not exist", toolName)

return jsonrpc.NewError(id, jsonrpc.INVALID_PARAMS, err.Error(), nil), err

}

// Populate gen_ai attributes for operation duration metric

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "execute_tool"

genAIAttrs.ToolName = toolName

}

gemini-code-assist · 2026-02-26T01:19:40Z

internal/server/mcp/v20250618/method.go

+	if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {
+		genAIAttrs.OperationName = "get_prompt"
+		genAIAttrs.PromptName = promptName
+	}


The gen_ai.prompt.name attribute is populated from the untrusted promptName variable before validation. Move the assignment after the prompt existence check on line 363.

Suggested change

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "get_prompt"

genAIAttrs.PromptName = promptName

}

prompt, ok := resourceMgr.GetPrompt(promptName)

if !ok {

err := fmt.Errorf("prompt with name %q does not exist", promptName)

return jsonrpc.NewError(id, jsonrpc.INVALID_PARAMS, err.Error(), nil), err

}

// Populate gen_ai attributes for operation duration metric

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "get_prompt"

genAIAttrs.PromptName = promptName

}

gemini-code-assist · 2026-02-26T01:19:40Z

internal/server/mcp/v20251125/method.go

+	if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {
+		genAIAttrs.OperationName = "execute_tool"
+		genAIAttrs.ToolName = toolName
+	}


The gen_ai.tool.name attribute is populated from the untrusted toolName variable before validation, which can lead to a metrics cardinality explosion. Move the assignment after the tool existence check on line 116.

Suggested change

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "execute_tool"

genAIAttrs.ToolName = toolName

}

tool, ok := resourceMgr.GetTool(toolName)

if !ok {

err = fmt.Errorf("invalid tool name: tool with name %q does not exist", toolName)

return jsonrpc.NewError(id, jsonrpc.INVALID_PARAMS, err.Error(), nil), err

}

// Populate gen_ai attributes for operation duration metric

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "execute_tool"

genAIAttrs.ToolName = toolName

}

gemini-code-assist · 2026-02-26T01:19:40Z

internal/server/mcp/v20251125/method.go

+	if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {
+		genAIAttrs.OperationName = "get_prompt"
+		genAIAttrs.PromptName = promptName
+	}


The gen_ai.prompt.name attribute is populated from the untrusted promptName variable before validation. Move the assignment after the prompt existence check on line 363.

Suggested change

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "get_prompt"

genAIAttrs.PromptName = promptName

}

prompt, ok := resourceMgr.GetPrompt(promptName)

if !ok {

err := fmt.Errorf("prompt with name %q does not exist", promptName)

return jsonrpc.NewError(id, jsonrpc.INVALID_PARAMS, err.Error(), nil), err

}

// Populate gen_ai attributes for operation duration metric

if genAIAttrs := util.GenAIMetricAttrsFromContext(ctx); genAIAttrs != nil {

genAIAttrs.OperationName = "get_prompt"

genAIAttrs.PromptName = promptName

}

gemini-code-assist · 2026-02-26T01:19:40Z

internal/server/mcp.go

+		durationAttrs := fullAttrs
+		if sessionErr != nil && sessionErr != io.EOF {
+			durationAttrs = append(durationAttrs, attribute.String("error.type", sessionErr.Error()))
+		}


Assigning durationAttrs directly to fullAttrs and then conditionally appending to it can be confusing due to slice aliasing. While it doesn't cause a bug in the current code, creating an explicit copy makes the intent clearer and prevents potential bugs if the code is refactored in the future.

Suggested change

durationAttrs := fullAttrs

if sessionErr != nil && sessionErr != io.EOF {

durationAttrs = append(durationAttrs, attribute.String("error.type", sessionErr.Error()))

}

durationAttrs := make([]attribute.KeyValue, len(fullAttrs))

copy(durationAttrs, fullAttrs)

if sessionErr != nil && sessionErr != io.EOF {

durationAttrs = append(durationAttrs, attribute.String("error.type", sessionErr.Error()))

}

gemini-code-assist · 2026-02-26T01:19:40Z

internal/server/mcp.go

+		durationAttrs := sessionAttrs
 		if err != nil {
 			span.SetStatus(codes.Error, err.Error())
+			durationAttrs = append(durationAttrs, attribute.String("error.type", err.Error()))
 		}


Similar to another comment, the direct assignment durationAttrs := sessionAttrs followed by a conditional append can be risky due to slice aliasing. sessionAttrs is used for decrementing the active sessions counter just before this. To improve code safety and clarity, it's better to work with an explicit copy.

Suggested change

durationAttrs := sessionAttrs

if err != nil {

span.SetStatus(codes.Error, err.Error())

durationAttrs = append(durationAttrs, attribute.String("error.type", err.Error()))

}

durationAttrs := make([]attribute.KeyValue, len(sessionAttrs))

copy(durationAttrs, sessionAttrs)

if err != nil {

span.SetStatus(codes.Error, err.Error())

durationAttrs = append(durationAttrs, attribute.String("error.type", err.Error()))

}

feat: telemetry updates as per semantic convention

60ddb61

AjmeraParth132 requested a review from a team as a code owner February 26, 2026 01:16

blunderbuss-gcf bot assigned duwenxin99 Feb 26, 2026

AjmeraParth132 mentioned this pull request Feb 26, 2026

Enhance Telemetry for Toolbox Servers #2222

Open

1 task

gemini-code-assist bot reviewed Feb 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: telemetry metrics updates as per semantic convention#2566

feat!: telemetry metrics updates as per semantic convention#2566
AjmeraParth132 wants to merge 1 commit intogoogleapis:mainfrom
AjmeraParth132:feat-metrics-update

AjmeraParth132 commented Feb 26, 2026

Uh oh!

gemini-code-assist bot commented Feb 26, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

gemini-code-assist bot Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AjmeraParth132 commented Feb 26, 2026

Description

PR Checklist

Uh oh!

gemini-code-assist bot commented Feb 26, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants