Skip to content

sql, test: introduce a failure threshold for some categories of test failure #163963

@eric-alton

Description

@eric-alton

Summary

During manual test triage it is common to encounter a transient test failure, for example a time out, that in isolation may not signal a real problem. Often the action taken during triage is to close these issues if a search does not show any other occurrences of the same problem happening. The second or third time it surfaces we get a strong signal that there is something worth investigating further.

Details

Can we automate this process of needing to breach a threshold before defined categories of test failures require human input? This could replicate the manual process today, where an AI agent detects whether this is the first occurrence of an issue and if so it closes it with an explanation. If it detects this is a repeat of a past issue, it can post an update to the ticket listing out the issue history.

We may want the flexibility to toggle this behaviour off for in-development releases that are approaching GA, as in such cases we will not have as much runway to wait and see if an issue repeats.

Epic: CRDB-60540

Jira issue: CRDB-60551

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-testingTesting tools and infrastructureC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-sql-foundationsSQL Foundations Team (formerly SQL Schema + SQL Sessions)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions