Skip to content

Commit 7380350

Browse files
mayurkrclaude
andauthored
fix(github): don't add commits to an already-open PR (#95)
* fix(github): short-circuit CreatePullRequest when an open PR covers the head branch The GitHub engine's CreatePullRequest had a silent-commit footgun that produced the "311 commits on one PR" incident observed on zelyo-operator-fix/demo-app-payment-service PR #12. Flow before the fix: 1. getRef → base SHA 2. createRef returns 422 "branch exists" — engine swallows and continues 3. createOrUpdateFile for each file — COMMITS LAND ON THE EXISTING PR's BRANCH (this is the regression) 4. openPR returns 422 "a pull request already exists" — error bubbles up 5. Controller catches the error but the file commit already shipped 6. Next reconcile: same branch, another commit pile This was only partially mitigated by the controller-layer existingBranches dedup shipped in #91 and #94. That mitigation misses when ListOpenPRs fails, when pagination hits the safety cap, or when any other caller of CreatePullRequest skips the upstream check. The engine itself needed to be the last line of defense. Fix: after createRef returns 422, query GitHub for any open PR whose head branch is exactly pr.HeadBranch. If one exists, log and return it — no commit, no duplicate openPR. If no open PR exists (legitimate case when a user manually closed a PR but left the branch), proceed with commit + openPR as before. If the dedup lookup ITSELF fails, fail closed — remediation defers to the next cycle rather than risk polluting an open PR with duplicate commits. New tests: - SkipsWhenOpenPRExists: createRef=422 + open PR exists → asserts PUT /contents and POST /pulls are never called, existing PR is returned - BranchExistsWithoutOpenPR: createRef=422 + empty PR list → asserts commit + openPR both run (the legitimate orphan-branch case) - BranchExistsLookupFails: createRef=422 + PR list returns 500 → asserts error return, no commit, no openPR (fail-closed) Uses GitHub's head=owner:ref query param to scope the lookup server-side (only one open PR per head branch is possible, so no pagination needed), with in-memory ref equality as a defense-in-depth filter against mocks or proxies that ignore the query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * address PR review: typed APIError, owner:branch dedup match, shared PR struct Three reviewer asks on #95: 1. Codex P2 — findOpenPRByHeadBranch was filtering on head.ref alone, so a fork PR using the same branch name could false-match the dedup check when a proxy/mock ignores the server-side head= query. Switch to matching on head.label ("owner:ref"), which GitHub always sets to the canonical identifier. 2. Gemini medium — string-matching "422" in the error message was brittle. doRequest now returns a typed *APIError with StatusCode and Body; the createRef 422 check uses errors.As for a direct integer comparison against http.StatusUnprocessableEntity. 3. Gemini medium — the PR-response anonymous struct was duplicated between findOpenPRByHeadBranch and ListOpenPRs. Extract githubPullResponse at package scope and reuse. Tests: - Existing SkipsWhenOpenPRExists mock now sets head.label so the engine's new equality check matches. - New DoesNotMatchForkPRWithSameRef: server (simulating a proxy that ignored head= query) returns a fork PR with head.label = "forkuser:<branch>" distinct from our owner. Asserts the engine proceeds with its own commit + openPR instead of false-matching. - New APIErrorTypedUnwrap: pins the typed-error contract. Covers both the direct-%w-wrap path callers use and the live doRequest path, so a future refactor that swallows the type breaks here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 21e3eac commit 7380350

2 files changed

Lines changed: 526 additions & 15 deletions

File tree

internal/github/engine.go

Lines changed: 124 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ import (
2020
"bytes"
2121
"context"
2222
"encoding/json"
23+
"errors"
2324
"fmt"
2425
"io"
2526
"net/http"
@@ -32,6 +33,41 @@ import (
3233
"github.com/zelyo-ai/zelyo-operator/internal/gitops"
3334
)
3435

36+
// APIError is the typed error doRequest returns on any non-2xx response
37+
// from GitHub. Callers use errors.As to branch on HTTP status codes —
38+
// e.g. createRef's 422-on-branch-exists — rather than string-matching the
39+
// formatted message, which was the brittle pre-refactor pattern.
40+
type APIError struct {
41+
StatusCode int
42+
Body string // Truncated response body for diagnostics.
43+
}
44+
45+
func (e *APIError) Error() string {
46+
return fmt.Sprintf("GitHub API error %d: %s", e.StatusCode, e.Body)
47+
}
48+
49+
// githubPullResponse is the subset of GitHub's PR payload we decode. Shared
50+
// by CreatePullRequest, findOpenPRByHeadBranch, ListOpenPRs, and openPR so
51+
// the decoding shape is defined once.
52+
//
53+
// Head.Label is the canonical "owner:ref" identifier. Matching on Label is
54+
// how we tell a same-repo PR (head.label = repo_owner:branch) apart from a
55+
// fork PR that happens to use the same branch name (head.label =
56+
// fork_owner:branch). Filtering only on Head.Ref would let a fork PR
57+
// false-match the dedup check.
58+
type githubPullResponse struct {
59+
Number int `json:"number"`
60+
HTMLURL string `json:"html_url"`
61+
Head struct {
62+
Ref string `json:"ref"`
63+
Label string `json:"label"`
64+
} `json:"head"`
65+
CreatedAt time.Time `json:"created_at"`
66+
Labels []struct {
67+
Name string `json:"name"`
68+
} `json:"labels"`
69+
}
70+
3571
// GitHubEngine implements gitops.Engine using the GitHub REST API.
3672
// It supports GitHub App authentication and all CRUD operations needed
3773
// for the Zelyo Operator auto-remediation workflow.
@@ -69,12 +105,46 @@ func (e *GitHubEngine) CreatePullRequest(ctx context.Context, pr *gitops.PullReq
69105
}
70106

71107
// Step 2: Create the head branch from the base.
108+
//
109+
// GitHub returns 422 when the branch already exists. Historically the
110+
// engine swallowed that error and proceeded to Step 3 (commit files)
111+
// and Step 4 (openPR). If a PR was already open against this branch,
112+
// that flow produced the 311-commits-on-one-PR footgun: openPR would
113+
// fail with 422 "already exists", but the file commits from Step 3 had
114+
// already landed on the open PR's branch. Every subsequent reconcile
115+
// piled another commit on the same PR.
116+
//
117+
// Fix: when the branch already exists, check whether it already backs
118+
// an open PR BEFORE committing. If yes, short-circuit and return the
119+
// existing PR — no new commits, no duplicate opens. If the branch
120+
// exists without an open PR (legitimate when a user deletes a PR
121+
// without deleting the branch), proceed with commit + openPR.
122+
//
123+
// If the dedup lookup itself fails, abort rather than risk silently
124+
// appending commits to an open PR. Remediation for this branch will
125+
// retry next reconcile — a temporary deferral is cheaper than
126+
// permanent commit pollution on an already-open PR.
72127
if err := e.createRef(ctx, pr.RepoOwner, pr.RepoName, "refs/heads/"+pr.HeadBranch, baseSHA); err != nil {
73-
// Branch might already exist from a previous attempt — continue.
74-
if !strings.Contains(err.Error(), "422") {
128+
// 422 Unprocessable Entity → branch already exists. Any other
129+
// status is a hard failure we can't recover from. errors.As on
130+
// APIError replaces the earlier string-match on "422" — the
131+
// message format is no longer part of the contract.
132+
var apiErr *APIError
133+
if !errors.As(err, &apiErr) || apiErr.StatusCode != http.StatusUnprocessableEntity {
75134
return nil, fmt.Errorf("creating head branch: %w", err)
76135
}
77-
e.log.Info("Branch already exists, continuing", "branch", pr.HeadBranch)
136+
existing, lookupErr := e.findOpenPRByHeadBranch(ctx, pr.RepoOwner, pr.RepoName, pr.HeadBranch)
137+
if lookupErr != nil {
138+
return nil, fmt.Errorf("head branch %q already exists but could not verify absence of open PR: %w",
139+
pr.HeadBranch, lookupErr)
140+
}
141+
if existing != nil {
142+
e.log.Info("Skipping commit: an open PR already covers this head branch",
143+
"number", existing.Number, "branch", pr.HeadBranch, "url", existing.URL)
144+
return existing, nil
145+
}
146+
e.log.Info("Branch already exists without an open PR — proceeding",
147+
"branch", pr.HeadBranch)
78148
}
79149

80150
// Step 3: Commit each file to the head branch.
@@ -147,6 +217,52 @@ func (e *GitHubEngine) GetFile(ctx context.Context, owner, repo, path, ref strin
147217
return decoded[:n], nil
148218
}
149219

220+
// findOpenPRByHeadBranch queries GitHub for any open PR whose head branch
221+
// is exactly `branch` in the same repo (no cross-fork lookup). Returns
222+
// (nil, nil) when no matching PR exists, the PR when one does, or an error
223+
// when the API call failed.
224+
//
225+
// GitHub rejects a second open PR against an already-in-use head branch,
226+
// so the result set is size 0 or 1 — no pagination. The `head=owner:ref`
227+
// query narrows server-side; we additionally filter in-memory on the full
228+
// "owner:ref" label so a mock or proxy that ignores the query parameter
229+
// cannot return a spurious match from a fork PR that happens to use the
230+
// same branch name (head.ref alone would false-match — a real concern for
231+
// any heavily-forked repo).
232+
//
233+
// Used by CreatePullRequest to short-circuit when the target head branch
234+
// already backs an open PR; see the comment there for the 311-commits
235+
// footgun this guards against.
236+
func (e *GitHubEngine) findOpenPRByHeadBranch(ctx context.Context, owner, repo, branch string) (*gitops.PullRequestResult, error) {
237+
wantLabel := owner + ":" + branch
238+
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls?state=open&head=%s",
239+
e.baseURL, owner, repo, url.QueryEscape(wantLabel))
240+
241+
body, err := e.doRequest(ctx, http.MethodGet, reqURL, nil)
242+
if err != nil {
243+
return nil, fmt.Errorf("looking up open PR for branch %q: %w", branch, err)
244+
}
245+
246+
var prs []githubPullResponse
247+
if err := json.Unmarshal(body, &prs); err != nil {
248+
return nil, fmt.Errorf("decoding PR lookup response: %w", err)
249+
}
250+
for _, pr := range prs {
251+
// Require an exact owner:ref match. GitHub always populates
252+
// head.label as "owner_login:ref_name"; a fork PR with the same
253+
// ref name has a different label and is correctly rejected here.
254+
if pr.Head.Label == wantLabel {
255+
return &gitops.PullRequestResult{
256+
Number: pr.Number,
257+
URL: pr.HTMLURL,
258+
Branch: pr.Head.Ref,
259+
CreatedAt: pr.CreatedAt,
260+
}, nil
261+
}
262+
}
263+
return nil, nil
264+
}
265+
150266
// ListOpenPRs implements gitops.Engine.ListOpenPRs by paginating over
151267
// GitHub's list-PRs endpoint. A single page is 100 PRs (the endpoint's max);
152268
// we keep requesting successive pages until a page comes back short or we
@@ -165,17 +281,7 @@ func (e *GitHubEngine) ListOpenPRs(ctx context.Context, owner, repo string) ([]g
165281
return nil, fmt.Errorf("listing open PRs (page %d): %w", page, err)
166282
}
167283

168-
var prs []struct {
169-
Number int `json:"number"`
170-
HTMLURL string `json:"html_url"`
171-
Head struct {
172-
Ref string `json:"ref"`
173-
} `json:"head"`
174-
CreatedAt time.Time `json:"created_at"`
175-
Labels []struct {
176-
Name string `json:"name"`
177-
} `json:"labels"`
178-
}
284+
var prs []githubPullResponse
179285
if err := json.Unmarshal(body, &prs); err != nil {
180286
return nil, fmt.Errorf("decoding PRs response (page %d): %w", page, err)
181287
}
@@ -382,7 +488,10 @@ func (e *GitHubEngine) doRequest(ctx context.Context, method, reqURL string, pay
382488
}
383489

384490
if resp.StatusCode >= 400 {
385-
return nil, fmt.Errorf("GitHub API error %d: %s", resp.StatusCode, truncate(string(body), 200))
491+
return nil, &APIError{
492+
StatusCode: resp.StatusCode,
493+
Body: truncate(string(body), 200),
494+
}
386495
}
387496

388497
return body, nil

0 commit comments

Comments
 (0)