Skip to content

Race condition allows PR merge with no terraform applied (or stale/mixed-commit plans to be applied) #6529

Description

@henriklundstrom

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

The atlantis apply command acts directly on whatever .tfplan files happen to exist on disk at the time. It does not check that atlantis plan ran successfully, or that the plans pertain to the current head commit, or that all expected projects are covered. This leads to three failure scenarios:

Scenario A — apply during an in-flight plan. No .tfplan files exist yet (plan is still running). Apply finds nothing, posts atlantis/apply = success with 0/0 projects. The PR can be merged without any terraform having been applied. Of course apply 0/0 projects can be a valid outcome if there truly are no plans to apply, but it's not safe to assume so while a plan is running.

Scenario B — concurrent push leaves stale plans. A new commit's autoplan calls deletePlans, then its per-project plans fail to acquire the per-project WorkingDirLocker (held by the old commit's in-flight doPlan). The old commit's plans eventually finish and write .tfplan files for the wrong commit. Apply executes them.

Scenario C — concurrent push produces mixed-commit plans. Same mechanism as B, but with projects that take different amounts of time to plan. Fast projects finish before the new autoplan starts (locks released, .tfplan files deleted and re-planned for the new commit), while slow projects are still locked by the old commit. Apply executes a mix of .tfplan files from two different commits.

Reproduction Steps

Scenario A (easiest to reproduce):

  1. Have a repository with terraform projects that take ~30+ seconds to plan
  2. Push a commit that triggers autoplan
  3. While terraform plan is still running, comment atlantis apply
  4. Observe that apply returns "Ran Apply for 0 projects" and sets atlantis/apply to success
  5. PR can now be merged without any terraform having been applied

Scenarios B and C (require tighter timing):

  1. Push commit OLD_SHA that triggers autoplan; per-project plans start running in parallel
  2. Before all plans complete, push commit NEW_SHA to the same branch
  3. NEW_SHA's autoplan calls deletePlans (deletes any .tfplan files written so far), then starts per-project plans — each project's doPlan tries to acquire the per-project WorkingDirLocker, which fails immediately (no retry) for any project where OLD_SHA's plan is still running
  4. OLD_SHA's remaining plans eventually finish and write .tfplan files for the old commit
  5. End state depends on per-project timing:
    • Scenario B: All of OLD_SHA's projects were still running → none were deleted, all finish and write stale .tfplan files. Apply executes plans from the wrong commit.
    • Scenario C: Some of OLD_SHA's projects had finished (deleted and re-planned for NEW_SHA), others hadn't (still write old .tfplan). Apply executes a mix of plans from two different commits.

Expected: Apply should not post success until a plan for the current head commit has completed. A 0/0 apply is valid when plan has finished and found nothing to plan — but not while a plan is still in progress (A). Apply should also verify that .tfplan files belong to the current head commit, not silently apply stale or mixed-commit plans (B, C).

Actual: Apply posts success — either without running terraform apply at all (A), or by applying plans from a previous commit (B) or a mix of commits (C).

Logs

Logs below are from a Scenario A reproduction.

Scenario A logs
13:49:57.107 [INFO ]: Running autoplan...
13:49:59.399 [INFO ]: 2 projects are to be planned based on their when_modified config
13:49:59.489 [INFO ]: Running plans in parallel
13:50:00.080 [INFO ]: Acquired lock with id 'repo/infrastructure/terraform/test'
13:50:00.101 [INFO ]: Acquired lock with id 'repo/infrastructure/terraform/prod'
13:50:12.220 [DEBUG]: starting 'sh -c "terraform plan ..."' in '.../test/infrastructure/terraform'
13:50:12.261 [DEBUG]: starting 'sh -c "terraform plan ..."' in '.../prod/infrastructure/terraform'
13:50:41.963 [INFO ]: Handling 'apply' comment
13:50:42.399 [INFO ]: Running comment command 'apply' for user 'username'.
13:50:43.472 [INFO ]: Updating GitHub Check status for 'atlantis/apply' to 'pending'
13:50:48.597 [INFO ]: Updating GitHub Check status for 'atlantis/apply' to 'success'
13:50:50.571 [DEBUG]: Ignoring non-command comment: 'Ran Apply for 0 projects:'
13:50:51.765 [INFO ]: successfully ran 'sh -c' 'terraform plan ...' in '.../test/infrastructure/terraform'
13:50:57.128 [INFO ]: successfully ran 'sh -c' 'terraform plan ...' in '.../prod/infrastructure/terraform'
13:50:59.830 [INFO ]: Handling GitHub Pull Request 'closed' event

Timeline:

Time Event
13:50:12 terraform plan starts for two projects
13:50:41 User comments atlantis apply
13:50:48 Apply posts success with 0/0 projects
13:50:51–57 Plans complete (.tfplan files written — too late)
13:50:59 PR merged, no terraform applied

Environment details

  • Atlantis version: v0.43.0 (code-path analysis confirmed against main at d6a7343a)
  • Deployment: Kubernetes (GKE)
  • VCS: GitHub

Additional Context

Why apply_requirements does not prevent this

Scenario A: apply_requirements and WorkingDirLocker run per discovered project. With zero discovered projects (no .tfplan files exist yet), none of them run — the 0/0 success is posted unconditionally. Making atlantis/plan a required check + apply_requirements: [mergeable] does not help because the mergeable check, like all per-project checks, is never evaluated when there are zero projects.

Scenarios B and C: Making atlantis/plan a required check + apply_requirements: [mergeable] does block these scenarios — when stale .tfplan files exist, the new commit's plan has necessarily failed (lock collision), so atlantis/plan is set to failed on the new HEAD SHA, the PR becomes non-mergeable, and the per-project mergeable check rejects apply. (Commit statuses are per-SHA, so the old commit's plan setting success on the old SHA does not interfere.) However, there is no recommendation in the Atlantis docs to make atlantis/plan a required check, and the documented configuration focuses on atlantis/apply as a required check with --gh-allow-mergeable-bypass-apply. Without plan as a required check, the mergeable requirement does not protect against B/C, and apply does not independently verify that .tfplan files match the current head commit.

Where it happens in the code

All line references are against commit d6a7343a on main.

The apply path builds its project list in buildAllProjectCommandsByPlan by scanning for .tfplan files via PendingPlanFinder.Find. When no files are found, the per-project WorkingDirLocker checks that would detect a concurrent plan never execute (empty loop). The apply runner then posts success with 0/0 projects.

.tfplan filenames encode workspace and project name but not the commit SHA, so apply cannot detect stale or mixed-commit plans (Scenarios B and C).

For Scenarios B and C, the contributing factor on the plan side is that per-project WorkingDirLocker.TryLock in doPlan is non-blocking with no retry — when a new commit's per-project plan collides with an in-flight plan for the same project, it fails immediately with no recovery path.

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions