You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of this last week or so we have not been able to successfully run GHA WF pipelines via our LargeHosted arm64 runners without encountering this issue at some random point of their processing.
It is random in nature however has a greater rate of failures than successful executions. And a successful job run in one attempt generally will not mean the same job rerun will succeed (in fact, matching the probability rate already mentioned it will likely fail as well 80%+ of the time)
There has been no change to the underlying tooling or dependencies and the same operations that are failing on the GH LHR arm64 instances are running in alternate ARM64 environments without any issue consistently and repeatedly.
The last successful pipelines running on the 29th of Jan on this ubuntu-24.04 arm64 2.321.0 build, since then multiple pipelines using this same pattern and these runners no longer work
Platforms affected
Azure DevOps
GitHub Actions - Standard Runners
GitHub Actions - Larger Runners
Runner images affected
Ubuntu 20.04
Ubuntu 22.04
Ubuntu 24.04
macOS 13
macOS 13 Arm64
macOS 14
macOS 14 Arm64
macOS 15
macOS 15 Arm64
Windows Server 2019
Windows Server 2022
Windows Server 2025
Image version and build link
2.322.0
Is it regression?
2.321.0
Expected behavior
PIpeline runs successfully and no errors
Actual behavior
Pipeline crashes out around 80%+ of the time on one of the matrix jobs which encounter and produce a signal: segmentation fault (core dumped) at some point in their workflow steps.
These pipelines are utilising a terragrunt action which parses and manages TG/TF stacks in a monorepo, however we are seeing this same impact on any repo which is using terragrunt on the LargeHosted arm64 runners.
Repro steps
Build a pipeline which uses autero1/action-terragrunt@v3 (with terragrunt-version: "v0.50.14") and runs-on a large hosted arm64-backed runner (we are running on ubuntu-24.04 arm image) and performs some terragrunt operation (i.e. init/validate or plan etc)
Invoke said pipeline and await it's completion.
If it doesn't fail the first time, re-run it a few times to confirm
Our rate is around 1/5 attempts where it progresses or proceeds slightly further before failing again.
The text was updated successfully, but these errors were encountered:
julienbonastre
changed the title
Large Hosted arm64 Runners - Segmentation Fault
Large Hosted arm64 Runners - Segmentation Fault since v2.322.0 Ubuntu-24.04
Feb 5, 2025
Hi @julienbonastre, Thank you for bringing this issue to our attention. We are looking into this issue and will update you on this issue after investigating.
Description
As of this last week or so we have not been able to successfully run GHA WF pipelines via our LargeHosted arm64 runners without encountering this issue at some random point of their processing.
It is random in nature however has a greater rate of failures than successful executions. And a successful job run in one attempt generally will not mean the same job rerun will succeed (in fact, matching the probability rate already mentioned it will likely fail as well 80%+ of the time)
There has been no change to the underlying tooling or dependencies and the same operations that are failing on the GH LHR arm64 instances are running in alternate ARM64 environments without any issue consistently and repeatedly.
The last successful pipelines running on the 29th of Jan on this ubuntu-24.04 arm64 2.321.0 build, since then multiple pipelines using this same pattern and these runners no longer work
Platforms affected
Runner images affected
Image version and build link
2.322.0
Is it regression?
2.321.0
Expected behavior
PIpeline runs successfully and no errors
Actual behavior
Pipeline crashes out around 80%+ of the time on one of the matrix jobs which encounter and produce a
signal: segmentation fault (core dumped)
at some point in their workflow steps.These pipelines are utilising a terragrunt action which parses and manages TG/TF stacks in a monorepo, however we are seeing this same impact on any repo which is using terragrunt on the LargeHosted arm64 runners.
Repro steps
terragrunt-version: "v0.50.14"
) andruns-on
a large hosted arm64-backed runner (we are running on ubuntu-24.04 arm image) and performs some terragrunt operation (i.e. init/validate or plan etc)The text was updated successfully, but these errors were encountered: