Skip to content

Actions: aws/aws-k8s-tester

Actions

CI

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
455 workflow runs
455 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

fix(nvidia/unit): avoid imds calls, expect h100 persistence
CI #550: Pull request #587 opened by mselim00
February 20, 2025 03:53 5h 22m 27s mselim00:fix-nvidia-unit
February 20, 2025 03:53 5h 22m 27s
fix(nvidia): use ld preload file instead of LD_PRELOAD
CI #549: Pull request #586 synchronize by ndbaker1
February 20, 2025 01:53 5h 24m 56s ndbaker1:ld
February 20, 2025 01:53 5h 24m 56s
fix(nvidia): use ld preload file instead of LD_PRELOAD
CI #548: Pull request #586 opened by ndbaker1
February 20, 2025 01:50 5h 21m 36s ndbaker1:ld
February 20, 2025 01:50 5h 21m 36s
fix(nvidia/nccl): wait on plugins ready before collecting node info
CI #547: Pull request #585 opened by mselim00
February 19, 2025 22:45 5h 17m 13s mselim00:fix-nccl
February 19, 2025 22:45 5h 17m 13s
fix: use rendered manifest for single-node training teardown
CI #546: Pull request #584 opened by ndbaker1
February 18, 2025 22:28 5h 30m 24s ndbaker1:fix-manifest
February 18, 2025 22:28 5h 30m 24s
build(nvidia): pin cuda-samples version
CI #543: Pull request #583 opened by ndbaker1
February 18, 2025 04:32 5h 31m 51s ndbaker1:cuda-samples
February 18, 2025 04:32 5h 31m 51s
feat: add --cluster-creation-timeout flag
CI #542: Pull request #582 opened by cartermckinnon
February 14, 2025 05:51 11m 21s cluster-timeouts
February 14, 2025 05:51 11m 21s
fix: nil pointer when cluster creation fails
CI #541: Pull request #581 opened by cartermckinnon
February 13, 2025 21:49 14m 53s cluster-nil-ptr
February 13, 2025 21:49 14m 53s