fix(ci): resolve failures from telemetry regression, unhandled rejections, and flaky tests by aseemxs · Pull Request #8680 · aws/aws-toolkit-vscode

aseemxs · 2026-03-20T01:59:30Z

Problem

Multiple CI checks failing across macOS, Windows, and Linux (CodeBuild). All 5849+ unit tests pass — failures are from test assertion regressions, unhandled promise rejections detected by the run_and_report script, and pre-existing flaky tests.

Root Causes & Fixes

1. Lambda URI Handler telemetry — 3 test failures (all platforms)

Introduced by: #8598 (fix(lambda): add confirmation prompt before initiating console login)

What broke: #8598 changed handleLambdaUriError from throwing a cancelled ToolkitError to calling telemetry.record() + return. This broke telemetry because run() overwrites the span result with Succeeded when the function completes normally — the record() call gets clobbered.

Fix: Revert the cancellation path to throw ToolkitError.chain(e, 'User cancelled operation', { cancelled: true }). This lets run() correctly record Cancelled as the result. Tests updated to assert the throw instead of checking telemetry directly.

Files: uriHandlers.ts (functional), uriHandlers.test.ts (test)

2. AppBuilder Walkthrough test — 1 test failure (macOS minimum)

Introduced by: Interaction between the walkthrough test (#8236) and background security scan errors from startSecurityScan.ts. The test's onDidShowMessage handler asserted on every message, but an unrelated security scan error message fired during the test.

Fix: Filter the onDidShowMessage handler to only assert on the expected overwrite prompt message, ignoring unrelated ones. Added explicit assertion that the prompt was shown.

Files: walkthrough.test.ts (test only)

3. Unhandled promise rejections — Linux CodeBuild failures (all 3 variants)

Introduced by:

SageMaker server throw when env var missing — feat(sagemaker): Merge SageMaker SSH Kiro integration #8589 (feat(sagemaker): Merge SageMaker SSH Kiro integration)
CloudFormation StackActionCodeLensProvider calling sendRequest on stopped client — feat(cloudformation): Merge CloudFormation LSP integration with toolkit updates #8275 (feat(cloudformation): Merge CloudFormation LSP integration)
CloudFormation deactivate() client.stop() rejecting — same PR feat(cloudformation): Merge CloudFormation LSP integration with toolkit updates #8275

What broke: The run_and_report script in CodeBuild fails the build if it detects "rejected promise not handled" in stdout. All 5849 tests pass, but these 5 unhandled rejections triggered the check.

Fix:

SageMaker: console.error + return instead of throw for missing env var — avoids unhandled rejection while keeping extension host alive
CloudFormation CodeLens: isRunning() guard + try/catch before sendRequest, errors logged at debug level
CloudFormation deactivate: .catch() on client.stop()

Files: server.ts, stackActionCodeLensProvider.ts, extension.ts (all functional — adds resilience)

4. CloudFormation LSP client lifecycle rejections — Linux CodeBuild minimum

Introduced by: #8275 (feat(cloudformation): Merge CloudFormation LSP integration)

What broke: The vscode-languageclient LanguageClient creates unhandled promise rejections internally when the LSP server process fails to start during CI. These rejections (connection got disposed, Client is not running) happen inside the library's handleConnectionClosed/doInitialize and cannot be caught from our code.

Fix: Skip the CloudFormation LSP client activation when AWS_TOOLKIT_AUTOMATION === 'unit'. No unit tests depend on the LSP client — the CloudFormation unit tests cover template parsing, not the language server. E2E tests are unaffected (they use a different automation value).

Files: extension.ts (functional — conditional activation)

5. Pre-existing flaky test fixes

These are not caused by any recent PR but fail intermittently across CI:

SharedCredentialsProvider (handleInvalidConsoleCredentials — does not prompt reload for non-session errors):

Previous test's messages leak through getTestWindow().shownMessages — the array isn't cleared between tests
Fix: snapshot the count before the test and assert no new messages were added

ToolkitLogger (logs to a file — logs warn):

Windows file I/O race: log file exists but write hasn't flushed yet, single read misses content
Fix: retry loop that keeps reading until expected content appears (within existing 10s timeout)

editorContext + recommendationHandler (30s timeout on macOS insiders):

Slow VS Code startup + async telemetry/mock setup exceeds default 30s Mocha timeout
Fix: bump to 60s for these two tests

Files: sharedCredentialsProvider.test.ts, toolkitLogger.test.ts, editorContext.test.ts, recommendationHandler.test.ts (all test only)

Functional vs Test-Only Changes

File	Type	Change
`uriHandlers.ts`	Functional	Reverts cancellation to throw for correct telemetry
`server.ts`	Functional	console.error + return instead of throw for missing env var
`stackActionCodeLensProvider.ts`	Functional	isRunning() guard + debug logging on catch
`extension.ts`	Functional	.catch() on deactivate, skip LSP in unit tests
`uriHandlers.test.ts`	Test only	Updated assertions
`walkthrough.test.ts`	Test only	Filtered onDidShowMessage handler
`sharedCredentialsProvider.test.ts`	Test only	Snapshot message count instead of absolute check
`toolkitLogger.test.ts`	Test only	Retry loop for file content read
`editorContext.test.ts`	Test only	Timeout bump to 60s
`recommendationHandler.test.ts`	Test only	Timeout bump to 60s

Testing

All GitHub Actions checks pass (macOS/Windows/Linux × stable/minimum/insiders × toolkit/amazonq)
All 3 Linux CodeBuild variants pass (stable/minimum/insiders)
CloudFormation LSP E2E tests pass (all 3 platforms)

amazon-inspector-ohio · 2026-03-20T01:59:34Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

amazon-inspector-ohio · 2026-03-20T02:00:36Z

✅ I finished the code review, and didn't find any security or code quality issues.

…sage handler

…Formation LSP client

…ifecycle

The vscode-languageclient LanguageClient creates unhandled promise rejections internally when the LSP server process fails to start during CI. These rejections (connection disposed, client not running) occur inside the library's handleConnectionClosed/doInitialize and cannot be caught from our code. Skip the LSP client entirely during unit tests since no tests depend on it. Reverts the process.on('unhandledRejection') approach which was too broad and wouldn't suppress VS Code's own rejection tracking.

amazon-inspector-ohio · 2026-03-20T03:31:09Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

amazon-inspector-ohio · 2026-03-20T03:31:40Z

✅ I finished the code review, and didn't find any security or code quality issues.

… timeout tests - SharedCredentialsProvider: check no *new* messages instead of absolute count (previous test's messages leak via getTestWindow) - ToolkitLogger: retry reading log file content instead of single read (Windows file I/O flush race condition) - editorContext/recommendationHandler: bump timeout to 60s for slow macOS insiders CI environment

github-actions · 2026-03-20T04:25:45Z

This pull request implements a feat or fix, so it must include a changelog entry (unless the fix is for an unreleased feature). Review the changelog guidelines.
- Note: beta or "experiment" features that have active users should announce fixes in the changelog.
- If this is not a feature or fix, use an appropriate type from the title guidelines. For example, telemetry-only changes should use the telemetry type.

… var - CodeLens: log caught errors at debug level instead of silent swallow - SageMaker server: process.exit(1) instead of return when env var missing — server can't function without it, shouldn't linger

amazon-inspector-ohio · 2026-03-20T04:40:35Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

amazon-inspector-ohio · 2026-03-20T04:41:06Z

✅ I finished the code review, and didn't find any security or code quality issues.

amazon-inspector-ohio · 2026-03-20T04:59:08Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

amazon-inspector-ohio · 2026-03-20T04:59:39Z

✅ I finished the code review, and didn't find any security or code quality issues.

amazon-inspector-ohio · 2026-03-20T05:05:21Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

amazon-inspector-ohio · 2026-03-20T05:05:52Z

✅ I finished the code review, and didn't find any security or code quality issues.

amazon-inspector-ohio · 2026-03-20T05:08:28Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

amazon-inspector-ohio · 2026-03-20T05:08:59Z

✅ I finished the code review, and didn't find any security or code quality issues.

The SageMaker detached server runs inside the VS Code extension host process. process.exit(1) terminates the entire test runner. Use console.error + return instead.

ci: trigger CI check

7623371

aseemxs added 5 commits March 19, 2026 19:11

fix(lambda): revert cancellation handling to throw for correct telemetry

d62517a

fix(test): ignore unrelated messages in walkthrough test onDidShowMes…

c4202f9

…sage handler

fix: guard unhandled promise rejections in SageMaker server and Cloud…

58d7931

…Formation LSP client

fix: suppress CloudFormation LSP client unhandled rejections during l…

f3ed47f

…ifecycle

aseemxs closed this Mar 20, 2026

aseemxs reopened this Mar 20, 2026

fix: only skip CloudFormation LSP client for unit tests, not E2E

0052a3a

aseemxs changed the title ~~ci: dummy PR to check CI status~~ fix: resolve CI failures from Lambda telemetry regression, unhandled promise rejections, and flaky test assertions Mar 20, 2026

fix: address review feedback — log caught errors, exit on missing env…

7a68805

… var - CodeLens: log caught errors at debug level instead of silent swallow - SageMaker server: process.exit(1) instead of return when env var missing — server can't function without it, shouldn't linger

aseemxs closed this Mar 20, 2026

aseemxs reopened this Mar 20, 2026

aseemxs closed this Mar 20, 2026

aseemxs reopened this Mar 20, 2026

aseemxs changed the title ~~fix: resolve CI failures from Lambda telemetry regression, unhandled promise rejections, and flaky test assertions~~ fix: resolve CI failures from telemetry regression, unhandled rejections, and flaky tests Mar 20, 2026

aseemxs closed this Mar 20, 2026

aseemxs reopened this Mar 20, 2026

aseemxs changed the title ~~fix: resolve CI failures from telemetry regression, unhandled rejections, and flaky tests~~ fix(ci): resolve failures from telemetry regression, unhandled rejections, and flaky tests Mar 20, 2026

aseemxs closed this Mar 20, 2026

aseemxs reopened this Mar 20, 2026

fix: revert process.exit(1) — it kills the extension host during tests

0e04cee

The SageMaker detached server runs inside the VS Code extension host process. process.exit(1) terminates the entire test runner. Use console.error + return instead.

Conversation

aseemxs commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Causes & Fixes

1. Lambda URI Handler telemetry — 3 test failures (all platforms)

2. AppBuilder Walkthrough test — 1 test failure (macOS minimum)

3. Unhandled promise rejections — Linux CodeBuild failures (all 3 variants)

4. CloudFormation LSP client lifecycle rejections — Linux CodeBuild minimum

5. Pre-existing flaky test fixes

Functional vs Test-Only Changes

Testing

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

amazon-inspector-ohio bot commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aseemxs commented Mar 20, 2026 •

edited

Loading