Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit launch latency #1696

Merged
merged 12 commits into from
Aug 16, 2024
Merged

Emit launch latency #1696

merged 12 commits into from
Aug 16, 2024

Conversation

tylerwowen
Copy link
Contributor

@tylerwowen tylerwowen commented Aug 14, 2024

Changes

TAS

Previously launch latency was emitted by a Rodimus worker. This PR implements the same measure in the TAS.

Launch latency is defined as the duration from host launch to the the complete of the first deployment. It measures for all environments deployed on new hosts. The configuration Launch grace period is the user set threshold for this latency and it's used for AgentJanitor and Rodimus health check.

Another fix is that I found all first deploy metrics were success. So I updated the condition when the first_deploy flag should be turned off.

UI

While updating the group status page to include the launch latency, I realized I forgot that the original plan was to create a dashboard. So I removed the links added in #1694, created a dashboard including the changes in a new Teletraan user dashboard.

Fixed launch failure rate graph and updated some metrics calculations.

image

Test plan

TAS

  1. Deploy this PR to TAS dev1
  2. Launch a new host in tyler/test
    • Launch latency should be emitted
    • First deploy counter should increment with success=true.
  3. Deploy a bad build to tyler/test
  4. Launch a new host
    • Launch latency should be emitted
    • First deploy counter should increment with success=false.

UI

  1. Deploy this PR to deploy-board dev1
  2. Visit group status page /groups/tyler-test/
    • verify the link is updated and working

@github-actions github-actions bot added the deploy-service Includes changes to deploy-service label Aug 14, 2024
@tylerwowen tylerwowen force-pushed the touyang/lauch-latency branch from 7a6a6b6 to 2e9c31a Compare August 15, 2024 00:32
@github-actions github-actions bot added the deploy-board Includes changes to deploy-board label Aug 15, 2024
@tylerwowen tylerwowen marked this pull request as ready for review August 15, 2024 23:40
@tylerwowen tylerwowen requested a review from a team as a code owner August 15, 2024 23:40
osoriano
osoriano previously approved these changes Aug 16, 2024
@tylerwowen tylerwowen requested review from osoriano August 16, 2024 18:51
@tylerwowen tylerwowen merged commit 5de012e into master Aug 16, 2024
6 checks passed
@tylerwowen tylerwowen deleted the touyang/lauch-latency branch August 16, 2024 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deploy-board Includes changes to deploy-board deploy-service Includes changes to deploy-service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants