Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding health report to logstash integration #12464

Closed
wants to merge 25 commits into from
Closed

Conversation

flexitrev
Copy link
Contributor

@flexitrev flexitrev commented Jan 24, 2025

Proposed commit message

Added data and dashboards including Logstash Health report

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Closes #12465

Screenshots

Node Health

Node Health

Pipeline Health

Pipeline Health

@flexitrev flexitrev requested a review from a team as a code owner January 24, 2025 22:23
@flexitrev flexitrev requested a review from robbavey January 24, 2025 22:24
@flexitrev flexitrev linked an issue Jan 24, 2025 that may be closed by this pull request
@andrewkroh andrewkroh added dashboard Relates to a Kibana dashboard bug, enhancement, or modification. Integration:logstash Logstash Team:Stack Monitoring Stack Monitoring team [elastic/stack-monitoring] labels Jan 24, 2025
@jsvd jsvd self-requested a review January 27, 2025 14:25
@yaauie
Copy link
Member

yaauie commented Jan 27, 2025

  • On the "General Pipeline Information" section:
    • I would not expect TERMINATED pipelines to have load averages; the presence of the 0's and green color in those columns visually tells me that everything is A-okay.
    • Is "Load" in this section short-hand for Worker Utilization? Although it's shorter, I'm wary of overloading the word "Load" to mean new things (from a dev-ops perspective, it already has a very specific meaning related to CPU instructions queued per CPU tick)
    • Can we color-code the whole row based on the "Status" column (and if so, do we have prior-art for color-sets that are less in-your-face than #00ff00 #999999 #ffff00 #ff0000 )?
    • It feels like this is a smash-up of our two initial probes, but what happens when we add more?
      • in our diagnostic tool, we can compare two windows, IIRC it is always "current" against a selectable baseline, which lets us reduce the number of columns.

@andrewkroh andrewkroh added the enhancement New feature or request label Jan 27, 2025
@strawgate
Copy link
Contributor

  • I would not expect TERMINATED pipelines to have load averages; the presence of the 0's and green color in those columns visually tells me that everything is A-okay.

@flexitrev With the above sections covering terminated pipelines, we could change the bottom section to be running pipelines and just filter out terminated pipelines.

Can we color-code the whole row based on the "Status" column (and if so, do we have prior-art for color-sets that are less in-your-face than #00ff00 #999999 #ffff00 #ff0000 )?

I do not believe you can color-code the whole row based on a cell -- you can color code the cell based on its value, though. As for the colors themselves, the color sets are the EUI palettes that ship with Kibana and will follow any changes the kibana team makes to the values for the colors. In general I think we should stick with the built-in palettes for integrations we ship to customers.

@flexitrev
Copy link
Contributor Author

Added a filter to the Worker utilization by pipeline to exclude terminated pipelines. Should I exclude finished too?

Colors are expected to change with project borealis - If they don't automatically get picked up by us picking builtin colors, we'll update later. https://github.com/elastic/search-team/issues/8943

@flexitrev
Copy link
Contributor Author

flexitrev commented Jan 28, 2025

Updated dashboard, simplified worker utilization by pipeline table, modified visualization names, changed status to percentage.

hr_screenshot

@flexitrev flexitrev requested a review from strawgate January 28, 2025 20:40
@flexitrev
Copy link
Contributor Author

Thanks! I tried it at the top level manifest but not on the individual streams -
image

@flexitrev
Copy link
Contributor Author

It looks like we removed event.dataset prematurely -- it looks like CI was failing because sample-event.json had two values for event.dataset: https://buildkite.com/elastic/integrations/builds/21145#0194aee4-6d12-43ad-8f03-233889b509f1

build/test-coverage/coverage-logstash-pipeline-1738101594676982809-report.xml

<testcase name="static test: Verify sample_event.json" classname="logstash.health_report" time="0.115375683">
<failure>test case failed: one or more errors found in document: [0] field "event.dataset" should have value in ["logstash.health_report"], it has "logstash.pipeline"</failure>
</testcase>

In the sample_event.json when CI was failing there were two event.dataset values in the document: 13ed93d

"event": {
  "agent_id_status": "verified",
  "ingested": "2025-01-28T18:41:28Z",
  "dataset": "logstash.health_report"
},
"event.dataset": "logstash.pipeline"
}

@strawgate Yes, I thought it best to just keep logstash.health_report. Should I keep both somehow? Rename event.dataset:logstash.pipeline?

@strawgate
Copy link
Contributor

Ideally, there would be two sample events, the node sample event and the pipeline sample event?

@flexitrev
Copy link
Contributor Author

Ideally, there would be two sample events, the node sample event and the pipeline sample event?

That might require defining 2 datastreams health_report.node and health_report.pipeline

@elasticmachine
Copy link

⏳ Build in-progress, with failures

Failed CI Steps

History

@andrewkroh andrewkroh added Integration:1password 1Password Integration:abnormal_security Abnormal Security New Integration Issue or pull request for creating a new integration package. Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations] labels Feb 3, 2025
@elasticmachine
Copy link

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@flexitrev
Copy link
Contributor Author

Cutting a new branch and creating a new PR - #12601

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dashboard Relates to a Kibana dashboard bug, enhancement, or modification. enhancement New feature or request Integration:logstash Logstash New Integration Issue or pull request for creating a new integration package. Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations] Team:Stack Monitoring Stack Monitoring team [elastic/stack-monitoring]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants