Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AAW dev: system nodepools track resource usage #2001

Closed
Jose-Matsuda opened this issue Dec 11, 2024 · 2 comments
Closed

AAW dev: system nodepools track resource usage #2001

Jose-Matsuda opened this issue Dec 11, 2024 · 2 comments
Assignees

Comments

@Jose-Matsuda
Copy link
Contributor

EPIC
Follow up to changing the workload sizes

We want to observe if there are any workloads being booted off / acting weirdly. This is to validate the work done in #1992

@jacek-dudek
Copy link

Here is a table of resource usage metrics that compares the discrepancies between actual usage and resource requests before request adjustments were made and after requests were adjusted. All workloads whose requests were adjusted track much better (in some cases by a factor of 10 or more) their new requests.
filtered-resource-utilization-on-aaw-dev-system-nodes.ods

@Jose-Matsuda
Copy link
Contributor Author

Reviewed the pods in the .ods file and LGTM;
The metrics on grafana are fine(memory and CPU are well provisioned).
The pods on k9s have 0 restarts which is good meaning it's not getting resource starved and crashloopbackoffing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants