Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky test] Metrics-related e2e tests fail occasionally #4139

Open
mimowo opened this issue Feb 3, 2025 · 2 comments · May be fixed by #4207
Open

[Flaky test] Metrics-related e2e tests fail occasionally #4139

mimowo opened this issue Feb 3, 2025 · 2 comments · May be fixed by #4207
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.

Comments

@mimowo
Copy link
Contributor

mimowo commented Feb 3, 2025

/kind flake

What happened:

metrics related e2e tests failed in runs of an unrelated branch

What you expected to happen:

no failures

How to reproduce it (as minimally and precisely as possible):

Repeat on CI

Anything else we need to know?:

{Timed out after 10.143s.
The function passed to Eventually failed at /home/prow/go/src/sigs.k8s.io/kueue/test/e2e/singlecluster/metrics_test.go:623 with:
Metric kueue_local_queue_quota_reserved_workloads_total was found in output # HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
# TYPE certwatcher_read_certificate_errors_total counter
certwatcher_read_certificate_errors_total 0
# HELP certwatcher_read_certificate_total Total number of certificate reads
# TYPE certwatcher_read_certificate_total counter
certwatcher_read_certificate_total 23
# HELP controller_runtime_active_workers Number of currently used workers per controller
# TYPE controller_runtime_active_workers gauge
controller_runtime_active_workers{controller="admissioncheck"} 0
controller_runtime_active_workers{controller="appwrapper"} 0
controller_runtime_active_workers{controller="cert-rotator"} 0
controller_runtime_active_workers{controller="clusterqueue"} 0
controller_runtime_active_workers{controller="cohort"} 0
controller_runtime_active_workers{controller="job"} 0
controller_runtime_active_workers{controller="jobset"} 0
controller_runtime_active_workers{controller="leaderworkerset"} 0
controller_runtime_active_workers{controller="leaderworkerset-pod"} 0
controller_runtime_active_workers{controller="localqueue"} 0
controller_runtime_active_workers{controller="multikueue-admissioncheck"} 0
controller_runtime_active_workers{controller="multikueue-workload"} 0
controller_runtime_active_workers{controller="multikueuecluster"} 0
controller_runtime_active_workers{controller="provisioning-admissioncheck"} 0
controller_runtime_active_workers{controller="provisioning-workload"} 0

...

leader_election_master_status{name="c1f6bfd2.kueue.x-k8s.io"} 1
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 9.36
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.073741816e+09
# HELP process_network_receive_bytes_total Number of bytes received by the process over the network.
# TYPE process_network_receive_bytes_total counter
process_network_receive_bytes_total 1.072063e+07
# HELP process_network_transmit_bytes_total Number of bytes sent by the process over the network.
# TYPE process_network_transmit_bytes_total counter
process_network_transmit_bytes_total 8.970838e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 32
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 9.8254848e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.73859753992e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.336942592e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19

Expected
    <bool>: true
to be false
In [It] at: /home/prow/go/src/sigs.k8s.io/kueue/test/e2e/singlecluster/metrics_test.go:625 @ 02/03/25 15:49:24.579
}

Does not seem to provide useful info for debugging. It would be great to see exactly which metric is missing.

@mimowo mimowo added the kind/bug Categorizes issue or PR as related to a bug. label Feb 3, 2025
@k8s-ci-robot k8s-ci-robot added the kind/flake Categorizes issue or PR as related to a flaky test. label Feb 3, 2025
@mimowo
Copy link
Contributor Author

mimowo commented Feb 3, 2025

cc @mbobrovskyi @mykysha PTAL

@mykysha
Copy link
Contributor

mykysha commented Feb 3, 2025

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants