Skip to content

Add higher-level of testing for "queue-name" handling in pod-based workloads #3767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mimowo opened this issue Dec 9, 2024 · 8 comments · Fixed by #4112 or #4431
Closed

Add higher-level of testing for "queue-name" handling in pod-based workloads #3767

mimowo opened this issue Dec 9, 2024 · 8 comments · Fixed by #4112 or #4431
Assignees
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.

Comments

@mimowo
Copy link
Contributor

mimowo commented Dec 9, 2024

What would you like to be cleaned:

Add integration-level testing for pod-based integrations (Pod themselves, STS, Deployment).
In particular cover scenarios related to the recent changes around namespace filtering and LQ defautling.

We may also consider adding a suite for e2e tests with manageJobsWithoutQueueName: true.

Why is this needed:

The logic around "queue-name" handling is getting complex, but the tests for Jobs aren't enough, because the handling is mostly separate for pod-based serving workloads.

@mimowo mimowo added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Dec 9, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Dec 9, 2024

@mimowo mimowo changed the title Add higher-level of testing for "queue-name" handling in pod-based workloads Add integration-level of testing for "queue-name" handling in pod-based workloads Dec 12, 2024
@mimowo mimowo changed the title Add integration-level of testing for "queue-name" handling in pod-based workloads Add higher-level of testing for "queue-name" handling in pod-based workloads Dec 12, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Dec 12, 2024

Actually, I think with #3804 it should be relatively simple to start a new e2e suite within the same CI job with that configuration. So, I would suggest starting with that. The Pod-based integration is easier to test e2e than integration, because pods are actually created by kube-controller-manager's controllers (Deployment or StatefulSets). cc @dgrove-oss @tenzen-y @mbobrovskyi .

We could start with the e2e tests for the "sanity" checking and add more integration tests for corner cases.

@mbobrovskyi
Copy link
Contributor

/assign

@mbobrovskyi
Copy link
Contributor

/unassign

Sorry, I don't have capacity to work on it for now.

@kaisoz
Copy link
Contributor

kaisoz commented Jan 13, 2025

/assign

I'll take this one since is a requirement to tackle #3829

@mimowo
Copy link
Contributor Author

mimowo commented Feb 10, 2025

/reopen
For the follow ups: #4112 (review)

@k8s-ci-robot k8s-ci-robot reopened this Feb 10, 2025
@k8s-ci-robot
Copy link
Contributor

@mimowo: Reopened this issue.

In response to this:

/reopen
For the follow ups: #4112 (review)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mimowo
Copy link
Contributor Author

mimowo commented Feb 14, 2025

I think with the ground work of #4112 we are ready to add some tests coverage which is crucial for correctness of this configuration with the pod-based frameworks.

I'm thinking about covering the following scenarios:

  1. create Pod/Deployment/StatefulSet without queue-name in test namespace and verify it is suspended, then delete the API objects and verify the child pods are removed too
  2. create a Pod/Deployment/StatefulSet without queue-name in kube-system and verify they run, then delete the API objects and verify the child pods are removed too

I believe there are numerous nuances between the integrations which are very hard to test with integration test because they rely on pod creation and integration with kube-controller-manager (especially for StatefulSet).

The scenarios above will require enabling the integrations in the Kueue configuration. IIUC we could just add more "Its" for them here.

My preference is to split the work into small PRs so that we can react and change plans as we continue the work.

cc @dgrove-oss @tenzen-y

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Projects
None yet
4 participants