Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Use RAYCLUSTER_DEFAULT_REQUEUE_SECONDS_ENV as timeout of status check in tests #1755

Merged
merged 1 commit into from
Dec 18, 2023

Conversation

rueian
Copy link
Contributor

@rueian rueian commented Dec 15, 2023

As @kevin85421 found out, we have set RAYCLUSTER_DEFAULT_REQUEUE_SECONDS_ENV to 10 seconds to speed up the tests. But we still wait for utils.RAYCLUSTER_DEFAULT_REQUEUE_SECONDS+5, which is 305 seconds and is too long, to fail a test.

I believe this inconsistency exists because there is no convenient converter for the env variable.

In this PR, I replaced all usages of RAYCLUSTER_DEFAULT_REQUEUE_SECONDS with RAYCLUSTER_DEFAULT_REQUEUE_SECONDS_ENV in tests with the help of a simple MustGetEnvInt converter.

This change can fail our tests faster if any unexpected occurs.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421 kevin85421 self-requested a review December 15, 2023 17:04
@kevin85421 kevin85421 self-assigned this Dec 15, 2023
@rueian rueian force-pushed the test-requeue-seconds-env branch from 5922f36 to 31a8c2e Compare December 16, 2023 02:48
…ut of status check in tests

It is only an insurance to keep reconciliation continuously triggered to hopefully fix an unexpected state.
Relying on it to fix state transitions is not a good pratice. Therefore, We should not reference it in tests at all.
@rueian rueian force-pushed the test-requeue-seconds-env branch from 31a8c2e to b5be747 Compare December 16, 2023 07:32
@rueian
Copy link
Contributor Author

rueian commented Dec 16, 2023

After discussing with @kevin85421, we think replace all the usages of RAYCLUSTER_DEFAULT_REQUEUE_SECONDS in tests with 15 seconds. Because we should only treat the re-queue of reconciliation as insurance to hopefully fix an unexpected state and we should not rely on it in tests.

TODO: We can consider later removing the os.Setenv(utils.RAYCLUSTER_DEFAULT_REQUEUE_SECONDS_ENV, "10") in suite_test.go to further enforce our reconciliation logic to be correct without the help of re-queuing.

@kevin85421 kevin85421 merged commit c5d7de6 into ray-project:master Dec 18, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants