Skip to content

wait_ready() fails when called before MCAD status object is created #226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MichaelClifford opened this issue Jul 14, 2023 · 2 comments
Closed
Assignees

Comments

@MichaelClifford
Copy link
Collaborator

There is a case, when the cluster is out of capacity and MCAD fails to create a cluster, where wait_ready() will fail with:

TypeError: 'MissingModel' object is not callable
Failed to init Ray cluster, error 'MissingModel' object is not callable

This appears to be caused by line 469 in _map_to_app_wrapper() when no state yet exists.

def _map_to_app_wrapper(cluster) -> AppWrapper:
cluster_model = cluster.model
return AppWrapper(
name=cluster.name(),
status=AppWrapperStatus(cluster_model.status.state.lower()),
can_run=cluster_model.status.canrun,
job_state=cluster_model.status.queuejobstate,
)

We should updatecluster.status() to account for this instance and keep the cluster status "pending" until it is resolved, or times out.

We should also add some error handling to _map_to_app_wrapper() so that the reason for failure is clear and the correct value is passed up to cluster.status().

Orginal Request from Slack:
https://project-codeflare.slack.com/archives/C04PF8V5MB3/p1689336080924819

@Maxusmusti
Copy link
Collaborator

Update: the issue is not resolved by the kubernetes support update, but it is changed. No longer a MissingModel issue, but some elements of status not populated yet that need to be checked for.

@Maxusmusti
Copy link
Collaborator

Update: Addressed with #254

@github-project-automation github-project-automation bot moved this from Ready For Review to Done in Project CodeFlare Sprint Board Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants