Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

katib-db-manager : Ping to Katib db failed: dial tcp connect: connection refused #2980

Closed
5 of 7 tasks
pedrovgp opened this issue Feb 6, 2025 · 2 comments
Closed
5 of 7 tasks

Comments

@pedrovgp
Copy link

pedrovgp commented Feb 6, 2025

Validation Checklist

  • Is this a Kubeflow issue?
  • Are you posting in the right repository ?
  • Did you follow the Kubeflow installation guideline ?
  • Is the issue report properly structured and detailed with version numbers?
  • Is this for Kubeflow development ?
  • Would you like to work on this issue?
  • You can join the CNCF Slack and access our meetings at the Kubeflow Community website. Our channel on the CNCF Slack is here #kubeflow-platform.

Version

master

Describe your issue

We are installing Kubeflow 1.10 rc0 on AWS EKS with Kubernetes 1.30.

We did not make any customizations to Katib, we are using the install "apps/katib/upstream/installs/katib-with-kubeflow".

The "katib-db-manager" deploy enters CrashLoopBackOff

We found some similar issues, [1](kubeflow/katib#2425, 2 and 3.

We have debugged, I'll open a PR with the investigation and solution proposal.

Steps to reproduce the issue

  1. Checkout the v1.10rc0 branch.
  2. Create an EKS cluster in Kubernetes 1.30 version.
  3. Configure the cluster to be used by kubectl.
  4. Run kubectl apply -k example

Put here any screenshots or videos (optional)

No response

@juliusvonkohout
Copy link
Member

We have not yet upgraded Katib . It is still the same as in Kubeflow 1.9.1. Also here please reproduce with Kind to get help.

@pedrovgp
Copy link
Author

pedrovgp commented Feb 6, 2025

It's all right, I just raised the ticket for reference, before suggesting the solution. Here is the PR

@pedrovgp pedrovgp closed this as completed Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants