Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GPU Testing for LLM Blueprints #2432

Open
andreyvelich opened this issue Feb 11, 2025 · 2 comments
Open

Enable GPU Testing for LLM Blueprints #2432

andreyvelich opened this issue Feb 11, 2025 · 2 comments

Comments

@andreyvelich
Copy link
Member

andreyvelich commented Feb 11, 2025

What you would like to be added?

We will soon introduce LLM Blueprints which typically require GPUs to run them: #2410.
To support this, we need to explore using GitHub self-hosted runners with GPU support.

If we can get them, we need to see if we can deploy Kubernetes cluster with these runners and NVIDIA GPU Operator.

cc @Electronic-Waste @astefanutti @kubeflow/wg-training-leads @deepanker13 @saileshd1402 @franciscojavierarceo

Why is this needed?

We need GPU for our testing infrastructure.

Love this feature?

Give it a 👍 We prioritize the features with most 👍

@mahdikhashan
Copy link
Member

mahdikhashan commented Feb 19, 2025

@andreyvelich I did a very brief research and it seems like kind does not support GPUs natively, however there are custom configurations to enable it, one is a maintained fork from NVIDIA itself:

  1. https://github.com/NVIDIA/nvkind

otherwise, minikube has native support:

  1. https://minikube.sigs.k8s.io/docs/tutorials/nvidia/

@andreyvelich
Copy link
Member Author

This is good news that minikube supports Nvidia device plugin!

I remember that @astefanutti and @franciscojavierarceo also explored how to leverage Nvidia with local k8s cluster: #2231 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants