Enable GPU Testing for LLM Blueprints #2432

andreyvelich · 2025-02-11T18:06:00Z

What you would like to be added?

We will soon introduce LLM Blueprints which typically require GPUs to run them: #2410.
To support this, we need to explore using GitHub self-hosted runners with GPU support.

If we can get them, we need to see if we can deploy Kubernetes cluster with these runners and NVIDIA GPU Operator.

cc @Electronic-Waste @astefanutti @kubeflow/wg-training-leads @deepanker13 @saileshd1402 @franciscojavierarceo

Why is this needed?

We need GPU for our testing infrastructure.

Love this feature?

Give it a 👍 We prioritize the features with most 👍

mahdikhashan · 2025-02-19T18:57:07Z

@andreyvelich I did a very brief research and it seems like kind does not support GPUs natively, however there are custom configurations to enable it, one is a maintained fork from NVIDIA itself:

https://github.com/NVIDIA/nvkind

otherwise, minikube has native support:

https://minikube.sigs.k8s.io/docs/tutorials/nvidia/

andreyvelich · 2025-02-20T13:46:43Z

This is good news that minikube supports Nvidia device plugin!

I remember that @astefanutti and @franciscojavierarceo also explored how to leverage Nvidia with local k8s cluster: #2231 (comment)

andreyvelich added area/testing kind/feature labels Feb 11, 2025

andreyvelich mentioned this issue Feb 12, 2025

Google Summer of Code 2025 - Please Submit Your Project Ideas kubeflow/community#809

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable GPU Testing for LLM Blueprints #2432

Enable GPU Testing for LLM Blueprints #2432

andreyvelich commented Feb 11, 2025 •

edited

Loading

mahdikhashan commented Feb 19, 2025 •

edited

Loading

andreyvelich commented Feb 20, 2025

Enable GPU Testing for LLM Blueprints #2432

Enable GPU Testing for LLM Blueprints #2432

Comments

andreyvelich commented Feb 11, 2025 • edited Loading

What you would like to be added?

Why is this needed?

Love this feature?

mahdikhashan commented Feb 19, 2025 • edited Loading

andreyvelich commented Feb 20, 2025

andreyvelich commented Feb 11, 2025 •

edited

Loading

mahdikhashan commented Feb 19, 2025 •

edited

Loading