New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Blog: Add post on leveraging Katib for efficient RAG optimization. #161

Open

varshaprasad96 wants to merge 1 commit into kubeflow:master from varshaprasad96:post/rag_katib

+224 −0

varshaprasad96 commented Feb 22, 2025

Closes: #160


          Blog: Add post on leveraging Katib for efficient RAG optimization.

00ac27d

Signed-off-by: Varsha Prasad Narsing <[email protected]>

google-oss-prow bot requested a review from franciscojavierarceo

February 22, 2025 08:51

Contributor

google-oss-prow bot commented Feb 22, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign johnugeorge for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow bot requested a review from zijianjoy

February 22, 2025 08:51

google-oss-prow bot added the size/L label

Author

varshaprasad96 commented Feb 22, 2025

cc: @franciscojavierarceo

franciscojavierarceo reviewed

View reviewed changes

Contributor

franciscojavierarceo left a comment

This is so great! One small nit: can you break up the text so that there are line breaks around every 80 characters or so? It'll help comment individual sections.

Contributor

franciscojavierarceo commented Feb 22, 2025 •

edited

Loading

FYI @andreyvelich we can put this under GenAI page

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md


		# Introduction

		As machine learning models become more sophisticated, optimising their performance remains a critical challenge. Kubeflow provides a robust component, [KATIB][Katib], designed for hyperparameter optimization and neural architecture search. As a part of the Kubeflow ecosystem, KATIB enables scalable, automated tuning of underlying machine learning models, reducing the manual effort required for parameter selection while improving model performance across diverse ML workflows.

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
            As machine learning models become more sophisticated, optimising their performance remains a critical challenge. Kubeflow provides a robust component, [KATIB][Katib], designed for hyperparameter optimization and neural architecture search. As a part of the Kubeflow ecosystem, KATIB enables scalable, automated tuning of underlying machine learning models, reducing the manual effort required for parameter selection while improving model performance across diverse ML workflows.
          
            As artificial intelligence and machine learning models become more sophisticated, optimising their performance remains a critical challenge. Kubeflow provides a robust component, [KATIB][Katib], designed for hyperparameter optimization and neural architecture search. As a part of the Kubeflow ecosystem, KATIB enables scalable, automated tuning of underlying machine learning models, reducing the manual effort required for parameter selection while improving model performance across diverse ML workflows.

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md


		## STEP 1: Setup

		Since compute resources are scarcer than a perfectly labeled dataset :), we’ll use a lightweight KinD cluster to run this example locally. Rest assured, this setup can seamlessly scale to larger clusters by increasing the dataset size and the number of hyperparameters to tune.

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
            Since compute resources are scarcer than a perfectly labeled dataset :), we’ll use a lightweight KinD cluster to run this example locally. Rest assured, this setup can seamlessly scale to larger clusters by increasing the dataset size and the number of hyperparameters to tune.
          
            Since compute resources are scarcer than a perfectly labeled dataset :), we’ll use a lightweight [Kind cluster (Kubernetes in Docker)](https://kind.sigs.k8s.io/) to run this example locally. Rest assured, this setup can seamlessly scale to larger clusters by increasing the dataset size and the number of hyperparameters to tune.

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md


		Since compute resources are scarcer than a perfectly labeled dataset :), we’ll use a lightweight KinD cluster to run this example locally. Rest assured, this setup can seamlessly scale to larger clusters by increasing the dataset size and the number of hyperparameters to tune.

		To get started, we'll first install the KATIB controller in our cluster by following the steps outlined [here][katib_installation].

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
            To get started, we'll first install the KATIB controller in our cluster by following the steps outlined [here][katib_installation].
          
            To get started, we'll first install the KATIB controller in our cluster by following the steps outlined [in the documentation][katib_installation].

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md


		## STEP 2: Implementing RAG pipeline

		In this implementation, we use a retriever model to fetch relevant documents based on a query and a generator model to produce coherent text responses.

Contributor

franciscojavierarceo Feb 23, 2025

probably would be good to outline what a retriever model is somewhere

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md

+. Retriever: Sentence Transformer & FAISS Index
+                 * A SentenceTransformer model (paraphrase-MiniLM-L6-v2) encodes predefined documents into vector representations.
+                 * FAISS is used to index these document embeddings and perform efficient similarity searches to retrieve the most relevant documents.

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
               * FAISS is used to index these document embeddings and perform efficient similarity searches to retrieve the most relevant documents.
          
               * [FAISS](https://ai.meta.com/tools/faiss/) is used to index these document embeddings and perform efficient similarity searches to retrieve the most relevant documents.

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md

+                 * A Hugging Face GPT-2 text generation pipeline (which can be replaced with any other model) is used to generate responses based on the retrieved documents. I chose GPT-2 for this example as it is lightweight enough to run on my local machine while still generating coherent responses.
+. Query Processing & Response Generation
+                 * When a query is submitted, the retriever encodes it and searches the FAISS index for the top-k most similar documents.
+                 * These retrieved documents are concatenated to form a context, which is then passed to the GPT-2 model to generate a response.

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
               * These retrieved documents are concatenated to form a context, which is then passed to the GPT-2 model to generate a response.
          
               * These retrieved documents are concatenated to form the input context, which is then passed to the GPT-2 model to generate a response.

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md

+                  parser.add_argument("--temperature", type=float, required=True, help="Temperature for the generator")
+                  args = parser.parse_args()
+                  #TODO: The quries and ground truth against which the BLEU score needs to be evaluated. They can be provided in the script below or loaded from an external volume.

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
                #TODO: The quries and ground truth against which the BLEU score needs to be evaluated. They can be provided in the script below or loaded from an external volume. 
          
                #TODO: The queries and ground truth against which the BLEU score needs to be evaluated. They can be provided in the script below or loaded from an external volume.

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md

+                          restartPolicy: Never
+              ```
+              On applying this yaml on cluster, we can see our optimization script in action.

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
            On applying this yaml on cluster, we can see our optimization script in action.
          
            After applying this yaml on the cluster, we can see our optimization script in action.

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md

+              kubeflow    rag-tuning-experiment-hzxrzq2t   Running   True     10m
+              ```
+              The list of completed trials and their results will be shown in the UI as below. Steps to access Katib UI are available [here][katib_ui].:

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
            The list of completed trials and their results will be shown in the UI as below. Steps to access Katib UI are available [here][katib_ui].:  
          
            The list of completed trials and their results will be shown in the UI like below. Steps to access Katib UI are available [in the documentation][katib_ui].:

franciscojavierarceo reviewed

View reviewed changes

_posts/2025-02-21-katib-rag-optimization.md


		In this experiment, we leveraged Kubeflow Katib to optimize a Retrieval-Augmented Generation (RAG) pipeline, systematically tuning key hyperparameters like top_k and temperature to enhance retrieval precision and generative response quality.

		For anyone working with RAG systems or hyperparameter optimization, Katib is a game-changer—enabling scalable, efficient, and intelligent tuning of machine learning models! In this experiment. Hope this helps you streamline hyperparameter tuning and unlock new efficiencies in your ML workflows!

Contributor

franciscojavierarceo Feb 23, 2025

Suggested change

      
            For anyone working with RAG systems or hyperparameter optimization, Katib is a game-changer—enabling scalable, efficient, and intelligent tuning of machine learning models! In this experiment. Hope this helps you streamline hyperparameter tuning and unlock new efficiencies in your ML workflows!
          
            For anyone working with RAG systems or hyperparameter optimization, Katib is a powerful tool—enabling scalable, efficient, and intelligent tuning of machine learning models! We hope this tutorial helps you streamline hyperparameter tuning and unlock new efficiencies in your ML workflows!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels