Restructure and rewrite README.md (#193)

LynseyFabel · web-flow · commit f666a346d396 · 2024-09-10T10:27:16.000-07:00
diff --git a/README.md b/README.md
@@ -6,62 +6,98 @@
 ![](docs/images/apps-catalog-promo-web-banner-laptop-300@2x.jpg)
 
 # NVIDIA Generative AI Examples
-This repository serves as a starting point for generative AI developers looking to integrate with the NVIDIA software ecosystem to accelerate their generative AI systems.
-Whether you are building RAG pipelines, agentic workflows, or finetuning models, this repository will help you integrate NVIDIA, seamlessly and natively, with your development stack.
 
-## What's new?
+This repository is a starting point for developers looking to integrate with the NVIDIA software ecosystem to speed up their generative AI systems. Whether you are building RAG pipelines, agentic workflows, or fine-tuning models, this repository will help you integrate NVIDIA, seamlessly and natively, with your development stack.
 
-#### Knowledge Graph RAG
-The example implements a GPU-accelerated pipeline for creating and querying knowledge graphs using RAG by leveraging NIM microservices and the RAPIDS ecosystem for efficient processing of large-scale datasets.
-- [Knowledge Graphs for RAG with NVIDIA AI Foundation Models and Endpoints](community/knowledge_graph_rag)
+## Table of Contents
+<!-- TOC -->
 
-#### Agentic Workflows with Llama 3.1
-- Build an Agentic RAG Pipeline with Llama 3.1 and NVIDIA NeMo Retriever NIM microservices [[Blog](https://developer.nvidia.com/blog/build-an-agentic-rag-pipeline-with-llama-3-1-and-nvidia-nemo-retriever-nims/), [notebook](RAG/notebooks/langchain/agentic_rag_with_nemo_retriever_nim.ipynb)]
-- [NVIDIA Morpheus, NIM microservices, and RAG pipelines integrated to create LLM-based agent pipelines](https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/experimental/event-driven-rag-cve-analysis)
+* [What's New?](#whats-new)
+  * [Knowledge Graph RAG](#knowledge-graph-rag)
+  * [Agentic Workflows with Llama 3.1](#agentic-workflows-with-llama-31)
+  * [RAG with Local NIM Deployment and LangChain](#rag-with-local-nim-deployment-and-langchain)
+* [Try it Now!](#try-it-now)
+* [RAG](#rag)
+  * [RAG Notebooks](#rag-notebooks)
+  * [RAG Examples](#rag-examples)
+  * [RAG Tools](#rag-tools)
+  * [RAG Projects](#rag-projects)
+* [Documentation](#documentation)
+  * [Getting Started](#getting-started)
+  * [How To's](#how-tos)
+  * [Reference](#reference)
+* [Community](#community)
+
+<!-- /TOC -->
+
+## What's New?
+
+### Knowledge Graph RAG
 
-#### RAG with local NIM deployment and Langchain
-- Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints by Amit Bleiweiss. [[Blog](https://developer.nvidia.com/blog/tips-for-building-a-rag-pipeline-with-nvidia-ai-langchain-ai-endpoints/), [notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks/08_RAG_Langchain_with_Local_NIM.ipynb)]
+This example implements a GPU-accelerated pipeline for creating and querying knowledge graphs using RAG by leveraging NIM microservices and the RAPIDS ecosystem to process large-scale datasets efficiently.
 
-#### NeMo Guardrails with RAG
-- Notebook for demonstrating how to integrate NeMo Guardrails with a basic RAG pipeline in LangChain to ensure safe and accurate LLM responses using NVIDIA NIM microservices. [[Blog](https://developer.nvidia.com/blog/securing-generative-ai-deployments-with-nvidia-nim-and-nvidia-nemo-guardrails/), [notebook](RAG/notebooks/langchain/NeMo_Guardrails_with_LangChain_RAG/using_nemo_guardrails_with_LangChain_RAG.ipynb)]
+- [Knowledge Graphs for RAG with NVIDIA AI Foundation Models and Endpoints](community/knowledge_graph_rag)
+
+### Agentic Workflows with Llama 3.1
 
+- Build an Agentic RAG Pipeline with Llama 3.1 and NVIDIA NeMo Retriever NIM microservices [[Blog](https://developer.nvidia.com/blog/build-an-agentic-rag-pipeline-with-llama-3-1-and-nvidia-nemo-retriever-nims/), [Notebook](RAG/notebooks/langchain/agentic_rag_with_nemo_retriever_nim.ipynb)]
+- [NVIDIA Morpheus, NIM microservices, and RAG pipelines integrated to create LLM-based agent pipelines](https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/experimental/event-driven-rag-cve-analysis)
 
 
+### RAG with Local NIM Deployment and LangChain
 
-For more details view the [releases](https://github.com/NVIDIA/GenerativeAIExamples/releases/).
+- Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints by Amit Bleiweiss. [[Blog](https://developer.nvidia.com/blog/tips-for-building-a-rag-pipeline-with-nvidia-ai-langchain-ai-endpoints/), [Notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks/08_RAG_Langchain_with_Local_NIM.ipynb)]
 
-## Try it now!
+For more information, refer to the [Generative AI Example releases](https://github.com/NVIDIA/GenerativeAIExamples/releases/).
+
+## Try it Now!
 
 Experience NVIDIA RAG Pipelines with just a few steps!
 
 1. Get your NVIDIA API key.
+   1. Go to the [NVIDIA API Catalog](https://build.ngc.nvidia.com/explore/).
+   1. Select any model.
+   1. Click **Get API Key**.
+   1. Run:
+      ```console
+      export NVIDIA_API_KEY=nvapi-...
+      ``` 
 
-   Visit the [NVIDIA API Catalog](https://build.ngc.nvidia.com/explore/), select on any model, then click on `Get API Key`
+1. Clone the repository.
 
-   Afterward, run `export NVIDIA_API_KEY=nvapi-...`.
+   ```console
+   git clone https://github.com/nvidia/GenerativeAIExamples.git
+   ```
 
-1. Clone the repository and then build and run the basic RAG pipeline:
+1. Build and run the basic RAG pipeline.
 
    ```console
-   git clone https://github.com/nvidia/GenerativeAIExamples.git
    cd GenerativeAIExamples/RAG/examples/basic_rag/langchain/
    docker compose up -d --build
    ```
 
-Open a browser to <https://localhost:8090/> and submit queries to the sample RAG Playground.
+1. Go to <https://localhost:8090/> and submit queries to the sample RAG Playground.
+
+1. Stop containers when done. 
+  
+   ```console
+   docker compose down
+   ``` 
+      
 
-When done, stop containers by running `docker compose down`.
 
+## RAG
 
-## End to end RAG Examples and Notebooks
-NVIDIA has first class support for popular generative AI developer frameworks like [LangChain](https://python.langchain.com/v0.2/docs/integrations/chat/nvidia_ai_endpoints/), [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/nvidia/) and [Haystack](https://haystack.deepset.ai/integrations/nvidia). These notebooks will show you how to integrate NIM microservices using your preferred generative AI development framework.
+### RAG Notebooks
 
-### Notebooks
-Use the [notebooks](./RAG/notebooks/README.md) to learn about the LangChain and LlamaIndex connectors.
+NVIDIA has first-class support for popular generative AI developer frameworks like [LangChain](https://python.langchain.com/v0.2/docs/integrations/chat/nvidia_ai_endpoints/), [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/nvidia/), and [Haystack](https://haystack.deepset.ai/integrations/nvidia). These end-to-end notebooks show how to integrate NIM microservices using your preferred generative AI development framework.
+
+Use these [notebooks](./RAG/notebooks/README.md) to learn about the LangChain and LlamaIndex connectors.
 
 #### LangChain Notebooks
+
 - RAG
-  - [Basic RAG with CHATNVIDIA Langchain Integration](./RAG/notebooks/langchain/langchain_basic_RAG.ipynb)
+  - [Basic RAG with CHATNVIDIA LangChain Integration](./RAG/notebooks/langchain/langchain_basic_RAG.ipynb)
   - [RAG using local NIM microservices for LLMs and Retrieval](./RAG/notebooks/langchain/RAG_Langchain_with_Local_NIM.ipynb)
   - [RAG for HTML Documents](./RAG/notebooks/langchain/RAG_for_HTML_docs_with_Langchain_NVIDIA_AI_Endpoints.ipynb)
   - [Chat with NVIDIA Financial Reports](./RAG/notebooks/langchain/Chat_with_nvidia_financial_reports.ipynb)
@@ -72,48 +108,63 @@ Use the [notebooks](./RAG/notebooks/README.md) to learn about the LangChain and
 
 
 #### LlamaIndex Notebooks
+
 - [Basic RAG with LlamaIndex Integration](./RAG/notebooks/llamaindex/llamaindex_basic_RAG.ipynb)
 
-### End to end RAG Examples
-By default, the [examples](RAG/examples/README.md) use preview NIM endpoints on [NVIDIA API Catalog](https://catalog.ngc.nvidia.com).
-  Alternatively, you can run any of the examples [on premises](./RAG/examples/local_deploy/).
+### RAG Examples
+
+By default, these end-to-end [examples](RAG/examples/README.md) use preview NIM endpoints on [NVIDIA API Catalog](https://catalog.ngc.nvidia.com). Alternatively, you can run any of the examples [on premises](./RAG/examples/local_deploy/).
 
 #### Basic RAG Examples
+
   - [LangChain example](./RAG/examples/basic_rag/langchain/README.md)
   - [LlamaIndex example](./RAG/examples/basic_rag/llamaindex/README.md)
 
 #### Advanced RAG Examples
+
   - [Multi-Turn](./RAG/examples/advanced_rag/multi_turn_rag/README.md)
   - [Multimodal Data](./RAG/examples/advanced_rag/multimodal_rag/README.md)
   - [Structured Data](./RAG/examples/advanced_rag/structured_data_rag/README.md) (CSV)
   - [Query Decomposition](./RAG/examples/advanced_rag/query_decomposition_rag/README.md)
 
-### How To Guides
-
-- [Change the inference or embedding model](./docs/change-model.md)
-- [Customize the vector database](./docs/vector-database.md)
-- Customize the chain server:
-  - [Chunking strategy](./docs/text-splitter.md)
-  - [Prompt template engineering](./docs/prompt-customization.md)
-- [Support multiturn conversations](./docs/multiturn.md)
-- [Configure LLM parameters at runtime](./docs/llm-params.md)
-- [Speak queries and listen to responses with NVIDIA Riva](./docs/riva-asr-tts.md).
-
-## Tools
+### RAG Tools
 
 Example tools and tutorials to enhance LLM development and productivity when using NVIDIA RAG pipelines.
 
 - [Evaluation](./RAG/tools/evaluation/README.md)
 - [Observability](./RAG/tools/observability/README.md)
 
-## Community
-We're posting these examples on GitHub to support the NVIDIA LLM community and facilitate feedback.
-We invite contributions! Open a GitHub issue or pull request!
+### RAG Projects
 
-Check out the [community](./community/README.md) examples and notebooks.
+- [NVIDIA Tokkio LLM-RAG](https://docs.nvidia.com/ace/latest/workflows/tokkio/text/Tokkio_LLM_RAG_Bot.html): Use Tokkio to add avatar animation for RAG responses.
+- [Hybrid RAG Project on AI Workbench](https://github.com/NVIDIA/workbench-example-hybrid-rag): Run an NVIDIA AI Workbench example project for RAG.
 
-## Related NVIDIA RAG Projects
+## Documentation
 
-- [NVIDIA Tokkio LLM-RAG](https://docs.nvidia.com/ace/latest/workflows/tokkio/text/Tokkio_LLM_RAG_Bot.html): Use Tokkio to add avatar animation for RAG responses.
+### Getting Started
 
-- [Hybrid RAG Project on AI Workbench](https://github.com/NVIDIA/workbench-example-hybrid-rag): Run an NVIDIA AI Workbench example project for RAG.
+- [Prerequisites](./docs/common-prerequisites.md)
+
+### How To's
+
+- [Changing the Inference or Embedded Model](./docs/change-model.md)
+- [Customizing the Vector Database](./docs/vector-database.md)
+- [Customizing the Chain Server](./docs/chain-server.md):
+  - [Chunking Strategy](./docs/text-splitter.md)
+  - [Prompting Template Engineering](./docs/prompt-customization.md)
+- [Configuring LLM Parameters at Runtime](./docs/llm-params.md)
+- [Supporting Multi-Turn Conversations](./docs/multiturn.md)
+- [Speaking Queries and Listening to Responses with NVIDIA Riva](./docs/riva-asr-tts.md)
+
+### Reference
+
+- [Support Matrix](./docs/support-matrix.md)
+- [Architecture](./docs/architecture.md)
+- [Using the Sample Chat Web Application](./docs/using-sample-web-application.md)
+- [RAG Playground Web Application](./docs/frontend.md)
+- [Software Component Configuration](./docs/configuration.md)
+
+
+## Community
+We're posting these examples on GitHub to support the NVIDIA LLM community and facilitate feedback.
+We invite contributions! Open a GitHub issue or pull request! Check out the [community](./community/README.md) examples and notebooks.