From bfe1058433af0351834e7e72b8021a2f4434c3a5 Mon Sep 17 00:00:00 2001 From: Stephen Baione Date: Wed, 18 Dec 2024 17:57:56 -0600 Subject: [PATCH] Make shortfin server heading for more visible doc links, Add iree-base-runtime, iree-base-compiler and iree-turbine to nightly install instructions --- docs/shortfin/llm/user/e2e_llama8b_mi300x.md | 5 +++++ .../llm/user/shortfin_with_sglang_frontend_language.md | 3 +++ 2 files changed, 8 insertions(+) diff --git a/docs/shortfin/llm/user/e2e_llama8b_mi300x.md b/docs/shortfin/llm/user/e2e_llama8b_mi300x.md index 313a8086c..87ceb84dd 100644 --- a/docs/shortfin/llm/user/e2e_llama8b_mi300x.md +++ b/docs/shortfin/llm/user/e2e_llama8b_mi300x.md @@ -39,6 +39,11 @@ To install nightly packages: ```bash pip install shark-ai[apps] sharktank \ --pre --find-links https://github.com/nod-ai/shark-ai/releases/expanded_assets/dev-wheels +pip install -f https://iree.dev/pip-release-links.html --pre --upgrade \ + iree-base-compiler \ + iree-base-runtime \ + iree-turbine \ + "numpy<2.0" ``` See also the diff --git a/docs/shortfin/llm/user/shortfin_with_sglang_frontend_language.md b/docs/shortfin/llm/user/shortfin_with_sglang_frontend_language.md index 832ec9c3d..0dbbeb56b 100644 --- a/docs/shortfin/llm/user/shortfin_with_sglang_frontend_language.md +++ b/docs/shortfin/llm/user/shortfin_with_sglang_frontend_language.md @@ -24,6 +24,9 @@ For this tutorial, you will need to meet the following prerequisites: - You can check out [pyenv](https://github.com/pyenv/pyenv) as a good tool to be able to manage multiple versions of python on the same system. + +### Shortfin LLM Server + - A running `shortfin` LLM server. Directions on launching the llm server on one system can be found [here](https://github.com/nod-ai/shark-ai/blob/main/docs/shortfin/llm/user/e2e_llama8b_mi300x.md) and for launching on a kubernetes cluster, please look [here](https://github.com/nod-ai/shark-ai/blob/main/docs/shortfin/llm/user/e2e_llama8b_k8s.md) - We will use the shortfin server as the `backend` to generate completions