Skip to content

Commit 5effb05

Browse files
authored
Update README.md
1 parent d2bd00e commit 5effb05

File tree

1 file changed

+13
-7
lines changed

1 file changed

+13
-7
lines changed

README.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,30 @@
11
# Azimuth LLM
22

3-
This repository contains a Helm chart for deploying Large Language Models (LLMs) on Kubernetes. It is developed primarily for use as a pre-packaged application within [Azimuth](https://www.stackhpc.com/azimuth-introduction.html) but is structured such that it can, in principle, be deployed on any Kubernetes cluster with at least 1 GPU node.
3+
This repository contains a set of Helm charts for deploying Large Language Models (LLMs) on Kubernetes. It is developed primarily for use as a set of pre-packaged applications within [Azimuth](https://www.stackhpc.com/azimuth-introduction.html) but is structured such that the charts can, in principle, be deployed on any Kubernetes cluster with at least 1 GPU node.
44

55
## Azimuth App
66

7-
This app is provided as part of a standard deployment Azimuth, so no specific steps are required to use this app other than access to an up-to-date Azimuth deployment.
7+
This primary LLM chat app is provided as part of a standard deployment Azimuth, so no specific steps are required to use this app other than access to an up-to-date Azimuth deployment.
88

99
## Manual Deployment
1010

11-
Alternatively, to set up the Helm repository and manually install this chart on an existing Kubernetes cluster, run
11+
Alternatively, to set up the Helm repository and manually install the LLM chat interface chart on an existing Kubernetes cluster, run
1212

1313
```
1414
helm repo add <chosen-repo-name> https://stackhpc.github.io/azimuth-llm/
1515
helm repo update
16-
helm install <installation-name> <chosen-repo-name>/azimuth-llm --version <version>
16+
helm install <installation-name> <chosen-repo-name>/azimuth-llm-chat
1717
```
1818

19-
where `version` is the full name of the published version for the specified commit (e.g. `0.1.0-dev.0.main.125`). To see the latest published version, see [this page](https://github.com/stackhpc/azimuth-llm/tree/gh-pages).
19+
This will install the latest stable [release](https://github.com/stackhpc/azimuth-llm/releases) of the application.
20+
21+
## Chart Structure
22+
23+
Under the charts directory, there is a base [azimuth-llm](./charts/azimuth-llm) Helm chart which uses vLLM to deploy models from Huggingface. The [azimuth-chat](charts/azimuth-chat) and [azimuth-image-analysis](charts/azimuth-image-analysis) are wrapper charts which add different Gradio web interfaces for interacting with the deployed LLM.
2024

2125
### Customisation
2226

23-
The `chart/values.yaml` file documents the various customisation options which are available. In order to access the LLM from outside the Kubernetes cluster, the API and/or UI service types may be changed to
27+
The `charts/azimuth-llm/values.yaml` file documents the various customisation options which are available. In order to access the LLM from outside the Kubernetes cluster, the API and/or UI service types may be changed to
2428
```
2529
api:
2630
service:
@@ -38,6 +42,8 @@ ui:
3842

3943
The both the web-based interface and the backend OpenAI-compatible vLLM API server can also optionally be exposed using [Kubernetes Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/). See the `ingress` section in `values.yml` for available config options.
4044

45+
When deploying the chat or image-analysis wrapper charts, all configuration options must be nested under the `azimuth-llm` heading ([example](https://github.com/stackhpc/azimuth-llm/blob/main/charts/azimuth-chat/values.yaml#L1)) due to the way that Helm passes values between [parent charts and sub-charts](https://helm.sh/docs/chart_template_guide/subcharts_and_globals/#overriding-values-from-a-parent-chart).
46+
4147
## Tested Models
4248

4349
The application uses [vLLM](https://docs.vllm.ai/en/latest/index.html) for model serving, therefore any of the vLLM [supported models](https://docs.vllm.ai/en/latest/models/supported_models.html) should work. Since vLLM pulls the model files directly from [HuggingFace](https://huggingface.co/models) it is likely that some other models will also be compatible with vLLM but mileage may vary between models and model architectures. If a model is incompatible with vLLM then the API pod will likely enter a `CrashLoopBackoff` state and any relevant error information will be found in the API pod logs. These logs can be viewed with
@@ -46,7 +52,7 @@ The application uses [vLLM](https://docs.vllm.ai/en/latest/index.html) for model
4652
kubectl (-n <helm-release-namespace>) logs deploy/<helm-release-name>-api
4753
```
4854

49-
If you suspect that a given error is not caused by the upstream vLLM support and a problem with this Helm chart then please [open an issue](https://github.com/stackhpc/azimuth-llm/issues).
55+
If you suspect that a given error is not caused by the upstream vLLM version and is instead a problem with this Helm chart then please [open an issue](https://github.com/stackhpc/azimuth-llm/issues).
5056

5157
## Monitoring
5258

0 commit comments

Comments
 (0)