Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tianshuc/vllm docker server health check #1124

Draft
wants to merge 46 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
3622454
SUpport running raw docker servers directly and avoid truss
bolasim Aug 24, 2024
713a182
Add version to test
bolasim Aug 24, 2024
df1582f
Fix type
bolasim Aug 24, 2024
5d39e6f
Add server port for proper proxy
bolasim Aug 24, 2024
6b38251
Catch all proxy pass
bolasim Aug 24, 2024
dba08fb
Try again to handle all v1 paths
bolasim Aug 24, 2024
3194fc4
update
Sep 4, 2024
9d27f14
update
Sep 4, 2024
eed149d
update
Sep 4, 2024
5bad45d
update
Sep 4, 2024
05a73ca
supervisor checks
Sep 5, 2024
395fd25
docker update
Sep 5, 2024
8c6f2c6
docker update
Sep 5, 2024
2484be0
version
Sep 5, 2024
5d106b3
added supervisor_checks lin
tianshuc0731 Sep 5, 2024
8f47ff2
update
tianshuc0731 Sep 5, 2024
124ec26
supervisor_checks
tianshuc0731 Sep 5, 2024
7f3eafd
tag
tianshuc0731 Sep 5, 2024
76bcb2e
tag
tianshuc0731 Sep 5, 2024
1ca1ec0
Remove submodule supervisor_checks
Sep 5, 2024
37c665b
Add supervisor_checks as a regular directory
Sep 5, 2024
a44ec3b
clone supervisor checks
Sep 5, 2024
855ba3d
recover
Sep 5, 2024
d495132
vllm_check
Sep 5, 2024
0cd0541
update
Sep 5, 2024
d87abaf
update
Sep 5, 2024
a96514c
tag
Sep 5, 2024
7c5ebf6
update
Sep 5, 2024
f02c6f8
tag
Sep 5, 2024
5dce494
update
Sep 5, 2024
7602db1
update
Sep 5, 2024
afbe62b
tag
Sep 5, 2024
b2086e8
remove redundant files
Sep 5, 2024
509fa4a
tag
Sep 5, 2024
d673fcb
tag
Sep 5, 2024
27f4f6b
tag
Sep 5, 2024
c98c4f4
tag
Sep 5, 2024
52c67c1
fix health check
Sep 6, 2024
5f1f2f6
tag
Sep 6, 2024
19fac1d
tag
Sep 6, 2024
082c896
support s3 download
Sep 10, 2024
d848cb3
MULTI_THREAD_WORKERS = 10
Sep 10, 2024
2805ac0
update
Sep 11, 2024
4390b15
update
Sep 11, 2024
53a5c00
support data_dir
Sep 11, 2024
f11856b
tag
Sep 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "truss"
version = "0.9.30rc3"
version = "0.9.32.dev-23"
description = "A seamless bridge from model development to model delivery"
license = "MIT"
readme = "README.md"
Expand Down
54 changes: 54 additions & 0 deletions truss/contexts/image_builder/serving_image_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
from truss.util.jinja import read_template_from_fs
from truss.util.path import (
build_truss_target_directory,
copy_file_path,
copy_tree_or_file,
copy_tree_path,
load_trussignore_patterns,
Expand Down Expand Up @@ -339,6 +340,59 @@ def copy_into_build_dir(from_path: Path, path_in_build_dir: str):
# Copy over truss
copy_tree_path(truss_dir, build_dir, ignore_patterns=truss_ignore_patterns)

if config.docker_server is not None:
copy_tree_path(truss_dir, build_dir)
copy_tree_path(
TEMPLATES_DIR / "docker_server" / "supervisor_checks",
build_dir / "supervisor_checks",
)
# TODO: to remove
# copy_file_path(
# TEMPLATES_DIR / "docker_server" / "download_model.py",
# build_dir / "download_model.py",
# )
copy_file_path(
TEMPLATES_DIR / "docker_server" / "setup_ready_check.py",
build_dir / "setup_ready_check.py",
)

if not build_dir.exists():
build_dir.mkdir(parents=True)

dockerfile_template = read_template_from_fs(
TEMPLATES_DIR, "docker_server/docker_server.Dockerfile.jinja"
)
nginx_template = read_template_from_fs(
TEMPLATES_DIR, "docker_server/proxy.conf.jinja"
)

dockerfile_content = dockerfile_template.render(
base_image_name_and_tag=config.base_image.image,
config=config,
)
dockerfile_filepath = build_dir / "Dockerfile"
dockerfile_filepath.write_text(dockerfile_content)

nginx_content = nginx_template.render(
server_endpoint=config.docker_server.predict_endpoint,
readiness_endpoint=config.docker_server.readiness_endpoint,
liveness_endpoint=config.docker_server.liveness_endpoint,
server_port=config.docker_server.server_port,
)
nginx_filepath = build_dir / "proxy.conf"
nginx_filepath.write_text(nginx_content)

supervisord_template = read_template_from_fs(
TEMPLATES_DIR, "docker_server/supervisord.conf.jinja"
)
supervisord_contents = supervisord_template.render(
start_command=config.docker_server.start_command,
setup_command=config.docker_server.setup_command,
)
supervisord_filepath = build_dir / "supervisord.conf"
supervisord_filepath.write_text(supervisord_contents)
return

# Copy over template truss for TRT-LLM (we overwrite the model and packages dir)
# Most of the code is pulled from upstream triton-inference-server tensorrtllm_backend
# https://github.com/triton-inference-server/tensorrtllm_backend/tree/v0.9.0/all_models/inflight_batcher_llm
Expand Down
37 changes: 37 additions & 0 deletions truss/templates/docker_server/docker_server.Dockerfile.jinja
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
FROM {{base_image_name_and_tag}}

RUN grep -w 'ID=debian\|ID_LIKE=debian' /etc/os-release || { echo "ERROR: Supplied base image is not a debian image"; exit 1; }

EXPOSE 8080-9000

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
curl nginx supervisor python3-pip && \
rm -rf /var/lib/apt/lists/*

COPY ./{{ config.model_module_dir }} /app/model

COPY ./{{ config.data_dir }} /app/data

COPY ./supervisor_checks /app/supervisor_checks

# Install supervisor_checks using pip
RUN pip3 install /app/supervisor_checks

RUN pip3 install boto3

ENV SUPERVISOR_SERVER_URL http://localhost:8080

COPY ./proxy.conf /etc/nginx/conf.d/proxy.conf

RUN mkdir -p /var/log/supervisor
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf

{% for env_var_name, env_var_value in config.environment_variables.items() %}
ENV {{ env_var_name }} {{ env_var_value }}
{% endfor %}

ENV SERVER_START_CMD /usr/bin/supervisord

COPY ./setup_ready_check.py /app/setup_ready_check.py

ENTRYPOINT ["/usr/bin/supervisord"]
43 changes: 43 additions & 0 deletions truss/templates/docker_server/proxy.conf.jinja
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
server {
# We use the proxy_read_timeout directive here (instead of proxy_send_timeout) as it sets the timeout for reading a response from the proxied server vs. setting a timeout for sending a request to the proxied server.
listen 8080;

# Liveness
location = / {
proxy_redirect off;
proxy_read_timeout 300s;

rewrite ^/$ {{liveness_endpoint}} break;

proxy_pass http://127.0.0.1:{{server_port}};
}

# Readiness
location ~ ^/v1/models/model$ {
proxy_redirect off;
proxy_read_timeout 300s;

rewrite ^/v1/models/model$ {{readiness_endpoint}} break;

proxy_pass http://127.0.0.1:{{server_port}};
}

# Predict
location ~ ^/v1/models/model:predict$ {
proxy_redirect off;
proxy_read_timeout 300s;

rewrite ^/v1/models/model:predict$ {{server_endpoint}} break;

proxy_pass http://127.0.0.1:{{server_port}};
}

# Forward all other paths
location / {
proxy_redirect off;
proxy_read_timeout 300s;

proxy_pass http://127.0.0.1:{{server_port}};
}

}
45 changes: 45 additions & 0 deletions truss/templates/docker_server/setup_ready_check.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import subprocess
import sys
import supervisor.childutils
import argparse

def parse_args():
parser = argparse.ArgumentParser(description='Check if the first process is ready and start the second process')
parser.add_argument('--first', type=str, required=True, help='Name of the first process to run')
parser.add_argument('--second', type=str, required=True, help='Name of the second process to run')
return parser.parse_args()

def check_model_ready(first_process, second_process):

while True:
# Read the event header and payload
headers, payload = supervisor.childutils.listener.wait()

# Check if the exited process is process_A and its exit code is 0
if headers['eventname'] == 'PROCESS_STATE_EXITED':
# print(f"Received headers: {headers}")
# print(f"Received payload: {payload}")
# fields = dict(field.split(':') for field in payload.split(' '))
# process_name = fields['processname']
pheaders, pdata = supervisor.childutils.eventdata(payload+'\n')
print(f"Received headers: {pheaders}")
process_name = pheaders['processname']
is_expected = int(pheaders['expected'])
print(f"processname {process_name} expected {is_expected}")

if process_name == first_process and is_expected:
try:
print(f"Running command: supervisorctl start {second_process}")
result = subprocess.run(['supervisorctl', 'start', second_process],
capture_output=True, text=True, check=True)
return
except subprocess.CalledProcessError as e:
print(f"Error starting process: {e}")
raise

# Acknowledge the event to supervisor
supervisor.childutils.listener.ok()

if __name__ == "__main__":
args = parse_args()
check_model_ready(args.first, args.second)
20 changes: 20 additions & 0 deletions truss/templates/docker_server/supervisor_checks/LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Copyright 2015 Volodymyr Kuznetsov

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Loading
Loading