We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent cb7b089 commit 8331875Copy full SHA for 8331875
docker/llm/serving/xpu/docker/vllm_offline_inference.py
@@ -54,6 +54,8 @@
54
disable_async_output_proc=True,
55
distributed_executor_backend="ray",
56
max_model_len=2000,
57
+ trust_remote_code=True,
58
+ block_size=8,
59
max_num_batched_tokens=2000)
60
# Generate texts from the prompts. The output is a list of RequestOutput objects
61
# that contain the prompt, generated text, and other information.
0 commit comments