Update sharded export command (#1078)

stbaione · web-flow · commit 400e2e219358 · 2025-03-12T15:59:42.000-05:00
We missed a spot when we updated the `llama_serving` guide after
enabling `prefill` and `decode` to export with different batch sizes
diff --git a/docs/shortfin/llm/user/llama_serving.md b/docs/shortfin/llm/user/llama_serving.md
@@ -383,7 +383,8 @@ python -m sharktank.examples.export_paged_llm_v1 \
   --irpa-file /path/to/output/llama3.1-405b.irpa \
   --output-mlir /path/to/output/llama3.1-405b.mlir \
   --output-config /path/to/output/llama3.1-405b.config.json \
-  --bs 4
+  --bs-prefill 4 \
+  --bs-decode 4
 ```
 
 ### Compiling to VMFB