tstesco/benchmark-uplift #63

tstescoTT · 2025-02-19T22:48:46Z

changelog:

Uplift benchmarks/benchmark_serving.py to add support for max_concurrency feature to allow for large n when running benchmark sweeps and measuring correct TTFT and E2EL.
add backport_removeprefix in benchmarks/benchmark_serving.py to support Python 3.8+

Discussion

Previously the benchmarking script sends all requests at time=0 and vLLM queues them on the server side, thus measuring the queue time in TTFT and E2EL. The latest version from upstream uses asyncio.Semaphore(max_concurrency) to stop all requests from running at once, at a higher concurrency than the model max batch size. Setting max_concurrency to the max batch size of the model allows for correct measurement of TTFT and E2EL.

…ility

github-actions · 2025-02-19T22:48:59Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

skhorasganiTT · 2025-02-20T17:16:14Z

benchmarks/backend_request_func.py

+def backport_removeprefix(string: str, prefix: str) -> str:
+    return string[len(prefix):] if string.startswith(prefix) else string


Could you add a comment to this function specifying why it's needed (& the torch version it's supporting)? EDIT: Nevermind, see other comment about remove_prefix

skhorasganiTT · 2025-02-20T17:18:47Z

benchmarks/backend_request_func.py

-            "best_of": request_func_input.best_of,
+            # "best_of": request_func_input.best_of,
            "max_tokens": request_func_input.output_len,
-            "logprobs": request_func_input.logprobs,
+            # "logprobs": request_func_input.logprobs,


Was this changed by mistake? (I don't think we can upstream this)

We dont support those parameters in our fork yet #44. I can add a TODO pointing to the issue.

skhorasganiTT · 2025-02-20T17:28:10Z

benchmarks/backend_request_func.py

-# Since vllm must support Python 3.8, we can't use str.removeprefix(prefix)
-# introduced in Python 3.9
-def remove_prefix(text: str, prefix: str) -> str:
-    if text.startswith(prefix):
-        return text[len(prefix):]
-    return text


They already had a function remove_prefix for supporting 3.8 but intentionally removed it, they may not want us to add it back

This should only be here until we can move to python 3.9+, and that hopefully happens before we upstream. I can add e.g.:

# TODO: remove backport after we can drop support of Python 3.8

We are aiming to proceed with the rebase + integration of the dev branch on to upstream in the next week or two, so I'm hesitant to push this since we'll have to remove it again

That's fine, I can use this off the branch for now and close the PR. If you're rebasing to upstream then I can apply what I need on top of that and keep on a branch if still needed.

We can talk about when #37 and #44 can be addressed to remove restrictions.

tstescoTT · 2025-02-26T23:54:06Z

closing as per #63 (comment)

tstescoTT added 5 commits February 19, 2025 21:02

adding best_of and logprobs patch to benchmarking script for compatab…

d0b78e6

…ility

uplift benchmarks/benchmark_serving.py from 4518683

fa7ab76

add benchmark_utils.py

870cfe2

uplift

c22daa0

backport removeprefix

3429acf

tstescoTT requested a review from skhorasganiTT February 19, 2025 22:48

skhorasganiTT reviewed Feb 20, 2025

View reviewed changes

add TODO: remove before upstream: comments

82515b4

tstescoTT closed this Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tstesco/benchmark-uplift #63

tstesco/benchmark-uplift #63

tstescoTT commented Feb 19, 2025

github-actions bot commented Feb 19, 2025

skhorasganiTT Feb 20, 2025

skhorasganiTT Feb 20, 2025

tstescoTT Feb 22, 2025

skhorasganiTT Feb 20, 2025

tstescoTT Feb 22, 2025 •

edited

Loading

skhorasganiTT Feb 25, 2025

tstescoTT Feb 26, 2025

tstescoTT commented Feb 26, 2025

		def backport_removeprefix(string: str, prefix: str) -> str:
		return string[len(prefix):] if string.startswith(prefix) else string

tstesco/benchmark-uplift #63

tstesco/benchmark-uplift #63

Conversation

tstescoTT commented Feb 19, 2025

changelog:

Discussion

github-actions bot commented Feb 19, 2025

skhorasganiTT Feb 20, 2025

Choose a reason for hiding this comment

skhorasganiTT Feb 20, 2025

Choose a reason for hiding this comment

tstescoTT Feb 22, 2025

Choose a reason for hiding this comment

skhorasganiTT Feb 20, 2025

Choose a reason for hiding this comment

tstescoTT Feb 22, 2025 • edited Loading

Choose a reason for hiding this comment

skhorasganiTT Feb 25, 2025

Choose a reason for hiding this comment

tstescoTT Feb 26, 2025

Choose a reason for hiding this comment

tstescoTT commented Feb 26, 2025

tstescoTT Feb 22, 2025 •

edited

Loading