Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tstesco/benchmark-uplift #63

Closed
wants to merge 6 commits into from
Closed

tstesco/benchmark-uplift #63

wants to merge 6 commits into from

Conversation

tstescoTT
Copy link

changelog:

  • Uplift benchmarks/benchmark_serving.py to add support for max_concurrency feature to allow for large n when running benchmark sweeps and measuring correct TTFT and E2EL.
  • add backport_removeprefix in benchmarks/benchmark_serving.py to support Python 3.8+

Discussion

Previously the benchmarking script sends all requests at time=0 and vLLM queues them on the server side, thus measuring the queue time in TTFT and E2EL. The latest version from upstream uses asyncio.Semaphore(max_concurrency) to stop all requests from running at once, at a higher concurrency than the model max batch size. Setting max_concurrency to the max batch size of the model allows for correct measurement of TTFT and E2EL.

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

  • Add ready label to the PR
  • Enable auto-merge.

🚀

Comment on lines +49 to +50
def backport_removeprefix(string: str, prefix: str) -> str:
return string[len(prefix):] if string.startswith(prefix) else string

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment to this function specifying why it's needed (& the torch version it's supporting)? EDIT: Nevermind, see other comment about remove_prefix

Comment on lines 238 to 255
"best_of": request_func_input.best_of,
# "best_of": request_func_input.best_of,
"max_tokens": request_func_input.output_len,
"logprobs": request_func_input.logprobs,
# "logprobs": request_func_input.logprobs,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this changed by mistake? (I don't think we can upstream this)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont support those parameters in our fork yet #44. I can add a TODO pointing to the issue.

Comment on lines -392 to -397
# Since vllm must support Python 3.8, we can't use str.removeprefix(prefix)
# introduced in Python 3.9
def remove_prefix(text: str, prefix: str) -> str:
if text.startswith(prefix):
return text[len(prefix):]
return text

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They already had a function remove_prefix for supporting 3.8 but intentionally removed it, they may not want us to add it back

Copy link
Author

@tstescoTT tstescoTT Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be here until we can move to python 3.9+, and that hopefully happens before we upstream. I can add e.g.:

 # TODO: remove backport after we can drop support of Python 3.8

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are aiming to proceed with the rebase + integration of the dev branch on to upstream in the next week or two, so I'm hesitant to push this since we'll have to remove it again

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, I can use this off the branch for now and close the PR. If you're rebasing to upstream then I can apply what I need on top of that and keep on a branch if still needed.

We can talk about when #37 and #44 can be addressed to remove restrictions.

@tstescoTT
Copy link
Author

closing as per #63 (comment)

@tstescoTT tstescoTT closed this Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants