-
Notifications
You must be signed in to change notification settings - Fork 39
RFC: Benchmarking scenarios #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
81c35db
to
91651b0
Compare
@sjmonson I like the direction of this. A few quick thoughts from my side:
Not sure if there's something out there already to automatically handle the last two points, but if so, that would be a great inclusion |
My plan was to rely on the base class json and yaml loaders since they are cleaner for nested structures, such as synthetic dataset args. I can definitely try pydantic-settings since that does have the advantage of allowing us to unify all GuideLLM options under one file.
By default pydantic will attempt type coercion during the validation so its probably as simple as disabling the type handling in click and raising a
They already do in this patch as I have set every click default to pull from the scenario model, unless you mean something else? |
@sjmonson, for the last one, yes, something else. Specifically the entrypoint for benchmarking Python API here: https://github.com/neuralmagic/guidellm/blob/main/src/guidellm/benchmark/entrypoints.py#L22. Towards @anmarques's previous issues with needing to set all argument values when not using the CLI. Also would help towards suporting both CLI and Python API for the scenarios |
Ah I see. The workflow I imagined for @anmarques's use-case is to call the higher-level entrypoint result = await benchmark_with_scenario(
GenerativeTextScenario(
target="http://localhost:8000",
data={
"prompt_tokens": 128,
"output_tokens": 128,
},
rate_type="sweep",
),
output_path="output.json",
) |
This PR adds support for "scenarios" that allow specifying benchmark argument in a file / as a single Pydantic object. CLI argument defaults are loaded from the scenario object defaults to give benchmark-as-code users the same defaults as CLI. Argument values in the CLI follow the following precedence:
Scenario (class defaults) < Scenario (CLI provided Scenario) < CLI Arguments
.This PR is not in a finalized state but comments on design are encouraged.
Closes: