benchmark on audio_transcriptions fails

**Describe the bug**

Precondition:
This script can generate audio test dataset.
```
import numpy as np
import pandas as pd
import wave
import struct

def generate_and_save_wav_with_metadata():
    """generate WAV format file and save it into CSV file"""
    
    # audio parameters
    sample_rate = 44100
    duration = 3.0
    frequency = 523.25 
    
    # generate audio data
    t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
    audio_data = 0.5 * np.sin(2 * np.pi * frequency * t)
    
    # transfer to 16 bit PCM
    audio_int16 = np.int16(audio_data * 32767)
    
    # save into one WAV file
    with wave.open('test_audio.wav', 'w') as wav_file:
        wav_file.setnchannels(1)  # single channle
        wav_file.setsampwidth(2)   # 2 bytes = 16 bits
        wav_file.setframerate(sample_rate)
        wav_file.writeframes(audio_int16.tobytes())
    
    # create metadata for CSV
    metadata = pd.DataFrame([{
        'filename': 'test_audio.wav',
        'sample_rate': sample_rate,
        'duration': duration,
        'frequency_hz': frequency,
        'channels': 1,
        'bits_per_sample': 16,
        'num_samples': len(audio_data),
        'max_amplitude': float(np.max(np.abs(audio_data))),
        'rms': float(np.sqrt(np.mean(audio_data**2)))
    }])
    
    metadata.to_csv('audio_metadata.csv', index=False)
    
    print(f"WAV save into: test_audio.wav")
    print(f"metadata save into: audio_metadata.csv")
    print("\n metadata content:")
    print(metadata.T)


generate_and_save_wav_with_metadata()
```

benchmark steps：
1.  checkout branch：https://github.com/vllm-project/guidellm/pull/521，then do cmd: pip install -e ./[dev]
2. Start mock server via cmd: guidellm mock-server --host 0.0.0.0 --port 8080
3. Append some console print in src/guidellm/scheduler/scheduler.py or else there is no any valid info in benchmark result

<img width="1052" height="462" alt="Image" src="https://github.com/user-attachments/assets/20ab5e6e-ffae-4c08-af89-c23f0317f677" />

4. Execute audio generating script to generate audio wav file and metadata csv file into local path

6. Do benchmark test via cmd:
```
guidellm benchmark \
    --target "http://localhost:8080" \
    --request-type "audio_transcriptions" \
    --rate-type "throughput" \
    --rate 1 \
   --max-requests 1 \
   --data "./audio_metadata.csv"
```
benchmark console print:

```
 request_info in scheduler  request_id='7562ef0e-8c58-435c-a48a-1292511e8d7f'
status='queued' scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766641850.768449
timings=RequestTimings(targeted_start=None, queued=1766641852.279289,
dequeued=None, scheduled_at=None, resolve_start=None, request_start=None,
first_request_iteration=None, first_token_iteration=None,
last_token_iteration=None, last_request_iteration=None, request_iterations=0,
token_iterations=0, request_end=None, resolve_end=None, finalized=None)
error=None started_at=None completed_at=None

 request_info in scheduler  request_id='7562ef0e-8c58-435c-a48a-1292511e8d7f'
status='pending' scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766641850.768449
timings=RequestTimings(targeted_start=1766641850.768449,
queued=1766641852.279289, dequeued=1766641852.2864509, scheduled_at=None,
resolve_start=None, request_start=None, first_request_iteration=None,
first_token_iteration=None, last_token_iteration=None,
last_request_iteration=None, request_iterations=0, token_iterations=0,
request_end=None, resolve_end=None, finalized=None) error=None started_at=None
completed_at=None

 request_info in scheduler  request_id='7562ef0e-8c58-435c-a48a-1292511e8d7f'
status='in_progress' scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766641850.768449
timings=RequestTimings(targeted_start=1766641850.768449,
queued=1766641852.279289, dequeued=1766641852.2864509,
scheduled_at=1766641852.2864509, resolve_start=1766641852.286586,
request_start=None, first_request_iteration=None, first_token_iteration=None,
last_token_iteration=None, last_request_iteration=None, request_iterations=0,
token_iterations=0, request_end=None, resolve_end=None, finalized=None)
error=None started_at=1766641852.286586 completed_at=None

 request_info in scheduler  request_id='7562ef0e-8c58-435c-a48a-1292511e8d7f'
status='errored' scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766641850.768449
timings=RequestTimings(targeted_start=1766641850.768449,
queued=1766641852.279289, dequeued=1766641852.2864509,
scheduled_at=1766641852.2864509, resolve_start=1766641852.286586,
request_start=1766641852.286649, first_request_iteration=None,
first_token_iteration=None, last_token_iteration=None,
last_request_iteration=None, request_iterations=0, token_iterations=0,
request_end=None, resolve_end=1766641852.287371, finalized=1766641852.290621)
error="Invalid type for value. Expected primitive type, got <class 'dict'>:
{'include_usage': True}" started_at=1766641852.286649
completed_at=1766641852.287371
╭─ Benchmarks ─────────────────────────────────────────────────────────────────╮
│ [1… thr… (c… Req:    0.0 req/s,    0.00s Lat,     0.0 Conc,       0 Comp,  … │
│              Tok:    0.0 gen/s,    0.0 tot/s,   0.0ms TTFT,    0.0ms ITL,  … │
╰──────────────────────────────────────────────────────────────────────────────╯
Generating... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (1/1) [ 0:00:02 < 0:00:00 ]
25-12-25 13:50:53|DEBUG            |guidellm.utils.text:load_text:228 - Loading text: https://blog.vllm.ai/guidellm/ui/v0.5.0/index.html


ℹ Run Summary Info
|============|==========|==========|=====|======|======|======|=====|=====|======|=====|=====|
| Benchmark  | Timings                             ||||| Input Tokens   ||| Output Tokens  |||
| Strategy   | Start    | End      | Dur | Warm | Cool | Comp | Inc | Err | Comp | Inc | Err |
|            |          |          | Sec | Sec  | Sec  | Tot  | Tot | Tot | Tot  | Tot | Tot |
|------------|----------|----------|-----|------|------|------|-----|-----|------|-----|-----|
| throughput | 13:50:50 | 13:50:52 | 1.5 | 0.0  | 0.0  | 0.0  | 0.0 | 0.0 | 0.0  | 0.0 | 0.0 |
|============|==========|==========|=====|======|======|======|=====|=====|======|=====|=====|


ℹ Audio Metrics Statistics (Completed Requests)
|============|=======|======|======|======|=======|======|======|======|=======|======|======|======|
| Benchmark  | Input Samples           |||| Input Seconds           |||| Input Bytes             ||||
| Strategy   | Per Request || Per Second || Per Request || Per Second || Per Request || Per Second ||
|            | Mdn   | p95  | Mdn  | Mean | Mdn   | p95  | Mdn  | Mean | Mdn   | p95  | Mdn  | Mean |
|------------|-------|------|------|------|-------|------|------|------|-------|------|------|------|
| throughput | 0.0   | 0.0  | 0.0  | 0.0  | 0.0   | 0.0  | 0.0  | 0.0  | 0.0   | 0.0  | 0.0  | 0.0  |
|============|=======|======|======|======|=======|======|======|======|=======|======|======|======|


ℹ Request Token Statistics (Completed Requests)
|============|======|=====|======|======|======|=====|=======|======|=========|========|
| Benchmark  | Input Tok || Output Tok || Total Tok || Stream Iter || Output Tok      ||
| Strategy   | Per Req   || Per Req    || Per Req   || Per Req     || Per Stream Iter ||
|            | Mdn  | p95 | Mdn  | p95  | Mdn  | p95 | Mdn   | p95  | Mdn     | p95    |
|------------|------|-----|------|------|------|-----|-------|------|---------|--------|
| throughput | 0.0  | 0.0 | 0.0  | 0.0  | 0.0  | 0.0 | 0.0   | 0.0  | 0.0     | 0.0    |
|============|======|=====|======|======|======|=====|=======|======|=========|========|


ℹ Request Latency Statistics (Completed Requests)
|============|=========|========|=====|=====|=====|=====|=====|=====|
| Benchmark  | Request Latency || TTFT     || ITL      || TPOT     ||
| Strategy   | Sec             || ms       || ms       || ms       ||
|            | Mdn     | p95    | Mdn | p95 | Mdn | p95 | Mdn | p95 |
|------------|---------|--------|-----|-----|-----|-----|-----|-----|
| throughput | 0.0     | 0.0    | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
|============|=========|========|=====|=====|=====|=====|=====|=====|


ℹ Server Throughput Statistics
|============|=====|======|=======|======|=======|=======|========|=======|=======|=======|
| Benchmark  | Requests               |||| Input Tokens || Output Tokens || Total Tokens ||
| Strategy   | Per Sec   || Concurrency || Per Sec      || Per Sec       || Per Sec      ||
|            | Mdn | Mean | Mdn   | Mean | Mdn   | Mean  | Mdn    | Mean  | Mdn   | Mean  |
|------------|-----|------|-------|------|-------|-------|--------|-------|-------|-------|
| throughput | 0.0 | 0.0  | 0.0   | 0.0  | 0.0   | 0.0   | 0.0    | 0.0   | 0.0   | 0.0   |
|============|=====|======|=======|======|=======|=======|========|=======|=======|=======|



✔ Benchmarking complete, generated 1 benchmark(s)
```
From the last request_info, there is one error:
error="Invalid type for value. Expected primitive type, got <class 'dict'>:
{'include_usage': True}"



**Expected behavior**
A clear and concise description of what you expected to happen.

audio_transcriptions benchmark test can work in mock_server.

**Environment**
Include all relevant environment information:
1. OS [e.g. Ubuntu 20.04]: MacOS 12.7.6
2. Python version [e.g. 3.12.2]:Python 3.11.5
3. guideLLM version: 0.5.0.dev0

**To Reproduce**
Exact steps to reproduce the behavior:


**Errors**
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.

**Additional context**
Add any other context about the problem here. Also include any relevant files.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

benchmark on audio_transcriptions fails #522

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

benchmark on audio_transcriptions fails #522

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions