Skip to content

[Feature Request] During streaming, return usage #3412

@hzzhyj

Description

@hzzhyj

Required prerequisites

Motivation

If we stream a response, the usage is not returned.

Even after enabling usage via

model_config_dict={
    "stream": True,
    "stream_options": {"include_usage": True},
},

the input tokens and output token counts are not returned. Testing using gemini api directly do support input and output token counts

Only a total tokens is returned which is not enough for billing purposes as input token price and output token price are not the same

Solution

Support returning input token and output token count for streamed responses

Alternatives

Well there is not alternative really. Just have to support that

Additional context

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions