What is the maximum token limit for response generation in DSPY #1223
Replies: 1 comment
-
|
The number 128k refers to the limit of total number of input and output tokens. If you need the model to produce 3K tokens in its output, you can provide up to 125k tokens in your input (including system and user messages). The combined total of input and output tokens cannot exceed 128k. The current model is restricted to a maximum of 4k or more accurately 4096 tokens for its output. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am using Gpt-4o LM for my project. I checked that the token limit for it is 128000. However, when I give max_tokens=50000, I am getting the following error:
1029 log.debug("Re-raising status error")
-> 1030 raise self._make_status_error_from_response(err.response) from None
1031
1032 return self._process_response(
BadRequestError: Error code: 400 - {'error': {'message': 'max_tokens is too large: 50000. This model supports at most 4096 completion tokens, whereas you provided 50000.', 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': None}}
Can someone explain why this is happening?
Beta Was this translation helpful? Give feedback.
All reactions