fix: text and character limits on Cohere embedding API #376
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In a previous PR, I added support for retrieving more than one embedding at once from the Cohere embedding API.
@romenlee pointed out that this causes problems with a large number of texts. It looks like the Cohere embedding API has a limit of either 96 texts, or 2048 characters.
This PR implements more intelligent batching that respects these limits -- it batches texts in chunks that are at most 96 texts or 2048 characters, submits them to the embedding API, and combines the results.
There is a complexity vs. efficiency tradeoff call to make here for the maintainers. This code is much more complex than the original naive code, pre #350 -- but it is much more efficient with network calls so will execute faster. An alternative if we don't want to take on this complexity would be to just revert #350, which would restore functionality for embedding large texts at the cost of giving up performance improvements.