Skip to content

feat(embedder): add Cohere dense embedder#941

Merged
MaojiaSheng merged 6 commits intovolcengine:mainfrom
Dicoangelo:feat/cohere-embedder
Mar 30, 2026
Merged

feat(embedder): add Cohere dense embedder#941
MaojiaSheng merged 6 commits intovolcengine:mainfrom
Dicoangelo:feat/cohere-embedder

Conversation

@Dicoangelo
Copy link
Copy Markdown
Contributor

Summary

  • Adds CohereDenseEmbedder using Cohere's Embed API v2 (/v2/embed)
  • Supports models: embed-v4.0, embed-english-v3.0, embed-multilingual-v3.0, embed-*-light-v3.0
  • Server-side dimension reduction for embed-v4.0 (256/512/1024/1536) via output_dimension
  • Client-side truncation + L2 renormalization fallback for v3 models
  • Asymmetric retrieval via input_type (search_query / search_document)
  • Batch embedding with 96-item chunking (Cohere API limit)
  • Full factory integration: provider validation, dimension auto-resolution, _create_embedder registry

Changes

  • New: openviking/models/embedder/cohere_embedders.pyCohereDenseEmbedder class
  • Modified: openviking/models/embedder/__init__.py — export + __all__ entry
  • Modified: openviking_cli/utils/config/embedding_config.py"cohere" in provider validation, dimension resolution, factory registry

Config example

{
  "embedding": {
    "dense": {
      "provider": "cohere",
      "model": "embed-v4.0",
      "api_key": "your-cohere-api-key",
      "dimension": 1024
    }
  }
}

Test plan

  • Validated config parsing with provider: "cohere" passes Pydantic validation
  • Tested with live Cohere API: 528+ vectors indexed, 237 embeddings, 0 errors
  • Semantic search quality verified: 0.43-0.55 relevance scores on targeted queries
  • Batch processing tested with 96-item chunking on 12K+ token document
  • Dimension auto-resolution tested: embed-v4.0 → 1536 default, configurable to 1024

none

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 24, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

@MaojiaSheng
Copy link
Copy Markdown
Collaborator

@Dicoangelo Thanks, but there are some conflicts that need to be resovled

Adds CohereDenseEmbedder using Cohere's Embed API v2.
- Supports embed-v4.0, embed-english-v3.0, embed-multilingual-v3.0
- Server-side dimension reduction for embed-v4.0 (256/512/1024/1536)
- Client-side truncation + renormalization fallback for v3 models
- Asymmetric search via input_type (search_query/search_document)
- Batch embedding with 96-item chunking (Cohere API limit)
- Full factory integration: provider validation, dimension resolution

none
16 tests covering:
- Init validation (api_key required, defaults, model dimensions)
- Dimension handling (v4 server-side, v3 client-side truncation, invalid dims)
- Embedding calls (single, batch, query vs document input_type)
- output_dimension sent for embed-v4.0
- Error handling (API errors → RuntimeError)
- Resource cleanup (close)

none
Extends RerankConfig with provider field and api_key for Cohere.
Adds CohereRerankClient with same interface as VikingDB RerankClient.
HierarchicalRetriever auto-selects rerank backend based on provider.

Config example:
  "rerank": {"provider": "cohere", "api_key": "...", "threshold": 0.15}

Quality improvement: META tokenomics query 0.55 → 0.77 relevance score.

none
9 tests covering:
- Rerank batch scoring with index-to-order mapping
- Empty input handling
- API error graceful fallback (returns None)
- Original order preservation from Cohere's sorted response
- Resource cleanup
- RerankConfig provider auto-detection (cohere/vikingdb/empty)

none
More vector candidates for reranker to evaluate = better precision.
With Cohere rerank-v3.5, 10 candidates gives the cross-encoder enough
material to find the best match without excessive latency.

none
@Dicoangelo Dicoangelo force-pushed the feat/cohere-embedder branch from ac0552f to 5fd06bb Compare March 29, 2026 04:34
…lient.from_config()

Cohere was special-cased in hierarchical_retriever.py while openai/litellm
went through the centralized RerankClient.from_config() dispatch. This commit
adds CohereRerankClient.from_config() and routes it through the same path.

Also fixes a bug where from_config() used config.provider directly instead
of _effective_provider(), which meant auto-detected providers (e.g. api_key
without explicit provider="cohere") would not dispatch correctly.

none
@MaojiaSheng MaojiaSheng merged commit 9bcdaf4 into volcengine:main Mar 30, 2026
2 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants