嵌入模型api接口支持 #1463

Kaos-dawn · 2025-04-18T07:57:27Z

feat: 未各大模型补充了embed方法
feat: 为LLMBackendAdapter添加了embed抽象方法。
feat: 增加了一个纯embedding模型服务商
tips: 新的服务商暂未注册，等待llm注册逻辑更新。

好的，这是翻译成中文的 pull request 总结：

Sourcery 总结

为多个 LLM 后端适配器添加 Embedding 支持，引入跨不同 AI 提供商的标准化 Embedding 请求和响应格式。

新特性：

为 OpenAI、Gemini、Ollama 和 Voyage 适配器实现了 Embedding 方法。
创建了标准化的 LLMEmbeddingRequest 和 LLMEmbeddingResponse 格式，用于 Embedding 操作。

增强功能：

使用抽象的 embed 方法扩展了 LLMBackendAdapter。
增加了对跨不同 AI 提供商的文本和多模态 Embedding 的支持。

Original summary in English

好的，这是翻译成中文的 pull request 总结：

Sourcery 总结

为 LLM 适配器添加嵌入 (embedding) 和重排序 (reranking) 功能。

新功能：

为 OpenAI、Gemini 和 Ollama 适配器实现嵌入方法。
引入一个新的 Voyage 适配器，支持文本/多模态嵌入和重排序。

增强：

为适配器定义 LLMEmbeddingProtocol 和 LLMReRankProtocol。
创建标准化的请求和响应格式 (LLMEmbeddingRequest/Response, LLMReRankRequest/Response)。
重构 OpenAI 适配器并更新继承的适配器 (Minimax, DeepSeek, Moonshot)。

构建：

更新 Python 版本要求为 >=3.10。
从 setup.py 中删除未使用的依赖项。

测试：

为新的 Voyage 适配器功能添加测试。

Original summary in English

Summary by Sourcery

Add embedding and reranking capabilities to LLM adapters.

New Features:

Implement embedding methods for OpenAI, Gemini, and Ollama adapters.
Introduce a new Voyage adapter supporting text/multi-modal embedding and reranking.

Enhancements:

Define LLMEmbeddingProtocol and LLMReRankProtocol for adapters.
Create standardized request and response formats (LLMEmbeddingRequest/Response, LLMReRankRequest/Response).
Refactor the OpenAI adapter and update inheriting adapters (Minimax, DeepSeek, Moonshot).

Build:

Update Python version requirement to >=3.10.
Remove unused dependencies from setup.py.

Tests:

Add tests for the new Voyage adapter functionality.

sourcery-ai · 2025-04-18T07:57:41Z

## 审查者指南

此 pull request 引入了对嵌入模型的支持，通过定义标准请求/响应格式 (LLMEmbeddingRequest, LLMEmbeddingResponse) 和 LLMEmbeddingProtocol。 此协议在现有的 OpenAI、Gemini 和 Ollama 适配器以及新添加的 Voyage 适配器中实现，通过调用相应的提供商 API。 Voyage 适配器还通过新的 LLMReRankProtocol 和相应的格式引入了重新排序功能。 基础适配器类和协议已重构以适应这些更改，并且最低 Python 版本已提高到 3.10。

#### 嵌入请求的序列图

```mermaid
sequenceDiagram
    participant Client
    participant Adapter as LLMAdapter (e.g., VoyageAdapter)
    participant ProviderAPI as External Provider API (e.g., Voyage API)

    Client->>+Adapter: embed(LLMEmbeddingRequest)
    Adapter->>+ProviderAPI: POST /v1/embeddings (or similar)
    ProviderAPI-->>-Adapter: EmbeddingResponse (JSON)
    Adapter->>Adapter: Process response
    Adapter-->>-Client: LLMEmbeddingResponse

重新排序请求的序列图 (Voyage)

sequenceDiagram
    participant Client
    participant VoyageAdapter
    participant VoyageAPI as Voyage Rerank API

    Client->>+VoyageAdapter: rerank(LLMReRankRequest)
    VoyageAdapter->>+VoyageAPI: POST /v1/rerank
    VoyageAPI-->>-VoyageAdapter: RerankResponse (JSON)
    VoyageAdapter->>VoyageAdapter: Process response, sort results
    VoyageAdapter-->>-Client: LLMReRankResponse

文件级别更改

Change	Details	Files
引入嵌入抽象	定义标准 `LLMEmbeddingRequest` 和 `LLMEmbeddingResponse` 模型。定义 `LLMEmbeddingProtocol` 用于嵌入实现。添加相关的类型提示和实用程序类型。	`kirara_ai/llm/format/embedding.py` `kirara_ai/llm/adapter.py`
在现有适配器中实现嵌入	为 OpenAI、Gemini 和 Ollama 适配器实现 `embed` 方法。将标准请求格式映射到提供商特定的 API 调用。将提供商响应解析为标准 `LLMEmbeddingResponse` 格式。	`kirara_ai/plugins/llm_preset_adapters/openai_adapter.py` `kirara_ai/plugins/llm_preset_adapters/gemini_adapter.py` `kirara_ai/plugins/llm_preset_adapters/ollama_adapter.py`
添加具有嵌入和重新排序功能的 Voyage 适配器	创建新的 `VoyageAdapter`，实现 `LLMEmbeddingProtocol` 和 `LLMReRankProtocol`。使用 Voyage API 实现文本和多模态嵌入。使用 Voyage API 实现重新排序。定义标准 `LLMReRankRequest`、`LLMReRankResponse` 模型和 `LLMReRankProtocol`。添加用于将媒体解析为 base64 以进行多模态输入的助手。为媒体解析添加单元测试。	`kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py` `kirara_ai/llm/format/rerank.py` `kirara_ai/llm/adapter.py` `tests/test_voyage.py` `kirara_ai/plugins/llm_preset_adapters/__init__.py`
重构适配器结构和协议	引入使用 `@runtime_checkable` 的 `LLMChatProtocol`。从基础 `LLMBackendAdapter` 中删除抽象 `chat` 方法。将 `OpenAIAdapter` 重构为仅用于聊天功能的 `OpenAIAdapterChatBase`。更新从 OpenAI 基类继承的适配器。更新其他适配器和测试以使用新协议。	`kirara_ai/llm/adapter.py` `kirara_ai/plugins/llm_preset_adapters/openai_adapter.py` `kirara_ai/plugins/llm_preset_adapters/minimax_adapter.py` `kirara_ai/plugins/llm_preset_adapters/deepseek_adapter.py` `kirara_ai/plugins/llm_preset_adapters/moonshot_adapter.py` `kirara_ai/plugins/llm_preset_adapters/claude_adapter.py` `tests/tracing/test_decorator.py` `tests/web/api/llm/test_llm.py`
更新项目配置	将最低要求的 Python 版本从 3.9 提高到 3.10。调整适配器设置中的依赖项。	`pyproject.toml` `kirara_ai/plugins/llm_preset_adapters/setup.py`

提示和命令

与 Sourcery 互动

触发新的审查： 在 pull request 上评论 @sourcery-ai review。
继续讨论： 直接回复 Sourcery 的审查评论。
从审查评论生成 GitHub issue： 通过回复审查评论，要求 Sourcery 从审查评论创建一个 issue。您还可以回复审查评论并使用 @sourcery-ai issue 从中创建一个 issue。
生成 pull request 标题： 在 pull request 标题中的任何位置写入 @sourcery-ai 以随时生成标题。您也可以在 pull request 上评论 @sourcery-ai title 以随时（重新）生成标题。
生成 pull request 摘要： 在 pull request 正文中的任何位置写入 @sourcery-ai summary 以随时在您想要的位置生成 PR 摘要。您也可以在 pull request 上评论 @sourcery-ai summary 以随时（重新）生成摘要。
生成审查者指南： 在 pull request 上评论 @sourcery-ai guide 以随时（重新）生成审查者指南。
解决所有 Sourcery 评论： 在 pull request 上评论 @sourcery-ai resolve 以解决所有 Sourcery 评论。如果您已经解决了所有评论并且不想再看到它们，这将非常有用。
驳回所有 Sourcery 审查： 在 pull request 上评论 @sourcery-ai dismiss 以驳回所有现有的 Sourcery 审查。如果您想从新的审查开始，这将特别有用 - 不要忘记评论 @sourcery-ai review 以触发新的审查！

自定义您的体验

访问您的仪表板以：

启用或禁用审查功能，例如 Sourcery 生成的 pull request 摘要、审查者指南等。
更改审查语言。
添加、删除或编辑自定义审查说明。
调整其他审查设置。

获得帮助

联系我们的支持团队提出问题或反馈。
访问我们的文档获取详细指南和信息。
通过在 X/Twitter、LinkedIn 或 GitHub 上关注我们，与 Sourcery 团队保持联系。

```

Original review guide in English

Reviewer's Guide

This pull request introduces support for embedding models by defining standard request/response formats (LLMEmbeddingRequest, LLMEmbeddingResponse) and an LLMEmbeddingProtocol. This protocol is implemented in the existing OpenAI, Gemini, and Ollama adapters, as well as a newly added Voyage adapter, by calling the respective provider APIs. The Voyage adapter also introduces reranking capabilities via a new LLMReRankProtocol and corresponding formats. Base adapter classes and protocols were refactored to accommodate these changes, and the minimum Python version was raised to 3.10.

Sequence Diagram for Embedding Request

sequenceDiagram
    participant Client
    participant Adapter as LLMAdapter (e.g., VoyageAdapter)
    participant ProviderAPI as External Provider API (e.g., Voyage API)

    Client->>+Adapter: embed(LLMEmbeddingRequest)
    Adapter->>+ProviderAPI: POST /v1/embeddings (or similar)
    ProviderAPI-->>-Adapter: EmbeddingResponse (JSON)
    Adapter->>Adapter: Process response
    Adapter-->>-Client: LLMEmbeddingResponse

Sequence Diagram for Reranking Request (Voyage)

sequenceDiagram
    participant Client
    participant VoyageAdapter
    participant VoyageAPI as Voyage Rerank API

    Client->>+VoyageAdapter: rerank(LLMReRankRequest)
    VoyageAdapter->>+VoyageAPI: POST /v1/rerank
    VoyageAPI-->>-VoyageAdapter: RerankResponse (JSON)
    VoyageAdapter->>VoyageAdapter: Process response, sort results
    VoyageAdapter-->>-Client: LLMReRankResponse

File-Level Changes

Change	Details	Files
Introduce Embedding Abstraction	Define standard `LLMEmbeddingRequest` and `LLMEmbeddingResponse` models. Define `LLMEmbeddingProtocol` for embedding implementation. Add related type hints and utility types.	`kirara_ai/llm/format/embedding.py` `kirara_ai/llm/adapter.py`
Implement Embedding in Existing Adapters	Implement the `embed` method for OpenAI, Gemini, and Ollama adapters. Map standard request format to provider-specific API calls. Parse provider responses into the standard `LLMEmbeddingResponse` format.	`kirara_ai/plugins/llm_preset_adapters/openai_adapter.py` `kirara_ai/plugins/llm_preset_adapters/gemini_adapter.py` `kirara_ai/plugins/llm_preset_adapters/ollama_adapter.py`
Add Voyage Adapter with Embedding and Reranking	Create new `VoyageAdapter` implementing `LLMEmbeddingProtocol` and `LLMReRankProtocol`. Implement text and multi-modal embedding using Voyage APIs. Implement reranking using Voyage API. Define standard `LLMReRankRequest`, `LLMReRankResponse` models and `LLMReRankProtocol`. Add helper for resolving media to base64 for multi-modal input. Add unit tests for media resolution.	`kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py` `kirara_ai/llm/format/rerank.py` `kirara_ai/llm/adapter.py` `tests/test_voyage.py` `kirara_ai/plugins/llm_preset_adapters/__init__.py`
Refactor Adapter Structure and Protocols	Introduce `LLMChatProtocol` using `@runtime_checkable`. Remove abstract `chat` method from base `LLMBackendAdapter`. Refactor `OpenAIAdapter` into `OpenAIAdapterChatBase` for chat-only functionality. Update adapters inheriting from OpenAI base class. Update other adapters and tests to use the new protocols.	`kirara_ai/llm/adapter.py` `kirara_ai/plugins/llm_preset_adapters/openai_adapter.py` `kirara_ai/plugins/llm_preset_adapters/minimax_adapter.py` `kirara_ai/plugins/llm_preset_adapters/deepseek_adapter.py` `kirara_ai/plugins/llm_preset_adapters/moonshot_adapter.py` `kirara_ai/plugins/llm_preset_adapters/claude_adapter.py` `tests/tracing/test_decorator.py` `tests/web/api/llm/test_llm.py`
Update Project Configuration	Increase minimum required Python version from 3.9 to 3.10. Adjust dependencies in adapter setup.	`pyproject.toml` `kirara_ai/plugins/llm_preset_adapters/setup.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

Kaos-dawn · 2025-04-18T09:41:08Z

有一点点问题，两个分支代码混淆了等待我弄好再说

feat: 为基本Adapter类添加抽象方法embed feat: 添加了一个纯embedding模型的服务商以及其Adapter feat: 为各大模型添加embedding支持及相应实现

kirara_ai/llm/adapter.py

kirara_ai/llm/format/request.py

refactor: 优化了llm_adapter使其使用组合接口实现基本的功能。 feature: 添加了rerank接口以实现重排器模型。并设计了对应的模型。 feature: 添加了embedding接口以实现embedding模型。并设计了对应的模型。 feature: 为主要模型添加了其嵌入式api接口。 feature: 添加了一个嵌入模型以及重排器模型提供商。（暂时未注册） merged: 合并了主线mc相关内容并解决了对应冲突 fix: 尝试修复pytest失败用例 fix: 修复mypy检查出的问题

…at-feature/embed

kirara_ai/llm/format/embedding.py

lss233 · 2025-04-22T17:19:27Z

kirara_ai/plugins/llm_preset_adapters/minimax_adapter.py

        super().__init__(config)

+    def embed(self, req):
+        raise NotImplementedError("Minimax does not support embedding")


既然不支持，为什么要定义？

lss233 · 2025-04-22T17:20:26Z

kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py

+        }
+        data = {
+            "model": req.model,
+            "inputs": asyncio.run(resolve_media_base64(req.inputs, media_manager)),


这里的代码测试过能跑吗？

有模拟的mediaManager吗我去写个pytest试试

lss233 · 2025-04-22T17:22:24Z

kirara_ai/plugins/llm_preset_adapters/openai_adapter.py



-class OpenAIAdapter(LLMBackendAdapter, AutoDetectModelsProtocol):
+class OpenAIAdapter(LLMBackendAdapter, AutoDetectModelsProtocol, LLMChatProtocol, LLMEmbeddingProtocol):


既然此处的 OpenAI Adapter 实现了 Embedding Protocol 已不能被其他 adapter 复用，建议抽离 chat 能力为单独的类，以便其他继承了 OpenAI Adapter 的 adapter 使用

github-actions · 2025-04-22T17:26:22Z

MyPy 类型检查通过 ✅

PR 修改的代码行通过了类型检查。

github-actions

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 3 个类型问题，需要修复。

已对修改的代码行创建了 3 个行级评论。

github-actions · 2025-04-24T09:11:56Z

kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py

+            raise e
+
+        return LLMEmbeddingResponse(
+            vectors=[data["embedding"] for data in response_data["data"]],


MyPy 类型错误: TypedDict "ReRankData" has no key "embedding" (typeddict-item)

详细信息请参考 mypy 文档。

github-actions · 2025-04-24T09:11:56Z

kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py

+            raise e
+
+        return LLMEmbeddingResponse(
+            vectors=[data["embedding"] for data in response_data["data"]],


MyPy 类型错误: TypedDict "ReRankData" has no key "embedding" (typeddict-item)

详细信息请参考 mypy 文档。

github-actions · 2025-04-24T09:11:56Z

kirara_ai/plugins/llm_preset_adapters/openai_adapter.py

+            "model": req.model,
+            "dimensions": req.dimension,
+            "encoding_format": req.encoding_format,
+            "user": req.user


MyPy 类型错误: "LLMEmbeddingRequest" has no attribute "user" (attr-defined)

详细信息请参考 mypy 文档。

github-actions

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 1 个类型问题，需要修复。

已对修改的代码行创建了 1 个行级评论。

github-actions · 2025-04-24T09:34:45Z

kirara_ai/plugins/llm_preset_adapters/openai_adapter.py

+            "model": req.model,
+            "dimensions": req.dimension,
+            "encoding_format": req.encoding_format,
+            "user": req.user


MyPy 类型错误: "LLMEmbeddingRequest" has no attribute "user" (attr-defined)

详细信息请参考 mypy 文档。

refactor: 改写了voyage 的转化逻辑，考虑到了事件循环导致的问题。 refactor: 移除了LLMEmbeddingRequest中的user字段。 feature: 编写了一个pytest用例用于测试voyage_adapter中resolve_media_base64是否正常运行。 fix: 添加了voyage的嵌入式接口返回值类型标识，修复了错误的类型标识。 fix: 修复了openai 嵌入式接口user字段未清理的问题。 tips: 对于腾讯, 阿里云等提供商，暂时继承全量openai接口。

lss233

lgtm

sourcery-ai

嘿 @teywat - 我已经审查了你的更改 - 这里有一些反馈：

新的 embed 和 rerank 方法使用同步网络调用 (requests)，可能会阻塞异步事件循环，这在混合同步/异步调用的 VoyageAdapter 中尤其明显；考虑始终如一地使用异步 HTTP 客户端。
阐明 pyproject.toml 中 Python 版本提升到 3.10 是否是此 PR 中引入的功能严格要求的。

这是我在审查期间查看的内容

🟡 一般问题：发现 3 个问题
🟢 安全性：一切看起来都很好
🟡 测试：发现 2 个问题
🟡 复杂性：发现 1 个问题
🟢 文档：一切看起来都很好

Sourcery 对开源是免费的 - 如果你喜欢我们的评论，请考虑分享它们 ✨

_{帮助我更有用！请点击每个评论上的 👍 或 👎，我将使用反馈来改进你的评论。}

Original comment in English

Hey @teywat - I've reviewed your changes - here's some feedback:

The new embed and rerank methods use synchronous network calls (requests), potentially blocking async event loops, notably seen in VoyageAdapter which mixes sync/async calls; consider using an async HTTP client consistently.
Clarify if the Python version bump to 3.10 in pyproject.toml is strictly required by the features introduced in this PR.

Here's what I looked at during the review

🟡 General issues: 3 issues found
🟢 Security: all looks good
🟡 Testing: 2 issues found
🟡 Complexity: 1 issue found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-05-03T14:39:14Z

kirara_ai/plugins/llm_preset_adapters/openai_adapter.py

 logger = get_logger("OpenAIAdapter")

-async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list[dict]:
+async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list | list[dict]:


suggestion: 重新审视联合返回类型，以使其更清晰。

list | list[dict] 联合是多余的。使用精确的类型（如 list[dict]）或显式的 Union 来反映预期的结构。

Suggested change

async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list | list[dict]:

async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list[dict]:

Original comment in English

suggestion: Revisit the union return type for clarity.

The union list | list[dict] is redundant. Use a precise type like list[dict] or an explicit Union to reflect the expected structure.

Suggested change

async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list | list[dict]:

async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list[dict]:

sourcery-ai · 2025-05-03T14:39:14Z

kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py

+        try:
+            response.raise_for_status()
+            response_data: EmbeddingResponse = response.json()
+        except Exception as e:
+            logger.error(f"Response: {response.text}")
+            raise e
+
+        return LLMEmbeddingResponse(
+            vectors=[data["embedding"] for data in response_data["data"]],
+            usage = Usage(


suggestion (bug_risk): 审查在可能正在运行的循环中使用 'run_until_complete' 的情况。

将其转换为异步，或者仅在没有事件循环运行时调用它，因为在正在运行的循环上调用 run_until_complete 会引发 RuntimeError。

Original comment in English

suggestion (bug_risk): Review use of 'run_until_complete' in a potentially running loop.

Convert this to async or only invoke it when no event loop is running, as calling run_until_complete on a running loop raises RuntimeError.

sourcery-ai · 2025-05-03T14:39:14Z

kirara_ai/plugins/llm_preset_adapters/openai_adapter.py

+        # 删除 None 字段
+        data = {k: v for k, v in data.items() if v is not None}
+        logger.debug(f"Request: {data}")
+        response = requests.post(api_url, headers=headers, json=data)


suggestion (performance): 同步方法中潜在的阻塞网络调用。

确保阻塞的 requests.post 调用在异步上下文中使用时不会使事件循环停滞——卸载它们或切换到异步 HTTP 客户端。

建议的实现：

import asyncio # (其他导入)

response = await asyncio.to_thread(requests.post, api_url, headers=headers, json=data)

await asyncio.to_thread(response.raise_for_status)

response_data = await asyncio.to_thread(response.json)

确保父函数声明为 async，以便 await 调用有效。如果此函数是从同步代码调用的，请相应地调整调用站点。

Original comment in English

suggestion (performance): Potential blocking network call in a synchronous method.

Ensure the blocking requests.post calls don't stall the event loop when used in async contexts—offload them or switch to an async HTTP client.

Suggested implementation:

import asyncio # (other imports)

response = await asyncio.to_thread(requests.post, api_url, headers=headers, json=data)

await asyncio.to_thread(response.raise_for_status)

response_data = await asyncio.to_thread(response.json)

Ensure that the parent function is declared as async so that the await calls are valid. If this function is called from synchronous code, adjust the call site accordingly.

鉴于现在的调度器执行流程为同步执行。同一时间只有一次http请求。改为异步意义不大。推荐等异步工作流执行特性完成再使用httpx重写

理解你的顾虑。目前整个调度器确实是同步执行、并且同一时间只有一次 HTTP 请求，单纯把 embed 改成 asyncio.to_thread 并不会带来太大收益。我们可以暂时保留同步实现，待上层引入真正的异步执行流或者切换到 httpx.AsyncClient 后，再一次性把所有后端请求（包括 embedding）整体迁移到异步。

建议在代码里加个 TODO／Issue 标记这块，后续跟进：

支持真正的异步工作流时，将 embed 方法改为 async 并使用 httpx.AsyncClient（或其他异步 HTTP 库）。

同步场景下，可考虑给 requests.post 增加超时、重试等策略，保证稳定性。

这样既不影响当前功能，又能为后续的异步重构留足空间。

sourcery-ai · 2025-05-03T14:39:15Z

tests/test_voyage.py

+class MockMediaManager:
+    def get_media(self, media_id: str):
+        return MockMedia(media_id)
+
+class MockMedia:
+    def __init__(self, media_id: str):
+        self.media_id = media_id
+
+    async def get_base64(self) -> str:
+        return f"{self.media_id}:base64_data"


suggestion (testing): 考虑测试找不到媒体的边缘情况

添加一个 media_manager.get_media 返回 None 的测试，并验证 resolve_media_base64 是否引发 ValueError。

Original comment in English

suggestion (testing): Consider testing the edge case where media is not found

Add a test where media_manager.get_media returns None and verify that resolve_media_base64 raises a ValueError.

sourcery-ai · 2025-05-03T14:39:15Z

tests/test_voyage.py

+def test_resolve_media_base64(inputs, media_manager):
+    try:
+        loop = asyncio.get_running_loop()
+    except RuntimeError:
+        loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(loop)
+
+    results = loop.run_until_complete(resolve_media_base64(inputs, media_manager))
+    assert results[0] == {'content': [{"type": "text", "text": "hello"}]}
+    assert results[1] == {"content": [{"type": "image_base64", "image_base64": "114514:base64_data"}]}


issue (testing): 测试套件缺少对新的嵌入和重排序功能的覆盖

请为 OpenAI、Gemini、Ollama 和 Voyage 中的新嵌入方法（涵盖文本和多模态，如果适用）、Voyage 重排序方法以及错误处理（无效输入、API 错误、不支持的功能）添加单元或集成测试。

Original comment in English

issue (testing): Test suite lacks coverage for the new embedding and reranking functionalities

Please add unit or integration tests for the new embed methods in OpenAI, Gemini, Ollama, and Voyage (covering text and multi-modal where applicable), the Voyage rerank method, and error handling (invalid inputs, API errors, unsupported features).

sourcery-ai · 2025-05-03T14:39:15Z

kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py

+            )
+        )
+
+    def _multi_modal_embedding(self, req: LLMEmbeddingRequest, media_manager: MediaManager) -> LLMEmbeddingResponse:


issue (complexity): 考虑将异步媒体解析与同步嵌入过程分离，并使用 asyncio.run 来桥接异步操作。

可操作的重构建议

主要的复杂性来自于将同步 HTTP 调用与异步媒体处理混合在一起。考虑以下两个重构步骤：

将异步媒体解析与嵌入分离：
为多模态嵌入创建一个专用的异步方法。这将事件循环处理与同步逻辑隔离。

一致地从同步方法桥接异步：
不要尝试在 _multi_modal_embedding 内部获取或创建正在运行的循环，而是将异步工作委托给明确定义的异步助手，并使用 asyncio.run（或类似方法）在外部边界运行它。

重构示例：

# 提取异步功能 async def async_multi_modal_embedding( self, req: LLMEmbeddingRequest, media_manager: MediaManager ) -> LLMEmbeddingResponse: api_url = f"{self.config.api_base}/v1/multimodalembeddings" headers = { "Authorization": f"Bearer {self.config.api_key}", "Content-Type": "application/json", } inputs = await resolve_media_base64(req.inputs, media_manager) data = { "model": req.model, "inputs": inputs, "input_type": req.input_type, "truncation": req.truncate, "output_encoding": req.encoding_format, } data = {k: v for k, v in data.items() if v is not None} response = requests.post(api_url, headers=headers, json=data) response.raise_for_status() response_data: ModalEmbeddingResponse = response.json() return LLMEmbeddingResponse( vectors=[item["embedding"] for item in response_data["data"]], usage=Usage(total_tokens=response_data["usage"].get("total_tokens", 0)), ) # 在您的 embed 方法中，通过 asyncio.run 委托： def embed(self, req: LLMEmbeddingRequest) -> LLMEmbeddingResponse: if all(isinstance(input, LLMChatTextContent) for input in req.inputs): return self._text_embedding(req) else: return asyncio.run(self.async_multi_modal_embedding(req, self.media_manager))

总结：
这种方法从适配器中删除了嵌套的事件循环管理，隔离了异步问题，并使其更容易维护和扩展代码，而不会混合职责。

Original comment in English

issue (complexity): Consider separating the asynchronous media resolution from the synchronous embedding process and using asyncio.run to bridge the async operations.

Actionable Refactoring Suggestion

The main complexity comes from mixing synchronous HTTP calls with asynchronous media processing. Consider these two refactoring steps:

Separate Async Media Resolution from Embedding:
Create a dedicated async method for multi-modal embedding. This isolates event loop handling from your synchronous logic.

Consistently Bridge Async from Synchronous Methods:
Instead of trying to get or create a running loop inside _multi_modal_embedding, delegate async work to a clearly defined async helper and run it using asyncio.run (or similar) at the outer boundary.

Example Refactoring:

# Extract async functionality async def async_multi_modal_embedding( self, req: LLMEmbeddingRequest, media_manager: MediaManager ) -> LLMEmbeddingResponse: api_url = f"{self.config.api_base}/v1/multimodalembeddings" headers = { "Authorization": f"Bearer {self.config.api_key}", "Content-Type": "application/json", } inputs = await resolve_media_base64(req.inputs, media_manager) data = { "model": req.model, "inputs": inputs, "input_type": req.input_type, "truncation": req.truncate, "output_encoding": req.encoding_format, } data = {k: v for k, v in data.items() if v is not None} response = requests.post(api_url, headers=headers, json=data) response.raise_for_status() response_data: ModalEmbeddingResponse = response.json() return LLMEmbeddingResponse( vectors=[item["embedding"] for item in response_data["data"]], usage=Usage(total_tokens=response_data["usage"].get("total_tokens", 0)), ) # In your embed method, delegate via asyncio.run: def embed(self, req: LLMEmbeddingRequest) -> LLMEmbeddingResponse: if all(isinstance(input, LLMChatTextContent) for input in req.inputs): return self._text_embedding(req) else: return asyncio.run(self.async_multi_modal_embedding(req, self.media_manager))

Summary:
This approach removes nested event loop management from within the adapter, isolates async concerns, and makes it easier to maintain and extend the code without mixing responsibilities.

Kaos-dawn · 2025-05-06T05:50:52Z

话说需不需要将所有adapter的api调用方法改为异步的httpx?

lss233 · 2025-05-06T14:55:59Z

话说需不需要将所有adapter的api调用方法改为异步的httpx?

我觉得可以

fix: 修复了所有adapter中loop的使用方法，原方法未实现事件循环的关闭以及仍被引用无法被垃圾回收的潜在资源泄露风险。将原本手动进行的loop生命周期管理，更换为更高层级封装好的asyncio.run()方法。 refactored: 查阅了api reference。暂时更正了gemini_adapter中api_key的转递方法（没有测试条件，展示未测试实际使用。 pytest测试通过） refactored: 查阅了voyage_api_reference。更改了待嵌入文本以及图像的序列化逻辑。Note: 新逻辑需要用到meida.description属性。 feature: 为llm_adapter添加了标准测试集框架，并实现了所有的embed方法测式集。

github-actions

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 8 个类型问题，需要修复。

已对修改的代码行创建了 8 个行级评论。

github-actions · 2025-05-14T05:02:17Z

tests/llm_adapters/conftest.py

+    async def get_url(self) -> str:
+        return "https://example.com/image.png"
+
+    @property


MyPy 类型错误: Read-only property cannot override read-write property (misc)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T05:02:17Z

kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py

+                raise ValueError(f"Media {input.media_id} not found")
+            results.append({
+                "content": [
+                    {"type": "text", "text": media.description},


MyPy 类型错误: Dict entry 1 has incompatible type "str": "str | None"; expected "str": "str" (dict-item)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T05:02:18Z

kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py

+            usage = Usage(
+                total_tokens = response_data["usage"].get("total_tokens", 0)
+            ),
+            sort = req.sort


MyPy 类型错误: Argument "sort" to "LLMReRankResponse" has incompatible type "bool | None"; expected "bool" (arg-type)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T05:02:18Z

tests/llm_adapters/test_voyage_adapter.py

+        response = voyage_adapter.embed(req)
+        assert isinstance(response, LLMEmbeddingResponse)
+        assert response.vectors[0] == REFERENCE_VECTOR
+        assert response.usage.total_tokens == 10


MyPy 类型错误: Item "None" of "Usage | None" has no attribute "total_tokens" (union-attr)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T05:02:18Z

tests/llm_adapters/test_voyage_adapter.py

+        response = voyage_adapter.embed(req)
+        assert isinstance(response, LLMEmbeddingResponse)
+        assert response.vectors[0] == REFERENCE_VECTOR
+        assert response.usage.total_tokens == 3576


MyPy 类型错误: Item "None" of "Usage | None" has no attribute "total_tokens" (union-attr)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T05:02:18Z

kirara_ai/plugins/llm_preset_adapters/openai_adapter.py

-def convert_llm_chat_message_to_openai_message(messages: list[LLMChatMessage], media_manager: MediaManager, loop: asyncio.AbstractEventLoop) -> list[dict]:
-    results = loop.run_until_complete(
+def convert_llm_chat_message_to_openai_message(messages: list[LLMChatMessage], media_manager: MediaManager) -> list[dict]:
+    results = asyncio.run(


MyPy 类型错误: Need type annotation for "results" (var-annotated)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T05:02:18Z

kirara_ai/plugins/llm_preset_adapters/gemini_adapter.py


        data = {
-            "contents": loop.run_until_complete(asyncio.gather(*[convert_llm_chat_message_to_gemini_message(msg, self.media_manager) for msg in req.messages])),
+            "contents": asyncio.run(asyncio.gather(*[convert_llm_chat_message_to_gemini_message(msg, self.media_manager) for msg in req.messages])),


MyPy 类型错误: Argument 1 to "run" has incompatible type "Future[list[dict[Any, Any]]]"; expected "Coroutine[Any, Any, Never]" (arg-type)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T05:02:18Z

tests/llm_adapters/test_ollama_adapter.py

+class TestOllamaAdapter:
+    @pytest.fixture(scope="class")
+    def ollama_adapter(self, mock_media_manager) -> OllamaAdapter:
+        config = OllamaConfig(


MyPy 类型错误: Unexpected keyword argument "api_key" for "OllamaConfig" (call-arg)

详细信息请参考 mypy 文档。

fix: 修复openai_adapter中max_tokens参数被移除的问题。 fix: fix mypy提示的类型错误。

github-actions

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 4 个类型问题，需要修复。

已对修改的代码行创建了 4 个行级评论。

github-actions · 2025-05-14T14:03:51Z

tests/llm_adapters/test_voyage_adapter.py

+        response = voyage_adapter.embed(req)
+        assert isinstance(response, LLMEmbeddingResponse)
+        assert response.vectors[0] == REFERENCE_VECTOR
+        assert response.usage.total_tokens == 10


MyPy 类型错误: Item "None" of "Usage | None" has no attribute "total_tokens" (union-attr)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T14:03:51Z

tests/llm_adapters/test_openai_adapter.py

+
+            response = openai_adapter.chat(req)
+            assert isinstance(response, LLMChatResponse)
+            assert response.message.content[0].text == "mock_response"


MyPy 类型错误: Item "LLMChatImageContent" of "LLMChatTextContent | LLMChatImageContent | LLMToolCallContent | LLMToolResultContent" has no attribute "text" (union-attr)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T14:03:52Z

tests/llm_adapters/test_openai_adapter.py

+
+            response = openai_adapter.chat(req)
+            assert isinstance(response, LLMChatResponse)
+            assert response.message.content[0].text == "mock_response"


MyPy 类型错误: Item "LLMToolCallContent" of "LLMChatTextContent | LLMChatImageContent | LLMToolCallContent | LLMToolResultContent" has no attribute "text" (union-attr)

详细信息请参考 mypy 文档。

github-actions · 2025-05-14T14:03:52Z

tests/llm_adapters/test_openai_adapter.py

+
+            response = openai_adapter.chat(req)
+            assert isinstance(response, LLMChatResponse)
+            assert response.message.content[0].text == "mock_response"


MyPy 类型错误: Item "LLMToolResultContent" of "LLMChatTextContent | LLMChatImageContent | LLMToolCallContent | LLMToolResultContent" has no attribute "text" (union-attr)

详细信息请参考 mypy 文档。

github-actions

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 1 个类型问题，需要修复。

已对修改的代码行创建了 1 个行级评论。

github-actions · 2025-05-20T05:29:14Z

tests/llm_adapters/mock_app/gemini.py

+router = APIRouter(tags=["gemini"], dependencies=[Depends(gemini_authenticate)])
+
+@router.post("/models/{model}:generateContent")
+async def chat(model: str, request:ChatRequest = Body()) -> dict:


MyPy 类型错误: Missing return statement (return)

详细信息请参考 mypy 文档。

feature: 进一步完善了adapter的基本测试用例。

Kaos-dawn · 2025-05-22T04:04:12Z

lgtm

实现embed接口

a1ca514

feat: 为基本Adapter类添加抽象方法embed feat: 添加了一个纯embedding模型的服务商以及其Adapter feat: 为各大模型添加embedding支持及相应实现

Kaos-dawn force-pushed the feature/embed branch from a804a5b to a1ca514 Compare April 19, 2025 13:41

fix: 修复添加embed方法导致的pytest失败

71cfc59

lss233 requested changes Apr 20, 2025

View reviewed changes

kirara_ai/llm/adapter.py Outdated Show resolved Hide resolved

kirara_ai/llm/format/request.py Outdated Show resolved Hide resolved

Kaos-dawn force-pushed the feature/embed branch from 446a500 to 75bac27 Compare April 22, 2025 02:19

Kaos-dawn force-pushed the feature/embed branch from 75bac27 to f86ac92 Compare April 22, 2025 02:26

Merge branch 'feature/embed' of github.com:teywat/kirara-ai into teyw…

10b1a60

…at-feature/embed

lss233 reviewed Apr 22, 2025

View reviewed changes

chore: Remove test password hash and test workflow YAML file

f5a4901

github-actions bot reviewed Apr 24, 2025

View reviewed changes

github-actions bot added the 🔴 类型检查：失败 label Apr 24, 2025

Kaos-dawn force-pushed the feature/embed branch from ed7dd18 to 313a8ae Compare April 24, 2025 09:33

github-actions bot reviewed Apr 24, 2025

View reviewed changes

Kaos-dawn force-pushed the feature/embed branch from 313a8ae to 63a1929 Compare April 24, 2025 09:39

github-actions bot added ✅ 类型检查：通过 and removed 🔴 类型检查：失败 labels Apr 24, 2025

lss233 approved these changes May 1, 2025

View reviewed changes

Kaos-dawn marked this pull request as ready for review May 3, 2025 14:37

sourcery-ai bot reviewed May 3, 2025

View reviewed changes

fireflyi added 2 commits May 11, 2025 21:16

Merge master into embed. And fix it conflict.

659950c

github-actions bot reviewed May 14, 2025

View reviewed changes

github-actions bot added the 🔴 类型检查：失败 label May 14, 2025

feature: 按照openai_api_reference的参数说明，完成了一分openai全量参数验证模型。

5843dd7

fix: 修复openai_adapter中max_tokens参数被移除的问题。 fix: fix mypy提示的类型错误。

github-actions bot reviewed May 14, 2025

View reviewed changes

github-actions bot reviewed May 20, 2025

View reviewed changes

fix: 修复了上次openai_adapter中未能发现的小问题。

8381633

feature: 进一步完善了adapter的基本测试用例。

Kaos-dawn force-pushed the feature/embed branch from 0a0cb7d to 8381633 Compare May 20, 2025 05:36

github-actions bot removed the 🔴 类型检查：失败 label May 20, 2025

Kaos-dawn merged commit 145aa35 into lss233:master May 22, 2025
6 checks passed



		class OpenAIAdapter(LLMBackendAdapter, AutoDetectModelsProtocol):
		class OpenAIAdapter(LLMBackendAdapter, AutoDetectModelsProtocol, LLMChatProtocol, LLMEmbeddingProtocol):

	async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list \| list[dict]:
	async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list[dict]:

嵌入模型api接口支持 #1463

嵌入模型api接口支持 #1463

Conversation

Kaos-dawn commented Apr 18, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Sourcery 总结

Sourcery 总结

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

重新排序请求的序列图 (Voyage)

文件级别更改

与 Sourcery 互动

自定义您的体验

获得帮助

Reviewer's Guide

Sequence Diagram for Embedding Request

Sequence Diagram for Reranking Request (Voyage)

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

Kaos-dawn commented Apr 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lss233 Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

lss233 Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

Kaos-dawn Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

lss233 Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MyPy 类型检查通过 ✅

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

MyPy 类型检查结果 ❌

Uh oh!

github-actions bot Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

MyPy 类型检查结果 ❌

Uh oh!

github-actions bot Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

lss233 left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot May 3, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot May 3, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot May 3, 2025

Choose a reason for hiding this comment

Uh oh!

Kaos-dawn commented Apr 18, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Apr 18, 2025 •

edited

Loading

github-actions bot commented Apr 22, 2025 •

edited

Loading