Skip to content

Conversation

@Kaos-dawn
Copy link
Collaborator

@Kaos-dawn Kaos-dawn commented Apr 18, 2025

feat: 未各大模型补充了embed方法
feat: 为LLMBackendAdapter添加了embed抽象方法。
feat: 增加了一个纯embedding模型服务商
tips: 新的服务商暂未注册,等待llm注册逻辑更新。

好的,这是翻译成中文的 pull request 总结:

Sourcery 总结

为多个 LLM 后端适配器添加 Embedding 支持,引入跨不同 AI 提供商的标准化 Embedding 请求和响应格式。

新特性:

  • 为 OpenAI、Gemini、Ollama 和 Voyage 适配器实现了 Embedding 方法。
  • 创建了标准化的 LLMEmbeddingRequest 和 LLMEmbeddingResponse 格式,用于 Embedding 操作。

增强功能:

  • 使用抽象的 embed 方法扩展了 LLMBackendAdapter。
  • 增加了对跨不同 AI 提供商的文本和多模态 Embedding 的支持。
Original summary in English

好的,这是翻译成中文的 pull request 总结:

Sourcery 总结

为 LLM 适配器添加嵌入 (embedding) 和重排序 (reranking) 功能。

新功能:

  • 为 OpenAI、Gemini 和 Ollama 适配器实现嵌入方法。
  • 引入一个新的 Voyage 适配器,支持文本/多模态嵌入和重排序。

增强:

  • 为适配器定义 LLMEmbeddingProtocolLLMReRankProtocol
  • 创建标准化的请求和响应格式 (LLMEmbeddingRequest/Response, LLMReRankRequest/Response)。
  • 重构 OpenAI 适配器并更新继承的适配器 (Minimax, DeepSeek, Moonshot)。

构建:

  • 更新 Python 版本要求为 >=3.10。
  • setup.py 中删除未使用的依赖项。

测试:

  • 为新的 Voyage 适配器功能添加测试。
Original summary in English

Summary by Sourcery

Add embedding and reranking capabilities to LLM adapters.

New Features:

  • Implement embedding methods for OpenAI, Gemini, and Ollama adapters.
  • Introduce a new Voyage adapter supporting text/multi-modal embedding and reranking.

Enhancements:

  • Define LLMEmbeddingProtocol and LLMReRankProtocol for adapters.
  • Create standardized request and response formats (LLMEmbeddingRequest/Response, LLMReRankRequest/Response).
  • Refactor the OpenAI adapter and update inheriting adapters (Minimax, DeepSeek, Moonshot).

Build:

  • Update Python version requirement to >=3.10.
  • Remove unused dependencies from setup.py.

Tests:

  • Add tests for the new Voyage adapter functionality.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Apr 18, 2025

## 审查者指南

此 pull request 引入了对嵌入模型的支持,通过定义标准请求/响应格式 (LLMEmbeddingRequest, LLMEmbeddingResponse) 和 LLMEmbeddingProtocol。 此协议在现有的 OpenAI、Gemini 和 Ollama 适配器以及新添加的 Voyage 适配器中实现,通过调用相应的提供商 API。 Voyage 适配器还通过新的 LLMReRankProtocol 和相应的格式引入了重新排序功能。 基础适配器类和协议已重构以适应这些更改,并且最低 Python 版本已提高到 3.10。

#### 嵌入请求的序列图

```mermaid
sequenceDiagram
    participant Client
    participant Adapter as LLMAdapter (e.g., VoyageAdapter)
    participant ProviderAPI as External Provider API (e.g., Voyage API)

    Client->>+Adapter: embed(LLMEmbeddingRequest)
    Adapter->>+ProviderAPI: POST /v1/embeddings (or similar)
    ProviderAPI-->>-Adapter: EmbeddingResponse (JSON)
    Adapter->>Adapter: Process response
    Adapter-->>-Client: LLMEmbeddingResponse

重新排序请求的序列图 (Voyage)

sequenceDiagram
    participant Client
    participant VoyageAdapter
    participant VoyageAPI as Voyage Rerank API

    Client->>+VoyageAdapter: rerank(LLMReRankRequest)
    VoyageAdapter->>+VoyageAPI: POST /v1/rerank
    VoyageAPI-->>-VoyageAdapter: RerankResponse (JSON)
    VoyageAdapter->>VoyageAdapter: Process response, sort results
    VoyageAdapter-->>-Client: LLMReRankResponse
Loading

文件级别更改

Change Details Files
引入嵌入抽象
  • 定义标准 LLMEmbeddingRequestLLMEmbeddingResponse 模型。
  • 定义 LLMEmbeddingProtocol 用于嵌入实现。
  • 添加相关的类型提示和实用程序类型。
kirara_ai/llm/format/embedding.py
kirara_ai/llm/adapter.py
在现有适配器中实现嵌入
  • 为 OpenAI、Gemini 和 Ollama 适配器实现 embed 方法。
  • 将标准请求格式映射到提供商特定的 API 调用。
  • 将提供商响应解析为标准 LLMEmbeddingResponse 格式。
kirara_ai/plugins/llm_preset_adapters/openai_adapter.py
kirara_ai/plugins/llm_preset_adapters/gemini_adapter.py
kirara_ai/plugins/llm_preset_adapters/ollama_adapter.py
添加具有嵌入和重新排序功能的 Voyage 适配器
  • 创建新的 VoyageAdapter,实现 LLMEmbeddingProtocolLLMReRankProtocol
  • 使用 Voyage API 实现文本和多模态嵌入。
  • 使用 Voyage API 实现重新排序。
  • 定义标准 LLMReRankRequestLLMReRankResponse 模型和 LLMReRankProtocol
  • 添加用于将媒体解析为 base64 以进行多模态输入的助手。
  • 为媒体解析添加单元测试。
kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py
kirara_ai/llm/format/rerank.py
kirara_ai/llm/adapter.py
tests/test_voyage.py
kirara_ai/plugins/llm_preset_adapters/__init__.py
重构适配器结构和协议
  • 引入使用 @runtime_checkableLLMChatProtocol
  • 从基础 LLMBackendAdapter 中删除抽象 chat 方法。
  • OpenAIAdapter 重构为仅用于聊天功能的 OpenAIAdapterChatBase
  • 更新从 OpenAI 基类继承的适配器。
  • 更新其他适配器和测试以使用新协议。
kirara_ai/llm/adapter.py
kirara_ai/plugins/llm_preset_adapters/openai_adapter.py
kirara_ai/plugins/llm_preset_adapters/minimax_adapter.py
kirara_ai/plugins/llm_preset_adapters/deepseek_adapter.py
kirara_ai/plugins/llm_preset_adapters/moonshot_adapter.py
kirara_ai/plugins/llm_preset_adapters/claude_adapter.py
tests/tracing/test_decorator.py
tests/web/api/llm/test_llm.py
更新项目配置
  • 将最低要求的 Python 版本从 3.9 提高到 3.10。
  • 调整适配器设置中的依赖项。
pyproject.toml
kirara_ai/plugins/llm_preset_adapters/setup.py

提示和命令

与 Sourcery 互动

  • 触发新的审查: 在 pull request 上评论 @sourcery-ai review
  • 继续讨论: 直接回复 Sourcery 的审查评论。
  • 从审查评论生成 GitHub issue: 通过回复审查评论,要求 Sourcery 从审查评论创建一个 issue。 您还可以回复审查评论并使用 @sourcery-ai issue 从中创建一个 issue。
  • 生成 pull request 标题: 在 pull request 标题中的任何位置写入 @sourcery-ai 以随时生成标题。 您也可以在 pull request 上评论 @sourcery-ai title 以随时(重新)生成标题。
  • 生成 pull request 摘要: 在 pull request 正文中的任何位置写入 @sourcery-ai summary 以随时在您想要的位置生成 PR 摘要。 您也可以在 pull request 上评论 @sourcery-ai summary 以随时(重新)生成摘要。
  • 生成审查者指南: 在 pull request 上评论 @sourcery-ai guide 以随时(重新)生成审查者指南。
  • 解决所有 Sourcery 评论: 在 pull request 上评论 @sourcery-ai resolve 以解决所有 Sourcery 评论。 如果您已经解决了所有评论并且不想再看到它们,这将非常有用。
  • 驳回所有 Sourcery 审查: 在 pull request 上评论 @sourcery-ai dismiss 以驳回所有现有的 Sourcery 审查。 如果您想从新的审查开始,这将特别有用 - 不要忘记评论 @sourcery-ai review 以触发新的审查!

自定义您的体验

访问您的 仪表板 以:

  • 启用或禁用审查功能,例如 Sourcery 生成的 pull request 摘要、审查者指南等。
  • 更改审查语言。
  • 添加、删除或编辑自定义审查说明。
  • 调整其他审查设置。

获得帮助

```
Original review guide in English

Reviewer's Guide

This pull request introduces support for embedding models by defining standard request/response formats (LLMEmbeddingRequest, LLMEmbeddingResponse) and an LLMEmbeddingProtocol. This protocol is implemented in the existing OpenAI, Gemini, and Ollama adapters, as well as a newly added Voyage adapter, by calling the respective provider APIs. The Voyage adapter also introduces reranking capabilities via a new LLMReRankProtocol and corresponding formats. Base adapter classes and protocols were refactored to accommodate these changes, and the minimum Python version was raised to 3.10.

Sequence Diagram for Embedding Request

sequenceDiagram
    participant Client
    participant Adapter as LLMAdapter (e.g., VoyageAdapter)
    participant ProviderAPI as External Provider API (e.g., Voyage API)

    Client->>+Adapter: embed(LLMEmbeddingRequest)
    Adapter->>+ProviderAPI: POST /v1/embeddings (or similar)
    ProviderAPI-->>-Adapter: EmbeddingResponse (JSON)
    Adapter->>Adapter: Process response
    Adapter-->>-Client: LLMEmbeddingResponse
Loading

Sequence Diagram for Reranking Request (Voyage)

sequenceDiagram
    participant Client
    participant VoyageAdapter
    participant VoyageAPI as Voyage Rerank API

    Client->>+VoyageAdapter: rerank(LLMReRankRequest)
    VoyageAdapter->>+VoyageAPI: POST /v1/rerank
    VoyageAPI-->>-VoyageAdapter: RerankResponse (JSON)
    VoyageAdapter->>VoyageAdapter: Process response, sort results
    VoyageAdapter-->>-Client: LLMReRankResponse
Loading

File-Level Changes

Change Details Files
Introduce Embedding Abstraction
  • Define standard LLMEmbeddingRequest and LLMEmbeddingResponse models.
  • Define LLMEmbeddingProtocol for embedding implementation.
  • Add related type hints and utility types.
kirara_ai/llm/format/embedding.py
kirara_ai/llm/adapter.py
Implement Embedding in Existing Adapters
  • Implement the embed method for OpenAI, Gemini, and Ollama adapters.
  • Map standard request format to provider-specific API calls.
  • Parse provider responses into the standard LLMEmbeddingResponse format.
kirara_ai/plugins/llm_preset_adapters/openai_adapter.py
kirara_ai/plugins/llm_preset_adapters/gemini_adapter.py
kirara_ai/plugins/llm_preset_adapters/ollama_adapter.py
Add Voyage Adapter with Embedding and Reranking
  • Create new VoyageAdapter implementing LLMEmbeddingProtocol and LLMReRankProtocol.
  • Implement text and multi-modal embedding using Voyage APIs.
  • Implement reranking using Voyage API.
  • Define standard LLMReRankRequest, LLMReRankResponse models and LLMReRankProtocol.
  • Add helper for resolving media to base64 for multi-modal input.
  • Add unit tests for media resolution.
kirara_ai/plugins/llm_preset_adapters/voyage_adapter.py
kirara_ai/llm/format/rerank.py
kirara_ai/llm/adapter.py
tests/test_voyage.py
kirara_ai/plugins/llm_preset_adapters/__init__.py
Refactor Adapter Structure and Protocols
  • Introduce LLMChatProtocol using @runtime_checkable.
  • Remove abstract chat method from base LLMBackendAdapter.
  • Refactor OpenAIAdapter into OpenAIAdapterChatBase for chat-only functionality.
  • Update adapters inheriting from OpenAI base class.
  • Update other adapters and tests to use the new protocols.
kirara_ai/llm/adapter.py
kirara_ai/plugins/llm_preset_adapters/openai_adapter.py
kirara_ai/plugins/llm_preset_adapters/minimax_adapter.py
kirara_ai/plugins/llm_preset_adapters/deepseek_adapter.py
kirara_ai/plugins/llm_preset_adapters/moonshot_adapter.py
kirara_ai/plugins/llm_preset_adapters/claude_adapter.py
tests/tracing/test_decorator.py
tests/web/api/llm/test_llm.py
Update Project Configuration
  • Increase minimum required Python version from 3.9 to 3.10.
  • Adjust dependencies in adapter setup.
pyproject.toml
kirara_ai/plugins/llm_preset_adapters/setup.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Kaos-dawn
Copy link
Collaborator Author

有一点点问题,两个分支代码混淆了等待我弄好再说

feat: 为基本Adapter类添加抽象方法embed
feat: 添加了一个纯embedding模型的服务商以及其Adapter
feat: 为各大模型添加embedding支持及相应实现
refactor: 优化了llm_adapter使其使用组合接口实现基本的功能。
feature: 添加了rerank接口以实现重排器模型。并设计了对应的模型。
feature: 添加了embedding接口以实现embedding模型。并设计了对应的模型。
feature: 为主要模型添加了其嵌入式api接口。
feature: 添加了一个嵌入模型以及重排器模型提供商。(暂时未注册)
merged: 合并了主线mc相关内容并解决了对应冲突
fix: 尝试修复pytest失败用例
fix: 修复mypy检查出的问题
super().__init__(config)

def embed(self, req):
raise NotImplementedError("Minimax does not support embedding")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

既然不支持,为什么要定义?

}
data = {
"model": req.model,
"inputs": asyncio.run(resolve_media_base64(req.inputs, media_manager)),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的代码测试过能跑吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有模拟的mediaManager吗我去写个pytest试试



class OpenAIAdapter(LLMBackendAdapter, AutoDetectModelsProtocol):
class OpenAIAdapter(LLMBackendAdapter, AutoDetectModelsProtocol, LLMChatProtocol, LLMEmbeddingProtocol):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

既然此处的 OpenAI Adapter 实现了 Embedding Protocol 已不能被其他 adapter 复用,建议抽离 chat 能力为单独的类,以便其他继承了 OpenAI Adapter 的 adapter 使用

@github-actions
Copy link

github-actions bot commented Apr 22, 2025

MyPy 类型检查通过 ✅

PR 修改的代码行通过了类型检查。

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 3 个类型问题,需要修复。

已对修改的代码行创建了 3 个行级评论。

raise e

return LLMEmbeddingResponse(
vectors=[data["embedding"] for data in response_data["data"]],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: TypedDict "ReRankData" has no key "embedding" (typeddict-item)

详细信息请参考 mypy 文档

raise e

return LLMEmbeddingResponse(
vectors=[data["embedding"] for data in response_data["data"]],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: TypedDict "ReRankData" has no key "embedding" (typeddict-item)

详细信息请参考 mypy 文档

"model": req.model,
"dimensions": req.dimension,
"encoding_format": req.encoding_format,
"user": req.user

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: "LLMEmbeddingRequest" has no attribute "user" (attr-defined)

详细信息请参考 mypy 文档

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 1 个类型问题,需要修复。

已对修改的代码行创建了 1 个行级评论。

"model": req.model,
"dimensions": req.dimension,
"encoding_format": req.encoding_format,
"user": req.user

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: "LLMEmbeddingRequest" has no attribute "user" (attr-defined)

详细信息请参考 mypy 文档

refactor: 改写了voyage 的转化逻辑,考虑到了事件循环导致的问题。
refactor: 移除了LLMEmbeddingRequest中的user字段。
feature: 编写了一个pytest用例用于测试voyage_adapter中resolve_media_base64是否正常运行。
fix: 添加了voyage的嵌入式接口返回值类型标识,修复了错误的类型标识。
fix: 修复了openai 嵌入式接口user字段未清理的问题。
tips: 对于腾讯, 阿里云等提供商,暂时继承全量openai接口。
Copy link
Owner

@lss233 lss233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Kaos-dawn Kaos-dawn marked this pull request as ready for review May 3, 2025 14:37
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嘿 @teywat - 我已经审查了你的更改 - 这里有一些反馈:

  • 新的 embedrerank 方法使用同步网络调用 (requests),可能会阻塞异步事件循环,这在混合同步/异步调用的 VoyageAdapter 中尤其明显;考虑始终如一地使用异步 HTTP 客户端。
  • 阐明 pyproject.toml 中 Python 版本提升到 3.10 是否是此 PR 中引入的功能严格要求的。
这是我在审查期间查看的内容
  • 🟡 一般问题:发现 3 个问题
  • 🟢 安全性:一切看起来都很好
  • 🟡 测试:发现 2 个问题
  • 🟡 复杂性:发现 1 个问题
  • 🟢 文档:一切看起来都很好

Sourcery 对开源是免费的 - 如果你喜欢我们的评论,请考虑分享它们 ✨
帮助我更有用!请点击每个评论上的 👍 或 👎,我将使用反馈来改进你的评论。
Original comment in English

Hey @teywat - I've reviewed your changes - here's some feedback:

  • The new embed and rerank methods use synchronous network calls (requests), potentially blocking async event loops, notably seen in VoyageAdapter which mixes sync/async calls; consider using an async HTTP client consistently.
  • Clarify if the Python version bump to 3.10 in pyproject.toml is strictly required by the features introduced in this PR.
Here's what I looked at during the review
  • 🟡 General issues: 3 issues found
  • 🟢 Security: all looks good
  • 🟡 Testing: 2 issues found
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

logger = get_logger("OpenAIAdapter")

async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list[dict]:
async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list | list[dict]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: 重新审视联合返回类型,以使其更清晰。

list | list[dict] 联合是多余的。使用精确的类型(如 list[dict])或显式的 Union 来反映预期的结构。

Suggested change
async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list | list[dict]:
async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list[dict]:
Original comment in English

suggestion: Revisit the union return type for clarity.

The union list | list[dict] is redundant. Use a precise type like list[dict] or an explicit Union to reflect the expected structure.

Suggested change
async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list | list[dict]:
async def convert_parts_factory(messages: LLMChatMessage, media_manager: MediaManager) -> list[dict]:

Comment on lines +108 to +117
try:
response.raise_for_status()
response_data: EmbeddingResponse = response.json()
except Exception as e:
logger.error(f"Response: {response.text}")
raise e

return LLMEmbeddingResponse(
vectors=[data["embedding"] for data in response_data["data"]],
usage = Usage(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): 审查在可能正在运行的循环中使用 'run_until_complete' 的情况。

将其转换为异步,或者仅在没有事件循环运行时调用它,因为在正在运行的循环上调用 run_until_complete 会引发 RuntimeError。

Original comment in English

suggestion (bug_risk): Review use of 'run_until_complete' in a potentially running loop.

Convert this to async or only invoke it when no event loop is running, as calling run_until_complete on a running loop raises RuntimeError.

# 删除 None 字段
data = {k: v for k, v in data.items() if v is not None}
logger.debug(f"Request: {data}")
response = requests.post(api_url, headers=headers, json=data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): 同步方法中潜在的阻塞网络调用。

确保阻塞的 requests.post 调用在异步上下文中使用时不会使事件循环停滞——卸载它们或切换到异步 HTTP 客户端。

建议的实现:

import asyncio
# (其他导入)
        response = await asyncio.to_thread(requests.post, api_url, headers=headers, json=data)
            await asyncio.to_thread(response.raise_for_status)
                response_data = await asyncio.to_thread(response.json)

确保父函数声明为 async,以便 await 调用有效。如果此函数是从同步代码调用的,请相应地调整调用站点。

Original comment in English

suggestion (performance): Potential blocking network call in a synchronous method.

Ensure the blocking requests.post calls don't stall the event loop when used in async contexts—offload them or switch to an async HTTP client.

Suggested implementation:

import asyncio
# (other imports)
        response = await asyncio.to_thread(requests.post, api_url, headers=headers, json=data)
            await asyncio.to_thread(response.raise_for_status)
                response_data = await asyncio.to_thread(response.json)

Ensure that the parent function is declared as async so that the await calls are valid. If this function is called from synchronous code, adjust the call site accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

鉴于现在的调度器执行流程为同步执行。同一时间只有一次http请求。改为异步意义不大。推荐等异步工作流执行特性完成再使用httpx重写

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

理解你的顾虑。目前整个调度器确实是同步执行、并且同一时间只有一次 HTTP 请求,单纯把 embed 改成 asyncio.to_thread 并不会带来太大收益。我们可以暂时保留同步实现,待上层引入真正的异步执行流或者切换到 httpx.AsyncClient 后,再一次性把所有后端请求(包括 embedding)整体迁移到异步。

建议在代码里加个 TODO/Issue 标记这块,后续跟进:

  1. 支持真正的异步工作流时,将 embed 方法改为 async 并使用 httpx.AsyncClient(或其他异步 HTTP 库)。
  2. 同步场景下,可考虑给 requests.post 增加超时、重试等策略,保证稳定性。

这样既不影响当前功能,又能为后续的异步重构留足空间。

Comment on lines 7 to 16
class MockMediaManager:
def get_media(self, media_id: str):
return MockMedia(media_id)

class MockMedia:
def __init__(self, media_id: str):
self.media_id = media_id

async def get_base64(self) -> str:
return f"{self.media_id}:base64_data"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): 考虑测试找不到媒体的边缘情况

添加一个 media_manager.get_media 返回 None 的测试,并验证 resolve_media_base64 是否引发 ValueError

Original comment in English

suggestion (testing): Consider testing the edge case where media is not found

Add a test where media_manager.get_media returns None and verify that resolve_media_base64 raises a ValueError.

Comment on lines 34 to 43
def test_resolve_media_base64(inputs, media_manager):
try:
loop = asyncio.get_running_loop()
except RuntimeError:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)

results = loop.run_until_complete(resolve_media_base64(inputs, media_manager))
assert results[0] == {'content': [{"type": "text", "text": "hello"}]}
assert results[1] == {"content": [{"type": "image_base64", "image_base64": "114514:base64_data"}]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (testing): 测试套件缺少对新的嵌入和重排序功能的覆盖

请为 OpenAI、Gemini、Ollama 和 Voyage 中的新嵌入方法(涵盖文本和多模态,如果适用)、Voyage 重排序方法以及错误处理(无效输入、API 错误、不支持的功能)添加单元或集成测试。

Original comment in English

issue (testing): Test suite lacks coverage for the new embedding and reranking functionalities

Please add unit or integration tests for the new embed methods in OpenAI, Gemini, Ollama, and Voyage (covering text and multi-modal where applicable), the Voyage rerank method, and error handling (invalid inputs, API errors, unsupported features).

)
)

def _multi_modal_embedding(self, req: LLMEmbeddingRequest, media_manager: MediaManager) -> LLMEmbeddingResponse:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): 考虑将异步媒体解析与同步嵌入过程分离,并使用 asyncio.run 来桥接异步操作。

可操作的重构建议

主要的复杂性来自于将同步 HTTP 调用与异步媒体处理混合在一起。考虑以下两个重构步骤:

  1. 将异步媒体解析与嵌入分离:
    为多模态嵌入创建一个专用的异步方法。这将事件循环处理与同步逻辑隔离。

  2. 一致地从同步方法桥接异步:
    不要尝试在 _multi_modal_embedding 内部获取或创建正在运行的循环,而是将异步工作委托给明确定义的异步助手,并使用 asyncio.run(或类似方法)在外部边界运行它。

重构示例:

# 提取异步功能
async def async_multi_modal_embedding(
    self, req: LLMEmbeddingRequest, media_manager: MediaManager
) -> LLMEmbeddingResponse:
    api_url = f"{self.config.api_base}/v1/multimodalembeddings"
    headers = {
        "Authorization": f"Bearer {self.config.api_key}",
        "Content-Type": "application/json",
    }
    inputs = await resolve_media_base64(req.inputs, media_manager)
    data = {
        "model": req.model,
        "inputs": inputs,
        "input_type": req.input_type,
        "truncation": req.truncate,
        "output_encoding": req.encoding_format,
    }
    data = {k: v for k, v in data.items() if v is not None}
    response = requests.post(api_url, headers=headers, json=data)
    response.raise_for_status()
    response_data: ModalEmbeddingResponse = response.json()
    return LLMEmbeddingResponse(
        vectors=[item["embedding"] for item in response_data["data"]],
        usage=Usage(total_tokens=response_data["usage"].get("total_tokens", 0)),
    )

# 在您的 embed 方法中,通过 asyncio.run 委托:
def embed(self, req: LLMEmbeddingRequest) -> LLMEmbeddingResponse:
    if all(isinstance(input, LLMChatTextContent) for input in req.inputs):
        return self._text_embedding(req)
    else:
        return asyncio.run(self.async_multi_modal_embedding(req, self.media_manager))

总结:
这种方法从适配器中删除了嵌套的事件循环管理,隔离了异步问题,并使其更容易维护和扩展代码,而不会混合职责。

Original comment in English

issue (complexity): Consider separating the asynchronous media resolution from the synchronous embedding process and using asyncio.run to bridge the async operations.

Actionable Refactoring Suggestion

The main complexity comes from mixing synchronous HTTP calls with asynchronous media processing. Consider these two refactoring steps:

  1. Separate Async Media Resolution from Embedding:
    Create a dedicated async method for multi-modal embedding. This isolates event loop handling from your synchronous logic.

  2. Consistently Bridge Async from Synchronous Methods:
    Instead of trying to get or create a running loop inside _multi_modal_embedding, delegate async work to a clearly defined async helper and run it using asyncio.run (or similar) at the outer boundary.

Example Refactoring:

# Extract async functionality
async def async_multi_modal_embedding(
    self, req: LLMEmbeddingRequest, media_manager: MediaManager
) -> LLMEmbeddingResponse:
    api_url = f"{self.config.api_base}/v1/multimodalembeddings"
    headers = {
        "Authorization": f"Bearer {self.config.api_key}",
        "Content-Type": "application/json",
    }
    inputs = await resolve_media_base64(req.inputs, media_manager)
    data = {
        "model": req.model,
        "inputs": inputs,
        "input_type": req.input_type,
        "truncation": req.truncate,
        "output_encoding": req.encoding_format,
    }
    data = {k: v for k, v in data.items() if v is not None}
    response = requests.post(api_url, headers=headers, json=data)
    response.raise_for_status()
    response_data: ModalEmbeddingResponse = response.json()
    return LLMEmbeddingResponse(
        vectors=[item["embedding"] for item in response_data["data"]],
        usage=Usage(total_tokens=response_data["usage"].get("total_tokens", 0)),
    )

# In your embed method, delegate via asyncio.run:
def embed(self, req: LLMEmbeddingRequest) -> LLMEmbeddingResponse:
    if all(isinstance(input, LLMChatTextContent) for input in req.inputs):
        return self._text_embedding(req)
    else:
        return asyncio.run(self.async_multi_modal_embedding(req, self.media_manager))

Summary:
This approach removes nested event loop management from within the adapter, isolates async concerns, and makes it easier to maintain and extend the code without mixing responsibilities.

@Kaos-dawn
Copy link
Collaborator Author

话说需不需要将所有adapter的api调用方法改为异步的httpx?

@lss233
Copy link
Owner

lss233 commented May 6, 2025

话说需不需要将所有adapter的api调用方法改为异步的httpx?

我觉得可以

fireflyi added 2 commits May 11, 2025 21:16
fix: 修复了所有adapter中loop的使用方法, 原方法未实现事件循环的关闭以及仍被引用无法被垃圾回收的潜在资源泄露风险。将原本手动进行的loop生命周期管理,更换为更高层级封装好的asyncio.run()方法。
refactored: 查阅了api reference。暂时更正了gemini_adapter中api_key的转递方法(没有测试条件,展示未测试实际使用。 pytest测试通过)
refactored: 查阅了voyage_api_reference。更改了待嵌入文本以及图像的序列化逻辑。Note: 新逻辑需要用到meida.description属性。
feature: 为llm_adapter添加了标准测试集框架,并实现了所有的embed方法测式集。
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 8 个类型问题,需要修复。

已对修改的代码行创建了 8 个行级评论。

async def get_url(self) -> str:
return "https://example.com/image.png"

@property

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Read-only property cannot override read-write property (misc)

详细信息请参考 mypy 文档

raise ValueError(f"Media {input.media_id} not found")
results.append({
"content": [
{"type": "text", "text": media.description},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Dict entry 1 has incompatible type "str": "str | None"; expected "str": "str" (dict-item)

详细信息请参考 mypy 文档

usage = Usage(
total_tokens = response_data["usage"].get("total_tokens", 0)
),
sort = req.sort

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Argument "sort" to "LLMReRankResponse" has incompatible type "bool | None"; expected "bool" (arg-type)

详细信息请参考 mypy 文档

response = voyage_adapter.embed(req)
assert isinstance(response, LLMEmbeddingResponse)
assert response.vectors[0] == REFERENCE_VECTOR
assert response.usage.total_tokens == 10

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Item "None" of "Usage | None" has no attribute "total_tokens" (union-attr)

详细信息请参考 mypy 文档

response = voyage_adapter.embed(req)
assert isinstance(response, LLMEmbeddingResponse)
assert response.vectors[0] == REFERENCE_VECTOR
assert response.usage.total_tokens == 3576

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Item "None" of "Usage | None" has no attribute "total_tokens" (union-attr)

详细信息请参考 mypy 文档

def convert_llm_chat_message_to_openai_message(messages: list[LLMChatMessage], media_manager: MediaManager, loop: asyncio.AbstractEventLoop) -> list[dict]:
results = loop.run_until_complete(
def convert_llm_chat_message_to_openai_message(messages: list[LLMChatMessage], media_manager: MediaManager) -> list[dict]:
results = asyncio.run(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Need type annotation for "results" (var-annotated)

详细信息请参考 mypy 文档


data = {
"contents": loop.run_until_complete(asyncio.gather(*[convert_llm_chat_message_to_gemini_message(msg, self.media_manager) for msg in req.messages])),
"contents": asyncio.run(asyncio.gather(*[convert_llm_chat_message_to_gemini_message(msg, self.media_manager) for msg in req.messages])),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Argument 1 to "run" has incompatible type "Future[list[dict[Any, Any]]]"; expected "Coroutine[Any, Any, Never]" (arg-type)

详细信息请参考 mypy 文档

class TestOllamaAdapter:
@pytest.fixture(scope="class")
def ollama_adapter(self, mock_media_manager) -> OllamaAdapter:
config = OllamaConfig(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Unexpected keyword argument "api_key" for "OllamaConfig" (call-arg)

详细信息请参考 mypy 文档

fix: 修复openai_adapter中max_tokens参数被移除的问题。
fix: fix mypy提示的类型错误。
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 4 个类型问题,需要修复。

已对修改的代码行创建了 4 个行级评论。

response = voyage_adapter.embed(req)
assert isinstance(response, LLMEmbeddingResponse)
assert response.vectors[0] == REFERENCE_VECTOR
assert response.usage.total_tokens == 10

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Item "None" of "Usage | None" has no attribute "total_tokens" (union-attr)

详细信息请参考 mypy 文档


response = openai_adapter.chat(req)
assert isinstance(response, LLMChatResponse)
assert response.message.content[0].text == "mock_response"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Item "LLMChatImageContent" of "LLMChatTextContent | LLMChatImageContent | LLMToolCallContent | LLMToolResultContent" has no attribute "text" (union-attr)

详细信息请参考 mypy 文档


response = openai_adapter.chat(req)
assert isinstance(response, LLMChatResponse)
assert response.message.content[0].text == "mock_response"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Item "LLMToolCallContent" of "LLMChatTextContent | LLMChatImageContent | LLMToolCallContent | LLMToolResultContent" has no attribute "text" (union-attr)

详细信息请参考 mypy 文档


response = openai_adapter.chat(req)
assert isinstance(response, LLMChatResponse)
assert response.message.content[0].text == "mock_response"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Item "LLMToolResultContent" of "LLMChatTextContent | LLMChatImageContent | LLMToolCallContent | LLMToolResultContent" has no attribute "text" (union-attr)

详细信息请参考 mypy 文档

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型检查结果 ❌

在 PR 修改的代码行中发现了 1 个类型问题,需要修复。

已对修改的代码行创建了 1 个行级评论。

router = APIRouter(tags=["gemini"], dependencies=[Depends(gemini_authenticate)])

@router.post("/models/{model}:generateContent")
async def chat(model: str, request:ChatRequest = Body()) -> dict:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyPy 类型错误: Missing return statement (return)

详细信息请参考 mypy 文档

feature: 进一步完善了adapter的基本测试用例。
@Kaos-dawn
Copy link
Collaborator Author

lgtm

@Kaos-dawn Kaos-dawn merged commit 145aa35 into lss233:master May 22, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants