[Feature Request] I need #2064

moshilangzi · 2025-02-10T10:38:39Z

Problem Description

Currently, the project lacks support for integrating models deployed via Xinference, such as large language models (LLMs), embedding models, rerank models, and multimodal embedding models. This limits the flexibility and scalability of the project, as Xinference provides a robust and efficient way to deploy and manage these models. Without this integration, users are unable to leverage the advanced capabilities of Xinference-deployed models within the project.

Example: "I often find it frustrating when I need to use advanced models like LLMs or multimodal embeddings in my project, but the current setup doesn’t support Xinference, which is my preferred deployment tool. This forces me to use less efficient or less scalable alternatives."

Proposed Solution

I propose adding support for Xinference-deployed models to the project. This would involve:

Integration with Xinference API: Implement functionality to connect to Xinference-deployed models via its API. This includes support for:

Large Language Models (LLMs)

Embedding Models

Rerank Models

Multimodal Embedding Models

Model Configuration: Allow users to configure Xinference-deployed models through a simple configuration file or environment variables. This should include:

Model endpoints

API keys (if required)

Model-specific parameters (e.g., temperature, max tokens for LLMs)

Seamless Usage: Ensure that the integration allows for seamless usage of these models within the existing project workflow. For example:

Embedding models should work with existing vector search functionality.

LLMs should integrate with the project’s text generation or chat features.

Rerank models should enhance search or recommendation systems.

Documentation: Provide clear documentation on how to set up and use Xinference-deployed models within the project.

Alternatives Considered

Local Model Deployment: One alternative is to deploy models locally without using Xinference. However, this approach is less scalable and requires more resources, making it less ideal for production environments.

Other Model Deployment Tools: I considered using other model deployment tools, but Xinference stands out due to its ease of use, scalability, and support for a wide range of models.

Additional Context

Xinference Documentation: Xinference Models

Similar Features: Other projects like LangChain and Haystack have integrations with various model deployment tools, which could serve as inspiration for this implementation.

Use Case: This feature would be particularly useful for users who need to deploy and manage multiple models in a scalable and efficient manner, especially in production environments.

This feature request outlines the need for Xinference support in the project and provides a clear path for implementation. If accepted, it would significantly enhance the project’s capabilities and flexibility.

pritipsingh · 2025-02-11T19:15:00Z

@moshilangzi Thanks for such a detailed request. We really appreciate it. We've added this to our community wishlist. However, it will definitely take some time on our end due to current priorities. If you have some time, feel free to explore & contribute.

moshilangzi added the enhancement New feature or request label Feb 10, 2025

moshilangzi closed this as completed Feb 12, 2025

moshilangzi reopened this Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] I need #2064

[Feature Request] I need #2064

moshilangzi commented Feb 10, 2025

pritipsingh commented Feb 11, 2025

[Feature Request] I need #2064

[Feature Request] I need #2064

Comments

moshilangzi commented Feb 10, 2025

Problem Description

Proposed Solution

Alternatives Considered

Additional Context

pritipsingh commented Feb 11, 2025