Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] I need #2064

Open
moshilangzi opened this issue Feb 10, 2025 · 1 comment
Open

[Feature Request] I need #2064

moshilangzi opened this issue Feb 10, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@moshilangzi
Copy link

Problem Description

Currently, the project lacks support for integrating models deployed via Xinference, such as large language models (LLMs), embedding models, rerank models, and multimodal embedding models. This limits the flexibility and scalability of the project, as Xinference provides a robust and efficient way to deploy and manage these models. Without this integration, users are unable to leverage the advanced capabilities of Xinference-deployed models within the project.

Example: "I often find it frustrating when I need to use advanced models like LLMs or multimodal embeddings in my project, but the current setup doesn’t support Xinference, which is my preferred deployment tool. This forces me to use less efficient or less scalable alternatives."

Proposed Solution

I propose adding support for Xinference-deployed models to the project. This would involve:

Integration with Xinference API: Implement functionality to connect to Xinference-deployed models via its API. This includes support for:

Large Language Models (LLMs)

Embedding Models

Rerank Models

Multimodal Embedding Models

Model Configuration: Allow users to configure Xinference-deployed models through a simple configuration file or environment variables. This should include:

Model endpoints

API keys (if required)

Model-specific parameters (e.g., temperature, max tokens for LLMs)

Seamless Usage: Ensure that the integration allows for seamless usage of these models within the existing project workflow. For example:

Embedding models should work with existing vector search functionality.

LLMs should integrate with the project’s text generation or chat features.

Rerank models should enhance search or recommendation systems.

Documentation: Provide clear documentation on how to set up and use Xinference-deployed models within the project.

Alternatives Considered

Local Model Deployment: One alternative is to deploy models locally without using Xinference. However, this approach is less scalable and requires more resources, making it less ideal for production environments.

Other Model Deployment Tools: I considered using other model deployment tools, but Xinference stands out due to its ease of use, scalability, and support for a wide range of models.

Additional Context

Xinference Documentation: Xinference Models

Similar Features: Other projects like LangChain and Haystack have integrations with various model deployment tools, which could serve as inspiration for this implementation.

Use Case: This feature would be particularly useful for users who need to deploy and manage multiple models in a scalable and efficient manner, especially in production environments.

This feature request outlines the need for Xinference support in the project and provides a clear path for implementation. If accepted, it would significantly enhance the project’s capabilities and flexibility.

@moshilangzi moshilangzi added the enhancement New feature or request label Feb 10, 2025
@pritipsingh
Copy link
Contributor

@moshilangzi Thanks for such a detailed request. We really appreciate it. We've added this to our community wishlist. However, it will definitely take some time on our end due to current priorities. If you have some time, feel free to explore & contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants