-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature add Add LlamaCppChatCompletionClient and llama-cpp #5326
base: main
Are you sure you want to change the base?
Conversation
…d chat capabilities
@aribornstein please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5326 +/- ##
==========================================
- Coverage 76.09% 75.15% -0.95%
==========================================
Files 157 159 +2
Lines 9475 9595 +120
==========================================
+ Hits 7210 7211 +1
- Misses 2265 2384 +119
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -31,6 +31,11 @@ file-surfer = [ | |||
"autogen-agentchat==0.4.5", | |||
"markitdown>=0.0.1a2", | |||
] | |||
|
|||
llama-cpp = [ | |||
"llama-cpp-python" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a minimum version bound for this, e.g., the current stable version.
from ._llama_cpp_completion_client import LlamaCppChatCompletionClient | ||
except ImportError as e: | ||
raise ImportError( | ||
"Dependencies for Llama Cpp not found. " "Please install llama-cpp-python: " "pip install llama-cpp-python" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Dependencies for Llama Cpp not found. " "Please install llama-cpp-python: " "pip install llama-cpp-python" | |
"Dependencies for Llama Cpp not found. " "Please install llama-cpp-python extra: " "pip install autogen-ext[llama-cpp]" |
repo_id: str, | ||
filename: str, | ||
n_gpu_layers: int = -1, | ||
seed: int = 1337, | ||
n_ctx: int = 1000, | ||
verbose: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://llama-cpp-python.readthedocs.io/en/latest/api-reference/#llama_cpp.Llama.from_pretrained
Let's keep it the same as the underlying API? We can use *kwargs to pass in additional arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit tests in the python/packages/autogen-ext/tests
directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will work on this tomorrow
self, | ||
filename: str, | ||
verbose: bool = True, | ||
**kwargs: Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: There is a way to allow typing for **kwargs by using Unpack[]
on a TypedDict
. See example here:
This pull request introduces the integration of the
llama-cpp
library into theautogen-ext
package, with significant changes to the project dependencies and the implementation of a new chat completion client. The most important changes include updating the project dependencies, adding a new module for theLlamaCppChatCompletionClient
, and implementing the client with various functionalities.Project Dependencies:
python/packages/autogen-ext/pyproject.toml
: Addedllama-cpp-python
as a new dependency under thellama-cpp
section.New Module:
python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/__init__.py
: Introduced theLlamaCppChatCompletionClient
class and handled import errors with a descriptive message for missing dependencies.Implementation of
LlamaCppChatCompletionClient
:python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py
:LlamaCppChatCompletionClient
class with methods to initialize the client, create chat completions, detect and execute tools, and handle streaming responses.Why are these changes needed?
Related issue number
Checks