Skip to content

Commit 6ddd568

Browse files
committed
removed extensive code and add reranker in retriever in model client
1 parent 4cc0106 commit 6ddd568

File tree

5 files changed

+64
-151
lines changed

5 files changed

+64
-151
lines changed

docs/source/developer_notes/base_data_class.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,7 @@ The ``exclude`` parameter works the same across all methods.
300300

301301
**DataClassFormatType**
302302

303-
For data class format, we have :class:``core.base_data_class.DataClassFormatType`` along with ``format_class_str`` method to specify the format type for the data format methods.
303+
For data class format, we have :class:`DataClassFormatType<core.base_data_class.DataClassFormatType>` along with ``format_class_str`` method to specify the format type for the data format methods.
304304

305305
.. code-block:: python
306306

docs/source/developer_notes/index.rst

+5
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,11 @@ A `Prompt` will work with `DataClass` to ease data interaction with the LLM mode
5252
A `Retriever` will work with databases to retrieve context and overcome the hallucination and knowledge limitations of LLM, following the paradigm of Retrieval-Augmented Generation (RAG).
5353
An `Agent` will work with tools and an LLM planner for enhanced ability to reason, plan, and act on real-world tasks.
5454

55+
56+
Additionally, what shines in LightRAG is that all orchestrator components, like `Retriever`, `Embedder`, `Generator`, and `Agent`, are model-agnostic.
57+
You can easily make each component work with different models from different providers by switching out the `ModelClient` and its `model_kwargs`.
58+
59+
5560
We will introduce the libraries starting from the core base classes, then move to the RAG essentials, and finally to the agent essentials.
5661
With these building blocks, we will further introduce optimizing, where the optimizer uses building blocks such as Generator for auto-prompting and retriever for dynamic few-shot in-context learning (ICL).
5762

docs/source/developer_notes/model_client.rst

+54-147
Original file line numberDiff line numberDiff line change
@@ -6,32 +6,37 @@ ModelClient
66
77
.. `Li Yin <https://github.com/liyin2015>`_
88
9-
What you will learn?
9+
.. What you will learn?
1010
11-
1. What is ``ModelClient`` and why is it designed this way?
12-
2. How to intergrate your own ``ModelClient``?
13-
3. How to use ``ModelClient`` directly?
11+
.. 1. What is ``ModelClient`` and why is it designed this way?
12+
.. 2. How to intergrate your own ``ModelClient``?
13+
.. 3. How to use ``ModelClient`` directly?
14+
15+
16+
:ref:`ModelClient<core-model_client>` is the standardized protocol and base class for all model inference SDKs (either via APIs or local) to communicate with LightRAG internal components.
17+
Therefore, by switching out the ``ModelClient`` in a ``Generator``, ``Embedder``, or ``Retriever`` (those components that take models), you can make these functional components model-agnostic.
1418

15-
:ref:`ModelClient<core-model_client>` is the standardized protocol and base class for all model inference SDKs (either via APIs or local) to communicate with LightRAG internal components/classes.
16-
Because so, by switching off ``ModelClient`` in a ``Generator`` or ``Embedder`` component, you can make your prompt or ``Retriever`` model-agnostic.
1719

1820

1921
.. figure:: /_static/images/model_client.png
2022
:align: center
2123
:alt: ModelClient
2224
:width: 400px
2325

24-
The interface to internal components in LightRAG
26+
The bridge between all model inference SDKs and internal components in LightRAG
2527

2628
.. note::
2729

28-
All users are encouraged to customize your own ``ModelClient`` whenever you need to do so. You can refer our code in ``components.model_client`` dir.
30+
All users are encouraged to customize their own ``ModelClient`` whenever needed. You can refer to our code in ``components.model_client`` directory.
31+
2932

3033
Model Inference SDKs
3134
------------------------
32-
With cloud API providers like OpenAI, Groq, Anthropic, it often comes with a `sync` and an `async` client via their SDKs.
35+
36+
With cloud API providers like OpenAI, Groq, and Anthropic, it often comes with a `sync` and an `async` client via their SDKs.
3337
For example:
3438

39+
3540
.. code-block:: python
3641
3742
from openai import OpenAI, AsyncOpenAI
@@ -42,128 +47,32 @@ For example:
4247
# sync call using APIs
4348
response = sync_client.chat.completions.create(...)
4449
45-
For local models, such as using `huggingface transformers`, you need to create this model inference SDKs yourself.
46-
How you do this is highly flexible. Here is an example to use local embedding model (e.g. ``thenlper/gte-base``) as a model (Refer :class:`components.model_client.transformers_client.TransformerEmbedder` for details).
50+
For local models, such as using `huggingface transformers`, you need to create these model inference SDKs yourself.
51+
How you do this is highly flexible.
52+
Here is an example of using a local embedding model (e.g., ``thenlper/gte-base``) as a model (Refer to :class:`TransformerEmbedder<components.model_client.transformers_client.TransformerEmbedder>` for details).
4753
It really is just normal model inference code.
4854

49-
.. code-block:: python
50-
51-
from transformers import AutoTokenizer, AutoModel
52-
53-
class TransformerEmbedder:
54-
models: Dict[str, type] = {}
55-
56-
def __init__(self, model_name: Optional[str] = "thenlper/gte-base"):
57-
super().__init__()
58-
59-
if model_name is not None:
60-
self.init_model(model_name=model_name)
61-
62-
@lru_cache(None)
63-
def init_model(self, model_name: str):
64-
try:
65-
self.tokenizer = AutoTokenizer.from_pretrained(model_name)
66-
self.model = AutoModel.from_pretrained(model_name)
67-
# register the model
68-
self.models[model_name] = self.model
69-
70-
except Exception as e:
71-
log.error(f"Error loading model {model_name}: {e}")
72-
raise e
73-
74-
def infer_gte_base_embedding(
75-
self,
76-
input=Union[str, List[str]],
77-
tolist: bool = True,
78-
):
79-
model = self.models.get("thenlper/gte-base", None)
80-
if model is None:
81-
# initialize the model
82-
self.init_model("thenlper/gte-base")
83-
84-
if isinstance(input, str):
85-
input = [input]
86-
# Tokenize the input texts
87-
batch_dict = self.tokenizer(
88-
input, max_length=512, padding=True, truncation=True, return_tensors="pt"
89-
)
90-
outputs = model(**batch_dict)
91-
embeddings = average_pool(
92-
outputs.last_hidden_state, batch_dict["attention_mask"]
93-
)
94-
# (Optionally) normalize embeddings
95-
embeddings = F.normalize(embeddings, p=2, dim=1)
96-
if tolist:
97-
embeddings = embeddings.tolist()
98-
return embeddings
99-
100-
def __call__(self, **kwargs):
101-
if "model" not in kwargs:
102-
raise ValueError("model is required")
103-
# load files and models, cache it for the next inference
104-
model_name = kwargs["model"]
105-
# inference the model
106-
if model_name == "thenlper/gte-base":
107-
return self.infer_gte_base_embedding(kwargs["input"])
108-
else:
109-
raise ValueError(f"model {model_name} is not supported")
110-
111-
11255

11356

11457

11558
ModelClient Protocol
11659
-----------------------------------------------------------------------------------------------------------
117-
A model client can be used to manage different types of models, we defined a ``ModelType`` to categorize the model type.
60+
A model client can be used to manage different types of models, we defined a :class:`ModelType<core.types.ModelType>` to categorize the model type.
11861

11962
.. code-block:: python
12063
12164
class ModelType(Enum):
12265
EMBEDDER = auto()
12366
LLM = auto()
67+
RERANKER = auto()
12468
UNDEFINED = auto()
12569
126-
We designed 6 abstract methods in the ``ModelClient`` class to be implemented by the subclass model type.
127-
We will use :class:`components.model_client.OpenAIClient` along with the above ``TransformerEmbedder`` as examples.
128-
129-
First, we offer two methods to initialize the model SDKs:
130-
131-
.. code-block:: python
132-
133-
def init_sync_client(self):
134-
raise NotImplementedError(
135-
f"{type(self).__name__} must implement _init_sync_client method"
136-
)
137-
138-
def init_async_client(self):
139-
raise NotImplementedError(
140-
f"{type(self).__name__} must implement _init_async_client method"
141-
)
70+
We designed 6 abstract methods in the `ModelClient` class that can be implemented by subclasses to integrate with different model inference SDKs.
71+
We will use :class:`OpenAIClient<components.model_client.OpenAIClient>` as the cloud API example and :class:`TransformersClient<components.model_client.transformers_client.TransformersClient>` along with the local inference code :class:`TransformerEmbedder<components.model_client.transformers_client.TransformerEmbedder>` as an example for local model clients.
14272

143-
This is how `OpenAIClient` implements these methods along with ``__init__`` method:
144-
145-
.. code-block:: python
14673

147-
class OpenAIClient(ModelClient):
148-
149-
def __init__(self, api_key: Optional[str] = None):
150-
151-
super().__init__()
152-
self._api_key = api_key
153-
self.sync_client = self.init_sync_client()
154-
self.async_client = None # only initialize if the async call is called
155-
156-
def init_sync_client(self):
157-
api_key = self._api_key or os.getenv("OPENAI_API_KEY")
158-
if not api_key:
159-
raise ValueError("Environment variable OPENAI_API_KEY must be set")
160-
return OpenAI(api_key=api_key)
161-
162-
def init_async_client(self):
163-
api_key = self._api_key or os.getenv("OPENAI_API_KEY")
164-
if not api_key:
165-
raise ValueError("Environment variable OPENAI_API_KEY must be set")
166-
return AsyncOpenAI(api_key=api_key)
74+
First, we offer two methods, `init_async_client` and `init_sync_client`, for subclasses to initialize the SDK client.
75+
You can refer to :class:`OpenAIClient<components.model_client.OpenAIClient>` to see how these methods, along with the `__init__` method, are implemented:
16776

16877
This is how ``TransformerClient`` does the same thing:
16978

@@ -183,8 +92,7 @@ This is how ``TransformerClient`` does the same thing:
18392
def init_sync_client(self):
18493
return TransformerEmbedder()
18594
186-
187-
Second. we use `convert_inputs_to_api_kwargs` for subclass to convert LightRAG inputs into the `api_kwargs` (SDKs arguments).
95+
Second, we use `convert_inputs_to_api_kwargs` for subclasses to convert LightRAG inputs into the `api_kwargs` (SDK arguments).
18896

18997
.. code-block:: python
19098
@@ -228,6 +136,15 @@ This is how `OpenAIClient` implements this method:
228136
raise ValueError(f"model_type {model_type} is not supported")
229137
return final_model_kwargs
230138
139+
.. For embedding, as `Embedder` takes both `str` and `List[str]` as input, we need to convert the input to a list of strings.
140+
.. For LLM, as `Generator` takes a `prompt_kwargs` (dict) and converts it into a single string, we need to convert the input to a list of messages.
141+
.. For Rerankers, you can refer to :class:`CohereAPIClient<components.model_client.cohere_client.CohereAPIClient>` for an example.
142+
143+
144+
For embedding, as ``Embedder`` takes both `str` and `List[str]` as input, we need to convert the input to a list of strings that is acceptable by the SDK.
145+
For LLM, as ``Generator`` will takes a `prompt_kwargs`(dict) and convert it into a single string, thus we need to convert the input to a list of messages.
146+
For Rerankers, you can refer to :class:`CohereAPIClient<components.model_client.cohere_client.CohereAPIClient>` for an example.
147+
231148
This is how ``TransformerClient`` does the same thing:
232149

233150
.. code-block:: python
@@ -245,37 +162,15 @@ This is how ``TransformerClient`` does the same thing:
245162
else:
246163
raise ValueError(f"model_type {model_type} is not supported")
247164
248-
In addition, you can add any method that parse the SDK specific output to a format compatible with LightRAG components.
249-
Typically an LLM needs to use `parse_chat_completion` to parse the completion to texts and `parse_embedding_response` to parse the embedding response to a structure LightRAG components can understand.
250-
251165
252-
.. code-block:: python
253-
254-
def parse_chat_completion(self, completion: Any) -> str:
255-
raise NotImplementedError(
256-
f"{type(self).__name__} must implement parse_chat_completion method"
257-
)
166+
In addition, you can add any method that parses the SDK-specific output to a format compatible with LightRAG components.
167+
Typically, an LLM needs to use `parse_chat_completion` to parse the completion to text and `parse_embedding_response` to parse the embedding response to a structure that LightRAG components can understand.
168+
You can refer to :class:`OpenAIClient<components.model_client.openai_client.OpenAIClient>` for API embedding model integration and :class:`TransformersClient<components.model_client.transformers_client.TransformersClient>` for local embedding model integration.
258169

259-
def parse_embedding_response(self, response: Any) -> EmbedderOutput:
260-
r"""Parse the embedding response to a structure LightRAG components can understand."""
261-
raise NotImplementedError(
262-
f"{type(self).__name__} must implement parse_embedding_response method"
263-
)
264170

265-
You can refer to :class:`components.model_client.openai_client.OpenAIClient` for API embedding model integration and :class:`components.model_client.transformers_client.TransformersClient` for local embedding model integration.
171+
Lastly, the `call` and `acall` methods are used to call model inference via their own arguments.
172+
We encourage subclasses to provide error handling and retry mechanisms in these methods.
266173

267-
Then `call` and `acall` methods to call Model inference via their own arguments.
268-
We encourage the subclass provides error handling and retry mechanism in these methods.
269-
270-
.. code-block:: python
271-
272-
def call(self, api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED):
273-
raise NotImplementedError(f"{type(self).__name__} must implement _call method")
274-
275-
async def acall(
276-
self, api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED
277-
):
278-
pass
279174

280175
The `OpenAIClient` example:
281176

@@ -296,21 +191,28 @@ The `TransformerClient` example:
296191
def call(self, api_kwargs: Dict = {}, model_type: ModelType = ModelType.UNDEFINED):
297192
return self.sync_client(**api_kwargs)
298193
299-
300-
Our library currently integrated with 5 providers: OpenAI, Groq, Anthropic, Huggingface, and Google.
194+
O
195+
ur library currently integrates with six providers: OpenAI, Groq, Anthropic, Huggingface, Google, and Cohere.
301196
Please check out :ref:`ModelClient Integration<components-model_client>`.
302197

198+
199+
303200
Use ModelClient directly
304201
-----------------------------------------------------------------------------------------------------------
305-
Though ``ModelClient`` is often managed in a ``Generator`` or ``Embedder`` component, you can use it directly if you ever plan to write your own component.
306-
Here is an example to use ``OpenAIClient`` directly, first on LLM model:
202+
203+
204+
Though ``ModelClient`` is often managed in a ``Generator``, ``Embedder``, or ``Retriever`` component, you can use it directly if you plan to write your own component.
205+
Here is an example of using ``OpenAIClient`` directly, first on an LLM model:
206+
307207

308208
.. code-block:: python
309209
310210
from lightrag.components.model_client import OpenAIClient
311211
from lightrag.core.types import ModelType
312212
from lightrag.utils import setup_env
313213
214+
setup_env()
215+
314216
openai_client = OpenAIClient()
315217
316218
query = "What is the capital of France?"
@@ -361,6 +263,10 @@ The output will be:
361263
api_kwargs: {'model': 'text-embedding-3-small', 'dimensions': 8, 'encoding_format': 'float', 'input': ['What is the capital of France?', 'What is the capital of France?']}
362264
reponse_embedder_output: EmbedderOutput(data=[Embedding(embedding=[0.6175549, 0.24047995, 0.4509756, 0.37041178, -0.33437008, -0.050995983, -0.24366009, 0.21549304], index=0), Embedding(embedding=[0.6175549, 0.24047995, 0.4509756, 0.37041178, -0.33437008, -0.050995983, -0.24366009, 0.21549304], index=1)], model='text-embedding-3-small', usage=Usage(prompt_tokens=14, total_tokens=14), error=None, raw_response=None)
363265
266+
267+
.. TODO: add optional package introduction here
268+
269+
364270
.. admonition:: API reference
365271
:class: highlight
366272

@@ -370,3 +276,4 @@ The output will be:
370276
- :class:`components.model_client.groq_client.GroqAPIClient`
371277
- :class:`components.model_client.anthropic_client.AnthropicAPIClient`
372278
- :class:`components.model_client.google_client.GoogleGenAIClient`
279+
- :class:`components.model_client.cohere_client.CohereAPIClient`

docs/source/developer_notes/output_parsers.rst

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
Parser
22
=============
33

4-
In this note, we will explain LightRAG parser and output parsers.
4+
Parser is the `interpreter` of the LLM output.
5+
6+
57

68
Context
79
----------------
@@ -21,7 +23,6 @@ It is an important step for the LLM applications to interact with the external w
2123
- to list to support multiple choice selection.
2224
- to json/yaml which will be extracted to dict, and optional further to data class instance to support support cases like function calls.
2325

24-
Parsing is the `interpreter` of the LLM output.
2526

2627
Scope and Design
2728
------------------

docs/source/get_started/installation.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Or, you can load it yourself with ``python-dotenv``:
5353
5454
This setup ensures that LightRAG can access all necessary configurations during runtime.
5555

56-
1. Install Optional Packages
56+
4. Install Optional Packages
5757
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5858

5959

0 commit comments

Comments
 (0)