You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/source/developer_notes/base_data_class.rst
+1-1
Original file line number
Diff line number
Diff line change
@@ -300,7 +300,7 @@ The ``exclude`` parameter works the same across all methods.
300
300
301
301
**DataClassFormatType**
302
302
303
-
For data class format, we have :class:``core.base_data_class.DataClassFormatType`` along with ``format_class_str`` method to specify the format type for the data format methods.
303
+
For data class format, we have :class:`DataClassFormatType<core.base_data_class.DataClassFormatType>` along with ``format_class_str`` method to specify the format type for the data format methods.
Copy file name to clipboardexpand all lines: docs/source/developer_notes/component.rst
+90-37
Original file line number
Diff line number
Diff line change
@@ -6,24 +6,30 @@ Component
6
6
7
7
.. `Li Yin <https://github.com/liyin2015>`_
8
8
9
-
What you will learn?
9
+
.. What you will learn?
10
+
11
+
.. 1. What is ``Component`` and why is it designed this way?
12
+
.. 2. How to use ``Component`` along with helper classes like ``FunComponent`` and ``Sequential``?
13
+
14
+
15
+
:ref:`Component<core-component>` is to LLM task pipelines what `nn.Module` is to PyTorch models.
16
+
It is the base class for components such as ``Prompt``, ``ModelClient``, ``Generator``, ``Retriever`` in LightRAG.
17
+
Your task pipeline should also subclass from ``Component``.
18
+
10
19
11
-
1. What is ``Component`` and why is it designed this way?
12
-
2. How to use ``Component`` along with helper classes like ``FunComponent`` and ``Sequential``?
13
20
14
21
Design
15
22
---------------------------------------
16
-
:ref:`Component<core-component>` is to LLM task pipelines what ``nn.Module`` is to PyTorch models.
17
23
18
-
It is the base class for components, such as ``Prompt``, ``ModelClient``, ``Generator``, ``Retriever`` in LightRAG.
19
-
Your task pipeline should subclass from ``Component`` too. Instead of working with ``Tensor`` and ``Parameter`` to train models with weights and biases, our component works with any data, ``Parameter`` that can be any data type for LLM in-context learning, from manual to auto prompt engineering.
20
-
We name it differently to avoid confusion and also for better compatibility with `PyTorch`.
24
+
Different from PyTorch's nn.Module, which works exclusively with Tensor and Parameter to train models with weights and biases, our component can work with different types of data, from a string or a list of strings to a list of :class:`Document<core.types.Document>`.
25
+
26
+
.. `Parameter` that can be any data type for LLM in-context learning, from manual to auto prompt engineering.
21
27
22
28
29
+
Here is the comparison of writing a PyTorch model and a LightRAG task pipeline.
23
30
24
-
Here is the comparison of writing a PyTorch model and a LightRAG task component.
25
31
26
-
.. grid:: 2
32
+
.. grid:: 1
27
33
:gutter: 1
28
34
29
35
.. grid-item-card:: PyTorch
@@ -65,28 +71,49 @@ Here is the comparison of writing a PyTorch model and a LightRAG task component.
As the fundamental building block in LLM task pipelines, the component is designed to serve five main purposes:
75
+
76
+
1. **Standardize the interface for all components.**
77
+
This includes the `__init__` method, the `call` method for synchronous calls, the `acall` method for asynchronous calls, and the `__call__` method, which by default calls the `call` method.
68
78
69
-
As the foundamental building block in LLM task pipeline, the component is designed to serve five main purposes:
79
+
2. **Provide a unified way to visualize the structure of the task pipeline**
80
+
via the `__repr__` method. Subclasses can additionally add the `_extra_repr` method to include more information than the default `__repr__` method.
70
81
71
-
1. **Standarize the interface for all components.** This includes the `__init__` method, the `call` method for synchronous call, the `acall` method for asynchronous call, and the `__call__` which in default calls the `call` method.
72
-
2. **Provide a unified way to visualize the structure of the task pipeline** via `__repr__` method. And subclass can additional add `_extra_repr` method to add more information than the default `__repr__` method.
73
-
3. **Tracks, adds all subcomponents and parameters automatically and recursively** to assistant the building and optimizing process of the task pipeline.
74
-
4. **Manages the states and serialization**, with `state_dict` and `load_state_dict` methods in particular for parameters and `to_dict` method for serialization of all the states fall into the component's attributes, from subcomponents to parameters, to any other attributes of various data type.
75
-
5. **Make all components configurable from using `json` or `yaml` files**. This is especially useful for experimenting or building data processing pipelines.
82
+
3. **Track and add all subcomponents and parameters automatically and recursively**
83
+
to assist in the building and optimizing process of the task pipeline.
76
84
77
-
These features are key to keep LightRAG pipeline transparent, flexible, and easy to use.
85
+
4. **Manage the states and serialization**,
86
+
with `state_dict` and `load_state_dict` methods specifically for parameters, and the `to_dict` method for serialization of all states within the component's attributes, from subcomponents to parameters, to any other attributes of various data types.
87
+
88
+
5. **Make all components configurable using `json` or `yaml` files**.
89
+
This is especially useful for experimenting or building data processing pipelines.
90
+
91
+
These features are key to keeping the LightRAG pipeline transparent, flexible, and easy to use.
78
92
By subclassing from the `Component` class, you will get most of these features out of the box.
79
93
80
94
95
+
.. As the foundamental building block in LLM task pipeline, the component is designed to serve five main purposes:
96
+
97
+
.. 1. **Standarize the interface for all components.** This includes the `__init__` method, the `call` method for synchronous call, the `acall` method for asynchronous call, and the `__call__` which in default calls the `call` method.
98
+
.. 2. **Provide a unified way to visualize the structure of the task pipeline** via `__repr__` method. And subclass can additional add `_extra_repr` method to add more information than the default `__repr__` method.
99
+
.. 3. **Tracks, adds all subcomponents and parameters automatically and recursively** to assistant the building and optimizing process of the task pipeline.
100
+
.. 4. **Manages the states and serialization**, with `state_dict` and `load_state_dict` methods in particular for parameters and `to_dict` method for serialization of all the states fall into the component's attributes, from subcomponents to parameters, to any other attributes of various data type.
101
+
.. 5. **Make all components configurable from using `json` or `yaml` files**. This is especially useful for experimenting or building data processing pipelines.
102
+
103
+
.. These features are key to keep LightRAG pipeline transparent, flexible, and easy to use.
104
+
.. By subclassing from the `Component` class, you will get most of these features out of the box.
105
+
106
+
81
107
Component in Action
82
108
---------------------------------------
83
109
84
-
.. Transparency
85
-
.. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110
+
111
+
86
112
87
113
In this note, we are creating an AI doctor to answer medical questions.
88
114
Run the ``DocQA`` on a query:
89
115
116
+
90
117
.. code-block:: python
91
118
92
119
doc = DocQA()
@@ -133,6 +160,7 @@ Configure from file
133
160
As the above example shows, we added subcomponent via attributes.
134
161
We can also use methods to add more subcomponnents or parameters.
135
162
163
+
136
164
.. code-block:: python
137
165
138
166
from lightrag.core.parameter import Parameter
@@ -141,8 +169,12 @@ We can also use methods to add more subcomponnents or parameters.
141
169
# list all parameters
142
170
for param in doc.named_parameters():
143
171
print(param)
144
-
# output
145
-
# ('demo', Parameter: demo)
172
+
173
+
The output:
174
+
175
+
.. code-block::
176
+
177
+
('demo', Parameter: demo)
146
178
147
179
You can easily save the detailed states:
148
180
@@ -152,21 +184,25 @@ You can easily save the detailed states:
152
184
153
185
save_json(doc.to_dict(), "doc.json")
154
186
187
+
To add even more flexibility, we provide :class:`FunComponent<core.component.FunComponent>` and :class:`Sequential<core.container.Sequential>` for more advanced use cases.
155
188
156
-
To adds even more flexibility, we provide :class:`core.component.FunComponent` and :class:`core.component.Sequential` for more advanced use cases.
157
189
158
190
159
191
Searalization and deserialization
160
192
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
161
193
162
-
We provide ``is_pickable`` method to check if the component is pickable.
163
-
And any of your component, it is a good practise to ensure it is pickable.
194
+
We provide the ``is_pickable`` method to check if the component is pickable.
195
+
It is good practice to ensure that any of your components are pickable.
196
+
197
+
198
+
199
+
164
200
165
201
FunComponent
166
202
--------------
167
-
Use :func:`core.component.fun_to_component` as a decorator to convert any function to a Component with its unique class name.
203
+
Use :func:`fun_to_component<core.component.fun_to_component>` as a decorator to convert any function to a Component with its unique class name.
168
204
169
-
:class:`core.component.FunComponent` is a subclass of :class:`core.component.Component` that allows you to define a component with a function.
205
+
:class:`FunComponent<core.component.FunComponent>` is a subclass of :class:`Component<core.component.Component>` that allows you to define a component with a function.
170
206
You can directly use this class as:
171
207
172
208
.. code-block:: python
@@ -180,16 +216,21 @@ You can directly use this class as:
180
216
print(fun_component(1))
181
217
print(type(fun_component))
182
218
183
-
# output:
184
-
# 2
185
-
# <class 'core.component.FunComponent'>
219
+
The printout:
220
+
221
+
.. code-block::
186
222
223
+
2
224
+
<class 'core.component.FunComponent'>
187
225
188
-
We also have :func:`core.component.fun_to_component` to convert a function to a FunComponent via decorator or directly call the function.
226
+
227
+
228
+
We also have :func:`fun_to_component<core.component.fun_to_component>` to convert a function to a `FunComponent` via a decorator or by directly calling the function.
189
229
This approach gives you a unique component converted from the function name.
190
230
191
231
Via direct call:
192
232
233
+
193
234
.. code-block:: python
194
235
195
236
from lightrag.core.component import fun_to_component
Via decorator will be even more convenient to have a component from a function:
250
+
251
+
252
+
Using a decorator is an even more convenient way to create a component from a function:
207
253
208
254
.. code-block:: python
209
255
@@ -220,8 +266,12 @@ Via decorator will be even more convenient to have a component from a function:
220
266
221
267
Sequential
222
268
--------------
223
-
We have :class:`core.component.Sequential` class to PyTorch's ``nn.Sequential`` class. This is especially useful to chain together components in a sequence. Much like the concept of ``chain`` or ``pipeline`` in other LLM libraries.
224
-
Let's put the FunComponent and DocQA together in a sequence:
269
+
270
+
271
+
272
+
We have the :class:`Sequential<core.container.Sequential>` class, which is similar to PyTorch's ``nn.Sequential`` class.
273
+
This is especially useful for chaining together components in a sequence, much like the concept of ``chain`` or ``pipeline`` in other LLM libraries.
274
+
Let's put the `FunComponent`` and `DocQA`` together in a sequence:
225
275
226
276
.. code-block:: python
227
277
@@ -236,9 +286,12 @@ Let's put the FunComponent and DocQA together in a sequence:
236
286
query ="What is the best treatment for headache?"
237
287
print(seq(query))
238
288
239
-
We automatically enhance users' queries before passing them to the DocQA component.
289
+
We automatically enhance users' queries before passing them to the `DocQA` component.
240
290
The output is:
241
291
292
+
293
+
294
+
242
295
.. code-block::
243
296
244
297
1. Over-the-counter pain relievers like acetaminophen, ibuprofen, or aspirin
@@ -269,4 +322,4 @@ The structure of the sequence using ``print(seq)``:
269
322
- :func:`core.component.fun_to_component`
270
323
271
324
272
-
We will have more advanced use cases in the upcoming tutorials.
325
+
We will cover more advanced use cases in the upcoming tutorials.
Copy file name to clipboardexpand all lines: docs/source/developer_notes/index.rst
+19
Original file line number
Diff line number
Diff line change
@@ -41,11 +41,30 @@ We have a clear :doc:`lightrag_design_philosophy`, which results in this :doc:`c
41
41
class_hierarchy
42
42
43
43
44
+
Introduction
45
+
-------------------
46
+
47
+
48
+
:ref:`Component<core-component>` is to LLM task pipelines what `nn.Module` is to PyTorch models.
49
+
An LLM task pipeline in LightRAG mainly consists of components, such as a `Prompt`, `ModelClient`, `Generator`, `Retriever`, `Agent`, or any other custom components.
50
+
This pipeline can be `Sequential` or a Directed Acyclic Graph (DAG) of components.
51
+
A `Prompt` will work with `DataClass` to ease data interaction with the LLM model.
52
+
A `Retriever` will work with databases to retrieve context and overcome the hallucination and knowledge limitations of LLM, following the paradigm of Retrieval-Augmented Generation (RAG).
53
+
An `Agent` will work with tools and an LLM planner for enhanced ability to reason, plan, and act on real-world tasks.
54
+
44
55
56
+
Additionally, what shines in LightRAG is that all orchestrator components, like `Retriever`, `Embedder`, `Generator`, and `Agent`, are model-agnostic.
57
+
You can easily make each component work with different models from different providers by switching out the `ModelClient` and its `model_kwargs`.
58
+
59
+
60
+
We will introduce the libraries starting from the core base classes, then move to the RAG essentials, and finally to the agent essentials.
61
+
With these building blocks, we will further introduce optimizing, where the optimizer uses building blocks such as Generator for auto-prompting and retriever for dynamic few-shot in-context learning (ICL).
0 commit comments