Skip to content

Commit d0db80f

Browse files
authored
Merge pull request SylphAI-Inc#103 from SylphAI-Inc/li
[Tutorial editing] Edit all raw tutorials to fix grammar errors and smooth the content
2 parents 2be590a + 6ddd568 commit d0db80f

File tree

6 files changed

+168
-188
lines changed

6 files changed

+168
-188
lines changed

docs/source/developer_notes/base_data_class.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,7 @@ The ``exclude`` parameter works the same across all methods.
300300

301301
**DataClassFormatType**
302302

303-
For data class format, we have :class:``core.base_data_class.DataClassFormatType`` along with ``format_class_str`` method to specify the format type for the data format methods.
303+
For data class format, we have :class:`DataClassFormatType<core.base_data_class.DataClassFormatType>` along with ``format_class_str`` method to specify the format type for the data format methods.
304304

305305
.. code-block:: python
306306

docs/source/developer_notes/component.rst

+90-37
Original file line numberDiff line numberDiff line change
@@ -6,24 +6,30 @@ Component
66
77
.. `Li Yin <https://github.com/liyin2015>`_
88
9-
What you will learn?
9+
.. What you will learn?
10+
11+
.. 1. What is ``Component`` and why is it designed this way?
12+
.. 2. How to use ``Component`` along with helper classes like ``FunComponent`` and ``Sequential``?
13+
14+
15+
:ref:`Component<core-component>` is to LLM task pipelines what `nn.Module` is to PyTorch models.
16+
It is the base class for components such as ``Prompt``, ``ModelClient``, ``Generator``, ``Retriever`` in LightRAG.
17+
Your task pipeline should also subclass from ``Component``.
18+
1019

11-
1. What is ``Component`` and why is it designed this way?
12-
2. How to use ``Component`` along with helper classes like ``FunComponent`` and ``Sequential``?
1320

1421
Design
1522
---------------------------------------
16-
:ref:`Component<core-component>` is to LLM task pipelines what ``nn.Module`` is to PyTorch models.
1723

18-
It is the base class for components, such as ``Prompt``, ``ModelClient``, ``Generator``, ``Retriever`` in LightRAG.
19-
Your task pipeline should subclass from ``Component`` too. Instead of working with ``Tensor`` and ``Parameter`` to train models with weights and biases, our component works with any data, ``Parameter`` that can be any data type for LLM in-context learning, from manual to auto prompt engineering.
20-
We name it differently to avoid confusion and also for better compatibility with `PyTorch`.
24+
Different from PyTorch's nn.Module, which works exclusively with Tensor and Parameter to train models with weights and biases, our component can work with different types of data, from a string or a list of strings to a list of :class:`Document<core.types.Document>`.
25+
26+
.. `Parameter` that can be any data type for LLM in-context learning, from manual to auto prompt engineering.
2127
2228
29+
Here is the comparison of writing a PyTorch model and a LightRAG task pipeline.
2330

24-
Here is the comparison of writing a PyTorch model and a LightRAG task component.
2531

26-
.. grid:: 2
32+
.. grid:: 1
2733
:gutter: 1
2834

2935
.. grid-item-card:: PyTorch
@@ -65,28 +71,49 @@ Here is the comparison of writing a PyTorch model and a LightRAG task component.
6571
def call(self, query: str) -> str:
6672
return self.doc(prompt_kwargs={"input_str": query}).data
6773
74+
As the fundamental building block in LLM task pipelines, the component is designed to serve five main purposes:
75+
76+
1. **Standardize the interface for all components.**
77+
This includes the `__init__` method, the `call` method for synchronous calls, the `acall` method for asynchronous calls, and the `__call__` method, which by default calls the `call` method.
6878

69-
As the foundamental building block in LLM task pipeline, the component is designed to serve five main purposes:
79+
2. **Provide a unified way to visualize the structure of the task pipeline**
80+
via the `__repr__` method. Subclasses can additionally add the `_extra_repr` method to include more information than the default `__repr__` method.
7081

71-
1. **Standarize the interface for all components.** This includes the `__init__` method, the `call` method for synchronous call, the `acall` method for asynchronous call, and the `__call__` which in default calls the `call` method.
72-
2. **Provide a unified way to visualize the structure of the task pipeline** via `__repr__` method. And subclass can additional add `_extra_repr` method to add more information than the default `__repr__` method.
73-
3. **Tracks, adds all subcomponents and parameters automatically and recursively** to assistant the building and optimizing process of the task pipeline.
74-
4. **Manages the states and serialization**, with `state_dict` and `load_state_dict` methods in particular for parameters and `to_dict` method for serialization of all the states fall into the component's attributes, from subcomponents to parameters, to any other attributes of various data type.
75-
5. **Make all components configurable from using `json` or `yaml` files**. This is especially useful for experimenting or building data processing pipelines.
82+
3. **Track and add all subcomponents and parameters automatically and recursively**
83+
to assist in the building and optimizing process of the task pipeline.
7684

77-
These features are key to keep LightRAG pipeline transparent, flexible, and easy to use.
85+
4. **Manage the states and serialization**,
86+
with `state_dict` and `load_state_dict` methods specifically for parameters, and the `to_dict` method for serialization of all states within the component's attributes, from subcomponents to parameters, to any other attributes of various data types.
87+
88+
5. **Make all components configurable using `json` or `yaml` files**.
89+
This is especially useful for experimenting or building data processing pipelines.
90+
91+
These features are key to keeping the LightRAG pipeline transparent, flexible, and easy to use.
7892
By subclassing from the `Component` class, you will get most of these features out of the box.
7993

8094

95+
.. As the foundamental building block in LLM task pipeline, the component is designed to serve five main purposes:
96+
97+
.. 1. **Standarize the interface for all components.** This includes the `__init__` method, the `call` method for synchronous call, the `acall` method for asynchronous call, and the `__call__` which in default calls the `call` method.
98+
.. 2. **Provide a unified way to visualize the structure of the task pipeline** via `__repr__` method. And subclass can additional add `_extra_repr` method to add more information than the default `__repr__` method.
99+
.. 3. **Tracks, adds all subcomponents and parameters automatically and recursively** to assistant the building and optimizing process of the task pipeline.
100+
.. 4. **Manages the states and serialization**, with `state_dict` and `load_state_dict` methods in particular for parameters and `to_dict` method for serialization of all the states fall into the component's attributes, from subcomponents to parameters, to any other attributes of various data type.
101+
.. 5. **Make all components configurable from using `json` or `yaml` files**. This is especially useful for experimenting or building data processing pipelines.
102+
103+
.. These features are key to keep LightRAG pipeline transparent, flexible, and easy to use.
104+
.. By subclassing from the `Component` class, you will get most of these features out of the box.
105+
106+
81107
Component in Action
82108
---------------------------------------
83109

84-
.. Transparency
85-
.. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110+
111+
86112

87113
In this note, we are creating an AI doctor to answer medical questions.
88114
Run the ``DocQA`` on a query:
89115

116+
90117
.. code-block:: python
91118
92119
doc = DocQA()
@@ -133,6 +160,7 @@ Configure from file
133160
As the above example shows, we added subcomponent via attributes.
134161
We can also use methods to add more subcomponnents or parameters.
135162

163+
136164
.. code-block:: python
137165
138166
from lightrag.core.parameter import Parameter
@@ -141,8 +169,12 @@ We can also use methods to add more subcomponnents or parameters.
141169
# list all parameters
142170
for param in doc.named_parameters():
143171
print(param)
144-
# output
145-
# ('demo', Parameter: demo)
172+
173+
The output:
174+
175+
.. code-block::
176+
177+
('demo', Parameter: demo)
146178
147179
You can easily save the detailed states:
148180

@@ -152,21 +184,25 @@ You can easily save the detailed states:
152184
153185
save_json(doc.to_dict(), "doc.json")
154186
187+
To add even more flexibility, we provide :class:`FunComponent<core.component.FunComponent>` and :class:`Sequential<core.container.Sequential>` for more advanced use cases.
155188

156-
To adds even more flexibility, we provide :class:`core.component.FunComponent` and :class:`core.component.Sequential` for more advanced use cases.
157189

158190

159191
Searalization and deserialization
160192
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
161193

162-
We provide ``is_pickable`` method to check if the component is pickable.
163-
And any of your component, it is a good practise to ensure it is pickable.
194+
We provide the ``is_pickable`` method to check if the component is pickable.
195+
It is good practice to ensure that any of your components are pickable.
196+
197+
198+
199+
164200

165201
FunComponent
166202
--------------
167-
Use :func:`core.component.fun_to_component` as a decorator to convert any function to a Component with its unique class name.
203+
Use :func:`fun_to_component<core.component.fun_to_component>` as a decorator to convert any function to a Component with its unique class name.
168204

169-
:class:`core.component.FunComponent` is a subclass of :class:`core.component.Component` that allows you to define a component with a function.
205+
:class:`FunComponent<core.component.FunComponent>` is a subclass of :class:`Component<core.component.Component>` that allows you to define a component with a function.
170206
You can directly use this class as:
171207

172208
.. code-block:: python
@@ -180,16 +216,21 @@ You can directly use this class as:
180216
print(fun_component(1))
181217
print(type(fun_component))
182218
183-
# output:
184-
# 2
185-
# <class 'core.component.FunComponent'>
219+
The printout:
220+
221+
.. code-block::
186222
223+
2
224+
<class 'core.component.FunComponent'>
187225
188-
We also have :func:`core.component.fun_to_component` to convert a function to a FunComponent via decorator or directly call the function.
226+
227+
228+
We also have :func:`fun_to_component<core.component.fun_to_component>` to convert a function to a `FunComponent` via a decorator or by directly calling the function.
189229
This approach gives you a unique component converted from the function name.
190230

191231
Via direct call:
192232

233+
193234
.. code-block:: python
194235
195236
from lightrag.core.component import fun_to_component
@@ -198,12 +239,17 @@ Via direct call:
198239
print(fun_component(1))
199240
print(type(fun_component))
200241
201-
# output:
202-
# 2
203-
# <class 'lightrag.core.component.AddOneComponent'>
242+
The output:
243+
244+
.. code-block::
245+
246+
2
247+
<class 'lightrag.core.component.AddOneComponent'>
204248
205249
206-
Via decorator will be even more convenient to have a component from a function:
250+
251+
252+
Using a decorator is an even more convenient way to create a component from a function:
207253

208254
.. code-block:: python
209255
@@ -220,8 +266,12 @@ Via decorator will be even more convenient to have a component from a function:
220266
221267
Sequential
222268
--------------
223-
We have :class:`core.component.Sequential` class to PyTorch's ``nn.Sequential`` class. This is especially useful to chain together components in a sequence. Much like the concept of ``chain`` or ``pipeline`` in other LLM libraries.
224-
Let's put the FunComponent and DocQA together in a sequence:
269+
270+
271+
272+
We have the :class:`Sequential<core.container.Sequential>` class, which is similar to PyTorch's ``nn.Sequential`` class.
273+
This is especially useful for chaining together components in a sequence, much like the concept of ``chain`` or ``pipeline`` in other LLM libraries.
274+
Let's put the `FunComponent`` and `DocQA`` together in a sequence:
225275

226276
.. code-block:: python
227277
@@ -236,9 +286,12 @@ Let's put the FunComponent and DocQA together in a sequence:
236286
query = "What is the best treatment for headache?"
237287
print(seq(query))
238288
239-
We automatically enhance users' queries before passing them to the DocQA component.
289+
We automatically enhance users' queries before passing them to the `DocQA` component.
240290
The output is:
241291

292+
293+
294+
242295
.. code-block::
243296
244297
1. Over-the-counter pain relievers like acetaminophen, ibuprofen, or aspirin
@@ -269,4 +322,4 @@ The structure of the sequence using ``print(seq)``:
269322
- :func:`core.component.fun_to_component`
270323

271324

272-
We will have more advanced use cases in the upcoming tutorials.
325+
We will cover more advanced use cases in the upcoming tutorials.

docs/source/developer_notes/index.rst

+19
Original file line numberDiff line numberDiff line change
@@ -41,11 +41,30 @@ We have a clear :doc:`lightrag_design_philosophy`, which results in this :doc:`c
4141
class_hierarchy
4242

4343

44+
Introduction
45+
-------------------
46+
47+
48+
:ref:`Component<core-component>` is to LLM task pipelines what `nn.Module` is to PyTorch models.
49+
An LLM task pipeline in LightRAG mainly consists of components, such as a `Prompt`, `ModelClient`, `Generator`, `Retriever`, `Agent`, or any other custom components.
50+
This pipeline can be `Sequential` or a Directed Acyclic Graph (DAG) of components.
51+
A `Prompt` will work with `DataClass` to ease data interaction with the LLM model.
52+
A `Retriever` will work with databases to retrieve context and overcome the hallucination and knowledge limitations of LLM, following the paradigm of Retrieval-Augmented Generation (RAG).
53+
An `Agent` will work with tools and an LLM planner for enhanced ability to reason, plan, and act on real-world tasks.
54+
4455

56+
Additionally, what shines in LightRAG is that all orchestrator components, like `Retriever`, `Embedder`, `Generator`, and `Agent`, are model-agnostic.
57+
You can easily make each component work with different models from different providers by switching out the `ModelClient` and its `model_kwargs`.
58+
59+
60+
We will introduce the libraries starting from the core base classes, then move to the RAG essentials, and finally to the agent essentials.
61+
With these building blocks, we will further introduce optimizing, where the optimizer uses building blocks such as Generator for auto-prompting and retriever for dynamic few-shot in-context learning (ICL).
4562

4663
Building
4764
-------------------
4865

66+
67+
4968
Base classes
5069
~~~~~~~~~~~~~~~~~~~~~~
5170
Code path: :ref:`lightrag.core <apis-core>`.

0 commit comments

Comments
 (0)