@@ -124,6 +124,58 @@ For more information, please see the following code files:
124
124
125
125
For an example of how to add embeddings to your data, please see [ this embeddings example] ( ../embedding.md ) .
126
126
127
+ ## RAG with Semaphore Models
128
+
129
+ [ Progress Semaphore] ( https://www.progress.com/semaphore/platform ) is a modular semantic AI platform that provides the
130
+ semantic layer of your digital ecosystem so you can manage knowledge models, extract facts and classify the context and
131
+ meaning from structured and unstructured information and generate rich semantic metadata.
132
+
133
+ Details for classifying text are specific to your Semaphore installation. However, for a Progress Data Cloud
134
+ installation, see the
135
+ [ Classification and Language Service Developer's Guide] ( https://portal.smartlogic.com/docs/5.6/classification_server_-_developers_guide/welcome ) .
136
+
137
+ Once you have [ classified] ( https://www.progress.com/semaphore/platform/semantic-knowledge-classification ) your documents
138
+ and stored the extracted concepts on the documents, you can also search for those concepts as a part of the RAG
139
+ retriever. A typical strategy is to use your custom model and the Semaphore Classifier to extract concepts from the
140
+ user's question. With that list of concepts, you can easily search your target documents for those that have matching
141
+ concepts, and then include those documents in the list of documents returned by the retriever.
142
+
143
+ For instance, assume that you have extracted the concepts from a document and stored those concepts in a new JSON block in the
144
+ document that looks something like this:
145
+ ```
146
+ "concepts": [
147
+ {
148
+ "CrimeReportsModel-Crimes": "Public Order Crime"
149
+ },
150
+ {
151
+ "CrimeReportsModel-Crimes": "Disturbing the Peace"
152
+ },
153
+ ...
154
+ ]
155
+ ```
156
+ You can search for all documents that have been classified with the ` Crimes ` concept in the ` CrimesReport ` model using
157
+ a CTS query:
158
+ ```
159
+ cts.jsonPropertyValueQuery('CrimeReportsModel-Crimes', 'Crimes')
160
+ ```
161
+ That query can be used on its own or as part of more complex query that retrieves the documents that provide the best
162
+ context information to your LLM. One possibility is to adapt the vector retriever to use that query in the initial
163
+ documents query. So, as an adaptation from ` vector_query_retriever.py ` , this uses the ` jsonPropertyValueQuery ` instead
164
+ of the ` wordQuery ` .
165
+ ```
166
+ op.fromSearchDocs(
167
+ cts.andQuery([
168
+ cts.jsonPropertyValueQuery('CrimeReportsModel-Crimes', 'Crimes'),
169
+ cts.collectionQuery('events')
170
+ ]),
171
+ null,
172
+ {
173
+ 'scoreMethod': 'score-bm25',
174
+ 'bm25LengthWeight': 0.5
175
+ }
176
+ )
177
+ ```
178
+
127
179
## Summary
128
180
129
181
The three RAG approaches shown above - a simple word query, a contextual query, and a vector query - demonstrate how
0 commit comments