Skip to content

Commit a81d414

Browse files
committed
Couple docs edits
1 parent 8ed927e commit a81d414

File tree

2 files changed

+15
-17
lines changed

2 files changed

+15
-17
lines changed

docs/embedding.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,6 @@ title: Embedding Examples
44
nav_order: 5
55
---
66

7-
## Table of contents
8-
{: .no_toc .text-delta }
9-
10-
- TOC
11-
{:toc}
12-
13-
## Adding embeddings with langchain4j
14-
157
The vector queries shown in the [langchain](../rag-langchain-python/README.md),
168
[langchain4j](../rag-langchain-java), and [langchain.js](../rag-langchain-js/README.md) RAG examples
179
depend on embeddings - vector representations of text - being added to documents in MarkLogic. Vector queries can
@@ -21,6 +13,12 @@ This project demonstrates the use of a
2113
the [MarkLogic Data Movement SDK](https://docs.marklogic.com/guide/java/data-movement) for adding embeddings to
2214
documents in MarkLogic.
2315

16+
## Table of contents
17+
{: .no_toc .text-delta }
18+
19+
- TOC
20+
{:toc}
21+
2422
## Setup
2523

2624
This example depends both on the [main setup for all examples](../setup/README.md) and also on having run the
@@ -29,7 +27,7 @@ This example depends both on the [main setup for all examples](../setup/README.m
2927
the text in Enron email documents and write each chunk of text to a separate document. This example will then use
3028
langchain4j to generate an embedding for the chunk of text and add it to each chunk document.
3129

32-
## Add embeddings example
30+
## Adding embedding to documents
3331

3432
To try the embedding example, run the following Gradle task:
3533

docs/splitting.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,21 @@ title: Splitting Examples
44
nav_order: 4
55
---
66

7-
## Table of contents
8-
{: .no_toc .text-delta }
9-
10-
- TOC
11-
{:toc}
12-
13-
## Splitting documents with langchain4j
14-
157
A RAG approach typically benefits from sending multiple smaller segments or "chunks" of text to an LLM. While MarkLogic
168
can efficiently ingest and index large documents, sending all the text in even a single document may either exceed
179
the number of tokens allowed by your LLM or may result in slower and more expensive responses from the LLM. Thus,
1810
when importing or reprocessing documents in MarkLogic, your RAG approach may benefit from splitting the searchable
1911
text in a document into smaller segments or "chunks" that allow for much smaller and more relevent segments of text
2012
to be sent to the LLM.
2113

14+
## Table of contents
15+
{: .no_toc .text-delta }
16+
17+
- TOC
18+
{:toc}
19+
20+
## Overview
21+
2222
This project demonstrates two different approaches to splitting documents:
2323

2424
1. Splitting the text in a document and storing each chunk in a new separate document.

0 commit comments

Comments
 (0)