diff --git a/modules/genai-ecosystem/images/defaultEmbeddingNode.png b/modules/genai-ecosystem/images/defaultEmbeddingNode.png new file mode 100644 index 0000000..8c13310 Binary files /dev/null and b/modules/genai-ecosystem/images/defaultEmbeddingNode.png differ diff --git a/modules/genai-ecosystem/images/nodeWithCustomLabelAndProps.png b/modules/genai-ecosystem/images/nodeWithCustomLabelAndProps.png new file mode 100644 index 0000000..13805e3 Binary files /dev/null and b/modules/genai-ecosystem/images/nodeWithCustomLabelAndProps.png differ diff --git a/modules/genai-ecosystem/images/nodeWithCustomLabelPropsAndMetadata.png b/modules/genai-ecosystem/images/nodeWithCustomLabelPropsAndMetadata.png new file mode 100644 index 0000000..efeb3af Binary files /dev/null and b/modules/genai-ecosystem/images/nodeWithCustomLabelPropsAndMetadata.png differ diff --git a/modules/genai-ecosystem/pages/langchain4j.adoc b/modules/genai-ecosystem/pages/langchain4j.adoc index 5aefaaa..eda0b87 100644 --- a/modules/genai-ecosystem/pages/langchain4j.adoc +++ b/modules/genai-ecosystem/pages/langchain4j.adoc @@ -19,26 +19,447 @@ The Neo4j Integration makes the xref:vector-search.adoc[Neo4j Vector] index avai == Installation +[NOTE] +---- +Except when specified, the following LangChain4j features are available from version 1.0.0-beta3 of Langchain4j Community. +Previously they were also available in Langchain Core, but since version 1.0.0-beta3 they have been migrated to the Community project. +---- + .pom.xml [source,xml] ---- include::https://github.com/langchain4j/langchain4j-examples/raw/main/neo4j-example/pom.xml[lines=19..23] ---- +[source,xml] +---- + + dev.langchain4j + langchain4j-community-neo4j + {langchain4j.version} + + + + dev.langchain4j + langchain4j-community-neo4j-retriever + {langchain4j.version} + +---- + == Functionality Includes * Create vector index * Populate nodes and vector index from documents * Query vector index +* Convert natural language questions in cypher queries == Documentation -An example is avalable at: https://github.com/langchain4j/langchain4j-examples/tree/main/neo4j-example +An example is available at : https://github.com/langchain4j/langchain4j-examples/tree/main/neo4j-example + +[source,java] +---- +include::https://raw.githubusercontent.com/vga91/langchain4j-examples/refs/heads/test-tags/neo4j-example/src/main/java/Neo4jEmbeddingStoreExample.java[tags=test] +---- + +LangChain4j provides the following classes for Neo4j integration: + +- `Neo4jEmbeddingStore`: Implements the https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/main/java/dev/langchain4j/store/embedding/EmbeddingStore.java[EmbeddingStore] interface, enabling storing and querying vector embeddings in a Neo4j database. +- `Neo4jText2CypherRetriever`: Implements the https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/main/java/dev/langchain4j/rag/content/retriever/ContentRetriever.java[ContentRetriever] interface for generating and executing Cypher queries from user questions, improving content retrieval from Neo4j databases. + + +=== Connection Setup + +We can connect to Neo4j and instantiate the above classes via builders. + +==== Neo4jEmbeddingStore + +Here is how to create a `Neo4jEmbeddingStore` instance: +[source,java] +---- +Neo4jEmbeddingStore embeddingStore = Neo4jEmbeddingStore.builder()..build(); +---- + +Where `` must have `dimension` and either `driver` or `withBasicAuth` parameters, +besides other optional ones. + +Here is the complete builder list: + +[options="header",cols="m,m,a"] +|=== +| Key | Default Value | Description +| driver | | the https://neo4j.com/docs/api/java-driver/current/org.neo4j.driver/org/neo4j/driver/Driver.html[Java Driver instance] +| withBasicAuth | | Creates an instance of Neo4jEmbeddingStore defining a https://neo4j.com/docs/api/java-driver/current/org.neo4j.driver/org/neo4j/driver/Driver.html[Java Driver instance], starting from `uri`, `user` and `password` +| dimension | | the vector's dimension +| config | org.neo4j.driver.SessionConfig.forDatabase(``) | the https://neo4j.com/docs/api/java-driver/current/org.neo4j.driver/org/neo4j/driver/SessionConfig.html[SessionConfig instance] +| label | "Document" | the label name +| embeddingProperty | "embedding" | the embedding property name +| idProperty | "id" | the id property name +| metadataPrefix | "" | the metadata prefix +| textProperty | "text" | the text property name +| indexName | "vector" | the vector index name +| databaseName | "neo4j" | the database name +| retrievalQuery | "RETURN properties(node) AS metadata, node.idProperty AS idProperty, node.textProperty AS textProperty, node.embeddingProperty AS embeddingProperty, score" | the retrieval query +|=== + +==== Neo4jText2CypherRetriever + +The `Neo4jContentRetriever` translates natural language questions into Cypher queries, +leveraging the Neo4j schema calculated via https://neo4j.com/docs/apoc/current/overview/apoc.meta/apoc.meta.data/[apoc.meta.data] procedure. + + +Here is how to create a `Neo4jText2CypherRetriever` instance +[source,java] +---- +Neo4jText2CypherRetriever retriever = Neo4jText2CypherRetriever.builder()..build(); +---- + +Here is the complete builder list: +[options="header",cols="m,m,a"] +|=== +| Key | Default Value | Description +| graph | | see below +| chatModel | | the https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/main/java/dev/langchain4j/model/chat/ChatModel.java[ChatModel] implementation used to create the Cypher query from a natural language question +| prompt | see example below | the prompt that will be used with the chatModel +| examples | empty string | additional examples to enrich and improve the result +|=== + +To connect to Neo4j we have to leverage the `Neo4jGraph` class this way: +[source,java] +---- +// Neo4j Java Driver connection instance +Driver driver = GraphDatabase.driver("", AuthTokens.basic("", "")); + +Neo4jGraph neo4jGraph = Neo4jGraph.builder() + .driver(driver) + .build(); +---- + +or using withBasicAuth as the `Neo4jEmbeddingStore`: + +[source,java] +---- +Neo4jGraph neo4jGraph = Neo4jGraph.builder() + .withBasicAuth("", "", "") + .build(); +---- + +and then pass it to the builder: + +[source,java] +---- +Neo4jGraph neo4jGraph = /* Neo4jGraph instance */; + +// ChatModel instance, e.g. OpenAiChatModel +ChatModel chatLanguageModel = OpenAiChatModel.builder() + .apiKey(OPENAI_API_KEY) + .modelName(GPT_4_O_MINI) + .build(); + +// Neo4jText2CypherRetriever instance +Neo4jText2CypherRetriever retriever = Neo4jText2CypherRetriever.builder() + .graph(neo4jGraph) + .chatLanguageModel(chatLanguageModel) + .build(); +---- + +== Usage Examples + +=== Neo4jEmbeddingStore + +We can define an `Neo4jEmbeddingStore` with the required Java Driver instance and dimension: + +[source,java] +---- +Driver driver = GraphDatabase.driver('', AuthTokens.basic('', '')); + +Neo4jEmbeddingStore embeddingStore = Neo4jEmbeddingStore.builder() + .withDriver(driver) + .dimension(384) + .label("CustomLabel") + .build(); +---- + +Or use withBasicAuth as alternative to `driver()` +[source,java] +---- + +// we can use with `withBasicAuth` as alternative to driver: +Neo4jEmbeddingStore embeddingStore = Neo4jEmbeddingStore.builder().withBasicAuth("", "", "").build(); +---- + +We can define an `Neo4jEmbeddingStore` with the required configurations and an optional label name in this way: +[source,java] +---- +Driver driver = GraphDatabase.driver('', AuthTokens.basic('', '')); + +Neo4jEmbeddingStore embeddingStoreWithCustomLabel = Neo4jEmbeddingStore.builder() + .withDriver(driver) + .dimension(384) + .label("CustomLabel") + .build(); +---- + + +To add a single embedding with a `TextSegment`: + +[source,java] +---- +TextSegment segment = TextSegment.from("I like football."); +Embedding embedding = embeddingModel.embed(segment.text()).content(); +String id = embeddingStore.add(embedding, segment); +// output: id of the embedding +---- + +The above embedding creates a node with a label `Document` and the properties `embedding` (with the embedded text), `text` with the original text, `id` with a UUID: + +image::defaultEmbeddingNode.png[] + +Add multiple embeddings: + +[source,java] +---- +TextSegment segment = TextSegment.from(randomUUID()); +Embedding firstEmbedding = embeddingModel.embed("firstEmbedText").content(); +Embedding secondEmbedding = embeddingModel.embed("secondEmbedText").content(); +List ids = embeddingStore.addAll(asList(firstEmbedding, secondEmbedding)); +// output: list of the embedding ids +---- + + +To add embeddings with `TextSegment` and a metadata with key "foo" and value "bar": + +[source,java] +---- +TextSegment segment = TextSegment.from(randomUUID(), Metadata.from("foo", "bar")); +Embedding embedding = embeddingModel.embed(segment.text()).content(); +String id = embeddingStore.add(embedding, segment); +// output: id of the embedding +---- + +To add embeddings with `TextSegment` and 3 metadata with keys "foo1", "foo2", "foo3" and values respectively "bar1", "bar2", "bar3": +[source,java] +---- +TextSegment segment = TextSegment.from(randomUUID(), Metadata.from(Map.of("foo1", "bar1", "foo2", "bar2", "foo3", "bar3"))); +Embedding embedding = embeddingModel.embed(segment.text()).content(); +String id = embeddingStore.add(embedding, segment); +// output: id of the embedding +---- + +Add multiple embeddings with segments + +[source,java] +---- +TextSegment firstSegment = TextSegment.from("firstText"); +Embedding firstEmbedding = embeddingModel.embed(firstSegment.text()).content(); + +TextSegment secondSegment = TextSegment.from("secondText"); +Embedding secondEmbedding = embeddingModel.embed(secondSegment.text()).content(); + +List ids = embeddingStore.addAll( + asList(firstEmbedding, secondEmbedding), + asList(firstSegment, secondSegment) +); +// output: list of the embedding ids +---- + + + +To add embeddings with segment with metadata-prefix and custom id property name: [source,java] ---- -include::https://github.com/langchain4j/langchain4j-examples/raw/main/neo4j-example/src/main/java/Neo4jEmbeddingStoreExample.java[] +String metadataPrefix = "metadata."; +String customIdProp = "customId ` & Prop ` To Sanitize"; + +Neo4jEmbeddingStore customEmbeddingStore = Neo4jEmbeddingStore.builder() + .withBasicAuth("", "", "") + .dimension(embeddingModel.dimension()) + .indexName("customIdx") + .label("MyCustomLabel") + .embeddingProperty("customProp") + .idProperty("customId") + .textProperty("customText") + .build(); + +TextSegment segment = TextSegment.from(randomUUID(), Metadata.from("test-key", "test-value")); +Embedding embedding = embeddingModel.embed(segment.text()).content(); +String id = embeddingStore.add(embedding, segment); +// output: id of the embedding + +final EmbeddingSearchRequest request = EmbeddingSearchRequest.builder() + .queryEmbedding(embedding) + .maxResults(10) + .build(); +final List> relevant = embeddingStore.search(request).matches(); +// output: list of embeddings +---- + +The above example creates a node with label MyCustomLabel and properties `customId`, `customProp`, `customText` + +image::nodeWithCustomLabelAndProps.png[] + + +To add embeddings with custom metadata prefix: + +[source,java] +---- +String metadataPrefix = "metadata."; +String labelName = "CustomLabelName"; + +Neo4jEmbeddingStore embeddingStore = Neo4jEmbeddingStore.builder() + .withBasicAuth("", "", "") + .dimension(384) + .metadataPrefix(metadataPrefix) + .label(labelName) + .indexName("customIdxName") + .build(); + +String metadataCompleteKey = metadataPrefix + METADATA_KEY; + +TextSegment segment = TextSegment.from(randomUUID(), Metadata.from("test-key", "test-value")); +Embedding embedding = embeddingModel.embed(segment.text()).content(); +String id = embeddingStore.add(embedding, segment); +// output: id of the embedding +---- + +image::nodeWithCustomLabelPropsAndMetadata.png[] + + +Search embeddings: + +[source,java] ---- +Embedding embedding = embeddingModel.embed("embedText").content(); +String id = embeddingStore.add(embedding); +final EmbeddingSearchRequest request = EmbeddingSearchRequest.builder() + .queryEmbedding(embedding) + .maxResults(10) // Optional `max results`, default: 3 + .minScore(0.15) // Optional `min score`, default: 0.0 + .build(); + +final List> relevant = embeddingStore.search(request).matches(); +// output: list of embeddings +---- + +=== Neo4jText2CypherRetriever + + +Here is a basic examples: +[source,java] +---- +// create dataset, for example: +// CREATE (book:Book {title: 'Dune'})<-[:WROTE {when: date('1999')}]-(author:Person {name: 'Frank Herbert'})"); + + +// create a Neo4jGraph instance +Neo4jGraph neo4jGraph = Neo4jGraph.builder() + .driver(/**/) + .build(); + +// create a Neo4jText2CypherRetriever instance +Neo4jText2CypherRetriever retriever = Neo4jText2CypherRetriever.builder() + .graph(neo4jGraph) + .chatLanguageModel(chatLanguageModel) + .build(); + +Query query = new Query("Who is the author of the book 'Dune'?"); + +// retrieve result +List contents = retriever.retrieve(query); + +System.out.println(contents.get(0).textSegment().text()); +// example output: "Frank Herbert" +---- + +The above one will execute a chat request with the following prompt string: + +[source,text] +---- +Task:Generate Cypher statement to query a graph database. +Instructions +Use only the provided relationship types and properties in the schema. +Do not use any other relationship types or properties that are not provided. +Schema: + +Node properties are the following: +:Book {title: STRING} +:Person {name: STRING} + +Relationship properties are the following: +:WROTE {when: DATE} + +The relationships are the following: +(:Person)-[:WROTE]->(:Book) + +Note: Do not include any explanations or apologies in your responses. +Do not respond to any questions that might ask anything else than for you to construct a Cypher statement. +Do not include any text except the generated Cypher statement. +The question is: {{question}} +---- +where `question` is "Who is the author of the book 'Dune'?" +and `schema` is handled by the apoc.meta.data procedure to retrieve and stringify the current Neo4j schema. +In this case is +[source,text] +---- +Node properties are the following: +:Book {title: STRING} +:Person {name: STRING} + +Relationship properties are the following: +:WROTE {when: DATE} + +The relationships are the following: +(:Person)-[:WROTE]->(:Book) +---- + +We can also change the default prompt if needed: +[source,java] +---- +Neo4jGraph neo4jGraph = /* Neo4jGraph instance */ + +Neo4jText2CypherRetriever.builder() + .neo4jGraph(neo4jGraph) + .promptTemplate("") + .build(); +---- + +Moreover, we can enrich and improve the result by just adding few-shot examples to prompt. +For instance: +[source,java] +---- +Neo4jGraph neo4jGraph = /* Neo4jGraph instance */ + +List examples = List.of( + """ + # Which streamer has the most followers? + MATCH (s:Stream) + RETURN s.name AS streamer + ORDER BY s.followers DESC LIMIT 1 + """, + """ + # How many streamers are from Norway? + MATCH (s:Stream)-[:HAS_LANGUAGE]->(:Language {{name: 'Norwegian'}}) + RETURN count(s) AS streamers + """); + +Neo4jText2CypherRetriever neo4jContentRetriever = Neo4jText2CypherRetriever.builder() + .graph(neo4jGraph) + .chatLanguageModel(openAiChatModel) + // add the above examples + .examples(examples) + .build(); + +// retrieve the optimized results +final String textQuery = "Which streamer from Italy has the most followers?"; +Query query = new Query(textQuery); +List contents = neo4jContentRetriever.retrieve(query); + +System.out.println(contents.get(0).textSegment().text()); +// output: "The most followed italian streamer" +---- + + == Relevant Links [cols="1,4"]