Regarding this blog:
Qdrant + Neo4j Graph RAG
Doubt about the sharing of id between Qdrant and neo4j
while generating the graph db, we send the entire raw text and based on that gpt generated n number of nodes
Now, as far as I could understand, there is no restriction on the number of nodes. And these nodes will not be serialized based on line number of raw text anyway
But while we are working with the vector database, we are chunking based on line-break. And then link it up with an id from neo4j node collection. My specific query is in this assignment snippet
def ingest_to_qdrant(collection_name, raw_data, node_id_mapping):
embeddings = [openai_embeddings(paragraph) for paragraph in raw_data.split("\n")]
qdrant_client.upsert(
collection_name=collection_name,
points=[
{
"id": str(uuid.uuid4()),
"vector": embedding,
"payload": {"id": node_id}
}
for node_id, embedding in zip(node_id_mapping.values(), embeddings)
]
)
Example scenario
say the raw text contains 10 lines. Number of generated nodes is 12. Then how will be the mapping work.
This is a self doubt. I may be am wrong. Sorry in advance for any mistake
Regarding this blog:
Qdrant + Neo4j Graph RAG
Doubt about the sharing of id between Qdrant and neo4j
while generating the graph db, we send the entire raw text and based on that gpt generated n number of nodes
Now, as far as I could understand, there is no restriction on the number of nodes. And these nodes will not be serialized based on line number of raw text anyway
But while we are working with the vector database, we are chunking based on line-break. And then link it up with an id from neo4j node collection. My specific query is in this assignment snippet
Example scenario
say the raw text contains 10 lines. Number of generated nodes is 12. Then how will be the mapping work.
This is a self doubt. I may be am wrong. Sorry in advance for any mistake