Getting warnings while doing data ingestion #220

Anindyadeep · 2024-11-27T16:23:39Z

First of, this is an amazing library, hats off for making such great community efforts. While I was using graphiti, I encountered lots of warnings during the time of data ingestion.

Here is the code that I used for data ingestion:

import os
from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
from datetime import datetime
from graphiti_core.llm_client import openai_client
from graphiti_core.embedder.openai import OpenAIEmbedder, OpenAIEmbedderConfig
from graphiti_core.llm_client import LLMConfig

import nest_asyncio
nest_asyncio.apply()

# I am using Neo4J AuraDB
username="neo4j"
password="xxxxxx"
url="neo4j+s://xxxxx.neo4j.io"

config = LLMConfig(
    api_key=os.environ.get("OPENAI_API_KEY"),
    model="gpt-4o-mini"
)
client = openai_client.OpenAIClient(config=config)
embedding_config = OpenAIEmbedderConfig(
    embedding_model="text-embedding-3-small",
    api_key=os.environ.get("OPENAI_API_KEY"),
    embedding_dim=1024
)
embedder = OpenAIEmbedder(config=embedding_config)

graphiti = Graphiti(
    uri=url,
    user=username,
    password=password,
    llm_client=client,
    embedder=embedder
)

await graphiti.build_indices_and_constraints() 


from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter

documents = SimpleDirectoryReader(input_dir="./documents").load_data()
splitter = SentenceSplitter(
    chunk_size=512,
    chunk_overlap=0
)
nodes = splitter.get_nodes_from_documents(documents=documents)
nodes = nodes[:30]

for node in nodes:
    id_ = node.id_
    filename = node.metadata.get("file_name")
    text = node.text
    
    # Now make episodes
    await graphiti.add_episode(
        name=id_,
        episode_body=text,
        source=EpisodeType.text,
        source_description=filename,
        reference_time=datetime.now()
    )

Now While I did this, I encountered two types of error:

First error (received lots of errors like these one by one):

Received notification from DBMS server: {severity: WARNING} {code: Neo.ClientNotification.Statement.UnknownPropertyKeyWarning} {category: UNRECOGNIZED} {title: The provided property key is not in the database} {description: One of the property names in your query is not available in the database, make sure you didn't misspell it or that the label is available when you run this statement in your application (the missing property name is: content)} {position: line: 4, column: 18, offset: 143} for query: '\n        MATCH (e:Episodic) WHERE e.valid_at <= $reference_time \n        AND ($group_ids IS NULL) OR e.group_id in $group_ids\n        RETURN e.content AS content,\n            e.created_at AS created_at,\n            e.valid_at AS valid_at,\n            e.uuid AS uuid,\n            e.group_id AS group_id,\n            e.name AS name,\n            e.source_description AS source_description,\n            e.source AS source\n        ORDER BY e.created_at DESC\n        LIMIT $num_episodes\n        '

and finally got this:

[#DC36]  _: <CONNECTION> error: Failed to read from defunct connection IPv4Address(('3c4c6c3f.databases.neo4j.io', 7687)) (ResolvedIPv4Address(('54.216.115.14', 7687))): ConnectionResetError(104, 'Connection reset by peer')

And I see just to upload 30 chunks, it took around 20 minutes. Let me know if I am missing anything here, or in any ways I can improve and do not encounters error. Thanks

prasmussen15 · 2025-02-26T16:48:35Z

These issues should be solved. The connection error is likely caused by a stale connection in the neo4j client. Pool connections can have their lifetime set to a lower value (like 200 seconds) to prevent this issue.

prasmussen15 closed this as completed Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting warnings while doing data ingestion #220

Getting warnings while doing data ingestion #220

Anindyadeep commented Nov 27, 2024

prasmussen15 commented Feb 26, 2025

Getting warnings while doing data ingestion #220

Getting warnings while doing data ingestion #220

Comments

Anindyadeep commented Nov 27, 2024

prasmussen15 commented Feb 26, 2025