Skip to content

problem adding episode with neo4j in docker-compose environment #325

@lingster

Description

@lingster

I've configured graphiti to use the OpenAIGenericClient() and connecting it to a LiteLLM proxy, which can be configured to called ChatGpt-4o or Gemini.

I'm trying to call add_episode() using the Kamala Harris example

but I get these errors:

Received notification from DBMS server: {severity: WARNING} {code: Neo.ClientNotification.Statement.UnknownPropertyKeyWarning} {category: UNRECOGNIZED} {title: The provided property key is not in the database} {description: One of the property names in your query is not available in the database, make sure you didn't misspell it or that the label is available when you run this statement in your application (the missing property name is: fact_embedding)} {position: line: 4, column: 45, offset: 250} for query: '\n                                                                                MATCH (n:Entity)-[r:RELATES_TO]->(m:Entity)\n                                                                                \nWITH DISTINCT r, vector.similarity.cosine(r.fact_embedding, $search_vector) AS score\n                WHERE score > $min_score\n                RETURN\n                    r.uuid AS uuid,\n                    r.group_id AS group_id,\n                    startNode(r).uuid AS source_node_uuid,\n                    endNode(r).uuid AS target_node_uuid,\n                    r.created_at AS created_at,\n                    r.name AS name,\n                    r.fact AS fact,\n                    r.fact_embedding AS fact_embedding,\n                    r.episodes AS episodes,\n                    r.expired_at AS expired_at,\n                    r.valid_at AS valid_at,\n                    r.invalid_at AS invalid_at\n                ORDER BY score DESC\n                LIMIT $limit\n
{code: Neo.ClientError.Statement.SyntaxError} {message: Setting labels or properties dynamically is not supported. (line 4, column 9 (offset: 74))
"    SET n:$(node.labels)"
         ^}

My neo4j is started via docker/compose with settings as follows:

neo4j:
    image: neo4j:5.22.0
    healthcheck:
      test: wget "http://localhost:${NEO4J_PORT}" || exit 1
      interval: 1s
      timeout: 10s
      retries: 20
      start_period: 3s
    ports:
      - "7474:7474"  # HTTP
      - "${NEO4J_PORT}:${NEO4J_PORT}"  # Bolt
    volumes:
      - neo4j_data:/data
      - ./neo4j/data:/backup
      - ./neo4j/plugins:/plugins
      - ./neo4j/logs:/logs
    environment:
      - NEO4J_AUTH=neo4j/password
      - NEO4J_PLUGINS=["apoc"]
      - NEO4J_apoc_export_file_enabled=true
      - NEO4J_apoc_import_file_enabled=true
      - NEO4J_apoc_import_file_use__neo4j__config=true

I have connected graphiti to Langfuse and for some of the calls I can see the following:

{
    "messages": [
        {
            "role": "system",
            "content": "You are an AI assistant that determines which facts have not been extracted from the given context\nDo not escape unicode characters.\n"
        },
        {
            "role": "user",
            "content": "\n<PREVIOUS MESSAGES>\n[\n  \"Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.\",\n  \"As AG, Harris was in office from January 3, 2011 \\u2013 January 3, 2017\",\n  \"Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.\",\n  \"As AG, Harris was in office from January 3, 2011 \\u2013 January 3, 2017\",\n  \"Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.\"\n]\n</PREVIOUS MESSAGES>\n<CURRENT MESSAGE>\nAs AG, Harris was in office from January 3, 2011 – January 3, 2017\n</CURRENT MESSAGE>\n\n<EXTRACTED ENTITIES>\n[]\n</EXTRACTED ENTITIES>\n\n<EXTRACTED FACTS>\n[]\n</EXTRACTED FACTS>\n\nGiven the above MESSAGES, list of EXTRACTED ENTITIES entities, and list of EXTRACTED FACTS; \ndetermine if any facts haven't been extracted.\n\n\nRespond with a JSON object in the following format:\n\n{\"properties\": {\"missing_facts\": {\"description\": \"facts that weren't extracted\", \"items\": {\"type\": \"string\"}, \"title\": \"Missing Facts\", \"type\": \"array\"}}, \"required\": [\"missing_facts\"], \"title\": \"MissingFacts\", \"type\": \"object\"}"
        }
    ]
}

and the llm output returned:

{
    "content": {
        "properties": {
            "missing_facts": {
                "description": "facts that weren't extracted",
                "items": {
                    "type": "string"
                },
                "title": "Missing Facts",
                "type": "array"
            }
        },
        "required": [
            "missing_facts"
        ],
        "title": "MissingFacts",
        "type": "object"
    },
    "role": "assistant",
    "tool_calls": null,
    "function_call": null
}

so it appears that the prompt isn't correctly returning the required facts.

I see similar issues:

{
    "messages": [
        {
            "role": "system",
            "content": "You are an AI assistant that extracts entity nodes from text. Your primary task is to identify and extract the speaker and other significant entities mentioned in the provided text.\nDo not escape unicode characters.\n"
        },
        {
            "role": "user",
            "content": "\n<TEXT>\nAs AG, Harris was in office from January 3, 2011 – January 3, 2017\n</TEXT>\n\nThe following entities were missed in a previous extraction: \nJanuary 3, 2011,\nJanuary 3, 2017,\n\nGiven the above text, extract entity nodes from the TEXT that are explicitly or implicitly mentioned:\n\nGuidelines:\n1. Extract significant entities, concepts, or actors mentioned in the conversation.\n2. Avoid creating nodes for relationships or actions.\n3. Avoid creating nodes for temporal information like dates, times or years (these will be added to edges later).\n4. Be as explicit as possible in your node names, using full names and avoiding abbreviations.\n\n\nRespond with a JSON object in the following format:\n\n{\"properties\": {\"extracted_node_names\": {\"description\": \"Name of the extracted entity\", \"items\": {\"type\": \"string\"}, \"title\": \"Extracted Node Names\", \"type\": \"array\"}}, \"required\": [\"extracted_node_names\"], \"title\": \"ExtractedNodes\", \"type\": \"object\"}"
        }
    ]
}

which returns:

{
    "content": {
        "properties": {
            "extracted_node_names": {
                "description": "Name of the extracted entity",
                "items": {
                    "type": "string"
                },
                "title": "Extracted Node Names",
                "type": "array"
            }
        },
        "required": [
            "extracted_node_names"
        ],
        "title": "ExtractedNodes",
        "type": "object"
    },
    "role": "assistant",
    "tool_calls": null,
    "function_call": null
}

However if I revert to using the OpenAIClient() with LiteLLM, this works fine(with both closed sourced models and open/ollama local models) and edges/relationships are correctly created and the search query can be answered.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions