-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search optimizations #280
Merged
Merged
Search optimizations #280
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
paul-paliychuk
approved these changes
Feb 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 9acc4a2 in 2 minutes and 9 seconds
More details
- Looked at
74
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
11
drafted comments based on config settings.
1. graphiti_core/search/search_utils.py:232
- Draft comment:
Ensure indentation consistency in the multiline Cypher query string formatting. - Reason this comment was not posted:
Confidence changes required:80%
<= threshold80%
None
2. graphiti_core/search/search_utils.py:768
- Draft comment:
Conditionally re-adding the center UUID is now based on its presence in the original list. Confirm this behavior is intended. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =30%
<= threshold80%
The comment is asking the PR author to confirm their intention, which violates the rule against asking for confirmation of intended behavior. However, it does point out a specific change in behavior, which could be useful if rephrased to suggest a review of the logic or to ensure that the change aligns with the intended functionality.
3. graphiti_core/utils/maintenance/node_operations.py:175
- Draft comment:
Avoid comparing 'entity_type' against the string 'None'; it's better to handle type conversion earlier. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 80%
The current code handles two valid cases: when entity_type is None (from dict.get()) and when entity_type is the string "None" from classification. Removing the string comparison would make the code handle fewer valid cases. The comment seems to misunderstand that both cases are intentional and valid. The code is actually more robust by handling both cases.
Maybe there's a deeper issue with why the classification sometimes returns "None" as a string that should be fixed upstream?
Even if there is an upstream issue, this code correctly and defensively handles both cases, making it more robust. Removing the string check would make the code more brittle.
The comment should be deleted because its suggestion would make the code handle fewer valid cases. The current code is correct in checking for both None and "None".
4. graphiti_core/search/search_utils.py:232
- Draft comment:
Review the query string formatting: the extra indentation in the triple-quoted string may be unintended. Confirm that additional spaces do not affect the Cypher query parsing. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =40%
<= threshold80%
The comment is asking the author to confirm the impact of extra indentation in a query string. This is a request for confirmation, which violates the rule against asking the author to confirm their intention. However, it does point out a potential issue with the query string formatting, which could be useful. I need to decide if the comment is more about asking for confirmation or if it provides a useful suggestion.
5. graphiti_core/search/search_utils.py:768
- Draft comment:
The conditional addition of the center UUID now checks 'if center_node_uuid in node_uuids'. Consider checking if it's missing from the filtered list (e.g., 'if center_node_uuid not in filtered_uuids') to more explicitly capture the intention. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 80%
The current code checks if center_node_uuid was in the original input list (node_uuids) to decide whether to add it back. The suggested change would check if it's missing from filtered_uuids. Since we explicitly filter out center_node_uuid on line 739, checking filtered_uuids would be more direct and clearer about the intent - we want to add it back if it's not already there. The current code could theoretically have a bug if center_node_uuid was in node_uuids but somehow didn't get filtered out.
The current code might actually be intentional - maybe we only want to add back center_node_uuid if it was in the original input list, regardless of whether it's in filtered_uuids or not.
Looking at the code more carefully, the comment on line 768 says "add back in filtered center uuid if it was filtered out", which suggests the intent is to restore it if it's missing from filtered_uuids, not to check if it was in the original input.
The comment suggests a valid improvement that would make the code more directly match its stated intent in the comment.
6. graphiti_core/utils/maintenance/node_operations.py:175
- Draft comment:
The label assignment now checks for both None and the string 'None'. This workaround may hide upstream data issues. Consider normalizing the entity type so that it’s consistently None when not set. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 80%
The comment makes a valid point about data consistency - having both None and 'None' represent the same concept is not ideal. However, this change appears to be a pragmatic fix to handle both cases, likely because the upstream data source might provide either format. The suggestion to normalize upstream is speculative and may not be feasible.
The comment could be too speculative - we don't know if normalizing upstream is possible or if this dual handling is actually the correct solution for the system's needs.
While data consistency is important, this comment is more speculative than actionable, and the current change might be the most practical solution given the constraints.
The comment should be deleted as it's speculative and doesn't provide a clear, actionable improvement to the current code change.
7. tests/test_graphiti_int.py:75
- Draft comment:
An explicit 'search_filter=SearchFilters(node_labels=["Entity"])' has been added. Ensure this filter covers all intended scenarios in integration tests. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold80%
This comment is asking the PR author to ensure that the filter covers all intended scenarios in integration tests, which violates the rule against asking the author to ensure their change is tested. It does not provide a specific suggestion or point out a specific issue with the code.
8. graphiti_core/search/search_utils.py:745
- Draft comment:
There appears to be a typographical error in the shortest path query in the node_distance_reranker function. The query uses "MATCH p = SHORTEST 1 (center:Entity {uuid: $center_uuid})-[:RELATES_TO]-+(n:Entity {uuid: node_uuid})" where the '1' seems out of place. Consider reviewing this portion to ensure it matches the intended Cypher syntax (possibly intended to use shortestPath or similar). - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. graphiti_core/search/search_utils.py:129
- Draft comment:
In the Cypher query within the get_communities_by_nodes function, the SELECT clause appears to be missing a comma between 'c.name_embedding AS name_embedding' and 'c.created_at AS created_at'. This could lead to a concatenated field and should be corrected. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. graphiti_core/utils/maintenance/node_operations.py:127
- Draft comment:
Minor: Consider renaming variables and functions using 'reflexion' (e.g.,reflexion_iterations
andextract_nodes_reflexion
) to 'reflection' to follow common modern English usage and improve consistency. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. tests/test_graphiti_int.py:41
- Draft comment:
Typographical inconsistency: The variables 'NEO4j_USER' and 'NEO4j_PASSWORD' use a lowercase 'j'. It is recommended to rename them to 'NEO4J_USER' and 'NEO4J_PASSWORD' to maintain consistent capitalization. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_aVSbrEMZoMktMA4m
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
Optimized search logic in
search_utils.py
andnode_operations.py
, updated version to0.7.4
, and enhanced tests intest_graphiti_int.py
.search_utils.py
, adjustededge_similarity_search()
to ensurecenter_node_uuid
is added back only if it was filtered out.extract_nodes()
innode_operations.py
to handleentity_type == 'None'
when setting labels.pyproject.toml
from0.7.3
to0.7.4
.test_graphiti_int.py
to includesearch_filter=SearchFilters(node_labels=['Entity'])
intest_graphiti_init()
.This description was created by
for 9acc4a2. It will automatically update as commits are pushed.