-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entity classification updates #281
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 3311065 in 1 minute and 44 seconds
More details
- Looked at
73
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
12
drafted comments based on config settings.
1. graphiti_core/prompts/extract_nodes.py:33
- Draft comment:
Field names updated to use two separate lists. Consider adding a comment explaining the index relationship between 'entities' and 'entity_classifications' for clarity. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold80%
None
2. graphiti_core/prompts/extract_nodes.py:187
- Draft comment:
The updated guidelines now explicitly state to only use provided entity types. This change is clear; ensure LLM responses adhere to these constraints. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold80%
None
3. graphiti_core/utils/maintenance/node_operations.py:166
- Draft comment:
Ensure the lengths of 'entities' and 'entity_classifications' match. Mismatches could lead to incorrect pairing. Consider adding a validation check or a comment to clarify expected response. - Reason this comment was not posted:
Confidence changes required:66%
<= threshold80%
None
4. graphiti_core/utils/maintenance/node_operations.py:177
- Draft comment:
For label assignment, consider normalizing the classification value (e.g., converting to lowercase) to robustly handle variations like 'None', 'null', etc. - Reason this comment was not posted:
Confidence changes required:66%
<= threshold80%
None
5. pyproject.toml:3
- Draft comment:
Version update to 0.7.5 looks appropriate. Ensure that any necessary documentation or changelog entries are updated to reflect the changes. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold80%
None
6. graphiti_core/prompts/extract_nodes.py:180
- Draft comment:
The classify_nodes guidelines have been updated. Confirm that the numbering (now three steps) correctly reflects the intended classification logic and requirements. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold80%
None
7. graphiti_core/utils/maintenance/node_operations.py:166
- Draft comment:
Using zip to combine 'entities' and 'entity_classifications' will silently truncate if lengths differ. It's recommended to check that both lists have equal length or handle mismatches explicitly. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 80%
This is a valid concern about data integrity. If the LLM response has mismatched list lengths, data could be silently lost. The code assumes the LLM response will have matching lengths but doesn't verify this. This could lead to subtle bugs where some classifications are dropped without any error or warning. The comment suggests a concrete improvement to make the code more robust.
The LLM response may be validated by a Pydantic model (EntityClassification) which could ensure the lists match in length, making this check redundant. The code may be relying on that validation.
Looking at the code, while EntityClassification is used as the response_model, we don't see the model definition to confirm if it enforces matching lengths. Even with validation, being explicit about length checking would make the code more maintainable and self-documenting.
Keep the comment as it identifies a real potential issue with data integrity and suggests a concrete improvement to make the code more robust.
8. graphiti_core/utils/maintenance/node_operations.py:177
- Draft comment:
Consider normalizing the entity type (e.g., converting to lowercase) before comparing to 'None' or 'null' to avoid potential case sensitivity issues. - Reason this comment was not posted:
Confidence changes required:66%
<= threshold80%
None
9. pyproject.toml:1
- Draft comment:
Version bump to 0.7.5 looks correct. Also, verify that including pydantic in both dependencies and dev-dependencies is intentional to avoid duplication. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold80%
None
10. graphiti_core/prompts/extract_nodes.py:38
- Draft comment:
Consider revising the description for 'entity_classifications'. Instead of 'List of entities classifications. The index of the classification should match the index of the entity it corresponds to.', it might be clearer to say something like 'List of entity classifications. The index of each classification should correspond to the index of its entity.' This will help improve readability. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 80%
The original text is already clear and unambiguous. The suggested rewording is only slightly different and doesn't fix any actual issues. This seems like a purely subjective style preference. The comment doesn't point out any actual problems or confusion.
Maybe clearer documentation could help future developers better understand the code structure? The suggested wording is arguably more elegant.
While better docs are good, this change is too minor to justify a comment. The original text already clearly conveys the meaning.
Delete this comment as it suggests a minor stylistic change that doesn't materially improve code quality or fix any actual issues.
11. graphiti_core/prompts/extract_nodes.py:110
- Draft comment:
In the guidelines for extract_json, the sentence '1. Always try to extract an entities that the JSON represents. This will often be something like a "name" or "user field"' has a grammatical error. Consider changing it to 'Always try to extract the entity that the JSON represents' or 'Always try to extract entities that the JSON represents' depending on the intended meaning. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. pyproject.toml:8
- Draft comment:
Please double-check the spelling of 'Daniel Chalef' in the authors list. It might be a typographical error if the correct spelling is different. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_zSU1S44c6TSFjMcM
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on ba36716 in 1 minute and 19 seconds
More details
- Looked at
14
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. graphiti_core/prompts/extract_nodes.py:37
- Draft comment:
Good cleanup: removing commented-out tuple alternative to 'entity_classifications'. Ensure this change is intentional to avoid future confusion. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. graphiti_core/prompts/extract_nodes.py:40
- Draft comment:
Good cleanup! Removing the old commented code forentity_classifications
improves clarity. Ensure historical reference is available via version control if needed. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold80%
The comment is purely informative, suggesting that the historical reference is available via version control. It doesn't provide a specific code suggestion or ask for a test to be written. It also doesn't address any of the additional rules provided.
3. graphiti_core/prompts/extract_nodes.py:107
- Draft comment:
Typographical error: In the first guideline for extracting from JSON (line ~107), the phrase "extract an entities" should be corrected. Consider changing it to "extract entities" or "extract an entity" to improve clarity. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_whBCPNUNA6ihJLmb
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on a0175df in 48 seconds
More details
- Looked at
12
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. graphiti_core/utils/maintenance/node_operations.py:14
- Draft comment:
Good cleanup: removal of the unused 'ast' import adheres to DRY principles. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold80%
The comment is purely informative and does not provide any actionable feedback or suggestions. It simply states that the removal of an unused import adheres to DRY principles, which is not necessary to point out.
2. graphiti_core/utils/maintenance/node_operations.py:14
- Draft comment:
Unused import 'ast' removed. This cleanup is good for code simplicity. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold80%
This comment is purely informative, praising the removal of an unused import. It doesn't provide any actionable feedback or suggestions for improvement. According to the rules, purely informative comments should be removed.
Workflow ID: wflow_IVglj3OyRoOdNDII
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Important
Update entity classification to use lists for entities and classifications, adjust node operations accordingly, and increment version to 0.7.5.
extract_nodes.py
,EntityClassification
now usesentities
andentity_classifications
lists instead of a dictionary.classify_nodes()
guidelines to restrict entity types to provided ones only.node_operations.py
, updatedextract_nodes()
to handle newEntityClassification
structure.None
andnull
entity types.pyproject.toml
from0.7.4
to0.7.5
.This description was created by
for a0175df. It will automatically update as commits are pushed.