Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entity classification updates #281

Merged
merged 4 commits into from
Feb 27, 2025
Merged

Conversation

prasmussen15
Copy link
Collaborator

@prasmussen15 prasmussen15 commented Feb 27, 2025

Important

Update entity classification to use lists for entities and classifications, adjust node operations accordingly, and increment version to 0.7.5.

  • Entity Classification Updates:
    • In extract_nodes.py, EntityClassification now uses entities and entity_classifications lists instead of a dictionary.
    • Updated classify_nodes() guidelines to restrict entity types to provided ones only.
  • Node Operations:
    • In node_operations.py, updated extract_nodes() to handle new EntityClassification structure.
    • Adjusted logic to handle None and null entity types.
  • Misc:
    • Incremented version in pyproject.toml from 0.7.4 to 0.7.5.

This description was created by Ellipsis for a0175df. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 3311065 in 1 minute and 44 seconds

More details
  • Looked at 73 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 12 drafted comments based on config settings.
1. graphiti_core/prompts/extract_nodes.py:33
  • Draft comment:
    Field names updated to use two separate lists. Consider adding a comment explaining the index relationship between 'entities' and 'entity_classifications' for clarity.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 80%
    None
2. graphiti_core/prompts/extract_nodes.py:187
  • Draft comment:
    The updated guidelines now explicitly state to only use provided entity types. This change is clear; ensure LLM responses adhere to these constraints.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 80%
    None
3. graphiti_core/utils/maintenance/node_operations.py:166
  • Draft comment:
    Ensure the lengths of 'entities' and 'entity_classifications' match. Mismatches could lead to incorrect pairing. Consider adding a validation check or a comment to clarify expected response.
  • Reason this comment was not posted:
    Confidence changes required: 66% <= threshold 80%
    None
4. graphiti_core/utils/maintenance/node_operations.py:177
  • Draft comment:
    For label assignment, consider normalizing the classification value (e.g., converting to lowercase) to robustly handle variations like 'None', 'null', etc.
  • Reason this comment was not posted:
    Confidence changes required: 66% <= threshold 80%
    None
5. pyproject.toml:3
  • Draft comment:
    Version update to 0.7.5 looks appropriate. Ensure that any necessary documentation or changelog entries are updated to reflect the changes.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 80%
    None
6. graphiti_core/prompts/extract_nodes.py:180
  • Draft comment:
    The classify_nodes guidelines have been updated. Confirm that the numbering (now three steps) correctly reflects the intended classification logic and requirements.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 80%
    None
7. graphiti_core/utils/maintenance/node_operations.py:166
  • Draft comment:
    Using zip to combine 'entities' and 'entity_classifications' will silently truncate if lengths differ. It's recommended to check that both lists have equal length or handle mismatches explicitly.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 80%
    This is a valid concern about data integrity. If the LLM response has mismatched list lengths, data could be silently lost. The code assumes the LLM response will have matching lengths but doesn't verify this. This could lead to subtle bugs where some classifications are dropped without any error or warning. The comment suggests a concrete improvement to make the code more robust.
    The LLM response may be validated by a Pydantic model (EntityClassification) which could ensure the lists match in length, making this check redundant. The code may be relying on that validation.
    Looking at the code, while EntityClassification is used as the response_model, we don't see the model definition to confirm if it enforces matching lengths. Even with validation, being explicit about length checking would make the code more maintainable and self-documenting.
    Keep the comment as it identifies a real potential issue with data integrity and suggests a concrete improvement to make the code more robust.
8. graphiti_core/utils/maintenance/node_operations.py:177
  • Draft comment:
    Consider normalizing the entity type (e.g., converting to lowercase) before comparing to 'None' or 'null' to avoid potential case sensitivity issues.
  • Reason this comment was not posted:
    Confidence changes required: 66% <= threshold 80%
    None
9. pyproject.toml:1
  • Draft comment:
    Version bump to 0.7.5 looks correct. Also, verify that including pydantic in both dependencies and dev-dependencies is intentional to avoid duplication.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 80%
    None
10. graphiti_core/prompts/extract_nodes.py:38
  • Draft comment:
    Consider revising the description for 'entity_classifications'. Instead of 'List of entities classifications. The index of the classification should match the index of the entity it corresponds to.', it might be clearer to say something like 'List of entity classifications. The index of each classification should correspond to the index of its entity.' This will help improve readability.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 80%
    The original text is already clear and unambiguous. The suggested rewording is only slightly different and doesn't fix any actual issues. This seems like a purely subjective style preference. The comment doesn't point out any actual problems or confusion.
    Maybe clearer documentation could help future developers better understand the code structure? The suggested wording is arguably more elegant.
    While better docs are good, this change is too minor to justify a comment. The original text already clearly conveys the meaning.
    Delete this comment as it suggests a minor stylistic change that doesn't materially improve code quality or fix any actual issues.
11. graphiti_core/prompts/extract_nodes.py:110
  • Draft comment:
    In the guidelines for extract_json, the sentence '1. Always try to extract an entities that the JSON represents. This will often be something like a "name" or "user field"' has a grammatical error. Consider changing it to 'Always try to extract the entity that the JSON represents' or 'Always try to extract entities that the JSON represents' depending on the intended meaning.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. pyproject.toml:8
  • Draft comment:
    Please double-check the spelling of 'Daniel Chalef' in the authors list. It might be a typographical error if the correct spelling is different.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_zSU1S44c6TSFjMcM


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on ba36716 in 1 minute and 19 seconds

More details
  • Looked at 14 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. graphiti_core/prompts/extract_nodes.py:37
  • Draft comment:
    Good cleanup: removing commented-out tuple alternative to 'entity_classifications'. Ensure this change is intentional to avoid future confusion.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. graphiti_core/prompts/extract_nodes.py:40
  • Draft comment:
    Good cleanup! Removing the old commented code for entity_classifications improves clarity. Ensure historical reference is available via version control if needed.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 80%
    The comment is purely informative, suggesting that the historical reference is available via version control. It doesn't provide a specific code suggestion or ask for a test to be written. It also doesn't address any of the additional rules provided.
3. graphiti_core/prompts/extract_nodes.py:107
  • Draft comment:
    Typographical error: In the first guideline for extracting from JSON (line ~107), the phrase "extract an entities" should be corrected. Consider changing it to "extract entities" or "extract an entity" to improve clarity.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_whBCPNUNA6ihJLmb


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on a0175df in 48 seconds

More details
  • Looked at 12 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. graphiti_core/utils/maintenance/node_operations.py:14
  • Draft comment:
    Good cleanup: removal of the unused 'ast' import adheres to DRY principles.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 80%
    The comment is purely informative and does not provide any actionable feedback or suggestions. It simply states that the removal of an unused import adheres to DRY principles, which is not necessary to point out.
2. graphiti_core/utils/maintenance/node_operations.py:14
  • Draft comment:
    Unused import 'ast' removed. This cleanup is good for code simplicity.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 80%
    This comment is purely informative, praising the removal of an unused import. It doesn't provide any actionable feedback or suggestions for improvement. According to the rules, purely informative comments should be removed.

Workflow ID: wflow_IVglj3OyRoOdNDII


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@prasmussen15 prasmussen15 merged commit 6f87473 into main Feb 27, 2025
7 checks passed
@prasmussen15 prasmussen15 deleted the entity-classification-updates branch February 27, 2025 20:12
@github-actions github-actions bot locked and limited conversation to collaborators Feb 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants