Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update lucene escaping #233

Merged
merged 2 commits into from
Dec 9, 2024
Merged

update lucene escaping #233

merged 2 commits into from
Dec 9, 2024

Conversation

prasmussen15
Copy link
Collaborator

@prasmussen15 prasmussen15 commented Dec 6, 2024

Important

Update lucene_sanitize() to escape additional characters and adjust tests accordingly.

  • Functionality:
    • Update lucene_sanitize() in helpers.py to escape additional characters: O, R, N, T, A, D.
  • Tests:
    • Update test_lucene_sanitize() in helpers_test.py to include new escape characters in test cases.

This description was created by Ellipsis for 8b8b9a4. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 3b1205c in 30 seconds

More details
  • Looked at 17 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. graphiti_core/helpers.py:60
  • Draft comment:
    The characters 'O', 'R', 'N', 'T', 'A', 'D' are not typically special characters in Lucene syntax and do not need escaping. Consider removing these from the escape_map.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_wOBPD7BKQ8NCLHiz


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@@ -57,6 +57,12 @@ def lucene_sanitize(query: str) -> str:
':': r'\:',
'\\': r'\\',
'/': r'\/',
'O': r'\O',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The characters 'O', 'R', 'N', 'T', 'A', 'D' are not special characters in Lucene and do not need escaping. Consider removing these from the escape map.

@prasmussen15 prasmussen15 merged commit 732b2f3 into main Dec 9, 2024
7 checks passed
@prasmussen15 prasmussen15 deleted the lucene-escape branch December 9, 2024 15:36
@github-actions github-actions bot locked and limited conversation to collaborators Dec 9, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 8b8b9a4 in 21 minutes and 46 seconds

More details
  • Looked at 18 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. tests/helpers_test.py:27
  • Draft comment:
    The backslash at the beginning of the string seems unnecessary and could lead to confusion. Consider removing it for clarity.
            'This has every escape character \+ \- \&\& \|\| \! \( \) \{ \} \[ \] \^ \\" \~ \* \? \: \\\\ \/',
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The copyright notice is present, and the code is generally well-structured. However, there is a minor issue with the escape character in the test case.

Workflow ID: wflow_6IDW8MYzWHYWOU5v


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

'This has every secape character + - && || ! ( ) { } [ ] ^ " ~ * ? : \\ /',
'This has every secape character \+ \- \&\& \|\| \! \( \) \{ \} \[ \] \^ \\" \~ \* \? \: \\\ \/',
'This has every escape character + - && || ! ( ) { } [ ] ^ " ~ * ? : \\ /',
'\This has every escape character \+ \- \&\& \|\| \! \( \) \{ \} \[ \] \^ \\" \~ \* \? \: \\\ \/',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected result for the first query seems incorrect. The leading backslash is unnecessary and should be removed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants