Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Appropriately handle race unknown #196

Merged
merged 4 commits into from
Feb 7, 2025
Merged

Appropriately handle race unknown #196

merged 4 commits into from
Feb 7, 2025

Conversation

bamader
Copy link
Collaborator

@bamader bamader commented Feb 6, 2025

Description

This PR removes the feature_iter yield for the RACE field whenever an incoming record has a value for that field of UNKNOWN or ASKED_UNKNOWN. This ensures that, downstream, we don't perform fuzzy string comparisons against known race values and UNKNOWN and thereby award some log odds points where none should be.

Related Issues

#193

Additional Notes

n/a
<--------------------- REMOVE THE LINES BELOW BEFORE MERGING --------------------->

Checklist

Please review and complete the following checklist before submitting your pull request:

  • I have ensured that the pull request is of a manageable size, allowing it to be reviewed within a single session.
  • I have reviewed my changes to ensure they are clear, concise, and well-documented.
  • I have updated the documentation, if applicable.
  • I have added or updated test cases to cover my changes, if applicable.
  • I have minimized the number of reviewers to include only those essential for the review.

Checklist for Reviewers

Please review and complete the following checklist during the review process:

  • The code follows best practices and conventions.
  • The changes implement the desired functionality or fix the reported issue.
  • The tests cover the new changes and pass successfully.
  • Any potential edge cases or error scenarios have been considered.

Copy link

codecov bot commented Feb 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.65%. Comparing base (4d0ccfa) to head (0582e82).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #196   +/-   ##
=======================================
  Coverage   97.65%   97.65%           
=======================================
  Files          32       32           
  Lines        1576     1576           
=======================================
  Hits         1539     1539           
  Misses         37       37           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@ericbuckley ericbuckley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bamader would you mind adding another test or two for this, the existing test doesn't account for ASKED_UNKNOWN. It also be good to assert that the other races are still being used appropriately.

@ericbuckley ericbuckley linked an issue Feb 6, 2025 that may be closed by this pull request
@bamader
Copy link
Collaborator Author

bamader commented Feb 7, 2025

@ericbuckley No problem, some test cases added to test for other races as well as ASKED_UNKNOWN

@bamader bamader requested a review from ericbuckley February 7, 2025 19:51
Copy link
Collaborator

@ericbuckley ericbuckley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bamader bamader merged commit 1507bea into main Feb 7, 2025
15 checks passed
@bamader bamader deleted the unknown-fix branch February 7, 2025 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unknown RACE values should not be compared
2 participants