Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nel Report Precision, Recall, and F1 scores are unreplicable #68

Open
BenLambright opened this issue Aug 16, 2024 · 0 comments
Open

Nel Report Precision, Recall, and F1 scores are unreplicable #68

BenLambright opened this issue Aug 16, 2024 · 0 comments
Labels
🐛B Something isn't working

Comments

@BenLambright
Copy link
Contributor

Bug Description

When I run python evaluate.py preds@dbpedia-spotlight-wrapper@aapb-collaboration-21 golds, I am able to return the same counts for gold and system entities as the report, but not the same precision, accuracy, and recall. The scores for these are either 0 or near-zero numbers.

Reproduction steps

  1. cd to nel_eval
  2. remove guid cpb-aacip-507-nk3610wp6s from both the preds and golds because of its defunct gold data. An error will occur otherwise.
  3. run python evaluate.py preds@dbpedia-spotlight-wrapper@aapb-collaboration-21 golds
  4. view the results

Expected behavior

See the report for the expected behavior.

Log output

No response

Screenshots

No response

Additional context

I have tried different methods of comparing the gold and preds (hashing, strings, manually checking), and at least to me it appears that the criteria for calling the preds and golds NamedEntityLink classes must have changed in the current iteration of evaluate.py from when this report was written.

@BenLambright BenLambright added the 🐛B Something isn't working label Aug 16, 2024
@clams-bot clams-bot added this to infra Aug 16, 2024
@github-project-automation github-project-automation bot moved this to Todo in infra Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛B Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

1 participant