Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data build files following mondo release #782

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

twhetzel
Copy link
Contributor

@twhetzel twhetzel commented Feb 12, 2025

Resolves #ISSUE(s).

Overview

This PR:
Is a full data build from main following the Mondo release last week to be merged back into main.

Pre-merge checklist

Documentation

Was the documentation added/updated under docs/?

  • Yes
  • No, updates to the docs were not necessary after careful consideration

QC

Was the full pipeline run before submitting this PR using sh run.sh make build-mondo-ingest on this branch (after
docker pull obolibrary/odkfull:dev), and no errors occurred?

  • Yes
  • No, there are no functional (code-related) changes to the pipeline in the PR, so no re-run is necessary

New Packages

Were any new Python packages added?

Were any other non-Python packages added?

PR Review and Conversations Resolved

Has the PR been sufficiently reviewed by at least 1 team member of the Mondo Technical team and all threads resolved?

  • Yes

@twhetzel twhetzel requested a review from matentzn February 12, 2025 02:32
@twhetzel twhetzel added the build Mostly for build PRs: when changes only to data files post `build-mondo-ingest`; no code changes label Feb 12, 2025
@@ -200,7 +200,6 @@ http://purl.obolibrary.org/obo/MONDO_0007672 http://identifiers.org/hgnc/14373 O
http://purl.obolibrary.org/obo/MONDO_0007681 http://identifiers.org/hgnc/17098 OMIM:138800 https://omim.org/entry/606241
http://purl.obolibrary.org/obo/MONDO_0007686 http://identifiers.org/hgnc/31928 OMIM:139090 https://omim.org/entry/614169
http://purl.obolibrary.org/obo/MONDO_0007688 http://identifiers.org/hgnc/6770 OMIM:139210 https://omim.org/entry/600993
http://purl.obolibrary.org/obo/MONDO_0007690 http://identifiers.org/hgnc/2594 OMIM:139300 https://omim.org/entry/107910
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed this OMIM https://omim.org/entry/139300 no longer matches the rules for disease defining.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm debugging the OMIM pipeline and it's currently not even processing OMIM:139300 or OMIM:107910 to add associations for them. It's possible that the data has changed since your build was run, but that would be surprising since the OMIM release ran just 4 days ago. Won't be able to look into this until next week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the disease defining association is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When did you run this? I just downloaded the file from the release and this entry is not there.

If you didn't run this before Sunday, then maybe there is some problem with mondo-ingest fetching old data.

@@ -2353,7 +2353,6 @@ http://purl.obolibrary.org/obo/MONDO_0013259 http://identifiers.org/hgnc/10013 O
http://purl.obolibrary.org/obo/MONDO_0013263 http://identifiers.org/hgnc/34383 OMIM:613428 https://omim.org/entry/613425
http://purl.obolibrary.org/obo/MONDO_0013264 http://identifiers.org/hgnc/17142 OMIM:613435 https://omim.org/entry/602432
http://purl.obolibrary.org/obo/MONDO_0013268 http://identifiers.org/hgnc/450 OMIM:613451 https://omim.org/entry/605420
http://purl.obolibrary.org/obo/MONDO_0013270 http://identifiers.org/hgnc/3811 OMIM:613454 https://omim.org/entry/164874
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MONDO_0013270 is obsolete in Mondo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The OMIM pipeline doesn't have access to see which Mondo terms are deprecated. It will look for anything inside of these files mondo_exactmatch_omim.sssom.owl and mondo_exactmatch_omimps.sssom.owl, which it downloads from the Mondo master branch.

I don't think it would be easy to filter deprecated in the OMIM repo, but mondo-ingeset is set up much more easily to do that.

Can you make a ticket for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joeflack4 why is a ticket needed? Seeing this line removed is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry it is red not green. I was focusing more on your words -_- / I figured that the comment was an indication of an issue, rather than showing things were working as expected

@twhetzel twhetzel requested a review from joeflack4 February 12, 2025 16:20
Copy link
Contributor

@joeflack4 joeflack4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passes my general spot check, though these seem surprisingly high to me:

src/mappings/ordo.sssom.tsv | 91302 ++++++++--------------
src/ontology/unmapped/ncit-unmapped.owl | 3704 +-

ICD11 is higher because I just updated the release for the first time in many months:
.../reports/mirror_signature-icd11foundation.tsv | 822 +-

@twhetzel
Copy link
Contributor Author

Thanks Joe. Remember these results (src/mappings/ordo.sssom.tsv | 91302) include the "fix ordo" query update that was recently added and the numbers and file changes look to match up with that change.

Copy link
Member

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment but everything looks good to me!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an enormous change - is this explainable by recent changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i believe it's explainable by the recent update to the "fix ordo query"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Mostly for build PRs: when changes only to data files post `build-mondo-ingest`; no code changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants