-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map NCIT biological process terms to GO #68
Comments
Hi @nicolevasilevsky , this was actually done by design because way back we were thinking of one day reusing GO biological processes. So NCIt processes only covered the minimum wild type processes that it needed for specific DL modeling, and payed more attention to pathologic processes (this has changed a bit more recently). Our last foray into the reuse question was an analysis of the various things that we needed to do in order to take this to production (e.g. how to deal with deprecation, mapping, remapping on deprecation, so on). But we were stretched thin and it's now on the back burner. Don't know whether what we did would help you but would be happy to go over it with you if you wish. |
Thanks @fragosog! Let me check with Jim and Gaurav. :) |
Hi everybody! Sorry, it took me a while to get my UMLS mapping tool running again, but I've loaded up the UMLS 2020AB (Fall) release and it should be good to go now. Before proceeding any further, let me verify that I understand what we're trying to do:
Does that sounds right? If so, I think we should start by trying to complete step 2 first, so we get a list of all the NCIt terms from the OWL logical expressions that we want to try to map to GO, which I can then use as input to my program. @balhoff: could you please point me to the code that you used to generate the logical expressions in NCIt-OBO in the first place? I think it might be easier to extract the disease findings from that than trying to read it from the OWL logical expressions. If you want to take a stab at solving step 2 yourself, that'd be great -- I don't have any programs that do that yet, so I'd be starting from scratch, whereas you probably have more experience than me in working with NCIt-OBO! |
@gaurav I was thinking of something a bit simpler. Just extract mappings from NCIt terms to GO terms. It doesn't matter whether they are used in axioms or not. |
Ah, okay! That is a lot simpler :). Do you mean something like this: https://docs.google.com/spreadsheets/d/1_ERXJJjQsHNza5MihO0EU2eegmJUOANUqbiR1nQ_ju8/edit?usp=sharing? There are 180,003 Gene Ontology concepts in the UMLS, but apparently only 800 of those concepts appear to be mapped to NCI concepts. That seems like quite a discrepancy -- if you find any GO concepts that should be in this list but aren't, please let me know! |
@gaurav yep that's perfect, thanks! There are only about 45,000 GO concepts. There must be something off about the 180,000 number. |
@gaurav create an output available mappings from UMLS to GO
Background:
I started working on mapping NCIT processes to GO terms in this spreadsheet.
I started by looking at disease terms that were in the 'DiseaseMapping' tab, and looking at which NCIT terms in the logical definitions could be mapped to GO terms.
I reviewed ~50 terms so far and have noticed that there are not many terms in the NCIT equiv axioms that can be mapped to GO terms.
NCIT uses the relation 'Disease_Has_Finding', which seems to be related to all sorts of terms, including 'morphologic findings'. Some of these map to GO terms, but so far, the majority do not.
The text was updated successfully, but these errors were encountered: