Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTEX Identifier dump #3

Open
jmcmurry opened this issue Mar 9, 2018 · 8 comments
Open

GTEX Identifier dump #3

jmcmurry opened this issue Mar 9, 2018 · 8 comments
Labels
GTEx help wanted Extra attention is needed

Comments

@jmcmurry
Copy link
Contributor

jmcmurry commented Mar 9, 2018

We would like a dump of all of each of the GTEX identifiers in this format.

@jmcmurry jmcmurry mentioned this issue Mar 9, 2018
3 tasks
@owhite owhite added the help wanted Extra attention is needed label Mar 12, 2018
@owhite owhite added the GTEx label Apr 24, 2018
@jnedzel
Copy link
Contributor

jnedzel commented Apr 25, 2018

jmcmurry, can you please give me more detail on what you want? The data commons project is mostly focused on the raw GTEx data (e.g., BAM files). So what identifier do you want? Filename? Path to the Google bucket? I really need more context to understand what it is that you want and how you are going to use it.

@ctb
Copy link
Contributor

ctb commented Apr 26, 2018

@jmcmurry ^^

@jmcmurry
Copy link
Contributor Author

Good question; off the top of my head highest priority are things required for search and retrieval: species and anatomy and gene ID (or if not ID, at least gene symbol). Also useful are things useful for filtering or making sense of the data once you have found it (like genome assembly ID).

That's my 100% naive take but don't jump to it until we get confirmation from @cmungall and others.

@jnedzel
Copy link
Contributor

jnedzel commented Apr 27, 2018

I'm sorry, but I'm still completely lost as to what you want. Could we schedule a call to clarify?

@jnedzel
Copy link
Contributor

jnedzel commented May 25, 2018

I've committed id dump files for GTEx samples and subjects to this GitHub repo. I will continue working on other entities. There are some fields in your format that we don't understand, so I've left those blank:

  • We don't have URIs for many of these entities.
  • I don't know what an "outgoing URI" is.
  • I don't understand the distinction between a "Native form" id versus a "Prefixed form"

@jmcmurry
Copy link
Contributor Author

jmcmurry commented May 29, 2018

Thanks @jnedzel
If you reference an ID that you did not mint in house, the outgoing URI is the URI you reference.
In your particular case, it looks like you're not using any such identifiers; however, it would be great if you could map these tissue IDs to uberon (documenting the caveat that these are not pre-mapped in situ). I have made a sample change here for two of the terms now so you know what I mean.

@jnedzel
Copy link
Contributor

jnedzel commented May 30, 2018

We do have Uberon IDs. Once I confirm our mapping, I will update.

@jnedzel
Copy link
Contributor

jnedzel commented Jun 6, 2018

@jmcmurry I've committed a new version of the tissue ID file, with the Uberon mapping as you requested. I've also included a separate column that contains just the Uberon ID, in addition to the outgoing URI. I can remove that column if you would prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTEx help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants