Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert PAINT tool IBA XML to GAF #52

Open
4 tasks
dustine32 opened this issue Apr 29, 2021 · 3 comments
Open
4 tasks

Convert PAINT tool IBA XML to GAF #52

dustine32 opened this issue Apr 29, 2021 · 3 comments
Assignees

Comments

@dustine32
Copy link
Collaborator

dustine32 commented Apr 29, 2021

The OFFICIAL ticket for this. Related to pantherdb/features-bugs#20 as that ticket is about formatting the input for this ticket's new code.

Currently being developed in https://github.com/dustine32/pthr_db_caller but will be hooked into the update pipeline to generate the release GAFs (and soon GPADs).

This should take the XML output from Anushya's PAINT tool IBA propagator and output to GAF 2.2 format:

$ format_xml_iba_to_gaf.py --file_xml PTHR12548.xml
UniProtKB	C3YJJ8	BRAFLDRAFT_207856	involved_in	GO:0006357	PMID:21873635	IBA	PANTHER:PTN000284480|WB:WBGene00001061|UniProtKB:Q14188|UniProtKB:Q14186|MGI:MGI:101934	P	Uncharacterized protein (Fragment)	UniProtKB:C3YJJ8|PTN001756489	protein	taxon:7739	20200808	GO_Central
UniProtKB	M4CZ83	M4CZ83	involved_in	GO:0006357	PMID:21873635	IBA	PANTHER:PTN000284480|WB:WBGene00001061|UniProtKB:Q14188|UniProtKB:Q14186|MGI:MGI:101934	P	Uncharacterized protein	UniProtKB:M4CZ83|PTN004368046	protein	taxon:51351	20200808	GO_Central
UniProtKB	F6ZPV4	TFDP1	enables	GO:0000981	PMID:21873635	IBA	PANTHER:PTN000284480|UniProtKB:Q14188|UniProtKB:Q14186	F	Uncharacterized protein	UniProtKB:F6ZPV4|PTN001394910	protein	taxon:9796	20200914	GO_Central
...

Also needed to be worked into here:

  • Assemble together annotations from multiple input files (corresponding to PANTHER families)
  • Split annotations by organism to write to separate files (e.g. gene_association.paint_pombase, gene_association.paint_other)
  • Construct header
  • Confirm the total output equals (to a sane degree) the GAFs currently coming out of createGAF.pl
@dustine32 dustine32 self-assigned this Apr 29, 2021
dustine32 added a commit to pantherdb/pthr_db_caller that referenced this issue Apr 29, 2021
@pgaudet
Copy link
Collaborator

pgaudet commented Apr 30, 2021

Hi @dustine32
This is great! However I think the example above is GAF2.1, not 2.2? (I dont see the GP2term relations).

Thanks, Pascale

@dustine32
Copy link
Collaborator Author

@pgaudet Thanks for catching this! My bad, I was just so excited about my progress that these examples were pulled from the code I was developing. Unfortunately, that was before I plugged in the relation handling.

I updated the example lines to correct this!

@pgaudet
Copy link
Collaborator

pgaudet commented Apr 30, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants