-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create folder structure for versioning IBA GAF releases #34
Comments
To clarify, if I'm reading this correctly, this would mean that every few months somebody would need to update the PAINT GAF location (by date), while the tree location remains the same for a given PANTHER release? This would certainly meet our criteria. Possible implementation of geneontology/pipeline#86 |
@kltm Yep correct for all that! At IBA GAF release time, we would have to create the new folder (e.g. When we release the new 15.0 version of Panther trees we'll swap the |
So I decided to add another folder level to each dated "release" folder that will contain the IBA GAFs meant for the GO pipeline. To maintain convention, this new folder is called
Intuitively, the With this extra level, we can separate out the IBA GAFs from other products we want to attach to a specific release. For sure the "IBD" file that's generated parallel to the IBAs as well as other metadata will go somewhere here (not in |
@dustine32 A URL is a URL is a URL, but I'm a little confused as to what "presubmission" means in this case. These are essentially the versioned locations of the product that the GO will consume? To re-clarify, we are concerned with three things:
Noting that there is no way to get the lastest release of a particular PANTHER version in this setup. Does this all sound correct to you? With this, we'd essentially have two new variables in the pipeline: PANTHER_VERSION and PANTHER_RELEASE and would thread those in. |
@kltm Sorry, I'm actually not sure what is meant by "presubmission" either. I just reused the name to hopefully reduce confusion during this versioning transition, though I probably just created more confusion. @mugitty Thoughts on where the "presubmission" name came from? Re: the URL path variables, I believe this new setup should satisfy those three patterns. For example:
Will all get you the latest data. And these all currently point to:
Do you mean the latest Panther version's Also I should note the two new variables in
But you can't get the
I'd actually consider calling (I know I'm being nit picky) the release variable PAINT_RELEASE since the date primarily reflects the PAINT curation data as of that date. Anyhow, is this flexibility required? If yes, there are ways we could support this. But it would probably require discussion over whether these different |
@huaiyumi once instructed me to copy the GAF files into the 'presubmission' directory. But, I don't know why it is called 'presubmission'. |
@dustine32 You're right: I missed the pattern for the most current trees. I think all the uses are satisfied. I'm happy with any input as to the variable names. Mainly, I just want to prevent mistakes from copy/paste from creeping in whenever we make a change to what we're actively pointing at, with controlling the "version" and "date" being the only things that change. Let's pencil in, what I believe is your suggestion, of having: |
Thanks @mugitty ! I guess it's probably not that big of a deal what it's called? But, just noting that we're all aware, if we did decide to change it we would need to update the URLs in the paint.yaml. |
@kltm Awesome thank you! The I can also set up some go-site/pipeline test branches (similar to go-site/dustine32-issue-1127 and pipeline/issue-78-test-panther-14_1) for Jenkins to run. |
@dustine32 Yes, let's go ahead and do that /but/ wait until after the current release, hopefully this week. Would this be something that you could tackle? We can talk sometime next week about details and then you could switch us over and close out geneontology/pipeline#86 ? |
OK @kltm , yep, that would be "fun" for me to setup. I'll wait until I hear about the release and then see when you're available to chat. |
Currently we really only have one set of IBA GAFs available at anytime, which is the "published", "released" version, accessible via:
We could retain this URL as a pointer to the "published", "released" version of the IBA GAFs while also creating dated folders for each release that can then be used for testing/reproducing fixes/bugs:
But then how do we highlight which version of Panther was used to generate the IBAs? Try this:
The panther version should also be in the GAF header if anyone needs more assurance:
And we already do version the Panther tree files (up to a certain point in history):
http://data.pantherdb.org/PANTHER14.1/globals/tree_files.tar.gz
http://data.pantherdb.org/PANTHER13.1/globals/tree_files.tar.gz
We have a somewhat regular history of these GAFs since late 2017 stored under an "archive" folder. Just need to move these to match the above path convention. Ex:
ftp://ftp.pantherdb.org/downloads/paint/archive/09302017/
->ftp://ftp.pantherdb.org/downloads/paint/11.1/2017-05-11/
Note that the date discrepancy here is due to 2017-05-11 being the file generation date and 09302017 being the date the files were archived/replaced by a newer version.
I'll set this up and we can see if it satisfies the needs of the GO pipeline workflow. @mugitty Would you be affected if we moved around the contents of
ftp://ftp.pantherdb.org/downloads/paint/archive/
?The text was updated successfully, but these errors were encountered: