-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hg38 support #3
Comments
Hi Peter, The training data was indeed hg19. In short, hg38 is not on the roadmap. The regional mutation density features (RMD; no. of SNVs per 1mb bin across the genome) are the only genomic position sensitive features but these are by far the most important, but I'm not sure how much these 1mb bins will differ with hg38. Ideally CUPLR would be retrained on hg38 data. However i don't know whether Hartwig Medical Foundation have rerun any/all of their samples with hg38. Also, the PCAWG samples would need to be rerun with the Hartwig pipeline with hg38 (we did this but with hg19; took us 6 months!). So overall retraining isn't possible in the short term. An alternative is to lift over the hg19 1mb bin coordinates to hg38 coords, count the no. of SNVs for the RMD features with the hg38 coords, then run the existing model. Let me know if you'd like this as an option and i can add this to Luan |
We do have fairly comprehensive test data (which was used on CUPPA before). Could be worth a shot but also need to think about priorities here and where to slot this in. Thanks for the feedback! |
Ok i can add hg38 support to |
Hi Oliver and Peter, I've now added hg38 support to |
Thank you! |
Hello all, Did you tested on hg38 @ohofmann ? Is there any limitations to run it that way even tough the training was perform on hg19 @luannnguyen ? Thanks in advance :) Impressive work by the way |
@boutrys We went with a slightly different path in the end, wrapping CUPPA (which now has hg38 support) into a Nextflow pipeline and have started testing that. |
Hello! Great work on CUPLR! I am running
In fact, I get this error when doing Have you come across this before? I have BSgenome installed as well as both the hg19 and hg38 packages. EDIT: I believe I've isolated the issue to the following line in the function
|
Hi @luannnguyen!
Is hg38 supported? We've noticed a couple hg19-specific code chunks in
featureExtractor
, and (I believe?) those HMF/PCAWG training samples were hg19. Or is hg38 on the roadmap?Cheers - Peter
The text was updated successfully, but these errors were encountered: