-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory issues with flair collapse #432
Comments
Hi! I was able to figure out a solution by splitting the 'corrected.bed' output by 'flair correct' by chromosome and then splitting my reads so that only reads aligning to each respective chromosome are being used in 'flair collapse'. However this is producing 2 issues: (1) there are reads that are mapping to other chromosomes during 'flair collapse', which shouldn't be possible given I am only inputting reads that aligned to the focal chromosome during 'flair align'. I thought maybe this is because of the difference in parameters between the two minimap commands? (2) those reads that are mapping off-chromosome are also being reported repetitively, for example when I run 'flair collapse' for NC_024238.1, the output bed file contains the following lines. Here you can see that these isoform models are being mapped to another chromosome and are being reported multiple times. Here there are two versions of the same transcript (LOC103458740_XM_008399755.2 and LOC103458740_XM_008399753.2) but both are repeated multiple times.
This appears to be some sort of bug but I am not actually sure what the origin is. |
(flair) bash-4.2$ flair collapse -q ./06_FLAIR/NW_007617753.1.bed
-g ./guppy_genomes/ncbi_female.fa
-r ./03_updatebam/*.fastq
-o ./06_FLAIR/NW_007617753.1
--gtf ./guppy_genomes/ncbi_female_sorted.gtf
--stringent
--check_splice
--generate_map
--annotation_reliant generate
--no_gtf_end_adjustment
--isoformtss
--trust_ends
-t 40
Feel free to leave any original paths, we don't have access to your system
How did you install Flair?
(We'd prefer it if you used one of the top two because they are the least likely to have package compatibility problems.)
conda create -n flair -c conda-forge -c bioconda flair
)What happened?
We know it's ugly but we promise it helps us solve problems faster.
What else do we need to know?
Hi! I am having a lot of issues with memory limitations trying to run flair collapse. I've split my corrected .bed file into <1GB chunks but I still can't get passed the counting step. I am thinking it's because of the quantity of reads I have (840GB), so I am wondering if splitting my fastq files into chunks with transcript that match the individual bed files chunks would solve this issue or not?
In general I am confused as to why flair collapse is realigning everything as the first step given that my data is already aligned from flair align + correct. Is there no way to skip that step and instead just input a corrected bam file and have flair collapse the isoforms from that? I didn't see this option in the docs but given the size of my data I am wondering if there is some work around.
The text was updated successfully, but these errors were encountered: