Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flair collapse error after filtering cov step #420

Open
NHoang98 opened this issue Feb 14, 2025 · 2 comments
Open

Flair collapse error after filtering cov step #420

NHoang98 opened this issue Feb 14, 2025 · 2 comments
Assignees
Milestone

Comments

@NHoang98
Copy link

Copy and paste the exact command you tried to run
flair collapse -g Documents/ref/Momordica_charantia/goya2_genome.fasta -q Documents/Gac/RNAseq/4.mapping/flair/Momordica_charantia/Mcha_all_corrected.bed -r Documents/Gac/RNAseq/3.filtering/Aril/15libs/*.fq -f Documents/ref/Momordica_charantia/goya2_polish_chr.gff --stringent --check_splice --annotation_reliant generate

How did you install Flair?
(We'd prefer it if you used one of the top two because they are the least likely to have package compatibility problems.)

  1. bioconda (with newly created env)

What happened?
Writing temporary files to /tmp/tmpz6eh3m8k/
Making transcript fasta using annotated gtf and genome sequence
Aligning reads to reference transcripts
Counting supporting reads for annotated transcripts
Setting up unassigned reads for flair-collapse novel isoform detection
Annotated ends extracted from GTF
Read data extracted
Single-exon genes grouped, collapsing
Renaming isoforms using gtf
Aligning reads to first-pass isoform reference
[M::mm_idx_gen::0.7161.03] collected minimizers
[M::mm_idx_gen::0.815
1.39] sorted minimizers
[M::main::0.8831.28] loaded/built the index for 87370 target sequence(s)
[M::mm_mapopt_update::0.940
1.27] mid_occ = 32
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 87370
[M::mm_idx_stat::0.9811.26] distinct minimizers: 5272412 (62.62% are singletons); average occurrences: 1.774; average spacing: 5.513; total length: 51552793
[M::worker_pipeline::48.862
3.96] mapped 1427131 sequences
[M::worker_pipeline::91.0534.01] mapped 1074126 sequences
[M::worker_pipeline::133.877
4.03] mapped 1017947 sequences
[M::worker_pipeline::178.3444.04] mapped 1089105 sequences
[M::worker_pipeline::222.145
4.04] mapped 1028696 sequences
[M::worker_pipeline::265.9114.04] mapped 990891 sequences
[M::worker_pipeline::311.130
4.05] mapped 1036441 sequences
[M::worker_pipeline::355.8074.05] mapped 892957 sequences
[M::worker_pipeline::400.123
4.05] mapped 986732 sequences
[M::worker_pipeline::444.6024.05] mapped 828357 sequences
[M::worker_pipeline::488.222
4.05] mapped 959489 sequences
[M::worker_pipeline::533.7404.05] mapped 1257936 sequences
[M::worker_pipeline::575.507
4.05] mapped 1114138 sequences
[M::worker_pipeline::618.2814.05] mapped 1287061 sequences
[M::worker_pipeline::661.366
4.05] mapped 1148202 sequences
[M::worker_pipeline::690.813*4.04] mapped 765094 sequences
[M::main] Version: 2.24-r1122
[M::main] CMD: minimap2 -a -t 4 -N 4 flair.collapse.firstpass.fa flair.collapse.unassigned.fasta
[M::main] Real time: 690.862 sec; CPU: 2790.606 sec; Peak RSS: 2.284 GB
Filtering isoforms by read coverage
Traceback (most recent call last):
File "/home/cmmr/anaconda3/envs/flair/lib/python3.8/site-packages/flair/filter_collapsed_isoforms_from_annotation.py", line 229, in
if isoforms[chrom][n]['jname'][1:-1] in isoforms[chrom][n_]['jname'] and
KeyError: 'jname'
Traceback (most recent call last):
File "/home/cmmr/anaconda3/envs/flair/bin/flair", line 10, in
sys.exit(main())
File "/home/cmmr/anaconda3/envs/flair/lib/python3.8/site-packages/flair/flair.py", line 1035, in main
status = collapse()
File "/home/cmmr/anaconda3/envs/flair/lib/python3.8/site-packages/flair/flair.py", line 646, in collapse
subprocess.check_call([sys.executable, path+'filter_collapsed_isoforms_from_annotation.py', '-s', str(min_reads),
File "/home/cmmr/anaconda3/envs/flair/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/cmmr/anaconda3/envs/flair/bin/python', '/home/cmmr/anaconda3/envs/flair/lib/python3.8/site-packages/flair/filter_collapsed_isoforms_from_annotation.py', '-s', '3.0', '-i', 'flair.collapse.isoforms.bed', '--map_i', 'flair.collapse.isoform.read.map.txt', '-a', 'flair.collapse.annotated_transcripts.supported.bed', '--map_a', 'flair.collapse.annotated_transcripts.isoform.read.map.txt', '-o', 'flair.collapse.isoforms.bed', '--new_map', 'flair.collapse.combined.isoform.read.map.txt']' returned non-zero exit status 1.

What else do we need to know?
Hi we are trying to construct transcriptome of a plant species by using its sister species. The gff file of reference was made by "Liftoff" package (https://github.com/agshumate/Liftoff) since only previous genome (Scaffold-level) has been annotated.

These are the previous steps:

flair align -g Documents/ref/Momordica_charantia/goya2_genome.fasta -r Documents/Gac/RNAseq/3.filtering/Aril/15libs/*.fq -t 16 -o Documents/Gac/RNAseq/4.mapping/flair/Momordica_charantia/Mcha

flair correct -g Documents/ref/Momordica_charantia/goya2_genome.fasta -q Documents/Gac/RNAseq/4.mapping/flair/Momordica_charantia/Mcha.bed -f Documents/ref/Momordica_charantia/goya2_polish_chr.gff -o Documents/Gac/RNAseq/4.mapping/flair/Momordica_charantia/Mcha --threads 16

@cafelton
Copy link
Collaborator

@diekhans I think this is an issue in filter_collapsed_isoforms_from_annotation.py lines 154-162 - annotated transcripts that are single exon (no splice junctions) don't get assigned a name, so the name cannot be referenced later. Is this something we can address in the 2.1 release?

@diekhans diekhans self-assigned this Feb 14, 2025
@diekhans diekhans mentioned this issue Feb 14, 2025
17 tasks
@diekhans
Copy link
Collaborator

diekhans commented Feb 14, 2025 via email

@diekhans diekhans added this to the FLAIR 2.1 milestone Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants