feat: Return the status of unproductive contigs in is_valid function#287
feat: Return the status of unproductive contigs in is_valid function#287
Conversation
vdj_ann/src/transcript.rs
Outdated
| pub enum UnproductiveContigCause { | ||
| NoCdr3, | ||
| Misordered, | ||
| NotFull, | ||
| TooLarge, | ||
| } |
There was a problem hiding this comment.
Can we add comments above each failure state describing exactly what criteria need to met to qualify.
| // two passes, one for light chains and one for heavy chains | ||
| for pass in 0..2 { | ||
| let mut m = "A"; | ||
| if pass == 1 { |
There was a problem hiding this comment.
This ann tuple below is all over the vdj code base and it makes everything so confusing and hard to follow. Converting this into a struct is a lot more work but at-least in this function if we could rename the unpacked fields to something more intuitive it would really help. I think its defined here
rust-toolbox/vdj_ann/src/annotate.rs
Line 1098 in 03aa9d3
| if pass == 2 || n % 3 == 1 { | ||
| // on second pass, go through with checking for stop codon regardless of n % 3 value | ||
| if inner_pass == 2 || n % 3 == 1 { |
There was a problem hiding this comment.
I think the original code was conflating two checks: (1) full length i.e. having a vstart and jstop and (2) finding a frameshift and/or stop codon. We should split the two checks and specify them as separate fields in the enum. FYI these are the categories described on our software support site: https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/algorithms/annotation#productive
63e7479 to
16a96cf
Compare
Defining failure categories and returning why a contig is not productive.
Todo:
JIRA: https://10xtech.atlassian.net/browse/CELLRANGER-7568