-
Notifications
You must be signed in to change notification settings - Fork 27
gff3_fix.py documentation
dytk2134 edited this page Jan 2, 2018
·
9 revisions
This gff3_fix program aims to fix certain error types.
Errors were divided into three types, based on how they could be corrected:
- automatic – no review necessary, can be fixed by a program
- semi-automatic – needs manual review, but can be fixed by a program once approved
- manual – needs to be fixed manually
Error code | Error tag | prority | fix function |
---|---|---|---|
Emr0001 | Duplicate transcript found | high | remove_duplicate_trans |
Esf0003 | strand information missing | high | delet_model |
Esf0028 | Attributes must escape the percent (%) sign and any control characters | high | None |
Ema0001 | Parent feature start and end coordinates exceed those of child features | high | fix_boundary |
Ema0003 | This feature is not contained within the parent feature coordinates | high | fix_boundary |
Ema0005 | Pseudogene has invalid child feature type | high | pseudogene |
Ema0009 | Incorrectly merged gene parent? Isoforms that do not share coding sequences are found | high | split |
Emr0002 | Incorrectly split gene parent? | high | merge |
Esf0001 | Feature type may need to be changed to pseudogene | high | pseudogene |
Esf0002 | Start/Stop is not a valid 1-based integer coordinate | high | delet_model |
Esf0013 | White chars not allowed at the start of a line | high | gff3 parse |
Esf0017 | Start/End is not a valid integer | high | delete_model |
Esf0018 | Start is not less than or equal to end | high | delete_model |
Esf0022 | Features should contain 9 fields | high | delete_model |
Esf0025 | Strand has illegal characters | high | delete_model |
Ema0007 | CDS and parent feature on different strands | high | delete_model |
Ema0006 | Wrong phase | medium | fix_phase |
Esf0026 | Phase is not 0, 1, or 2, or not a valid integer | medium | fix_phase |
Esf0027 | Phase is required for all CDS features | medium | fix_phase |
Esf0014 | ##gff-version" missing from the first line | medium | add_gff3_version |
Esf0029 | Attributes must contain one and only one equal (=) sign | medium | fix_attribute |
Esf0030 | Empty attribute tag | medium | fix_attribute |
Esf0031 | Empty attribute value | medium | fix_attribute |
Esf0033 | Found ", " in a attribute, possible unescaped | medium | fix_attribute |
Esf0034 | attribute has identical values (count, value) | medium | fix_attribute |
Esf0036 | Value of a attribute contains unescaped "," | medium | fix_attribute |
Esf0020 | Version is not a valid integer | low | remove_directive |
Esf0032 | Found multiple attribute tags | low | remove_directive |
Esf0016 | ##sequence-region seqid may only appear once | low | remove_directive |
Esf0021 | Unknown directive | low | remove_directive |
Esf0041 | Unknown reserved (uppercase) attribute | low | fix_attributes |
None
Error code | Error tag |
---|---|
Ema0002 | Protein sequence contains internal stop codons |
Ema0004 | Incomplete gene feature that should contain at least one mRNA, exon, and CDS |
Ema0008 | Warning for distinct isoforms that do not share any regions |
Emr0003 | Duplicate ID |
Esf0004 | Seqid not found in any ##sequence-region |
Esf0005 | Start is less than the ##sequence-region start |
Esf0006 | End is greater than the ##sequence-region end |
Esf0007 | Seqid not found in the embedded ##FASTA |
Esf0008 | End is greater than the embedded ##FASTA sequence length |
Esf0009 | Found Ns in a feature using the embedded ##FASTA |
Esf0010 | Seqid not found in the external FASTA file |
Esf0011 | End is greater than the external FASTA sequence length |
Esf0012 | Found Ns in a feature using the external FASTA |
Esf0015 | Expecting certain fields in the feature |
Esf0019 | Version is not "3" |
Esf0023 | escape certain characters |
Esf0024 | Score is not a valid floating point number |
Esf0035 | attribute has unresolved forward reference |
Esf0037 | Target attribute should have 3 or 4 values |
Esf0038 | Start/End value of Target attribute is not a valid integer coordinate |
Esf0039 | Strand value of Target attribute has illegal characters |
Esf0040 | Value of Is_circular attribute is not "true" |