Skip to content

Formats

Lydia Buntrock edited this page Nov 3, 2020 · 9 revisions

Annotation I/O

VCF & BCF GFF GFF2 & GTF* GFF3 BED
Specific Header / Metainformation x no header no header No header
Header line x no
CHROM x seqname Reference sequence x
feature*** Name (optional)
source x x
method x type
POS x start & end start & end start & end start & end
thickStart x (optional)
thickEnd x (optional)
itemRgb x (optional)
blockCount x (optional)
blockSizes x (optional)
blockStarts x (optional)
ID x seqid
ALT x
QUAL x score score score score (optional)
strand x x x x (optional)
frame x phase phase
group x
FILTER x
INFO**** x attribute attributes**** *
FORMAT**** ** (optional) x
SAMPLES (optional) x

*The GTF is identical to GFF version 2.

***Gene, Variation, Similarity

****INFO: Arbitrary keys are permitted, although some sub-fields are reserved (albeit optional).

**** *attributes: ID, Name, Alias, Parent, Target, Gap, Derives_from, Note, Dbxref, Ontology_term

**** **FORMAT Tags: AD, ADF, ADR, DP, EC, FT, GL, GP , GQ, GT, HQ, MQ, PL, PQ, PS

Input / Output:

Formats:

VCF (Variant Call Format), BCF, GFF (General Feature Format), GFF2 (deprecated), GTF (General Transfer Format), GFF3, BED (Browser Extensible Data), GVF

HGVS vs BED vs GVF vs VCF Format example: https://www.ncbi.nlm.nih.gov/variation/tools/reporter/docs/examples#section-1.2.3

Format Overviews of Broad Institute of MIT and Harvard and UCSC (University of California, Santa Cruz)

Format specification:

Desired format conversions

VCF to GFF: http://seqanswers.com/forums/showthread.php?t=9796&highlight=gff+vcf%3C/a

Clone this wiki locally