Formats

Supported import/update/export file formats

variant tools uses file format specification files (.fmt) files to describe file formats so that commands such as vtools import, vtools update and vtools export know how to import data from and export data to files in such formats.

  • variant tools can import variants, variant info fields, genotypes, and genotype info fields from a file. The file must contain information about variants (chr, pos, ref, alt), although variant tools is able to obtain information from other sources if some of the variant info fields are missing. (For example, variant tools can retrieve reference alleles from a reference genome).
  • variant tools can update variant info fields and genotype info fields for existing variants and genotypes. The file can be variant-based (contains chr, pos, ref and alt), position-based (contains chra and pos), and range-based (contains chr, starting and ending positions). In the latter cases, a record in the input file can update multiple variants at the specified location or range.
  • variant tools can export variants, variant info fields, genotypes, and genotype info fields to a file. The format description file must define columns, which specify what and in which format to export to each column of the output file.

variant tools can import and export data in the following formats. We try to update descriptions of these formats as soon as possible but please use commands such as

% vtools show formats
% vtools show format basic

to get the most updated information about these formats.

Name Import Update Export Comment
basic Y a Y Import variants in tab-delimited format, export variants and optional variant info fields and genotypes
VCF Y Y Y Variant Call Format (VCF version 4.0 and 4.1)
CSV Y Y csv format
ANNOVAR Y Format of ANNOVAR input file.
ANNOVAR_variant_function Y used to imported annotations from ANNOVAR.variant_function files.
ANNOVAR_exonic_variant_function Y imports annotations from files generated from ANNOVAR of the form.exonic_variant_function.
CASAVA18_snps Y Illumina snps.txt format
CASAVA18_indels Y indels.txt from Illumina
CGA Y Complete Genomics CGA masterVarBeta$ID.tsv.bz2 file
Pileup_indel Y Pileup Indel format
MAP Y Import variants from files with only chr and pos information. reference and alternative alleles are retrieved from dbSNP.
PLINK Y Y Import variants and sample genotypes from PLINK file format. Currently only PLINK binary file input is supported.
Polyphen2 Y Y Export data in Polyphen2 batch query, import information from results returned by the polyphen2 batch query server.
TPED Y
twoalleles Y Import alleles as allele 1 and 2, use a reference genome to determine which one is reference
rsname Y Import variants from rsnames, using the dbSNP database to query variants

Customize import/export format: