% vtools liftover -h
usage: vtools liftover [-h] [--flip] [-v STD[LOG]] build
Convert coordinates of existing variants to alternative coordinates in an
alternative reference genome. The UCSC liftover tool will be automatically
downloaded if it is not available.
positional arguments:
build Name of the alternative reference genome
optional arguments:
-h, --help show this help message and exit
--flip Flip primary and alternative reference genomes so that
the specified build will become the primary reference
genome of the project.
-v STD[LOG], --verbosity STD[LOG]
Output error and warning (0), info (1) and debug (2)
information to standard output (default to 1), and to
a logfile (default to 2).
Vtools provides a command which is based on the tool of USCS liftOver to map the variants from existing reference genome to an alternative build. After executing of this command, The fields of chromosome, position reference and alternative of the variant in current and previous reference genomes are all in the master variant table.
An illustration of the liftover process
alt_chr
and alt_pos
columns to the master variants table.vtools output
and vtools export
can output alternative coordinates using parameter --build
.This feature is unavailable under windows because UCSC liftOver tool does not support windows.
Because the UCSC liftover tools does not guarantee complete translation, variants that failed to map will have missing alternative coordinates.
% vtools init -f liftover
% vtools admin --load_snapshot vt_testData_v3
% vtools import V1-3_hg19_combine.vcf --build hg19
% vtools liftover hg38
INFO: Downloading liftOver chain file from UCSC
INFO: Exporting variants in BED format
Exporting variants: 100% [===============================] 288 110.5K/s in 00:00:00
INFO: Running UCSC liftOver tool
Updating table variant: 100% [============================] 288 780.0/s in 00:00:00
After the liftOver operation, three more fields are added to the master variant table (alt_bin, alt_chr, alt_pos)
% vtools show table variant
Name: variant
Description: Master variant table
Creation date: May29
Command:
Fields: variant_id, bin, chr, pos, ref, alt, alt_bin, alt_chr, alt_pos
Number of variants: 1611
% vtools output variant variant_id bin chr pos ref alt alt_bin alt_chr alt_pos -l 15
variant_id, bin, chr, pos, ref, alt, alt_bin, alt_chr, alt_pos
1 585 1 14677 G A 585 1 14677
2 585 1 15820 G T 585 1 15820
... ...
52 586 1 230047 A T 586 1 260296
53 586 1 230058 T G 586 1 260307
54 586 1 231480 G C 586 1 261729
55 586 1 231504 G A 586 1 261753
56 586 1 231526 C T 586 1 261775
57 586 1 232223 C T 587 1 262472
58 586 1 234301 T C 587 1 264550
59 586 1 234308 A G 587 1 264557
... ...
% vtools show
Project name: test
Primary reference genome: hg19
Secondary reference genome: hg38
Storage method: hdf5
Variant tables: variant
Annotation databases:
% vtools liftover hg38 --flip
INFO: Downloading liftOver chain file from UCSC
INFO: Exporting variants in BED format
Exporting variants: 100% [===============================] 288 116.2K/s in 00:00:00
INFO: Running UCSC liftOver tool
INFO: Flipping primary and alternative reference genome
Updating table variant: 100% [============================] 288 612.1/s in 00:00:00
Interruption of the flipping process will leave the project unusable because of mixed coordinates.
% vtools show
Project name: test
Primary reference genome: hg38
Secondary reference genome: hg19
Storage method: hdf5
Variant tables: variant
Annotation databases:
% vtools output variant variant_id bin chr pos ref alt alt_bin alt_chr alt_pos -l 15
variant_id, bin, chr, pos, ref, alt, DP, alt_bin, alt_chr, alt_pos
1 585 1 14677 G A 585 1 14677
2 585 1 15820 G T 585 1 15820
... ...
52 586 1 260296 A T 586 1 230047
53 586 1 260307 T G 586 1 230058
54 586 1 261729 G C 586 1 231480
55 586 1 261753 G A 586 1 231504
56 586 1 261775 C T 586 1 231526
57 587 1 262472 C T 586 1 232223
58 587 1 264550 T C 586 1 234301
59 587 1 264557 A G 586 1 234308
... ...