This is implementation of the fixed threshold aggregation methods for disease and quantitative traits. Originally described in Morris and Zeggni 2010[^Andrew P. Morris and Eleftheria Zeggini (2010) An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genetic Epidemiology doi:10.1002/gepi.20450
. http://doi.wiley.com/10.1002/gepi.20450^] and known as Gene- or Region-based Analysis of Variants of Intermediate and Low frequency (GRANVIL), the Aggregation method for rare variants codes observed genotype of a genetic region the count of minor alleles: {$$X = \sum_i^N X_i$$}
Our program implements the aggregation methods in a logistic regression framework for disease traits analysis (case control data) as BurdenBt
method, and a linear regression framework for quantitative traits analysis as BurdenQt
method. {$p$} value for aggregation method is based on asymptotic normal distribution of the Wald statistic in generalized linear models. One could incorporate a number of phenotype covariates in collapsing tests and evaluate the significance of the genetics component.
The same --NA_adjust
option is avaliable as with collapsing methods? although they slightly differ in details, as described in Auer et al 2013[^personal communication with Paul L. Auer at Fred Hutchinson Cancer Research Center^].
vtools show test BurdenBt
Name: BurdenBt
Description: Burden test for disease traits, Morris & Zeggini 2009
usage: vtools associate --method BurdenBt [-h] [--name NAME]
[--mafupper MAFUPPER]
[--alternative TAILED] [--NA_adjust]
[--moi {additive,dominant,recessive}]
Fixed threshold burden test for disease traits (Morris & Zeggini 2009).
p-value is based on the significance level of the regression coefficient for
genotypes. If --group_by option is specified, the group of variants will be
coded using the counts of variants within the group.
optional arguments:
-h, --help show this help message and exit
--name NAME Name of the test that will be appended to names of
output fields, usually used to differentiate output of
different tests, or the same test with different
parameters.
--mafupper MAFUPPER Minor allele frequency upper limit. All variants
having sample MAF<=m1 will be included in analysis.
Default set to 0.01
--alternative TAILED Alternative hypothesis is one-sided ("1") or two-sided
("2"). Default set to 1
--NA_adjust This option, if evoked, will replace missing genotype
values with a score relative to sample allele
frequencies. The association test will be adjusted to
incorporate the information. This is an effective
approach to control for type I error due to
differential degrees of missing genotypes among
samples.
--moi {additive,dominant,recessive}
Mode of inheritance. Will code genotypes as 0/1/2/NA
for additive mode, 0/1/NA for dominant or recessive
model. Default set to additive
vtools show test BurdenQt
Name: BurdenQt
Description: Burden test for quantitative traits, Morris & Zeggini 2009
usage: vtools associate --method BurdenQt [-h] [--name NAME]
[--mafupper MAFUPPER]
[--alternative TAILED] [--NA_adjust]
[--moi {additive,dominant,recessive}]
Fixed threshold burden test for quantitative traits (Morris & Zeggini 2009).
p-value is based on the significance level of the regression coefficient for
genotypes. If --group_by option is specified, the group of variants will be
coded using the counts of variants within the group.
optional arguments:
-h, --help show this help message and exit
--name NAME Name of the test that will be appended to names of
output fields, usually used to differentiate output of
different tests, or the same test with different
parameters.
--mafupper MAFUPPER Minor allele frequency upper limit. All variants
having sample MAF<=m1 will be included in analysis.
Default set to 0.01
--alternative TAILED Alternative hypothesis is one-sided ("1") or two-sided
("2"). Default set to 1
--NA_adjust This option, if evoked, will replace missing genotype
values with a score relative to sample allele
frequencies. The association test will be adjusted to
incorporate the information. This is an effective
approach to control for type I error due to
differential degrees of missing genotypes among
samples.
--moi {additive,dominant,recessive}
Mode of inheritance. Will code genotypes as 0/1/2/NA
for additive mode, 0/1/NA for dominant or recessive
model. Default set to additive
Example using snapshot
vt_ExomeAssociation
vtools associate rare status --covariates gender age bmi exposure -m "BurdenBt --name Burde\
nBt --alternative 2" --group_by name2 --to_db burdenBt -j8 > burdenBt.txt
INFO: 3180 samples are found
INFO: 2632 groups are found
INFO: Starting 8 processes to load genotypes
Loading genotypes: 100% [========================================] 3,180 33.0/s in 00:01:36
Testing for association: 100% [=========================================] 2,632/147 25.3/s in 00:01:43
INFO: Association tests on 2632 groups have completed. 147 failed.
INFO: Using annotation DB burdenBt in project test.
INFO: Annotation database used to record results of association tests. Created on Wed, 30 Jan 2013 17:39:05
vtools show fields | grep burdenBt
burdenBt.name2 name2
burdenBt.sample_size_BurdenBt sample size
burdenBt.num_variants_BurdenBt number of variants in each group (adjusted for specified MAF
burdenBt.total_mac_BurdenBt total minor allele counts in a group (adjusted for MOI)
burdenBt.beta_x_BurdenBt test statistic. In the context of regression this is estimate of
burdenBt.pvalue_BurdenBt p-value
burdenBt.wald_x_BurdenBt Wald statistic for x (beta_x/SE(beta_x))
burdenBt.beta_2_BurdenBt estimate of beta for covariate 2
burdenBt.beta_2_pvalue_BurdenBt p-value for covariate 2
burdenBt.wald_2_BurdenBt Wald statistic for covariate 2
burdenBt.beta_3_BurdenBt estimate of beta for covariate 3
burdenBt.beta_3_pvalue_BurdenBt p-value for covariate 3
burdenBt.wald_3_BurdenBt Wald statistic for covariate 3
burdenBt.beta_4_BurdenBt estimate of beta for covariate 4
burdenBt.beta_4_pvalue_BurdenBt p-value for covariate 4
burdenBt.wald_4_BurdenBt Wald statistic for covariate 4
burdenBt.beta_5_BurdenBt estimate of beta for covariate 5
burdenBt.beta_5_pvalue_BurdenBt p-value for covariate 5
burdenBt.wald_5_BurdenBt Wald statistic for covariate 5
head burdenBt.txt
name2 sample_size_BurdenBt num_variants_BurdenBt total_mac_BurdenBt beta_x_BurdenBt pvalue_BurdenBt wald_x_BurdenBt beta_2_BurdenBt beta_2_pvalue_BurdenBt wald_2_BurdenBt beta_3_BurdenBt beta_3_pvalue_BurdenBt wald_3_BurdenBt beta_4_BurdenBt beta_4_pvalue_BurdenBt wald_4_BurdenBt beta_5_BurdenBt beta_5_pvalue_BurdenBt wald_5_BurdenBt
AADACL4 3180 5 138 -0.314582 0.321174 -0.992049 -0.295836 0.0157002 -2.41581 0.031285 4.33616E-09 5.87083 0.129902 1.92805E-40 13.3137 0.437291 0.00133887 3.20752
AAMP 3180 3 35 0.00135633 0.997852 0.0026919 -0.298944 0.0146254 -2.44152 0.0312624 4.39097E-09 5.86875 0.130231 1.24946E-40 13.346 0.43547 0.00139464 3.19576
ABCG8 3180 12 152 -0.432823 0.171192 -1.36838 -0.295762 0.0157794 -2.41398 0.0314772 3.67916E-09 5.89801 0.130108 1.52929E-40 13.331 0.440976 0.001228 3.2323
ABCG5 3180 6 87 0.324674 0.3172 1.00023 -0.2988 0.0146577 -2.44073 0.0312857 4.15942E-09 5.87773 0.130409 9.33403E-41 13.3677 0.439149 0.00127711 3.22107
ABCB10 3180 6 122 0.333178 0.219379 1.22818 -0.301597 0.013796 -2.46253 0.0312644 4.40563E-09 5.8682 0.130493 9.8029E-41 13.3641 0.431826 0.00154525 3.16605
ABHD1 3180 5 29 -0.149027 0.813232 -0.236258 -0.298211 0.0148918 -2.435 0.0312405 4.49306E-09 5.86494 0.130264 1.16337E-40 13.3513 0.436326 0.001369 3.20111
ABCB6 3180 7 151 -0.00762322 0.977401 -0.028327 -0.299001 0.0146089 -2.44193 0.0312671 4.42259E-09 5.86756 0.130228 1.17642E-40 13.3505 0.435506 0.00139372 3.19595
ABI2 3180 1 25 0.982737 0.0422609 2.03094 -0.30075 0.0140623 -2.45567 0.0311325 4.9292E-09 5.84954 0.129821 1.95802E-40 13.3125 0.436794 0.00135518 3.20403
ABL2 3180 4 41 0.192361 0.698251 0.387682 -0.298745 0.0146809 -2.44016 0.0312678 4.39516E-09 5.86859 0.130322 1.10243E-40 13.3553 0.436387 0.00136405 3.20215
QQ-plot
Attach:burdenBt.jpg
vtools associate rare bmi –covariates gender age exposure -m “BurdenQt –name BurdenQt –a
lternative 2” –group_by name2 –to_db burdenQt -j8 > burdenQt.txt
INFO: 3180 samples are found INFO: 2632 groups are found INFO: Starting 8 processes to load genotypes Loading genotypes: 100% [===========================] 3,180 33.7/s in 00:01:34 Testing for association: 100% [=========================] 2,632⁄147 26.2/s in 00:01:40 INFO: Association tests on 2632 groups have completed. 147 failed. INFO: Using annotation DB burdenQt in project test. INFO: Annotation database used to record results of association tests. Created on Wed, 30 Jan 2013 23:01:44
vtools show fields | grep burdenQt
burdenQt.name2 name2 burdenQt.sample_size_BurdenQt sample size burdenQt.num_variants_BurdenQt number of variants in each group (adjusted for specified MAF burdenQt.total_mac_BurdenQt total minor allele counts in a group (adjusted for MOI) burdenQt.beta_x_BurdenQt test statistic. In the context of regression this is estimate of burdenQt.pvalue_BurdenQt p-value burdenQt.wald_x_BurdenQt Wald statistic for x (beta_x/SE(beta_x)) burdenQt.beta_2_BurdenQt estimate of beta for covariate 2 burdenQt.beta_2_pvalue_BurdenQt p-value for covariate 2 burdenQt.wald_2_BurdenQt Wald statistic for covariate 2 burdenQt.beta_3_BurdenQt estimate of beta for covariate 3 burdenQt.beta_3_pvalue_BurdenQt p-value for covariate 3 burdenQt.wald_3_BurdenQt Wald statistic for covariate 3 burdenQt.beta_4_BurdenQt estimate of beta for covariate 4 burdenQt.beta_4_pvalue_BurdenQt p-value for covariate 4 burdenQt.wald_4_BurdenQt Wald statistic for covariate 4
head burdenQt.txt
name2 sample_size_BurdenQt num_variants_BurdenQt total_mac_BurdenQt beta_x_BurdenQt pvalue_BurdenQt wald_x_BurdenQt beta_2_BurdenQt beta_2_pvalue_BurdenQt wald_2_BurdenQt beta_3_BurdenQt beta_3_pvalue_BurdenQt wald_3_BurdenQt beta_4_BurdenQt beta_4_pvalue_BurdenQt wald_4_BurdenQt AADACL4 3180 5 138 -0.461457 0.308686 -1.01815 -0.0716573 0.726877 -0.349314 0.0150768 0.0574562 1.90051 -0.939843 2.75155E-05 -4.19925 ABCB10 3180 6 122 0.119 0.814045 0.23523 -0.0795874 0.697984 -0.388079 0.0150143 0.0585374 1.89233 -0.945568 2.50587E-05 -4.2205 ABHD1 3180 5 29 0.00268703 0.997829 0.00272057 -0.0787737 0.700934 -0.384095 0.0150272 0.0583193 1.89397 -0.943247 2.5858E-05 -4.21338 ABCA4 3180 43 492 0.0820913 0.74098 0.330584 -0.0787295 0.701037 -0.383955 0.0150093 0.0586193 1.89172 -0.942885 2.59414E-05 -4.21265 ABI2 3180 1 25 1.19633 0.276415 1.0886 -0.081478 0.691101 -0.397397 0.0150043 0.0586562 1.89144 -0.941765 2.64399E-05 -4.20833 ABL2 3180 4 41 -0.613866 0.475633 -0.713429 -0.0781101 0.703263 -0.380954 0.0150498 0.0579226 1.89697 -0.945432 2.46814E-05 -4.22394 ACADL 3180 5 65 1.30339 0.0536027 1.93075 -0.0819058 0.689433 -0.39966 0.0150828 0.0572499 1.90209 -0.940465 2.6925E-05 -4.20419 ACADM 3180 4 103 0.0561593 0.916101 0.105355 -0.0778643 0.704415 -0.379401 0.0150232 0.0583868 1.89347 -0.942722 2.61415E-05 -4.2109 ACAP3 3180 3 17 0.296682 0.823678 0.222835 -0.07936 0.698787 -0.386993 0.0150257 0.0583418 1.8938 -0.942487 2.61991E-05 -4.2104
QQ-plot Attach:burdenQt.jpg
[^#^]