This implements the {$C(\alpha)$} test (Neale et al 2011) for disease traits, to test for the hypothesis of rare variants disease association under the particular assumption that rare variants observed in cases and controls is a mixture of phenotypically deleterious, protective and neutral variants. Instead of using a cumulative dosage (or “burden”) based summary statistic over a gene region, it directly contrasts the observed and expected distribution of minor alleles in cases and controls at each locus as an evidence of “unusual distribution”, and combine evidences from multiple loci (whether it be an evidence of protective or deleterious) to formulate the \(C(\alpha)\) statistic:
$$T=\sum_{i=1}^m[(y_i-n_ip_0)^2-n_ip_0(1-p_0)]$$
The original paper evaluates p-value of the test under large sample normal assumption, which usually would not hold for the real world data. Implementation in this program also allows permutation based \(C(\alpha)\) test, if parameter -p/--permutations
is set greater than 0.
vtools show test Calpha
Name: Calpha
Description: c-alpha test for unusual distribution of variants between cases and
controls, Neale et al 2011
usage: vtools associate --method Calpha [-h] [--name NAME] [-q1 MAFUPPER]
[-q2 MAFLOWER] [-p N] [--adaptive C]
[--moi {additive,dominant,recessive}]
c-alpha test for unusual distribution of variants between cases and controls,
Neale et al 2011. It tests for deviation of variance of minor allele counts in
cases/ctrls from its exception based on binomial distribution. The statistic
is asymptotically normally distributed. p-value can be evaluated using either
permutation or asymptotic distribution as described in Neale et al 2011,
although it is recommended to use permutation to estimate a reliable p-value.
Calpha test is a two-tailed test
optional arguments:
-h, --help show this help message and exit
--name NAME Name of the test that will be appended to names of
output fields, usually used to differentiate output of
different tests, or the same test with different
parameters.
-q1 MAFUPPER, --mafupper MAFUPPER
Minor allele frequency upper limit. All variants
having sample MAF<=m1 will be included in analysis.
Default set to 0.01
-q2 MAFLOWER, --maflower MAFLOWER
Minor allele frequency lower limit. All variants
having sample MAF>m2 will be included in analysis.
Default set to 0.0
-p N, --permutations N
Number of permutations
--adaptive C Adaptive permutation using Edwin Wilson 95 percent
confidence interval for binomial distribution. The
program will compute a p-value every 1000 permutations
and compare the lower bound of the 95 percent CI of
p-value against "C", and quit permutations with the
p-value if it is larger than "C". It is recommended to
specify a "C" that is slightly larger than the
significance level for the study. To disable the
adaptive procedure, set C=1. Default is C=0.1
--moi {additive,dominant,recessive}
Mode of inheritance. Will code genotypes as 0/1/2/NA
for additive mode, 0/1/NA for dominant or recessive
model. Default set to additive
vtools associate rare status -m "Calpha --name Calpha -p 5000" --group_by name2 --to_db cal\
pha -j8 > calpha.txt
INFO: 3180 samples are found
INFO: 2632 groups are found
Loading genotypes: 100% [=====================] 3,180 27.6/s in 00:01:55
Testing for association: 100% [=====================] 2,632/591 11.6/s in 00:03:46
INFO: Association tests on 2632 groups have completed. 591 failed.
INFO: Using annotation DB calpha in project test.
INFO: Annotation database used to record results of association tests. Created on Wed, 30 Jan 2013 15:54:03
vtools show fields | grep calpha
calpha.refGene_name2 refGene_name2
calpha.sample_size_Calpha sample size
calpha.num_variants_Calpha number of variants in each group (adjusted for specified MAF
calpha.total_mac_Calpha total minor allele counts in a group (adjusted for MOI)
calpha.statistic_Calpha test statistic.
calpha.pvalue_Calpha p-value
head calpha.txt
name2 sample_size_Calpha num_variants_Calpha total_mac_Calpha statistic_Calpha pvalue_Calpha std_error_Calpha num_permutations_Calpha
AADACL4 3180 5 138 0.0229344 0.407592 1.08434 1000
AAMP 3180 3 35 -0.444631 0.601399 0.896954 1000
ABCD3 3180 3 42 -0.911816 0.93007 1.0528 1000
ABCB6 3180 7 151 -0.751779 0.757243 1.05563 1000
ABCG8 3180 12 152 -0.0149743 0.36963 0.981793 1000
ABHD1 3180 5 29 -0.744439 0.845155 1.0768 1000
ABCB10 3180 6 122 1.14261 0.12094 1.02364 2000
ABL2 3180 4 41 -0.76715 0.966034 0.866904 1000
ACADL 3180 5 65 -0.50523 0.642358 0.943209 1000
Benjamin M. Neale, Manuel A. Rivas, Benjamin F. Voight, David Altshuler, Bernie Devlin, Marju Orho-Melander, Sekar Kathiresan, Shaun M. Purcell, Kathryn Roeder and Mark J. Daly (2011) Testing for an Unusual Distribution of Rare Variants. PLoS Genetics doi:10.1371/journal.pgen.1001322
. http://dx.plos.org/10.1371/journal.pgen.1001322