To run an allelic series test, there are 4 key inputs:
A numeric annotation vector, of the same length as the number of variants, coded as 0 for benign missense variants (BMVs), 1 for deleterious missense variants (DMVs), and 2 for protein truncating variants (PTVs).
A covariates matrix, with as many rows as subjects and including columns such as age and sex. If omitted, defaults to an intercept only.
A genotype matrix, with subjects as rows and variants as columns. The number of columns should correspond to the length of the annotation vector.
A numeric phenotype vector, either continuous or binary.
The example data used below were generated using the
function provided with the package. The data set includes 100 subjects,
300 variants, and a continuous phenotype. The true effect sizes follow
an allelic series, with magnitudes proportional to
c(1, 2, 3) for BMVs, DMVs, and PTVs respectively.
set.seed(101) <- 100 n <- AllelicSeries::DGP( data n = n, snps = 300, beta = c(1, 2, 3) / sqrt(n), ) # Annotations. <- data$anno anno head(anno)
##  0 0 0 1 1 0
# Covariates. <- data$covar covar head(covar)
## int age sex pc1 pc2 pc3 ## [1,] 1 0.9227292 0 -0.9179036 -1.6327648 -0.1540658 ## [2,] 1 0.5888415 0 0.9207933 -1.3529452 -1.3514130 ## [3,] 1 1.2530388 1 0.6014231 -1.4208441 -1.1318932 ## [4,] 1 0.8227599 0 -0.7964197 1.1976729 0.7436693 ## [5,] 1 0.1778293 1 -0.5023568 -0.8110003 -1.7642698 ## [6,] 1 1.3477782 0 1.2106348 0.6395943 0.4450291
# Genotypes. <- data$geno geno head(geno[,1:5])
## [,1] [,2] [,3] [,4] [,5] ## [1,] 0 0 0 0 0 ## [2,] 0 0 0 0 0 ## [3,] 0 0 0 0 0 ## [4,] 0 0 0 0 0 ## [5,] 0 0 0 0 0 ## [6,] 0 1 0 0 1
# Phenotype. <- data$pheno pheno head(pheno)
##  2.1541823 0.4310216 2.2026698 4.0619662 2.0134196 2.0927631
The example data generated by the preceding are available under
The COding-variant Allelic Series Test (COAST) is run using the
COAST function. By default, p-values for the component
tests, as well as the overall omnibus test (
returned. Inspection of the component p-values is useful for determining
which model(s) drove an association. In the presence case, the
association was most evident via the baseline count model
<- AllelicSeries::COAST( results anno = anno, geno = geno, pheno = pheno, covar = covar )show(results)
## p_count p_ind p_max_count p_max_ind p_sum_count ## 1.992602e-19 6.426260e-05 1.816140e-08 4.870339e-06 4.915204e-17 ## p_sum_ind p_allelic_skat p_omni ## 2.274531e-06 3.105850e-07 2.381468e-18
apply_int = TRUEapplies the rank-based inverse normal transformation from the RNOmni package. This transformation is expected to improve power for phenotypes that have a skewed or kurtotic (e.g. long-tailed) distribution. It is applied by default in the case of continuous phenotype, and is ignored in the case of a binary phenotype.
::COAST( AllelicSeriesanno = anno, geno = geno, pheno = pheno, covar = covar, apply_int = TRUE )
include_orig_skato_all = TRUEincludes standard SKAT-O applied to all variants as a component of the omnibus test, while
include_orig_skato_ptv = TRUEincludes standard SKAT-O applied to PTVs only. Including standard SKAT-O as a component of the omnibus test can improve power to detect associations between the phenotype and genes that may not be allelic series.
::COAST( AllelicSeriesanno = anno, geno = geno, pheno = pheno, covar = covar, include_orig_skato_all = TRUE, include_orig_skato_ptv = TRUE, )
is_pheno_binary = TRUEis required to indicate that the supplied phenotype is binary, and should be analyzed using a logistic regression model.
::COAST( AllelicSeriesanno = anno, geno = geno, pheno = 1 * (pheno > 0), covar = covar, is_pheno_binary = TRUE )
return_omni_only = TRUEis used to return
p_omnionly when the component p-values are not of interest:
::COAST( AllelicSeriesanno = anno, geno = geno, pheno = pheno, covar = covar, return_omni_only = TRUE )
score_test = TRUEspecifies the use of a score-type allelic series burden test. The default of
score_test = FALSEspecifies a Wald-type allelic series burden test.
::COAST( AllelicSeriesanno = anno, geno = geno, pheno = pheno, covar = covar, score_test = TRUE )
weightsspecifies the relative phenotypic effects of BMVs, DMVs, and PTVs. An increasing pattern such as the default setting of
weights = c(1, 2, 3)targets allelic series. Setting
weights = c(1, 1, 1)would target a genetic architecture where all variants have equivalent expected magnitudes.
::COAST( AllelicSeriesanno = anno, geno = geno, pheno = pheno, covar = covar, weights = c(1, 2, 3) )