This vignette contains supplementary information regarding the usage of the `corrcoverage`

R package.

The two main functions are:

`corrcov`

(or analogously`corrcov_bhat`

): Provides a corrected coverage estimate of credible sets obtained using the Bayesian approach for fine-mapping (see “Corrected Coverage” vignette).

- Use
`corrcov`

for \(Z\)-scores and minor allele frequencies - Use
`corrcov_bhat`

for beta hat estimates and their standard errors.

`corrected_cs`

(or analogously`corrected_cs_bhat`

): Finds a corrected credible set, which is the smallest set of variants such that the corrected coverage is above some user defined “desired coverage” (see "Corrected credible set vignette).

- Use
`corrected_cs`

for \(Z\)-scores and minor allele frequencies - Use
`corrected_cs_bhat`

for beta hat estimates and their standard errors.

`corrcov_nvar`

(or analogously`corrcov_nvar_bhat`

): Finds a corrected coverage estimate whereby the simulated credible sets used to derive the estimate are limited to those which contain a specified number of variants (parameter ‘nvar’). These functions should be used with caution and only ever for very small credible sets (fewer than 4 variants).`corrcov_CI`

(or analogously`corrcov_CI_bhat`

): Finds a confidence interval for the corrected coverage estimate. The default is a 95% confidence interval (parameter`CI = 0.95`

), but can be adjusted accordingly. This function involves repeating the correction procedure 100 times and therefore requires lots of memory.

The Bayesian method for fine-mapping involves finding the posterior probability of causality for each SNP, before sorting these into descending order and adding variants to a ‘credible set’ until the combined posterior probabilities of these SNPs exceed some threshold. The supplementary text of Maller’s paper (available here) shows that these posterior probabilities are normalised Bayes factors.

Asymptotic Bayes factors (Wakefield, 2009) are commonly used in genetic association studies as these only require the specification of \(Z\)-scores (or equivalently the effect size coefficients, \(\beta\), and their standard errors, \(V\)), the standard errors of the effect sizes (\(V\)) and the prior variance of the estimated effect size (\(W^2\)), thus only requiring summary data from genetic association studies plus an estimate for the \(W\) parameter.

Consequently, the `corrcoverage`

package contains functions for converting between \(P\)-values, \(Z\)-scores, asymptotic Bayes factors (ABFs) and posterior probabilities of causality (PPs). The following table shows what input these conversion functions require and what output they produce. The ‘include null model’ column is for whether the null model of no genetic effect is included in the calculation (PPs obtained using the standard Bayesian approach ignore this).

Function | Include null model? | Input | Output |
---|---|---|---|

`approx.bf.p` |
YES | \(P\)-values | log(ABF) |

`pvals_pp` |
YES | \(P\)-values | Posterior Probabilities |

`z0_pp` |
YES | Marginal \(Z\)-scores | Posterior Probabilities |

`ppfunc` |
NO | Marginal \(Z\)-scores | Posterior Probabilities |

Functions are also provided to simulate marginal \(Z\)-scores from joint \(Z\)-scores (\(Z_j\)). The joint \(Z\)-scores are all 0, except at the causal variant where it is the “true effect”, \(\mu\).

\(\mu\) can be estimated using the `est_mu`

function which requires sample sizes, marginal \(Z\)-scores and minor allele frequencies. We estimate \(\mu\) by \[\hat\mu=\sum_{j}|Z_j|\times PP_j\]

The `z_sim`

function simulates marginal \(Z\)-scores from joint \(Z\)-scores, whilst `zj_pp`

can be used to simulate posterior probability systems from a joint \(Z\)-score vector. These functions first calculate the expected marginal \(Z\) scores, \(E(Z)\), \[E(Z)=Z_j \times \Sigma\] where \(\Sigma\) is the correlation matrix between SNPs.

We can then simulate more \(Z\)-score systems from a multivariate normal distribution with mean \(E(Z)\) and variance \(\Sigma\). This is a key step in our corrected coverage method.

- Joint \(Z\) score vectors (
`Z_j`

) are used to derive expected marginal \(Z\) score vectors, by multiplying with the SNP correlation matrix.

\[E(Z)=Z_j \times \Sigma\]

- The expected marginal \(Z\) score vectors can be used as the mean in a multi-variate normal distribution with variance equal to the SNP correlation matrix to simulate more marginal \(Z\) score vectors.

\[Z \sim MVN(E(Z),\Sigma)\]

The

`z_sim`

function follows these steps to simulate`nrep`

marginal \(Z\) score vectors.The

`zj_pp`

function goes one step further and converts these simulated marginal \(Z\) scores to posterior probabilities of causality.