UMVCUE is a function which aims to produce less biased SNP-trait association estimates for SNPs deemed significant in the discovery GWAS, using summary statistics from both discovery and replication GWASs. The function implements the method described in Bowden and Dudbridge (2009), which was established for this purpose.

UMVCUE(summary_disc, summary_rep, alpha = 5e-08)

Arguments

summary_disc

A data frame containing summary statistics from the discovery GWAS. It must have three columns with column names rsid, beta and se, respectively, and columns beta and se must contain numerical values. Each row must correspond to a unique SNP, identified by rsid.

summary_rep

A data frame containing summary statistics from the replication GWAS. It must have three columns with column names rsid, beta and se, respectively, and all columns must contain numerical values. Each row must correspond to a unique SNP, identified by the numerical value rsid. SNPs must be ordered in the exact same manner as those in summary_disc, i.e. summary_rep$rsid must be equivalent to summary_disc$rsid.

alpha

A numerical value which specifies the desired genome-wide significance threshold for the discovery GWAS. The default is given as 5e-8.

Value

A data frame with summary statistics and adjusted association estimate of only those SNPs which have been deemed significant in the discovery GWAS according to the specified threshold, alpha, i.e. SNPs with

\(p\)-values less than alpha. The inputted summary data occupies the first five columns, in which the columns beta_disc and

se_disc contain the statistics from the discovery GWAS and columns

beta_rep and se_rep hold the replication GWAS statistics. The new adjusted association estimate for each SNP, as defined in the aforementioned paper, is contained in the final column, namely

beta_UMVCUE. The SNPs are contained in this data frame according to their significance, with the most significant SNP, i.e. the SNP with the largest absolute \(z\)-statistic, now located in the first row of the data frame. If no SNPs are detected as significant in the discovery GWAS,

UMVCUE merely returns a data frame which combines the two inputted data sets.

References

Bowden, J., & Dudbridge, F. (2009). Unbiased estimation of odds ratios: combining genomewide association scans with replication studies. Genetic epidemiology, 33(5), 406\(-\)418. doi:10.1002/gepi.20394

See also

https://amandaforde.github.io/winnerscurse/articles/discovery_replication.html for illustration of the use of UMVCUE with toy data sets and further information regarding computation of the adjusted SNP-trait association estimates for significant SNPs.