UMVCUE.Rd
UMVCUE
is a function which aims to produce less biased SNP-trait
association estimates for SNPs deemed significant in the discovery GWAS, using
summary statistics from both discovery and replication GWASs. The function
implements the method described in
Bowden and
Dudbridge (2009), which was established for this purpose.
UMVCUE(summary_disc, summary_rep, alpha = 5e-08)
A data frame containing summary statistics from the
discovery GWAS. It must have three columns with column names
rsid
, beta
and se
, respectively, and columns
beta
and se
must contain numerical values. Each row must
correspond to a unique SNP, identified by rsid
.
A data frame containing summary statistics from the
replication GWAS. It must have three columns with column names
rsid
, beta
and se
, respectively, and all columns must
contain numerical values. Each row must correspond to a unique SNP,
identified by the numerical value rsid
. SNPs must be ordered in the
exact same manner as those in summary_disc
, i.e.
summary_rep$rsid
must be equivalent to summary_disc$rsid
.
A numerical value which specifies the desired genome-wide
significance threshold for the discovery GWAS. The default is given as
5e-8
.
A data frame with summary statistics and adjusted association estimate
of only those SNPs which have been deemed significant in the discovery GWAS
according to the specified threshold, alpha
, i.e. SNPs with
\(p\)-values less than alpha
. The inputted summary data occupies
the first five columns, in which the columns beta_disc
and
se_disc
contain the statistics from the discovery GWAS and columns
beta_rep
and se_rep
hold the replication GWAS statistics. The
new adjusted association estimate for each SNP, as defined in the
aforementioned paper, is contained in the final column, namely
beta_UMVCUE
. The SNPs are contained in this data frame according to
their significance, with the most significant SNP, i.e. the SNP with the
largest absolute \(z\)-statistic, now located in the first row of the data
frame. If no SNPs are detected as significant in the discovery GWAS,
UMVCUE
merely returns a data frame which combines the two inputted
data sets.
Bowden, J., & Dudbridge, F. (2009). Unbiased estimation of odds ratios: combining genomewide association scans with replication studies. Genetic epidemiology, 33(5), 406\(-\)418. doi:10.1002/gepi.20394
https://amandaforde.github.io/winnerscurse/articles/discovery_replication.html
for illustration of the use of UMVCUE
with toy data sets and further
information regarding computation of the adjusted SNP-trait association
estimates for significant SNPs.