An example GAMuT analysis
In the code that follows, we use notation consistent with the corresponding AJHG paper by Broadaway et al. (2016).The non-printing lines of R code shown in this page are contained in the file '
GAMuT.R
' found within the archive file
GAMuT.zip.
Preliminary code dependencies
For this example, we will be using thedavies()
function
defined in the R package
CompQuadForm.
This package is loaded with the library
command:
library(CompQuadForm)
GAMuT-functions.R
found in the zip archive.
For the functions to be recognized, this needs to be sourced:
source("GAMuT-functions.R")
ls(pattern="GAMuT")
## [1] "linear_GAMuT_geno" "linear_GAMuT_pheno" "proj_GAMuT_pheno" ## [4] "TestGAMuT"
Step 1: Read in PHENO
, GENO
, and COVAR
matrices
The sample phenotype, genotype, and covariate data are provided in the
respective files,
phenotypes.csv,
variants.csv, and
covariates.csv.
These files are also provided within the zip archive file.
We read this data into R with the following commands:
GENO <- read.csv("variants.csv") PHENO <- read.csv("phenotypes.csv") COVAR <- read.csv("covariates.csv")
GENO
, so here we print only the first eight columns
and four rows using
head()
:
head(GENO[,1:8], n=4)
## V1 V2 V3 V4 V5 V6 V7 V8 ## 1 0 0 0 0 0 0 0 0 ## 2 0 0 0 0 0 0 0 0 ## 3 0 0 0 0 0 0 0 0 ## 4 0 0 0 0 0 0 0 0
head(PHENO, n=4)
## Phenotype1 Phenotype2 Phenotype3 Phenotype4 Phenotype5 Phenotype6 ## 1 -1.5761 -0.5599 -0.96585 -1.1779 -0.7366 -0.8885 ## 2 0.5411 1.3688 -0.01711 0.7002 2.0259 1.9142 ## 3 -0.5115 -0.2910 -0.03845 -0.7068 -0.0155 0.3282 ## 4 -0.5073 0.2174 0.49525 0.3535 -0.7514 1.0904
head(COVAR, n=4)
## COVAR1 COVAR2 ## 1 -0.01153 0.2955 ## 2 1.38593 0.2443 ## 3 0.08433 0.0934 ## 4 0.49490 2.7669
Step 2: Residualize the phenotypes in PHENO
on the covariates in COVAR
Note: Each phenotype in PHENO
will be residualized
on each covariate in COVAR
.
If you wish to only residualize phenotypes on a subset of covariates in COVAR,
make sure to make appropriate adjustments to this block of code:
# Matrix of residualized phenotypes: residuals = matrix(NA, nrow=nrow(PHENO), ncol=ncol(PHENO)) for(l in 1:ncol(PHENO)){ model.residual = lm(PHENO[,l] ~ as.matrix(COVAR)) residuals[,l] = resid(model.residual) }
residuals
matrix are:
head(residuals, n=4)
## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] -1.5964 -0.53340 -1.01969 -1.21835 -0.8288 -0.9447 ## [2,] 0.3916 1.22380 -0.31997 0.46899 1.6950 1.5994 ## [3,] -0.5360 -0.26521 -0.09175 -0.74692 -0.1035 0.2680 ## [4,] -0.6326 0.04302 0.13016 0.07602 -1.1876 0.7673
Step 3: Form the phenotypic similarity matrix and corresponding eigenvalues based on residualized phenotypes from Step 2
## centered & scaled matrix of residualized phenotypes: P0 <- apply(residuals, 2, scale, scale=T, center=T) P <- as.matrix(P0) # convert dataframes into matrix
Yc
using the projection matrix:
The function proj_GAMuT_pheno()
constructs the projection matrix
and corresponding eigenvalues:
proj_pheno = proj_GAMuT_pheno(P) Yc = proj_pheno$Kc # projection matrix lambda_Y = proj_pheno$ev_Kc # eigenvalues of Yc
If one wishes to construct
Yc
using the linear kernel instead of projection matrix,
then use the linear_GAMuT_pheno()
function instead.
Uncommenting the following code would produce the linear kernel and the corresponding eigenvalues:
## linear_pheno <- linear_GAMuT_pheno(P) ## Yc = linear_pheno$Kc # linear kernel similarity matrix ## lambda_Y = linear_pheno$ev_Kc # eigenvalues of Yc
Step 4: Form the genotypic similarity matrix and corresponding eigenvalues
We assume weighted linear kernel for genotypes where weights are function of minor-allele frequency (MAF). We apply the beta-distribution MAF weights of Wu et al. (2011) in the SKAT paper:MAF = colMeans(GENO)/2 # sample MAF of each variant in the sample beta_weight = dbeta(MAF,1,25)/dbeta(0,1,25) # assume beta-distribution weights
beta_weight
as desired.
The first eight (out of 368) components for the
MAF
array are:
MAF[1:8]
## V1 V2 V3 V4 V5 V6 V7 V8 ## 0.0010 0.0005 0.0010 0.0005 0.0005 0.0010 0.0010 0.0015
G0 = as.matrix(GENO)%*%diag(beta_weight) # Weighted rare variants G = as.matrix(scale(G0,center=T,scale=F)) # centered genotype matrix
linear_GAMuT_geno()
to construct the weighted linear kernel matrix for genotypes along with the eigenvalues:
linear_geno <- linear_GAMuT_geno(G) Xc <- linear_geno$Lc # linear kernel similarity matrix lambda_X <- linear_geno$ev_Lc # eigenvalues of Xc
Step 5: Construct the GAMuT and obtain the p-value
Finally, the pvalue is calculated using the functionTestGAMuT()
:
GAMuT_pvalue <- TestGAMuT(Yc,lambda_Y,Xc,lambda_X) GAMuT_pvalue
## [1] 0.02144