File(s) under permanent embargo
Evaluation of a Two-Stage Statistical Learning Design for Genome-Wide Studies
thesis
posted on 2013-04-16, 00:00 authored by Raymond Kenney WaltersTwin and family studies show that many common traits and disorders are highly heritable, but genome-wide association studies (GWAS) have been largely unable to identify specific single nucleotide polymorphisms (SNPs) explaining this heritability at the genetic level. Recent work suggests statistical learning methods like gradient boosting (GBM) may be a viable alternative to conventional methods, especially after adjustments for the structure of SNP data. The current research evaluates a two-stage research design for GWAS. GBM is used as a first stage variable selection screen to substantially reduce the dimensionality of SNP data while maintaining sensitivity to additive, nonlinear, and interaction effects, allowing hypothesis testing with a reduced multiple testing burden in the second stage analysis. Thorough simulations shows the proposed two-stage design can substantially improve power to detect effect SNPs in a wide range of conditions. The limitations and potential improvements to this design are explored.
History
Date Modified
2017-06-02Research Director(s)
Gitta LubkeCommittee Members
Scott Maxwell Jiahan LiDegree
- Master of Arts
Degree Level
- Master's Thesis
Language
- English
Alternate Identifier
etd-04162013-072727Publisher
University of Notre DameProgram Name
- Psychology
Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC