Penalized Methods and Their Applications to Genetic Research and Economic Forecasting

Doctoral Dissertation


My work in this thesis is about the lasso-based penalties, SCAD-based penalties and their applications to genetic research and economic forecasting. Motivated by functional genome-wide association studies (fGWAS), a procedure with penalization on both linear regression coefficients and the inverse covariance matrix is proposed. Motivated by macroeconomic variable forecasting, the differences in the performance of forecasting from lasso regression and SCAD regression are examined and the improvements of forecast accuracy resulting from the inclusion of group structure, residual bootstrap and forecasts combination are investigated.

The first proposed procedure applies lasso-based penalties on both coefficients and inverse covariance matrix estimation in nonparametric varying-coefficient models in fGWAS to select SNPs that are significantly associated with an interested phenotypic trait based on limited number of measurements. The genetic effects of SNPs are time-varying and the phenotypic trait is measured repeatedly. The procedure provides satisfactory variable selection results in simulation, facilitates model interpretation and enhances variable selection power with sparse inverse covariance matrix estimation.

The rest of the dissertation is about penalized linear regressions in macroeconomic forecasting. Lasso and SCAD regressions are first examined as two alternative forecasting methods. Based on comparison in simulation and real examples, SCAD penalty is recommended over lasso penalty for macroeconomic data because they are grouped data and model mis-specication risks have to be considered. With such recommendation, SCAD penalty is further extended to group SCAD regression, SCAD regression with residual bootstrap and group SCAD regression with residual bootstrap. Group SCAD penalty provides more consistent variable selection results and enhances model interpretability. Residual bootstrap increases model selection stability. Group SCAD regression with residual bootstrap and SCAD regression significantly improve forecast accuracy for most macroeconomic variables. In the end,the forecasts combination of SCAD-related models and dynamic factor model are studied. The combined forecast shows advantages over individual forecasts in out-of-sample forecasting.


Attribute NameValues
  • etd-03242015-195334

Author Weiye Chen
Advisor Jiahan Li
Contributor Fang Liu, Committee Member
Contributor Jun Li, Committee Member
Contributor Jiahan Li, Committee Chair
Contributor Steven Buechler, Committee Member
Degree Level Doctoral Dissertation
Degree Discipline Applied and Computational Mathematics and Statistics
Degree Name PhD
Defense Date
  • 2015-01-19

Submission Date 2015-03-24
  • United States of America

  • macroeconomic forecasting

  • SCAD

  • Lasso

  • fGWAS

  • University of Notre Dame

  • English

Record Visibility Public
Content License
  • All rights reserved

Departments and Units


Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.