Missing Data Methods for Exploratory Dynamic Factor Analysis

Trichtinger, Lauren A.

doi:10.7274/9s161547h3m

Missing Data Methods for Exploratory Dynamic Factor Analysis

thesis

posted on 2021-07-27, 00:00 authored by Lauren A. Trichtinger

New technologies such as smartphones and wearable devices make it more feasible to collect intensive longitudinal data (ILD). ILD allows researchers to study intra-individual differences in finer detail. It also brings about some new challenges. This type of research design involves repeated measures from same individuals, at many time points. Missing data is ubiquitous with such data. Even with new advancements in data collection technologies leading to less burden for participants in longitudinal studies, problems such as equipment malfunctions and limited battery power still make missing data a relevant issue. Missing data can lead to smaller sample size, reduced power, and biased estimates. Missingness in studies with intensive longitudinal data becomes more complex due to dependencies at nearby time points. Therefore, effectively accommodating missing data is essential to analyzing intensive longitudinal data.

Dynamic factor analysis, a procedure combining factor analysis and time series analysis, is a popular data analytic tool for ILD. Previous studies have considered methods for handling missing data for ILD but few of them considered latent variables. Examples include vector autoregressive models or Kalman filters for parameter estimation in dynamic factor analysis models. However, none of them compared popular missing data methods and examined their effects on estimates in exploratory dynamic factor analysis in a variety of conditions. In this dissertation, I propose comparing four methods of handling missing data for exploratory dynamic factor analysis in a simulation study. These methods include pairwise deletion, listwise deletion, cross-sectional multiple imputation, and time series multiple imputation. I conduct a simulation study to examine the implications of these different methods of handing missing data on point estimates. The simulation study varies four features, namely, missing data mechanisms, amount of missing data, time series lengths, and model size. I also illustrate the four missing data methods with two empirical illustrations. The first illustration uses a mood study of daily dairy entries. The second one involves physiological measurements of functional magnetic resonance imaging (fMRI).

The results of the simulation study include that (1) listwise deletion performs poorly in most cases except in some MCAR conditions. (2) Pairwise deletion and the two multiple imputation procedures perform similarly in many cases. (3) Time series multiple imputation had large RMSE and biases in certain cases such as the MCAR condition for some measurement variables. The results of the empirical illustration are comparable with those of the simulation studies. I discuss the implications and limitations of the current study.

History

Date Modified

2021-10-26

Defense Date

2021-07-14

CIP Code

42.2799

Research Director(s)

Guangjian Zhang

Committee Members

Ke-Hai Yuan Zhiyong Zhang Lijuan Wang

Degree

Doctor of Philosophy

Degree Level

Doctoral Dissertation

Alternate Identifier

1280313239

Library Record

6135212

OCLC Number

1280313239

Additional Groups

Psychology

Program Name

Psychology, Research and Experimental

Usage metrics

Keywords

Dynamic factor analysis Missing data Time Series

Missing Data Methods for Exploratory Dynamic Factor Analysis

History

Date Modified

Defense Date

CIP Code

Research Director(s)

Committee Members

Degree

Degree Level

Alternate Identifier

Library Record

OCLC Number

Additional Groups

Program Name

Usage metrics

Categories

Keywords

Licence

Exports