University of Notre Dame
Browse

File(s) under embargo

Identifying Groups with Different Dynamic Patterns Using Cluster Analysis

dataset
posted on 2024-05-10, 20:54 authored by Dayoung Lee
Clustering is an exploratory analysis technique to uncover subgroups within the population and it facilitates the development of subgroup-specific intervention or treatment. Although it is commonly used for cross-sectional data, several researchers have used it for multivariate time series with the goal of grouping together individuals with similar dynamic patterns (Aghabozorgi, Shirkhorshidi, & Wah, 2015). The common raw data-based approach may fail to identify subgroups with distinctive dynamic factor structures that are often the focus of psychological research and not directly visible in the time series patterns. In the dissertation, I develop a model-based clustering method to identify subgroups characterized by their dynamic patterns of latent factors. In particular, it first fits a dynamic factor analysis model (Molenaar, 1985; Nesselroade, McArdle, Aggen, & Meyers, 2002; Browne & Zhang, 2007) to each individual's time series, and then calculates distances between parameter estimates of the fitted models and groups individuals based on the distances. I present an empirical illustration of four clustering methods. The methods are model-based clustering with K-means algorithm, model-based clustering with hierarchical algorithm, raw data-based clustering with K-means algorithm, and raw data-based clustering with hierarchical algorithm. I conduct the simulation study to compare the performances of the four methods with multivariate time series. The simulation results indicate that (1) the model-based approach had higher cluster recovery rates and adjusted Rand indices compared to the raw data-based approach. (2) Under the same clustering approach, the K-means clustering algorithm had slightly higher cluster recovery rates and adjusted rand indices compared to the hierarchical clustering algorithm in most conditions. (3) The cluster recovery rates of clustering validation indices vary depending on the number of population clusters. I address methodological and applied implications, limitations and future studies, and conclusions of the study.

History

Date Created

2024-04-15

Date Modified

2024-05-10

Defense Date

2024-03-27

CIP Code

  • 42.2799

Research Director(s)

Guangjian Zhang

Committee Members

Ke-Hai Yuan Johnny Zhang Fang Liu

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Language

  • English

Library Record

6584700

OCLC Number

1433162504

Publisher

University of Notre Dame

Program Name

  • Psychology, Research and Experimental

Usage metrics

    Dissertations

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC