Nonlinear Structural Equation Modeling with Text Data
dataset
posted on 2025-07-28, 15:03authored byLingbo Tong
Psychological research increasingly uses unstructured text, creating a need for models that can integrate it with traditional quantitative data. This dissertation introduces NeuralSEM, a novel framework that combines deep learning with structural equation modeling (SEM) for the joint analysis of numerical and textual data. By leveraging a conditional variational autoencoder (CVAE) architecture, NeuralSEM employs modality-specific encoders and an attention mechanism to fuse survey scores and text into an interpretable latent space. This allows for modeling complex, non-linear relationships while retaining theoretical structure.
The utility of NeuralSEM is validated through three applications: assessing perceived teaching quality from ratings and tags, examining menstrual health beliefs using survey items and explanations, and modeling the Big Five personality traits from questionnaire data and synthetic self-descriptions. Together, these studies validate NeuralSEM as a powerful and flexible methodology that bridges quantitative and qualitative analysis. By advancing latent variable modeling for multimodal data, this work opens new directions for research at the intersection of psychology, machine learning, and computational social science.<p></p>