Exploring the Effects of Frontalization and Data Synthesis on Face Recognition

Doctoral Dissertation


Automatic face recognition performance has improved remarkably in the last decade. Much of this success can be attributed to the development of deep learning techniques like convolutional neural networks (CNNs). But the training process of CNNs requires a large amount of clean and correctly labelled data. In the first part of this work, we try to find the ideal orientation (facial pose, shape, context) of this data for training and testing such CNNs. If a CNN is intended to work with non-frontal face images, should this training data be diverse in terms of facial poses, or should face images be frontalized as a pre-processing step? To answer these questions we evaluate a set of popular facial landmarking and pose frontalization algorithms to understand their effect on facial recognition performance. We also introduce a new landmarking and frontalization scheme that operates over a single image without the need for a subject-specific 3D model, and perform a comparative analysis between the new scheme and other methods in the literature.

Secondly, we analyze the usefulness of synthetic images in improving the face recognition pipeline while taking into account its practicality from a computation stand-point. In this regard, we propose a novel face synthesis method for augmentation of existing face image datasets. An augmented dataset reduces overfitting, which in turn, can enhance the face representation capability of a CNN. Our method, starting off with actual face images from an existing dataset, can generate a large number of synthetic images of real and synthetic identities, without the identity-labeling and privacy complications that come from downloading images from the web. Additionally, we develop a multi-scale generative adversarial network (GAN) model to hallucinate realistic context (forehead, hair, neck, clothes) and background pixels automatically from a single input face mask, without any user supervision. Our model is composed of a cascaded network of GAN blocks, each tasked with hallucination of missing pixels at a particular resolution while guiding the synthesis process of the next GAN block. Multiple experiments are performed to assess the realism of our synthetic face images and validate their effectiveness as supplemental data for training CNNs, and as distractors to test the robustness of trained model snapshots.


Attribute NameValues
Author Sandipan Banerjee
Contributor Chaoli Wang, Committee Member
Contributor Walter J. Scheirer, Committee Member
Contributor Patrick J. Flynn, Research Director
Contributor Kevin W. Bowyer, Research Director
Contributor Domingo Mery, Committee Member
Degree Level Doctoral Dissertation
Degree Discipline Computer Science and Engineering
Degree Name Doctor of Philosophy
Banner Code

Defense Date
  • 2019-05-06

Submission Date 2019-07-25
Record Visibility Public
Content License
  • All rights reserved

Departments and Units
Catalog Record


Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.