University of Notre Dame
Browse

Learning Hyperparameters for Neural Machine Translation

Download (1.31 MB)
thesis
posted on 2020-04-21, 00:00 authored by Kenton Murray

Machine Translation, the subfield of Computer Science that focuses on translating between two human languages, has greatly benefited from neural networks. However, these neural machine translation systems have complicated architectures with many hyperparameters that need to be manually chosen. Frequently, these are selected either through a grid search over values, or by using values commonplace in the literature. However, these are not theoretically justified and the same values are not optimal for all language pairs and datasets.

Fortunately, the innate structure of the problem allows for optimization of these hyperparameters during training. Traditionally, the hyperparameters of a system are chosen and then a learning algorithm optimizes all of the parameters within the model. In this work, I propose three methods to learn the optimal hyperparameters during the training of the model, allowing for one step instead of two. First, I propose using group regularizers to learn the number, and size of, the hidden neural network layers. Second, I demonstrate how to use a perceptron-like tuning method to solve known problems of undertranslation and label bias. Finally, I propose an Expectation-Maximization based method to learn the optimal vocabulary size and granularity. Using various techniques from machine learning and numerical optimization, this dissertation covers how to learn hyperparameters of a Neural Machine Translation system while training the model itself.

History

Date Modified

2020-05-22

Defense Date

2019-12-18

CIP Code

  • 40.0501

Research Director(s)

David Chiang

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Alternate Identifier

1155111876

Library Record

5503685

OCLC Number

1155111876

Additional Groups

  • Computer Science and Engineering

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC