University of Notre Dame
Browse

File(s) under permanent embargo

Accelerating Natural Language Processing Algorithms Using Graphics Processing Units

thesis
posted on 2019-03-19, 00:00 authored by Arturo Argueta

Natural Language Processing (NLP) models the different techniques computers use to understand and interpret human languages. NLP covers a wide range of sub-topics such as syntax (analyzing if words in an utterance are well arranged), semantics (understanding the meaning of combined words), and discourse. Most state-of-the-art NLP systems feed large amounts of natural language text into different models for training and testing.

One problem with natural language corpora is the unbalanced frequency of rare terms against commonly used words. The word-level frequency in natural language creates irregular sparsity patterns, and these patterns generate sparse data structures that do not perform well on parallel architectures. Asynchronous methods work best on specific sparse distributions. Ideally, the entire computation time should be spent on dense values only, and computation time on sparse elements should be minimized.

Graphics Processing Units (GPU) are widely used to process a large quantity of operations in parallel. A problem with the use of these accelerators is that not all computation problems can be parallelized, and some parallel adaptations run slower than a serial CPU counterpart. Using GPUs to process sparse structures of different sizes poses additional problems. A large part of the computation time will be spent on sparse regions if the parallel implementations do not take advantage of the partially dense properties of the input.

Significant speedups are achieved when a parallel implementation is tailored to the sparsity pattern of the problem being solved and the targeted architecture. Our work adapts methods used in NLP to run efficiently on a parallel architecture using high performance computing concepts. All contributions focus mainly on the GPU device designed to carry out a large amount of computations faster than several off-the-shelf CPU architectures.

This dissertation covers different adaptations of sparse NLP algorithms to the GPU architecture. We carry out experiments using different GPU architectures and compare the performance on different datasets. Our results demonstrate that GPU adaptations can significantly reduce the execution time of different sparse NLP algorithms: 6000x speedup on the Viterbi task, 4.5x speedup on the composition task, 7x speedup on a batched Forward-Backward method, and 50x improvement on batched operations seen in deep learning.

History

Date Modified

2019-05-18

Defense Date

2019-02-27

CIP Code

  • 40.0501

Research Director(s)

David Chiang

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Alternate Identifier

1101623841

Library Record

5101828

OCLC Number

1101623841

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC