Accelerating Memory Intensive Algorithms and Applications using In-Memory Computing

Laguna, Ann Franchesca

doi:10.7274/ms35t725z1p

LagunaAFB042022D.pdf (11.47 MB)

Accelerating Memory Intensive Algorithms and Applications using In-Memory Computing

thesis

posted on 2022-04-07, 00:00 authored by Ann Franchesca Laguna

Data-intensive applications do not fully utilize the compute capabilities of Von Neumann architectures because of the memory bandwidth bottleneck. These memory-bandwidth limited applications can be accelerated by minimizing the data movement between the memory and the compute units through in-memory computing (IMC). Using IMC, this work accelerated four different types of applications and algorithms.

The first part focuses on accelerating the attention mechanism of few-shot learning algorithms such as Memory Augmented Neural Networks and Prototypical Networks by utilizing different distance metrics and in-memory computing circuits. \linf\ distance is implemented using content-addressable memories (CAMs) via range encoding. \linf\ + \lone\ distance is implemented using a combination of CAMs and general-purpose computing-in-memory. CAMs are also used to implement locality-sensitive hashing. Multi-bit CAMs, with their corresponding distance metric, are also utilized.

Transformer networks have outperformed other deep neural networks (DNN) in various sequential tasks. However, memory and compute bottlenecks prevent transformer networks from scaling to long sequences due to their high execution time and energy consumption. We propose an in-memory transformer network accelerator (iMTransformer) that uses a combination of crossbars and CAMs to accelerate transformer networks.

Deep random forests (DRF) have comparable classification accuracy, easier interpretability, and lower memory and computational requirements than DNN. However, the development of efficient hardware to accelerate DRF is lagging behind its DNN counterparts. The key to hardware acceleration of DRF lies in efficiently realizing the branch-split operation at decision nodes when traversing a decision tree. We propose implementing DRF through simple associative searches using ferroelectric analog CAMs (ACAMs).

Genome analysis is becoming more important in forensic science, medicine, and history. Sequencing technologies such as High Throughput Sequencing and Third Generation Sequencing have greatly accelerated genome sequencing. However, genome read mapping remains significantly slower than sequencing. This research presents a genome read mapping accelerator that uses TCAMs to execute the Fast Seed and Vote algorithm that can map both short and long reads.

This research demonstrates a hardware-software co-design of data-intensive algorithms and applications, particularly few-shot learning, transformer networks, deep random forests, and DNA read mapping. Each accelerator is evaluated in terms of accuracy, latency, and energy improvements.

History

Date Modified

2022-05-04

Defense Date

2022-03-30

CIP Code

40.0501

Research Director(s)

X. Sharon Hu

Committee Members

Meng Jiang Siddharth Joshi Sri Parameswaran Xunzhao Yin

Degree

Doctor of Philosophy

Degree Level

Doctoral Dissertation

Language

English

Alternate Identifier

1313829071

Library Record

6208960

OCLC Number

1313829071

Program Name

Computer Science and Engineering

Usage metrics

Keywords

Computer Architecture Machine Learning Bioinformatics Deep Neural Networks in-Memory Computing Algorithms

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Accelerating Memory Intensive Algorithms and Applications using In-Memory Computing

History

Date Modified

Defense Date

CIP Code

Research Director(s)

Committee Members

Degree

Degree Level

Language

Alternate Identifier

Library Record

OCLC Number

Program Name

Usage metrics

Categories

Keywords

Licence

Exports