The Challenges of Scaling Up High-Throughput Workflow with Container Technology

Zheng, Chao

doi:10.7274/g158bg28c5m

File(s) under permanent embargo

The Challenges of Scaling Up High-Throughput Workflow with Container Technology

thesis

posted on 2019-10-03, 00:00 authored by Chao Zheng

High-throughput computing (HTC) is about using a large amount of computing resources over a long time to accomplish many independent and parallel computational tasks. HTC workloads are often described in the form of workflow and run on distributed systems through workflow systems. However, as most workflow systems are not liable for managing the task execution environment, HTC workflows are regularly limited in dedicated HTC facilities that have required settings.

Lately, container runtimes have been widely deployed across public cloud because of its ability to deliver execution environment with lower overheads than the virtual machine. This trend provides users of HTC workflows an opportunity to use unlimited computing power on the cloud. However, migrating complex workflow systems to a container environment is cumbersome.

To containerize HTC workflows and scale them up on the cloud, I synthesize my experiences on using container technologies and develop a methodology that contains seven design factors: i) Isolation Granularity – the granularity of isolation should be determined by characteristics for target workloads; ii) Container Management – container runtimes must be adapted to the distributed environment, and the under-layer distributed systems best does the management of containers; iii) Im- age Management – a cooperated mechanism can help to speed up and improve the efficiency of image distribution in distributed environment; iv) Garbage Collection – timely garbage collection is necessary given the massive amount of intermediate data generated by the HTC workflow; v) Network Connection – excessive network connections should be avoided considering the plenty of small transmissions; vi) Resource Management – customized resource management mechanisms that fully consider the characteristics of the target workflow are required; vii) Cross-layer Cooperation – implementation of advanced features requires cooperation between the upper-layer workflow system and the under-layer cluster manager.

In addition to HTC workflows, I validate the above factors through my work of standardizing resource provisioning process for extreme scale online workloads, and observe that they are equally applicable to the HTC workflow as well as the extreme scale online workload.

History

Date Modified

2019-10-31

Defense Date

2019-08-22

CIP Code

40.0501

Research Director(s)

Douglas L. Thain

Committee Members

Christian Poellabauer Dong Wang Lukas Rupprecht

Degree

Doctor of Philosophy

Degree Level

Doctoral Dissertation

Language

English

Alternate Identifier

1125224074

Library Record

5261608

OCLC Number

1125224074

Program Name

Computer Science and Engineering

Usage metrics

Keywords

High-Throughput Computing Cloud Computing Distributed System

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

The Challenges of Scaling Up High-Throughput Workflow with Container Technology

History

Date Modified

Defense Date

CIP Code

Research Director(s)

Committee Members

Degree

Degree Level

Language

Alternate Identifier

Library Record

OCLC Number

Program Name

Usage metrics

Categories

Keywords

Licence

Exports