University of Notre Dame
Browse

File(s) under permanent embargo

Topology-Aware Job Scheduling and Placement in High Performance Computing and Edge Computing Systems

thesis
posted on 2019-04-08, 00:00 authored by Kangkang Li

The interconnection topology of the computing nodes in a distributed system plays an important role in the way that jobs should be scheduled and allocated. In this work, I address two resource allocation problem. The first problem is topology-aware job scheduling and placement problem in high performance computing (HPC) systems, where a 3D torus-based interconnection topology is used. The second problem is networked virtual machine (VM) and job placement in edge cloud systems, in which a two-layer star topology is applied in the considered edge cloud architecture.

For the first resource allocation problem, I address the topology-aware job scheduling and placement problem in a 3D torus-based HPC system, with the objective of reducing system fragmentation and improving system utilization. Firstly, for the job scheduling problem, I propose a packing-based job scheduling strategy, which reduces the external fragmentation caused by using the First Come First Served (FCFS) + backfilling strategy. Secondly, I study the first case of job placement problem, where each job is allocated a convex prism shape. I propose a job placement algorithm based on a local migration and a global migration process, which aims at reducing the internal and external fragmentation in the job placement process. Thirdly, I study the second case of job placement problem, in which the shapes allocated for communication non-sensitive jobs are not limited to convex prisms. I propose two shape allocation methods to determine the topological shape for each input job, including a zigzag allocation method for communication non-sensitive jobs, and a convex allocation method for communication sensitive jobs. After that, I propose a communication-aware job placement algorithm including a target bin selection method and a bi-directional job placement method to reduce both the internal and external fragmentation in the job placement process. The evaluation results validate the efficiency of my proposed strategies and algorithms in reducing system fragmentation and improving system utilization.

For the second resource allocation problem, I address the networked VM and job placement problem in the edge cloud system. Firstly, for the homogeneous edge cloud system, I propose one optimal algorithm to obtain the maximum number of accepted VMs into the system, and then design another optimal algorithm to minimize the total inter-node communication cost in the homogeneous edge cloud system. Secondly, for the heterogeneous edge cloud system, I propose one optimal algorithm to obtain the maximum number of accepted VMs into the system, and then design another algorithm to minimize the total inter-node communication cost in the heterogeneous edge cloud system. Thirdly, I study the job placement problem under the multi-tenant scenario, which is NP-hard. A heuristic algorithm is proposed to give an efficient solution. The evaluation results validate the efficiency of my algorithms.

History

Date Modified

2019-07-06

Defense Date

2019-03-29

CIP Code

  • 40.0501

Research Director(s)

Jaroslaw Nabrzyski

Committee Members

Gregory Madey Scott Nestler Maciej Malawski

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Alternate Identifier

1105929477

Library Record

5114058

OCLC Number

1105929477

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC