University of Notre Dame
Browse
MaY112018D.pdf (2.65 MB)

Improving Reliability of Real-Time Embedded Systems

Download (2.65 MB)
thesis
posted on 2018-11-20, 00:00 authored by Yue Ma

Multi-processor system-on-chips (MPSoCs) provide high performance and power efficiency. They have been widely used in many real-time embedded applications such as automotive electronics, industrial automation, and avionics. Most of these applications must satisfy deterministic or probabilistic timing constraints. However, due to CMOS technology scaling, MPSoCs increasingly have higher power density and temperature, which reduce system lifetime reliability. Meanwhile, the decreasing feature size of transistors and low supply voltage and frequency make the chip more vulnerable to soft errors and degrade soft-error reliability. Maintaining the quality of service, improving lifetime reliability and soft-error reliability, and satisfying the real-time requirement have become major concerns in current MPSoCs, especially for such systems deployed in harsh environments.

This thesis focuses investigating techniques to improve reliability for MPSoCs. Two methods are first developed to improve lifetime reliability and soft-error reliability, respectively. The first method aims at improving lifetime reliability by dynamically reducing operating temperature. In order to maximize soft-error reliability and recover failed tasks caused by soft errors, the second method is developed to allocate recoveries to any failed tasks dynamically.

Based on these methods, two frameworks are designed to improve reliability for homogeneous MPSoCs. The first framework maximizes lifetime reliability under real-time constraint. It captures the system run-time status and utilizes the above method to reduce the operating temperature by dynamically reducing core voltages and frequencies. Based on the dynamic recovery allocation method, the second framework maximizes soft-error reliability under lifetime reliability and real-time constraints. This framework improves soft-error reliability by allocating recoveries to failed tasks as well as scheduling tasks to guarantee more tasks can be recovered.

MPSoCs, which has the 'big-little'' architecture or is consisting of integrated CPU and GPU, have been widely used in many applications such as autonomous vehicles, in-vehicle infotainment systems, and mobiles. Since such applications expect more tasks can complete successfully before their deadlines, we develop two frameworks to maximize soft-error reliability under lifetime reliability and real-time constraints. These frameworks both improve soft-error reliability by dynamically increasing core frequencies, but also consider different features of MPSoCs.

The first framework is for the 'big-little'' type MPSoCs. It exams the power features of the 'big'' core and 'little'' core, and dynamically selects the most power efficient cores to execute tasks. The second framework focuses on the MPSoCs with integrated CPU and GPU. Based on the analysis of the effects of tasks mapping to tasks' execution times, this framework dynamically migrates tasks to reduce tasks' execution times and achieves a high soft-error reliability.

All frameworks proposed in this thesis have been evaluated with tasks from multiple benchmark suites and on different hardware platforms, including a simulator, the Nvidia's TK1 chip, and the Nvidia's TX2 chip. Experimental results show that these frameworks can improve lifetime reliability and/or soft-error reliability and satisfy the real-time requirement.

History

Date Created

2018-11-20

Date Modified

2018-12-18

Defense Date

2018-11-09

CIP Code

  • 40.0501

Research Director(s)

Xiaobo Sharon Hu

Committee Members

Michael Niemier Yiyu Shi Robert P. Dick

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Alternate Identifier

1078222302

Library Record

5012612

OCLC Number

1078222302

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC