Leveraging Large-Scale Distributed Systems for Massive Biological Datasets

Doctoral Dissertation

Abstract

The tremendous increase in available data in bioinformatics necessitates computational approaches which leverage a wide variety of computing resources. These systems must be easy for the user and add minimal overhead for developers. Developers must also take advantage of these systems and build tools and applications which are highly portable and have low overheads while being capable of running on a large variety of resources. To this end, we present multiple applications and modifications with lessons learned and guidelines for development. Additionally these applications provide useful modifications to the community. Finally we provide a use case which leverages distributed computing and without which, would be computationally infeasible.

Attributes

Attribute NameValues
URN
  • etd-07262013-115037

Author Andrew David Thrasher
Advisor Dr. Scott Emrich
Contributor Dr. Adam Phillippy, Committee Member
Contributor Dr. Scott Emrich, Committee Chair
Contributor Dr. Michael Pfrender, Committee Member
Contributor Dr. Douglas Thain, Committee Member
Degree Level Doctoral Dissertation
Degree Discipline Computer Science and Engineering
Degree Name PhD
Defense Date
  • 2013-07-05

Submission Date 2013-07-26
Country
  • United States of America

Subject
  • distributed systems

  • bioinformatics

Publisher
  • University of Notre Dame

Language
  • English

Record Visibility and Access Public
Content License
  • All rights reserved

Departments and Units

Files

Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.