Leveraging Large-Scale Distributed Systems for Massive Biological Datasets

Doctoral Dissertation


The tremendous increase in available data in bioinformatics necessitates computational approaches which leverage a wide variety of computing resources. These systems must be easy for the user and add minimal overhead for developers. Developers must also take advantage of these systems and build tools and applications which are highly portable and have low overheads while being capable of running on a large variety of resources. To this end, we present multiple applications and modifications with lessons learned and guidelines for development. Additionally these applications provide useful modifications to the community. Finally we provide a use case which leverages distributed computing and without which, would be computationally infeasible.


Attribute NameValues
  • etd-07262013-115037

Author Andrew David Thrasher
Advisor Dr. Scott Emrich
Contributor Dr. Adam Phillippy, Committee Member
Contributor Dr. Scott Emrich, Committee Chair
Contributor Dr. Michael Pfrender, Committee Member
Contributor Dr. Douglas Thain, Committee Member
Degree Level Doctoral Dissertation
Degree Discipline Computer Science and Engineering
Degree Name Doctor of Philosophy
Defense Date
  • 2013-07-05

Submission Date 2013-07-26
  • United States of America

  • distributed systems

  • bioinformatics

  • University of Notre Dame

  • English

Record Visibility Public
Content License
  • All rights reserved

Departments and Units

Digital Object Identifier


This DOI is the best way to cite this doctoral dissertation.


Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.