posted on 2013-04-19, 00:00authored byRory Carmichael
Though many common bioinformatics problems are amenable to parallelization and large datasets are becoming the norm for biological inquiry, biologists do not generally have the skillset to effectively automate, parallelize, and scale their workflows. This document describes contributions to bioinformatics ranging from collaborative frameworks to the automation of common workflows to the development of novel algorithms. We begin by describing Biocompute, a web portal that overcomes challenges in user interface design and resource sharing to facilitate collaborations between systems programmers, bioinformatics software developers, and biologists. Next, we highlight several parallel workflow implementations developed to serve the needs of the University of Notre Dame's biologists. These leverage insights from both biology and distributed systems to achieve their goals. In implementing them we encountered and solved several practical challenges on the path to scaling up. We close with the introduction of a bioinformatics algorithm to detect loci-specific selective pressure favoring codon rarity in ortholog groups that span Archaea, Prokaryota, and Eukaryota.
History
Date Modified
2017-06-02
Research Director(s)
Scott Emrich
Committee Members
Douglas Thain
Scott Emrich
Patricia Clark
Degree
Master of Science in Computer Science and Engineering