CarmichaelR042013T.pdf (726.35 kB)
Scaling Collaborative Bioinformatics
thesis
posted on 2013-04-19, 00:00 authored by Rory CarmichaelThough many common bioinformatics problems are amenable to parallelization and large datasets are becoming the norm for biological inquiry, biologists do not generally have the skillset to effectively automate, parallelize, and scale their workflows. This document describes contributions to bioinformatics ranging from collaborative frameworks to the automation of common workflows to the development of novel algorithms. We begin by describing Biocompute, a web portal that overcomes challenges in user interface design and resource sharing to facilitate collaborations between systems programmers, bioinformatics software developers, and biologists. Next, we highlight several parallel workflow implementations developed to serve the needs of the University of Notre Dame's biologists. These leverage insights from both biology and distributed systems to achieve their goals. In implementing them we encountered and solved several practical challenges on the path to scaling up. We close with the introduction of a bioinformatics algorithm to detect loci-specific selective pressure favoring codon rarity in ortholog groups that span Archaea, Prokaryota, and Eukaryota.
History
Date Modified
2017-06-02Research Director(s)
Scott EmrichCommittee Members
Douglas Thain Scott Emrich Patricia ClarkDegree
- Master of Science in Computer Science and Engineering
Degree Level
- Master's Thesis
Language
- English
Alternate Identifier
etd-04192013-135548Publisher
University of Notre DameProgram Name
- Computer Science and Engineering
Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC