University of Notre Dame
Browse

File(s) under permanent embargo

A Flexible Comparative Genomics Framework for Integrating Heterogeneous Sequence Data

thesis
posted on 2011-07-22, 00:00 authored by Allison Ann Regier
Genome sequencing technologies have revolutionized biology in the past two decades, yet data analysis has lagged behind data production. In this thesis, we present a framework for analyzing genomic data in more flexible ways than previous techniques. First, the framework allows researchers to design analyses that compare genomic samples directly instead of relying on reference-relative variant calls, as most current tools do. Second, we provide utilities to look at both assembly data and resequencing data in the same analysis, where previous tools were restricted to either looking at an assembly or at resequencing data. Finally, our framework allows researchers to flexibly incorporate alignments to arbitrarily many reference sequences into their analysis.

We describe FlexReseq, the software implementation of this framework. FlexReseq allows researchers to easily customize resequencing analyses using a simple configuration file to define positions of interest. We give results from applications of these tools such as genotyping strains of Plasmodium falciparum, finding diversity and divergence between strains of Anopheles gambiae, detecting inversions based on assembly and alignment information from A. gambiae, and exploring resequencing analysis using alignments to multiple reference sequences.

History

Date Modified

2017-06-05

Defense Date

2011-07-07

Research Director(s)

Scott J. Emrich

Committee Members

Mihai Pop Frank Collins Kevin Bowyer Nora Besansky

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Language

  • English

Alternate Identifier

etd-07222011-111630

Publisher

University of Notre Dame

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC