University of Notre Dame
Browse
RegierA102008.pdf (1.93 MB)

Challenges in working with draft genomes

Download (1.93 MB)
thesis
posted on 2008-10-20, 00:00 authored by Allison A.P. Regier
As the cost of DNA sequencing falls, the relative cost of finishing steps (e.g., error correction and gap-closing) is increasing. As a result, many completed genome projects are only completed to draft stages and may not provide full information about the location of sequences on the chromosome. Further, they may contain gaps and assembly errors. Whether draft or finished, the output of a genome sequence project serves as the input to a host of analysis tools such as gene finding or variation analysis. Many of these tools have been designed for and tested on high-quality, finished genomes such as human or the fruit fly Drosophila melanogaster. In this thesis we discuss specific challenges in working with draft genomes and show how methods can be adapted to be more effective in draft genomes. First, we examine computational methods for finding errors in draft assemblies. Next, we modify a technique for finding DNA inversions between two genomes to account for gaps in the genomes. Finally, we develop a pipeline to construct chromosomes out of draft scaffolds using a closely related reference genome. We use examples from three different species of importance to global health: the body louse (Pediculus humanus), a malaria mosquito (Anopheles gambiae), and the human malaria parasite (Plasmodium falciparum).

History

Date Modified

2017-06-02

Research Director(s)

Kevin W Bowyer Scott J Emrich Scott J Emrich

Degree

  • Master of Science in Computer Science and Engineering

Degree Level

  • Master's Thesis

Language

  • English

Alternate Identifier

etd-10202008-153947

Publisher

University of Notre Dame

Program Name

  • Computer Science and Engineering

Usage metrics

    Masters Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC