Computational Analyses of Codon Usage Bias and Its Effects on Protein Translation, Expression, and Folding
Synonymous codons (i.e., codons that code for the same amino acid) are not used uniformly within a species' genome, resulting in measurable codon usage bias (CUB) for each species. Despite not affecting the resulting amino acid sequence, alterations to synonymous codon usage in a gene have been shown to affect the expression, folding, and function of the resulting protein. One prominent hypothesis about why this occurs is that a codon's translation rate is directly related to how 'preferred' the codon is in a given species, and that altered translation rates can potentially affect co-translational protein folding. However, there is debate about how best to define CUB so that it most accurately depicts codon translation rates. A better understanding of codon translation rates is imperative to understanding the processes of co-translational protein folding and heterologous gene expression. The debate about how best to define CUB has resulted in the development of numerous codon usage models, all of which define preferred codons in a distinct way. However, a rigorous comparison of these codon usage models has not yet been completed.
To that end, this work discusses a novel, rigorous comparison of these codon usage models relative to two different types of experimental data, as well as the development of two hybrid codon usage models that address a weakness of one of the most popular CUB models. Additionally, a novel 'codon harmonization' algorithm is developed, whose aim is to replicate the CUB of a native gene sequence in a synonymous mutant (for heterologous gene expression). This algorithm has the ability to incorporate a number of codon usage models, and therefore allows each model's biological efficacy to be tested in vivo. Finally, a novel framework for evaluating the effects of CUB on three-dimensional protein structures is discussed. This framework aims to illuminate of the effects of hypothesized co-translational folding on resulting protein structures.
History
Date Modified
2021-05-18Defense Date
2021-03-26CIP Code
- 40.0501
Research Director(s)
Scott EmrichCommittee Members
Patricia Clark Tim Weninger Collin McMillanDegree
- Doctor of Philosophy
Degree Level
- Doctoral Dissertation
Alternate Identifier
1250640528Library Record
6022711OCLC Number
1250640528Rights Statement
https://creativecommons.org/licenses/by-nc/4.0/Program Name
- Computer Science and Engineering