Identification and Quantification of Amino Acid Substitutions in Bottom-up Proteomics
dataset
posted on 2025-05-08, 15:26authored byTaylor Lundgren
An amino acid substitution is a change in the amino acid (aa) sequence that composes a protein. Even changing one amino acid can alter the function and stability of the protein. To preserve protein function, there are many cellular mechanisms that ensure protein synthesis is faithful to the genome-defined sequence. Studying these mechanisms is difficult because proteins with substituted aa sequences are in a complex mixture with proteins adhering to the genome-defined sequence. A tool to measure a complex mixture of proteins is bottom-up proteomics (BUP). The standard BUP methodology extracts proteins, digests proteins into peptides, and measures peptides with liquid chromatography mass spectrometry. However, identifying the measured peptides in BUP relies on matching genome-defined sequences. Therefore, an adaptation to BUP is needed to successfully measure aa substitutions.
I attempted to study all proteins that deviate from the genome-defined aa sequence. I found that data processing tools being developed to identify peptides with chemical modifications to aa could be adapted to identify single aa substitutions in bottom-up proteomics and determined the limitations of the bioinformatic analysis process. The identification of aa substitutions specifically is improving with the general progress of peptide identification tools. Substitutions that accumulate in vivo are effectively measured with BUP. However, many aa substitutions were observed to have a survivorship bias, where the incidence of substitutions is different than the observed abundance of substitutions. These substitutions are only observed under conditions increasing substitution incidence, reducing degradation, or by specific enrichment of nascent peptides. I further used bottom-up proteomics to validate known determinants of substitution abundance and model their relative contributions. These key findings on how aa substitutions are measured define the scope of testable hypotheses and variables to consider in experimental designs.