Masters Theses
Date of Award
5-2014
Degree Type
Thesis
Degree Name
Master of Science
Major
Life Sciences
Major Professor
Loren J. Hauser
Committee Members
Elizabeth Fozo, Brian O'Meara, Chongle Pan
Abstract
The growing implementation of next-generation sequencing technologies presents numerous fields with the opportunity to identify bacteria in near real-time. Fields such as counter-terrorism, forensics, medicine, and even microbial ecology are positioned to benefit from such advances and implementation. However, with the ability to rapidly produce high-quality sequence data comes the need to interpret this data as quickly as it is produced. While gene prediction algorithms have kept pace, functional prediction methods have not.
To bypass the need for large-scale queries to multiple databases for each newly-sequenced genome, the project detailed herein seeks to identify the genes shared within a taxonomic group using the pan-genome for that group. Doing so allows the pan-genome to be queried against this set of databases a single time, then rapidly searched with new genomes using k-mer peptide matching to make functional predictions.
Thirty-one strains from Salmonella enterica subsp. enterica were used to build the pan-genome for this taxon as a test model. Proteins in a new genome could then be matched with complete consistence to the resulting database in a matter of seconds (per genome) using a k-mer peptide search algorithm. This represents a major advancement in annotation speed over existing pipelines.
Recommended Citation
Utley, Jordan Matthew, "R-FAP: Rapid Functional Annotation of Prokaryotes Using Taxon-specific Pan-genomes and 10-mer Peptides. " Master's Thesis, University of Tennessee, 2014.
https://trace.tennessee.edu/utk_gradthes/2780
Included in
Bioinformatics Commons, Biology Commons, Computational Biology Commons, Genomics Commons