Title

Automated Genome-Wide Protein Domain Exploration

Date of Award

12-2007

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Computer Engineering

Major Professor

Gregory D. Peterson

Committee Members

Igor Jouline, Michael Berry, Ethan Farquhar

Abstract

Exploiting the exponentially growing genomics and proteomics data requires high quality, automated analysis. Protein domain modeling is a key area of molecular biology as it unravels the mysteries of evolution, protein structures, and protein functions. A plethora of sequences exist in protein databases with incomplete domain knowledge. Hence this research explores automated bioinformatics tools for faster protein domain analysis. Automated tool chains described in this dissertation generate new protein domain models thus enabling more effective genome-wide protein domain analysis. To validate the new tool chains, the Shewanella oneidensis and Escherichia coli genomes were processed, resulting in a new peptide domain database, detection of poor domain models, and identification of likely new domains. The automated tool chains will require months or years to model a small genome when executing on a single workstation. Therefore the dissertation investigates approaches with grid computing and parallel processing to significantly accelerate these bioinformatics tool chains.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS