Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. QUANTITATIVE AND FUNCTIONAL ANALYSIS PIPELINE FOR LABEL-FREE METAPROTEOMICS DATA AND ITS APPLICATIONS
Details

QUANTITATIVE AND FUNCTIONAL ANALYSIS PIPELINE FOR LABEL-FREE METAPROTEOMICS DATA AND ITS APPLICATIONS

Date Issued
August 1, 2015
Author(s)
Lee, Lang Ho  
Advisor(s)
Nathan C. VerBerkmoes, Tim E. Sparer
Additional Advisor(s)
Tamah Fridman
Arnold M. Saxton
Chongle Pan
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/24552
Abstract

Since the large-scale metaproteome was first reported in 2005, metaproteomics has advanced at a tremendous rate both in its quantitative and qualitative metrics. Furthermore metaproteomics is now being applied as a general tool in microbial ecology in a large variety of environmental studies. Though metaproteomics is becoming a useful and even a standard tool for the microbial ecologist, standardized bioinformatics pipelines are not readily available. Therefore, we developed quantitative and functional analysis pipeline for metaproteomics (QFAM) to help analyze large and complicated metaproteomics data in a robust and timely fashion with outputs designed to be simple and clearly understood by the microbial ecologist.


QFAM starts by running peptide-spectrum searches against resultant MS/MS datasets with mixed metagenome/appropriate protein FASTA database. Its primary search algorithm is MyriMatch/IDPicker. MyriMatch/IDPicker uses multi-CPUs effectively, has an accurate scoring-system, correctly use the high MS accuracy data, and finally has a robust method for protein determination. These are required features for metaproteomics requiring large protein database and complicated peptide-structure.

QFAM has quantitative (QAM) and functional (FAM) analysis to provide dependable protein signatures and confident information for understanding the characteristics of the metaproteome. QAM employs a ’selfea’ R package, which provides probability models as well as Cohen’s effect sizes. Our benchmark data test and Monte Carlo simulation results show that selfea can reduce false positives efficiently while losing few true positives; one of the key goals of proteomics and/or metaproteomics experiments.

FAM has two modules: BioSystems and COG analysis. The BioSystems module is most appropriate for well-annotated model organisms, such as humans, whereas the COG module is useful for less-annotated microorganisms and metagenome sequences. Both modules provide an enrichment test using Fisher’s exact-test and a significance test using selfea. With two statistics, FAM generates differentially enriched functional terms that are insightful for discerning biological information held behind the metaproteome data.

Two application studies in chapter 4 and 5 show how QFAM can be employed for metaproteomics data analysis. QFAM is distinguished from other proteomics pipelines by multiprocessing as well as quantitative and functional analysis.

Subjects

Metaproteomics

Mass spectrometry

Quantitative analysis...

Functional analysis

Cohen's effect sizes

Quasi-Poisson

Disciplines
Biochemistry
Bioinformatics
Integrative Biology
Degree
Doctor of Philosophy
Major
Life Sciences
Embargo Date
January 1, 2011
File(s)
Thumbnail Image
Name

Doctoral_dissertation_LHL.docx

Size

25.75 MB

Format

Microsoft Word XML

Checksum (MD5)

e9ac21075d2b275076ed4299b7284d3b

Thumbnail Image
Name

Doctoral_dissertation_LHL_2015726.pdf

Size

7.37 MB

Format

Adobe PDF

Checksum (MD5)

5eed141a490d0e86eaa86fd91409eac5

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify