While multiple tools exist for the analysis and identification of spectra generated in shotgun proteomics experiments, few easily implemented tools exist that allow for the automated analysis of the quality of spectra. A researcher’s knowledge of the quality of a spectra from an experiment can be helpful in determining possible reasons for misidentification or lack of identification of spectra in a sample.
Materials and methods
We are developing a automated high throughput method that analyses spectra from 2d-LC-MS/MS datasets to determine their quality and overall determines the quality of the run. We will then compare our programs to existing programs that perform a similar function. Our program calculates a quality score based on the following metrics: signal/noise ratio, absolute signal intensity, peak number, predicted mass distances between peak, and percent of incoming mass accounted for by peaks. These scores are then graphed against the outputs of common database search algorithms in order to display the following four categories: High-quality/Identified, High-quality/Unidentified, Low-quality/Identified, and Low-quality/Unidentified. We are currently testing the algorithm against 2d-LC-MS/MS runs of a mixed protein standard and blanks with no peptide spectra. The application samples are a time series of metaproteomes collected from environmental ground waters after biostimulation.
BMC Bioinformatics 2010, 11(Suppl 4):P27 doi:10.1186/1471-2105-11-S4-P27