Document Type


Publication Date




Reliability and reproducibility are key metrics for gene expression assays. This report assesses the utility of the correlation coefficient in the analysis of reproducibility and reliability of gene expression data.


The correlation coefficient alone is not sufficient to assess equality among sample replicates but when coupled with slope and scatter plots expression data equality can be better assessed. Narrow-intervals of scatter plots should be shown as a tool to inspect the actual level of noise within the data. Here we propose a method to examine expression data reproducibility, which is based on the ratios of both the means and the standard deviations for the inter-treatment expression ratios of genes. In addition, we introduce a fold-change threshold with an inter-replicate occurrence likelihood lower than 5% to perform analysis even when reproducibility is not acceptable. There is no possibility to find a perfect correlation between transcript and protein levels even when there is not any post-transcriptional regulatory mechanism. We therefore propose an adjustment for protein abundance with that of transcript abundance based on open reading frame length.


Here, we introduce a very efficient reproducibility approach. Our method detects very small changes in large datasets which was not possible through regular correlation analysis. We also introduce a correction on protein quantities which allows us to examine the post-transcriptional regulatory effects with a higher accuracy.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."