Doctoral Dissertations

Date of Award

8-2008

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Chemistry

Major Professor

Frank Vogt

Committee Members

Michael Sepaniak, Michael Best, Nicole Labbé

Abstract

Spectroscopic imaging is a vital tool for studying heterogeneous samples such as bacteria and tissue. Its ability to acquire spatially resolved information allows for identification and classification of the various constituents within a sample. Spectroscopic imagers quickly acquire thousands to tens of thousands of spectra per measurement. These data are often arranged in the form of a 3-dimensional (3D) data cube which contains two spatial dimensions and one spectral dimension. This large amount of data is beneficial for gaining a thorough understanding about the distributions of chemical information. If too little information is measured, important chemical behavior may be overlooked. Statistical analysis algorithms (chemometrics) are required to determine the relevant spectroscopic information within a data cube. Applying chemometrics to such large volumes of data presents computational difficulties regarding computer memory and processing speed. To overcome these burdens, wavelet transform compression is applied prior to chemometric evaluation to accelerate computations and reduce data storage requirements.

To optimize compression by enhancing acceleration and reducing approximation errors, different wavelets, or „hybrid wavelets‟, can be applied to the different dimensions of a 3D data set. Determining which combination of wavelets will yield the most compression and best data representation is difficult since many possibilities exist. A compression method is presented that automatically determines the optimum wavelet combinations for a given data set. Principal component analysis (PCA) is used to demonstrate the capabilities of this new procedure, but the compression routine is advantageous for many chemometric techniques.

Although linear algorithms like PCA work well in many situations, they are not well-adapted for explaining nonlinear relationships. Kernel principal component analysis (KPCA) has recently been developed to overcome the limitations of linear algorithms. However, when applied to spectroscopic imaging, KPCA calculations require multiple gigabytes of RAM just for holding the data. Therefore, routine use of the algorithm is often prohibited on personal computers. To circumvent such situations, a wavelet compression algorithm is presented that avoids ever having to hold all data in memory at any point during the calculations. The goal is to enable the application of KPCA to large imaging data sets of heterogeneous samples.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS