Date of Award

8-2013

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Computer Engineering

Major Professor

Itamar Arel

Committee Members

Hairong Qi, Husheng Li, J. Wesley Hines

Abstract

Multi-stage visual architectures have recently found success in achieving high classification accuracies over image datasets with large variations in pose, lighting, and scale. Inspired by techniques currently at the forefront of deep learning, such architectures are typically composed of one or more layers of preprocessing, feature encoding, and pooling to extract features from raw images. Training these components traditionally relies on large sets of patches that are extracted from a potentially large image dataset. In this context, high-dimensional feature space representations are often helpful for obtaining the best classification performances and providing a higher degree of invariance to object transformations. Large datasets with high-dimensional features complicate the implementation of visual architectures in memory constrained environments. This dissertation constructs online learning replacements for the components within a multi-stage architecture and demonstrates that the proposed replacements (namely fuzzy competitive clustering, an incremental covariance estimator, and multi-layer neural network) can offer performance competitive with their offline batch counterparts while providing a reduced memory footprint. The online nature of this solution allows for the development of a method for adjusting parameters within the architecture via stochastic gradient descent. Testing over multiple datasets shows the potential benefits of this methodology when appropriate priors on the initial parameters are unknown. Alternatives to batch based decompositions for a whitening preprocessing stage which take advantage of natural image statistics and allow simple dictionary learners to work well in the problem domain are also explored. Expansions of the architecture using additional pooling statistics and multiple layers are presented and indicate that larger codebook sizes are not the only step forward to higher classification accuracies. Experimental results from these expansions further indicate the important role of sparsity and appropriate encodings within multi-stage visual feature extraction architectures.

Recommended Citation

Rose, Derek Christopher, "Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition. " PhD diss., University of Tennessee, 2013.
https://trace.tennessee.edu/utk_graddiss/2473

Download

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Included in

Artificial Intelligence and Robotics Commons, Other Computer Engineering Commons, Other Statistics and Probability Commons

COinS

Doctoral Dissertations

Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition

Date of Award

Degree Type

Degree Name

Major

Major Professor

Committee Members

Abstract

Recommended Citation

Included in

Search

Browse

Contributors

Useful Links

About Trace

Doctoral Dissertations

Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition

Author

Date of Award

Degree Type

Degree Name

Major

Major Professor

Committee Members

Abstract

Recommended Citation

Included in

Share

Search

Browse

Contributors

Useful Links

About Trace