Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition
Details

Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition

Date Issued
August 1, 2013
Author(s)
Rose, Derek Christopher
Advisor(s)
Itamar Arel
Additional Advisor(s)
Hairong Qi
Husheng Li
J. Wesley Hines
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/23501
Abstract

Multi-stage visual architectures have recently found success in achieving high classification accuracies over image datasets with large variations in pose, lighting, and scale. Inspired by techniques currently at the forefront of deep learning, such architectures are typically composed of one or more layers of preprocessing, feature encoding, and pooling to extract features from raw images. Training these components traditionally relies on large sets of patches that are extracted from a potentially large image dataset. In this context, high-dimensional feature space representations are often helpful for obtaining the best classification performances and providing a higher degree of invariance to object transformations. Large datasets with high-dimensional features complicate the implementation of visual architectures in memory constrained environments. This dissertation constructs online learning replacements for the components within a multi-stage architecture and demonstrates that the proposed replacements (namely fuzzy competitive clustering, an incremental covariance estimator, and multi-layer neural network) can offer performance competitive with their offline batch counterparts while providing a reduced memory footprint. The online nature of this solution allows for the development of a method for adjusting parameters within the architecture via stochastic gradient descent. Testing over multiple datasets shows the potential benefits of this methodology when appropriate priors on the initial parameters are unknown. Alternatives to batch based decompositions for a whitening preprocessing stage which take advantage of natural image statistics and allow simple dictionary learners to work well in the problem domain are also explored. Expansions of the architecture using additional pooling statistics and multiple layers are presented and indicate that larger codebook sizes are not the only step forward to higher classification accuracies. Experimental results from these expansions further indicate the important role of sparsity and appropriate encodings within multi-stage visual feature extraction architectures.

Subjects

machine learning

image recognition

deep learning

clustering

high-dimensional feat...

feature pooling

Disciplines
Artificial Intelligence and Robotics
Other Computer Engineering
Other Statistics and Probability
Degree
Doctor of Philosophy
Major
Computer Engineering
Embargo Date
January 1, 2011
File(s)
Thumbnail Image
Name

drosefinalwithTP.pdf

Size

6.83 MB

Format

Adobe PDF

Checksum (MD5)

4f3268a3a2f129abfa8abe560cff9ca9

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify