Masters Theses

Date of Award

12-2002

Degree Type

Thesis

Degree Name

Master of Science

Major

Computer Science

Major Professor

Bruce A. whitehead

Committee Members

Kenneth R. Kimble, Bruce W. Bomar

Abstract

A typical feed forward neural network relies solely on its training algorithm, such as backprop or quickprop, to determine suitable weight values for an architecture chosen by the human operator. The architecture itself is typically a fully connected structuring of neurons and synapses where each hidden neuron is connected to every neuron in the next layer. Such architecture does not reflect the structure of the data used to train it. Similarly, in the case where random initial weight values are used, these initial weights are also unlikely to relate to the training set. Thus the job of the training algorithm is to adjust these weights without any initial suggestion for the structure and general trends present in the training data.

This thesis investigates the effect of restructuring a typical fully connected architecture into a collection of subnets and processing modules that exhibit an application specific ordering. The conglomeration of these modules and subnets will be called a supernet.

The processing modules use techniques such as cluster analysis to find general patterns within the training set – somewhat like a low-resolution representation of trends within the data. The subnets are then used for “drilling deeper” into examples that exhibit these trends to produce a higher-resolution representation of aspects within the training set. Additional modules, referred to as dicer and splicer agents in the text, are used to respectively direct training examples to certain subnets, and join outputs from different subnets. The resultant structure of a supernet is similar to that of a neural network, but instead of being made up purely of neurons and synapses, a supernet is rather an assemblage of processing modules and neural subnets connected with dicers and splicers.

The goal of this thesis is to demonstrate practical advantages of supernets and how they can improve are reviewed and compared with a standard fully connected neural network structure.

The results of this practical evaluation show that there are definite benefits to the supernet technique, such as the Classification Based on Subnet Error (CBSE) model that works well with clusters that exhibit dissimilar local behavior. Since a supernet is designed specifically for certain types of application, it cannot be expected to improve performance for general applications. Thus it is necessary to develop a specially tailored supernet for the particular type of application for which it will be used. Although this can result in more work for the neural network developer, it is possible to radically reduce this workload by reusing modules and having a quick and simple means to connect them. If similar types of applications occur frequently, then the effort in developing supernets for these reoccurring applications should provide long-term benefits, since in the long run it will be easier to acquire trained neural networks for these applications. Such an effect is not necessarily achieved using standard neural networks since they are designed for universal use and as such do not have a structure that is optimized for specific types of application.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS