Date of Award

12-2015

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Computer Engineering

Major Professor

Itamar Arel

Committee Members

Jamie Coble, Jeremy Holleman, Jens Gregor

Abstract

Neural networks have had many great successes in recent years, particularly with the advent of deep learning and many novel training techniques. One issue that has affected neural networks and prevented them from performing well in more realistic online environments is that of catastrophic forgetting. Catastrophic forgetting affects supervised learning systems when input samples are temporally correlated or are non-stationary. However, most real-world problems are non-stationary in nature, resulting in prolonged periods of time separating inputs drawn from different regions of the input space.

Reinforcement learning represents a worst-case scenario when it comes to precipitating catastrophic forgetting in neural networks. Meaningful training examples are acquired as the agent explores different regions of its state/action space. When the agent is in one such region, only highly correlated samples from that region are typically acquired. Moreover, the regions that the agent is likely to visit will depend on its current policy, suggesting that an agent that has a good policy may avoid exploring particular regions. The confluence of these factors means that without some mitigation techniques, supervised neural networks as function approximation in temporal-difference learning will be restricted to the simplest test cases.

This work explores catastrophic forgetting in neural networks in terms of supervised and reinforcement learning. A simple mathematical model is introduced to argue that catastrophic forgetting is a result of overlapping representations in the hidden layers in which updates to the weights can affect multiple unrelated regions of the input space. A novel neural network architecture, dubbed "cluster-select," is introduced which utilizes online clustering for the selection of a subset of hidden neurons to be activated in the feedforward and backpropagation stages. Clusterselect is demonstrated to outperform leading techniques in both classification nd regression. In the context of reinforcement learning, cluster-select is studied for both fully and partially observable Markov decision processes and is demonstrated to converge faster and behave in a more stable manner when compared to other state-of-the-art algorithms.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS