Date of Award
Doctor of Philosophy
Jamie Coble, Jens Gregor, Hairong Qi
Recently, deep learning models such as convolutional and recurrent neural networks have displaced state-of-the-art techniques in a variety of application domains. While the computationally heavy process of training is usually conducted on powerful graphics processing units (GPUs) distributed in large computing clusters, the resulting models can still be somewhat heavy, making deployment in resource- constrained environments potentially problematic. In this work, we build upon the idea of conditional computation, where the model is given the capability to learn how to avoid computing parts of the graph. This allows for models where the number of parameters (and in a sense, the model’s capacity to learn) can grow at a faster rate than the computation that is required to propagate information through the graph. In this work, we apply conditional computation to feed forward and recurrent neural networks. In the feed forward case, we demonstrate a technique that trades off accuracy for potential computational benefits, and in the recurrent case, we demonstrate techniques that yield practical speed benefits on a language modeling task. Given the rapidly expanding domain of problems where deep learning proves useful, the work presented here can help enable the future scalability requirements of deploying trained models.
Davis, Andrew Scott, "Conditional Computation in Deep and Recurrent Neural Networks. " PhD diss., University of Tennessee, 2016.