
Masters Theses
Date of Award
12-2024
Degree Type
Thesis
Degree Name
Master of Science
Major
Computer Science
Major Professor
Catherine Schuman
Committee Members
Piotr Luszczek, James Plank
Abstract
In this modern era of AI revolution, there have been massive and rapid investments in data-driven large-scale AI systems. However, the high-performance computing techniques that enable the computation for these rapidly growing AI systems, consume a staggering amount of energy and resources. This proliferation of AI brings new optimization challenges for sustainability without losing scalability and performance. This thesis research aims to tackle these challenges and provide a way forward for scalable and sustainable AI. Among the energy-efficient alternatives of the traditional Von Neumann architecture, neuromorphic computing and its Spiking Neural Networks (SNNs) are a promising choice due to their inherent energy efficiency. However, in some real-world application scenarios such as complex continuous control tasks, SNNs often lack the performance optimizations of traditional artificial neural networks. To address this issue, researchers have combined SNNs with Deep Reinforcement Learning (DeepRL) algorithms to utilize their optimization techniques. Although this integration manages to accomplish these complex tasks, the question of scalability still remains unexplored. Hence, this thesis presents a novel model called SpikeRL, which is a scalable and efficient framework for DeepRL-based SNNs for complex continuous control tasks. The SpikeRL framework consists of three major components. First, a DeepRL-based SNN model utilizing population encoding. Second, distributed computing across models and environments is implemented using PyTorch’s distributed package with both Message Passing Interface (MPI) and NVIDIA Collective Communications ivLibrary (NCCL) backends. Lastly, further optimization for model training is achieved by using mixed-precision techniques for parameter updates. Comparison analyses with the state-of-the-art SNN methods demonstrated that the SpikeRL model achieves an overall performance increase of 40%, an energy efficiency of 39%, and a carbon emission reduction of 28%. The SpikeRL model was also tested on a neuromorphic hardware at TENNLab using the Reduced Instruction Spiking Processor (RISP) simulation to run the inference of the model. Although the deployment of SpikeRL on a neuromorphic hardware is still work in progress, the research findings presented in this thesis demonstrate the scalability and energy efficiency of SpikeRL for the training of complex continuous control agents, leading to advancements in the domain of scalable and sustainable AI.
Recommended Citation
Tahmid, Tokey, "Energy-Efficient Computing for Scalable and Sustainable AI. " Master's Thesis, University of Tennessee, 2024.
https://trace.tennessee.edu/utk_gradthes/12870
Included in
Computer and Systems Architecture Commons, Other Computer Engineering Commons, Robotics Commons