Date of Award
Master of Science
Hairong Qi, Garrett Rose
The deep learning technique of convolutional neural networks (CNNs) has greatly advanced the state-of-the-art for computer vision tasks such as image classification and object detection. These solutions rely on large systems leveraging wattage-hungry GPUs to provide the computational power to achieve such performance. However, the size, weight and power (SWaP) requirements of these conventional GPU-based deep learning systems are not suitable when a solution requires deployment to so called "Edge" environments such as autonomous vehicles, unmanned aerial vehicles (UAVs) and smart security cameras.
The objective of this work is to benchmark FPGA-based alternatives to conventional GPU systems that have the potential to offer similar CNN inference performance while being delivered in a low SWaP platform suitable for Edge deployment. In this thesis we create equivalent pipelines for both GPU and FPGA which implement deep learning models for both image classification and object detection tasks. Beyond baseline benchmarking, we additionally quantify the impact on inference performance of two common real-world image degradation scenarios (simulated contrast reduced capture and salt-and-pepper sensor noise) and their associated correction methods (gamma correction and median kernel filtering) we selected as illustrative examples. The baseline system analysis, coupled with these additional robustness evaluations, provides a statistically significant benchmark comparison targeting a breadth of interest for the computer vision community.
We have conducted the following experiments to demonstrate the FPGA as an effective alternative to the GPU implementation when deployed to Edge environments: (1) we developed a hardware video processing architecture with an associated library of hardware processing functions to prototype a base FPGA ecosystem, (2) we established through benchmarking that two common CNN models (ResNet-50 and YOLO version 3) have a mere 1\% drop in performance on FPGA versus GPU, (3) we show a quantitative baseline analysis for the image degradation/correction on the associated testing datasets, and (4) we proved that our FPGA-based computer vision system is an ideal platform for Edge deployment given its comparable robustness to input degradation when optimal correction is applied.
The significance of these findings is the demonstration of our FPGA-based solution as the superior candidate for Edge deployed vision systems evidenced by our experiments which illustrate its competitive inference performance to the conventional GPU solution and its equivalent robustness provided by correction methods to noise encountered during in-the-wild imaging while being delivered with far lower SWaP requirements.
Cornett, David Carter, "Evaluation of Robust Deep Learning Pipelines Targeting Low SWaP Edge Deployment. " Master's Thesis, University of Tennessee, 2021.