Masters Theses

Orcid ID

orcid.org/0000-0002-2291-0860

Date of Award

12-2021

Degree Type

Thesis

Degree Name

Master of Science

Major

Computer Engineering

Major Professor

Mongi Abidi

Committee Members

Hairong Qi, Garrett Rose

Abstract

The deep learning technique of convolutional neural networks (CNNs) has greatly advanced the state-of-the-art for computer vision tasks such as image classification and object detection. These solutions rely on large systems leveraging wattage-hungry GPUs to provide the computational power to achieve such performance. However, the size, weight and power (SWaP) requirements of these conventional GPU-based deep learning systems are not suitable when a solution requires deployment to so called "Edge" environments such as autonomous vehicles, unmanned aerial vehicles (UAVs) and smart security cameras.

The objective of this work is to benchmark FPGA-based alternatives to conventional GPU systems that have the potential to offer similar CNN inference performance while being delivered in a low SWaP platform suitable for Edge deployment. In this thesis we create equivalent pipelines for both GPU and FPGA which implement deep learning models for both image classification and object detection tasks. Beyond baseline benchmarking, we additionally quantify the impact on inference performance of two common real-world image degradation scenarios (simulated contrast reduced capture and salt-and-pepper sensor noise) and their associated correction methods (gamma correction and median kernel filtering) we selected as illustrative examples. The baseline system analysis, coupled with these additional robustness evaluations, provides a statistically significant benchmark comparison targeting a breadth of interest for the computer vision community.

We have conducted the following experiments to demonstrate the FPGA as an effective alternative to the GPU implementation when deployed to Edge environments: (1) we developed a hardware video processing architecture with an associated library of hardware processing functions to prototype a base FPGA ecosystem, (2) we established through benchmarking that two common CNN models (ResNet-50 and YOLO version 3) have a mere 1\% drop in performance on FPGA versus GPU, (3) we show a quantitative baseline analysis for the image degradation/correction on the associated testing datasets, and (4) we proved that our FPGA-based computer vision system is an ideal platform for Edge deployment given its comparable robustness to input degradation when optimal correction is applied.

The significance of these findings is the demonstration of our FPGA-based solution as the superior candidate for Edge deployed vision systems evidenced by our experiments which illustrate its competitive inference performance to the conventional GPU solution and its equivalent robustness provided by correction methods to noise encountered during in-the-wild imaging while being delivered with far lower SWaP requirements.

Comments

This is the FINAL version of my thesis approved by my committee and my advisor. The only changes from the previous submission are minor grammatical changes.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS