Publication: Optimizing stochastic computing based convolutional neural network implementation for image processing applications
Loading...
Date
2024-02-01
Authors
Lee, Yang Yang
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Stochastic Computing (SC) has a tremendous amount of study on Application
Specific Integrated Circuit (ASIC) due to its logic-level design. However, Field-
programmable Gate Array (FPGA) could be heavily underutilised with literal ASIC
transcoding, making FPGA inefficient in implementing SC. Also, not all convolutional
neural network (CNN) algorithms could fit or be incompatible with the SC domain.
This research reinvented several FPGA-optimised SC CNN architectures and
successfully simulated and implemented them on Xilinx Kintex7 FPGA via High-
Level Synthesis (HLS) and with Genetic Algorithm (GA) optimisation. SC multiply-
accumulate (MAC) component is 35x to 332x more resource efficient than equivalent
fixed-point binary computing MAC, while estimated being 2.967x more resource
efficient than the conventional SC method in implementing SC CNN. At least 368x
better energy efficiency and 31x higher data throughput than binary computing is
achieved while only having 0.14% accuracy loss relative to binary CNN that attained
98.36% accuracy in the MNIST handwriting classification task. The SC CNN early
decision termination capability extended the performance envelope exponentially,
making SC CNN extremely lucrative for image classification applications on edge
devices. However, SC CNN YOLOv3 in license plate (LP) detection is prone to SC
noise, degrading the average precision (AP) score to 71.21% compared to binary CNN
YOLOv3’s 90.16% AP score, despite having the efficiency and throughput gains of
1054x and 77x, respectively. Thus, SC is ill-suited for CNN regression application.