Implementation of Data-optimized FPGA-based Accelerator for Convolutional Neural Network
Convolutional Neural Networks (CNNs) are widely used for image recognition, and FPGAs are considered suitable platform for CNNs due to their low power consumption and reconfigurability. While CNNs are mostly trained using floating point data type for high inference accuracy, fixed point data type can be used to reduce data size and take advantage of computation efficiency on FPGAs without any accuracy loss. In this paper, we propose an accelerator design for LeNet-5 CNN architecture  for MNIST handwritten digit recognition. The accelerator is synthesized with Xilinx Vivado High-Level Synthesis (HLS) tool (v2017.2), targeting xczu9eg-ffvb1156-2-i FPGA board. The proposed accelerator focuses on reducing latency and memory usage, and the performance is compared with a conventional floating point design. Our proposed accelerator can achieve latency reduction up to 90% and memory usage reduction up to 40% without any accuracy loss, compared to the conventional design.
Convolutional Neural Network, FPGA, High-level Synthesis, Accelerator