FPGA-based Implementation of a Real-Time Object Recognition System using Convolutional Neural Network
High computational complexity and power consumption makes Convolutional Neural Networks (CNNs) ineligible for real-time embedded applications. In this work, we introduce a low power and flexible platform as a hardware accelerator for CNNs. The proposed architecture is fully configurable by a software library so that it can perform different CNN models with a reconfigurable hardware. The hardware accelerator is evaluated on a ZC706 evaluation board. We make use of the AlexNet architecture in a real-time object recognition application to demonstrate the effectiveness of the proposed CNN accelerator. The results show that the performance rates of 198.1 GOP/s using 512 DSP blocks and 23.14 GOP/s using 64 DSP blocks are achievable for the convolution and fully connected layers, respectively. Moreover, images are processed at 82 frames per second, which is significantly higher than existing implementations.
Convolutional Neural Network, Object Recognition, FPGA, Configurable Architecture.