An FPGA based Accelerator for Clustering Algorithms with Custom Instructions
Clustering algorithms are becoming popular and widely applied in many academic fields, such as machine learning, pattern recognition, and artificial intelligence. It has posed significant challenges to accelerate the algorithms due to the explosive data scale and wide variety of applications. However, previous studies mainly focus on the raw speedup with insufficient attention to the flexibility of the accelerator to support various applications. In order to accelerate different clustering algorithms in one accelerator, in this paper we design an accelerating framework based on FPGA for four state-of-the-art clustering methods, including K-means, PAM, SLINK, and DBSCAN algorithms. Moreover, we provide both Euclidean and Manhattan distances as similarity metrics in the accelerator design paradigm. Moreover, we provide a custom instruction set to operate the accelerators within each application. In order to evaluate the performance and hardware cost of the accelerator, we constructed a hardware prototype on the state-of-the-art Xilinx FPGA platform. Experimental results demonstrate that the accelerator framework is able to achieve up to 23 speedup than Intel Xeon processor, and is 9.46 more energy efficient than NVIDIA GTX 750 GPU accelerators.
Accelerators, Clustering, Custom Instructions, Machine Learning, FPGA