TY - GEN
T1 - Acceleration of Fully Connected Layers on FPGA using the Strassen Matrix Multiplication
AU - Leon-Vega, Luis G.
AU - Chaon-Rodriguez, Alex
AU - Salazar-Villalobos, Eduardo
AU - Castro-Godinez, Jorge
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Deep Learning is one of the most popular techniques of Machine Learning (ML) but also one of the most computationally intensive and energy-demanding task in High-Performance Computing (HPC), resulting in concerns about the sustainability of executing Artificial Intelligence massively. The challenge becomes harder when performing inference tasks at the Edge, where computational resources and energy are scarce. Field Programmable Gate Arrays (FPGAs) are hardware-level reconfigurable devices with greater benefits in computer power vs energy consumption compared to CPUs and GPUs in the wide spectrum of HPC. High-Level Synthesis has helped reduce the complexity of using FPGAs to accelerate algorithms using high-level languages like C++. This opens opportunities to research new hardware architectures for accelerating Deep Learning Inference (DLI) using FPGAs for energy-constraint applications. Most ML applications involve Deep Neural Networks (DNNs), where matrix multiplication is one of the most popular operations. This work covers the optimisation of matrix multiplication by proposing a generic, HLS-compliant Strassen Matrix Multiplication Processing Element (PE) capable of adapting its implementation under different numerical precisions and hardware approximations. We discuss the PE's numerical and resource consumption analysis under several configurations and describe how it behaves in an actual DNN model dedicated to anomaly detection, spotting promising results for FPGA-based DLI at the Edge, saving up to 12.5% of DSP cells compared to the standard multiplication units on an XC7A50T (low-end FPGA) with negligible accuracy loss.
AB - Deep Learning is one of the most popular techniques of Machine Learning (ML) but also one of the most computationally intensive and energy-demanding task in High-Performance Computing (HPC), resulting in concerns about the sustainability of executing Artificial Intelligence massively. The challenge becomes harder when performing inference tasks at the Edge, where computational resources and energy are scarce. Field Programmable Gate Arrays (FPGAs) are hardware-level reconfigurable devices with greater benefits in computer power vs energy consumption compared to CPUs and GPUs in the wide spectrum of HPC. High-Level Synthesis has helped reduce the complexity of using FPGAs to accelerate algorithms using high-level languages like C++. This opens opportunities to research new hardware architectures for accelerating Deep Learning Inference (DLI) using FPGAs for energy-constraint applications. Most ML applications involve Deep Neural Networks (DNNs), where matrix multiplication is one of the most popular operations. This work covers the optimisation of matrix multiplication by proposing a generic, HLS-compliant Strassen Matrix Multiplication Processing Element (PE) capable of adapting its implementation under different numerical precisions and hardware approximations. We discuss the PE's numerical and resource consumption analysis under several configurations and describe how it behaves in an actual DNN model dedicated to anomaly detection, spotting promising results for FPGA-based DLI at the Edge, saving up to 12.5% of DSP cells compared to the standard multiplication units on an XC7A50T (low-end FPGA) with negligible accuracy loss.
KW - deep learning
KW - edge computing
KW - field programmable gate arrays
KW - high-level synthesis
KW - low-power electronics
KW - Strassen multiplication
UR - http://www.scopus.com/inward/record.url?scp=85184353316&partnerID=8YFLogxK
U2 - 10.1109/BIP60195.2023.10379257
DO - 10.1109/BIP60195.2023.10379257
M3 - Contribución a la conferencia
AN - SCOPUS:85184353316
T3 - 5th IEEE International Conference on BioInspired Processing, BIP 2023
BT - 5th IEEE International Conference on BioInspired Processing, BIP 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th IEEE International Conference on BioInspired Processing, BIP 2023
Y2 - 28 November 2023 through 30 November 2023
ER -