TY - GEN
T1 - Generic Accuracy Configurable Matrix Multiplication-Addition Accelerator using HLS
AU - Leon-Vega, Luis G.
AU - Castro-Godinez, Jorge
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Matrix Multiplication-Addition is one of the most common calculations when implementing Machine Learning (ML) algorithms for inference. Edge devices have limited processing power, due to resource and energy constraints, making the execution of these calculations a challenging task. This paper proposes the design of a generic configurable accelerator architecture for Generic Matrix Multiplication-Additions (GEMMA), implemented in untimed C++ for High-Level Synthesis and adaptable in matrix size, data bit-width, and data type for accuracy configuration, allowing tuning the impact on the overall design resource consumption. The overall analysis utilises existing Processing Elements (PE) from previous work as the execution units to perform the matrix operations. This work analyses the proposed architecture to spot design compromises regarding numerical accuracy. Also, this work points out that the proposed architecture inherits the behaviour of implementing each PE, presenting a trade-off between granularity and design efficiency.
AB - Matrix Multiplication-Addition is one of the most common calculations when implementing Machine Learning (ML) algorithms for inference. Edge devices have limited processing power, due to resource and energy constraints, making the execution of these calculations a challenging task. This paper proposes the design of a generic configurable accelerator architecture for Generic Matrix Multiplication-Additions (GEMMA), implemented in untimed C++ for High-Level Synthesis and adaptable in matrix size, data bit-width, and data type for accuracy configuration, allowing tuning the impact on the overall design resource consumption. The overall analysis utilises existing Processing Elements (PE) from previous work as the execution units to perform the matrix operations. This work analyses the proposed architecture to spot design compromises regarding numerical accuracy. Also, this work points out that the proposed architecture inherits the behaviour of implementing each PE, presenting a trade-off between granularity and design efficiency.
KW - approximate computing
KW - design automation
KW - field programmable gate arrays
KW - High-Level Synthesis
KW - inference
UR - http://www.scopus.com/inward/record.url?scp=85169441089&partnerID=8YFLogxK
U2 - 10.1109/DSN-W58399.2023.00048
DO - 10.1109/DSN-W58399.2023.00048
M3 - Contribución a la conferencia
AN - SCOPUS:85169441089
T3 - Proceedings - 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops Volume, DSN-W 2023
SP - 171
EP - 174
BT - Proceedings - 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops Volume, DSN-W 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops Volume, DSN-W 2023
Y2 - 27 June 2023 through 30 June 2023
ER -