WinTA: An Efficient Reconfigurable CNN Training Accelerator With Decomposition Winograd
WinTA: An Efficient Reconfigurable CNN Training
Accelerator With Decomposition Winograd
Abstract:
Convolutional neural networks (CNNs) are expected to bridge the domain shift between the training data and real-world tasks. Moreover, the efficient training of CNNs on resource-constrained platforms has become more important because of communication latency and privacy concerns. However, deploying CNN training on edge devices is challenging due to the intensive computation and diverse computational patterns. In this work, we firstly propose a hybrid decomposition Winograd (HDW) method that significantly reduces the number of multiplications and flexibly handles various convolution operations during training. Secondly, we design a reconfigurable CNN training accelerator, named WinTA, utilizing a set of unified transformation units to support various Winograd operations. Thirdly, we implement an efficient and flexible data access scheme using a hierarchical barrel shifter network (HBSN). Experimental results on the Xilinx Alveo U50 FPGA Card demonstrate that WinTA effectively accelerates CNN training. Compared to CPU and GPU implementations, WinTA achieves speedups of 7.1× and 1.65×, respectively, while improving energy efficiency by 26.6× and 10.4×, respectively. Additionally, our design provides 1.24× and 2.04× improvements in terms of throughput and resource efficiency compared to prior-art FPGA-based training accelerator.
” Thanks for Visit this project Pages – Register This Project and Buy soon with Novelty “
WinTA: An Efficient Reconfigurable CNN Training
Accelerator With Decomposition Winograd