A Reconfigurable Floating-Point Division and Square Root Architecture for High-Precision Softmax
A Reconfigurable Floating-Point Division and Square Root Architecture for High-Precision Softmax
Abstract:
With the advancement of deep learning models, the Softmax function with self-attention has become pervasive in everyday applications. As components of the Softmax function and its inputs, both division and square root operations impact its accuracy. However, these two non-linear operations bring significant area and power consumption for hardware implementation. To address these challenges, this paper proposes a reconfigurable floating-point division and square root (FDSR) architecture that achieves low resource consumption and high accuracy for general-purpose computation. The FDSR enhances the traditional non-restoring algorithm by using shift-registers and optimizing the leading-one detection and shift operations, reducing hardware resource usage while maintaining high accuracy (0.5 ULP). In the mantissa calculation, the division can be converted to a square root operation by simply switching the path to the subtractor through multiplexers. Additionally, a triple-mode reconfigurable iteration unit is introduced, featuring a multi-layer pipelined architecture that provides adaptability for different applications. By redesigning the pipeline depth and reusing logical units, the FDSR efficiently addresses the issue of large-scale integration. Experimental results show that in 40 nm CMOS technology, the FDSR achieves significant improvements, including a 14.69% area reduction for floating-point division compared with Synopsys Design Ware and 38.95% power reduction at 91.55% accuracy. For floating-point square root, the FDSR reduces area by 64.63% and power consumption by 87.81% compared with Synopsys Design Ware. On FPGA implementation, the FDSR significantly reduces power consumption, achieving an 83.23% reduction for floating-point division and 87.81% for floating-point square root, outperforming state-of-the-art designs.
Index Terms —
Floating-point division, floating-point square root, low-resource consumption, and high-accuracy computation.
” Thanks for Visit this project Pages – Register This Project and Buy soon with Novelty “