Design of Low-Cost and High-Accurate 8-bit Logarithmic Floating-Point Arithmetic Circuits
Design of Low-Cost and High-Accurate 8-bit Logarithmic Floating-Point Arithmetic Circuits
Abstract:
Recent studies suggest that the 8-bit floating-point (FP) format plays an important role in deep learning, where the E4M3 (4-bit exponent, 3-bit mantissa) is suited for the natural language processing model and the E3M4 is better on computer vision tasks. In this brief, the logarithmic number system (LNS) is used to design multipliers and dividers because multiplication and division can be performed by the addition and subtraction in the logarithmic domain. Furthermore, this brief finds that the 3- and 4-bit logarithmic and anti-logarithmic (Antilog) converters can be effectively realized by {x, x + 1} and {x, x − 1}. As a result, compared to the standard E4M3 and E3M4 multipliers, the cell area can be reduced by 32% and 40%. Compared to the standard E4M3 and E3M4 divider, the cell area can be reduced by 61% and 67%. In addition, compared with the INT8-based divider, the area of convolution core using proposed multiplier is reduced by 33%. The accuracy loss of the quantized ResNet-50, MobileNet, and VIT-B based on the proposed converters are −0.12%, +0.38%, and +0.68%, which are better than the INT8-based design. In the end, the proposed divider can be used in the image change detection. The false rate is slightly reduced from 2.97% to 2.94% compared to the standard E3M4 divider.
Index Terms —
Division, image change detection, logarithmic number system (LNS), low bit-width floating-point (FP) representation, multiplication, neural networks.
” Thanks for Visit this project Pages – Register This Project and Buy soon with Novelty “