MBS: A High-Precision Approximation Method for Softmax and Efficient Hardware Implementation

Abstract:

The softmax function needs to be frequently used in the multi-head attention layer of Transformer networks. Compared to DNNs and other networks, Transformers have higher computational complexity, requiring higher accuracy and hardware performance for softmax function calculations. Therefore, we propose mixed-base softmax (MBS) for the first time for the approximation of the softmax function. This method combines exponential functions with bases of 2 and 4, which is advantageous for hardware implementation. MBS has a high similarity to the softmax function and demonstrates advanced performance during inference in Transformer networks. Through algorithm transformation and hardware optimization, we have designed a low-complexity and highly parallel hardware architecture, which only occupies few additional hardware resources compared to base-2 softmax but achieves higher accuracy. Experimental results show that, under TSMC 90nm CMOS technology at the frequency of 0.5 GHz, our design can enhance the efficiency of 236.18 G/s (mm²·mW) with the area of 324 μm². Furthermore, MBS exhibits higher computational accuracy and inference precision compared with base-2 softmax.

Index Terms —

Softmax function, mixed-base softmax, transformer, neural network, high-efficiency hardware.

” Thanks for Visit this project Pages – Register This Project and Buy soon with Novelty “

“Buy Novelty based VLSI Projects Online”

2014

2015

2016

2017

2018

2019

MBS: A High-Precision Approximation Method for Softmax and Efficient Hardware Implementation

MBS: A High-Precision Approximation Method for Softmax and Efficient Hardware Implementation

Abstract:

“Buy Novelty based VLSI Projects Online”

THANK YOU

Our services

Quick Links

Contact us :

Our services

Quick Links

Contact us :