Approximate Error Detection With Stochastic Checkers
[/vc_column_text][vc_column_text] Abstract:
Designing reliable systems, while eschewing the high overheads of conventional fault tolerance techniques, is a critical challenge in the deeply scaled CMOS and post CMOS era. To address this challenge, we leverage the intrinsic resilience of application domains such as multimedia, recognition, mining, search, and analytics where acceptable outputs are produced despite occasional approximate computations. We propose stochastic checkers (checkers designed using stochastic logic) as a new approach to performing error checking in an approximate manner at greatly reduced overheads. Stochastic checkers are inherently inaccurate and require long latencies for computation. To limit the loss in error coverage, as well as false positives (correct outputs flagged as erroneous), caused due to the approximate nature of stochastic checkers, we propose input permuted partial replicas of stochastic logic, which improves their accuracy with minimal increase in overheads. To address the challenge of long error detection latency, we propose progressive checking policies that provide an early decision based on a prefix of the checker’s output bit stream. This technique is further enhanced by employing progressively accurate binary-to-stochastic converters. Across a suite of error-resilient applications, we observe that stochastic checkers lead to greatly reduced overheads (29.5% area and 21.5% power, on average) compared with traditional fault tolerance techniques while maintaining high coverage and very low false positives.