Lab Logo

  Emerging Computing Technology Laboratory at SJTU

HomePeopleResearchPublicationsTeachingActivitiesCodes & Links


Approximate Computing
Three important goals of VLSI design are reducing circuit area, improving circuit frequency, and reducing power consumption, all of which are achieved under the basic assumption that the circuit correctly implements the specified function. However, many applications widely used today, such as signal processing, pattern recognition, and machine learning, do not require perfect computation. Instead, results with small errors are still acceptable. A new design paradigm, known as approximate computing, is recently proposed to design circuits for those error-tolerant applications. Exploiting the error tolerance of applications, it deliberately sacrifices a small amount of accuracy to achieve improvement in area, performance, and power consumption.

An example of approximate computing is shown in Fig. 1. Fig. 1(a) shows the Karnaugh map of an accurate 2-bit multiplier. If we change the output "1001" in the red circle in Fig. 1(a) to "111", we obtain the Karnaugh map of an approximate 2-bit multiplier, as shown in Fig. 1(b). The circuit for the accurate multiplier and that for the approximate multiplier are shown in Fig. 1(c) and 1(d), respectively. Notice that by introducing a small amount of inaccuracy, we reduce both the area and the delay of the original multiplier significantly.

Original Karnaugh
                            Map

(a) Karnaugh map of accurate multiplier

Approximate Karnaugh
                          Map

(b) Karnaugh map of approximate multiplier

Accurate Multiplier
(c) Circuit of accurate multiplier
Approximate
                            Multiplier

(d) Circuit of approximate multiplier

Fig. 1. Accurate and approximate designs of 2-bit multiplier from the paper "Trading accuracy for power with an underdesigned multiplier architecture" by Kulkarni et al., 2011.

We study the following fundamental problems related to approximate computing:
  1. Approximate logic synthesis. We develop algorithms to automatically synthesize approximate circuits. Specifically, given a target function and an error specification, the algorithm will produce an optimal circuit which is an approximation to the target function satisfying the error specification.
  2. The design, synthesis, and analysis of approximate arithmetic circuits. Arithmetic circuits, including adders and multipliers, are key building blocks of many applications. We study how to design and synthesize high-performance energy-efficient approximate adders and multipliers by deliberately introducing errors. We also study how to analyze the error of these approximate circuits.
  3. The design of approximate computing architectures. We study how to design novel architectures exploiting approximate computing to improve energy efficiency.

Selected Publications on Approximate Logic Synthesis

Selected Publications on the Design, Synthesis, and Analysis of Approximate Arithmetic Circuits

Selected Publications on the Design of Approximate Computing Architectures



Stochastic Computing
Traditional arithmetic circuits operate on numbers encoded by binary radix, which is a deterministic way to represent numerical values with zeros and ones. Fundamentally different from the binary radix, stochastic encoding is another way to represent numerical values by logical zeros and ones.  In such a encoding, a real value p in the unit interval is represented by a sequence of N random bits X1, X2, ..., XN, with each Xi having probability p of being one and probability (1-p) of being zero, i.e., P(Xi = 1) = p and P(Xi = 0) = 1-pFig. 2(a) shows an example of a stochastic bit stream encoding the value 5/8.

Since the random sequences are composed of binary digits, we can apply digital circuits to process them. Thus, instead of mapping Boolean values into Boolean values, a digital circuit now maps real probability values into real probability values. We refer to this type of computation as stochastic computing. Fig. 2(b) illustrates an AND gate performing multiplication on stochastic bit streams.

Stochastic Encoding and Circuit

Fig. 2. Stochastic encoding and computation on stochastic encoding: (a) A stochastic bit stream encoding the value x = 5/8; (b) An AND gate multiplying two values encoded by two input stochastic bit streams.

We study the following fundamental problems related to stochastic computing:
  1. The synthesis of stochastic computing circuits. In the earlier works on stochastic computing, a number of stochastic circuits are designed manually, but they are restricted to basic computation units such as adder and multiplier. We explore algorithms to automatically synthesize stochastic circuits for arbitrary arithmetic functions.
  2. The generation of stochastic bit streams. Stochastic computing relies on stochastic number generators (SNGs) to generate input stochastic bit streams of the desired probabilities. However, they need far more area and power consumption than the stochastic computing core, offsetting the latter's advantage. We explore solutions to reduce the area and power consumption of SNGs.
  3. Architectures and application of stochastic computing. We explore architecture for stochastic computing. We apply such computational paradigm to different applications, such as image processing and numerical integration.

Selected Publications on the Synthesis of Stochastic Computing Circuits

Selected Publications on the Generation of Stochastic Bit Streams

Selected Publications on Architectures and Applications of Stochastic Computing



Deep Neural Network Accelerators
Deep neural networks (DNNs) have achieved a great success in the past decade due to their high accuracy and have been successfully applied in many domains including computer vision and speech recognition. However, they involve many multiply-and-accumulate (MAC) operations, which makes the traditional Von Neumann architecture unsuitable. To address this challenge, novel DNN accelerators are being actively explored. We study how to apply stochastic computing and approximate computing to design energy-efficient DNN accelerators. In this case, both computing paradigms allow efficient design of multipliers and adders, while their computational accuracy loss is well tolerated by the DNN application. We also study how to design DNN accelerators using emerging devices. One particular device we are interested in is resistive random access memory (ReRAM). ReRAM-based crossbars can naturally implement matrix-vector multiplication. However, they have notorious reliability issues. We study how to design reliable DNN accelerators using ReRAM-based crossbars.

Selected Publications



Design Automation for Emerging Technologies
As CMOS technology is scaled into the nanometer regime, power consumption has become one of the paramount concerns in designing very large scale integrated (VLSI) circuits. To address this challenge, alternatives to CMOS technology are being actively explored. We are exploring two kinds of promising substitutes for CMOS devices and addressing challenges in designing VLSI circuits with them:
  1. Single-electron transistors (SETs). SET is an ultra-low power device. A suitable structure for realizing logic function using SET is a binary decision diagram (BDD)-based SET array. The special connection of this structure puts some constraints in realizing digital logic and hence, requires new methods to map a Boolean function onto it. We explore methods to map a target Boolean function onto an SET array with minimal area cost.
  2. Carbon nanotube field effect transistors (CNFETs). CNFETs, which use carbon nanotubes (CNTs) as the transistor channel, are a promising substitution of conventional CMOS technology. However, due to the randomness in the CNFET fabrication process, the number of CNTs in each CNFET has a large variation, resulting in degraded circuit performance. We explore a timing-driven placement method for CNFET circuits that effectively improves the timing yield.

Selected Publications