Proposed Title :
CMOS Design Methodology on 32 nm Voltage Scaled Clock Distribution Network
A conventional DFF cell designed for FS operation cannot be used when the clock voltage swing is reduced due to degradations in reliability and power consumption.
Reliability: In a typical DFF cell, clock signals drive both nMOS and pMOS transistors. If the same DFF topology is used with an LS clock signal (whereas the data signal is still at FS to maintain performance), the pMOS transistors driven by the clock signal fail to completely turn OFF when the clock signal is high. For example, consider a 45-nm technology with a nominal VDD of 1 V. If the clock swing is reduced to 0.7×VDD, the gate-to-source voltage of the pMOS transistors is −0.3 V, since the data signal is at FS and the inverters within the flip-flop are connected to nominal (FS) VDD. Since−0.3 V is sufficiently close to the threshold voltage of pMOS transistors in this technology, this behavior significantly affects the operation reliability of a traditional DFF cell driven by an LS clock signal.
o better illustrate the unreliability of the conventional DFF cells operating with an LS clock signal, a traditional transmission gate-based DFF, as shown in Fig. 1, is simulated with a 45-nm technology node when the clock swing is 0.7 V. Note that the clock signal and the inverted clock signal are internally generated using two inverters. This circuit is referred to as the clock sub circuit, as also shown in Fig. 4. Note that the inverters within the clock sub circuit are connected to a low supply voltage to provide LS clock signals. Since the pMOS transistors driven by the clock signals are not completely turned OFF, internal nodes experience a glitch as high as 400 mV. Furthermore, at the slow corner, the DFF cell fails to correctly latch the data signal. Thus, a new topology is required that can reliably operate with an LS clock signal and an FS data signal.
In this case, these inverters also function as single voltage, low-to-high level shifters, and the transmission gates receive FS clock signals. The primary limitation of this approach is an unavoidable increase in power consumption due to significant static current drawn by the inverters within the clock sub circuit. To better illustrate this behavior, a conventional DFF is simulated when an LS clock signal is applied to the clock pin while the clock sub circuit is connected to a nominal VDD.
- Low performance
- More Area and Power
The proposed LSDFF cell enables LS operations are performed with the low power consumption level.
The proposed DFF topology, shown in Fig. 2, is based on the most commonly used static DFF shown in Fig. 4. Rather than using transmission gates, however, pass gates with nMOS transistors (N1, N2, N5, and N6) are utilized as the switches in both master and slave latches. Thus, when the LS clock signal is at logic high, N1 and N6 can completely turn OFF. Pass gates, however, cannot transfer a full voltage to the output. This issue is critical, since the incoming data signal operates at FS. Thus, node A cannot reach a full VDD, thereby increasing the short circuit and leakage current in the following stages in addition to increasing the clock-to-Q delay. Furthermore, pass transistors are less robust to process variations. To alleviate these issues, a pull-up network consisting of two pMOS transistors are added to both master and slave latches (P1–P4). When the master node M transitions to logic low, P1 turns ON. If the data signal is also at logic low, then node A is pulled to full VDD through P1 and P2. Note that P2 (in the master latch) and P4 (in the slave latch) are added to prevent contention current (and, therefore, reduce power consumption) when the data signal is at logic high and the clock signal is at logic low. In this situation, N1 is ON and node A is discharged through N1 and the inverter. If P2 did not exist, a race condition would occur at node A, since N1 should be stronger than P1, which pulls node Y to full VDD. Finally, pull-down logic (N3, N4, N7, and N8) is added to both master and slave latches to enhance the clock-to-Q delay. In particular, when data and clock signals are at logic low, the pull-down logic is active and pulls master node M to ground, triggering P1. Thus, node A quickly reaches full VDD. Note that the master node does not need to wait for node A to rise through a weak pass transistor and activate the inverter. Instead, the pull-down logic completes this transition relatively faster.
- High performance
- Less Area and Power
- TANNER EDA