Hybrid FAs are made of two modules, including 2-inputXOR/XNOR (or simultaneous XOR–XNOR) gate and 2-to-1 multiplexer (2-1-MUX) gate. The XOR/XNOR gate is the major consumer of power in the FA cell. Therefore, the power consumption of the FA cell can be reduced by optimum designing of the XOR/XNOR gate. The XOR/XNOR gate has also many applications in digital circuits design. Many circuits have been proposed to implement XOR/XNOR gate, which a few examples of the most efficient ones are shown in Fig. 1(a) shows the full-swing XOR/XNOR gate circuit designed by double pass-transistor logic (DPL) style. This structure has eight transistors. The main problem of this circuit is using two high power consumption NOT gates on the critical path of the circuit, because the NOT gates must drive the output capacitance. Therefore, the size of the transistors in the NOT gates should be increased to obtain lower critical path delay. Furthermore, it causes the creation of an intermediate node with a large capacitance. Of course, this means that the NOT gates drives the output of circuit through, for example, pass transistor or TG. Therefore, the short-circuit power and, thus, the total power dissipation of this circuit are widely increased. Moreover, in the optimum PDP situation, the critical path delay will also be increased slightly Fig. 1(b) shows another example of the full-swing XOR/XNOR gate, each made of six transistors. This circuit is based on the PTL logic style, whose delay and power consumption are better than the circuit depicted in Fig. 1(a). The only problem of this structure is using a NOT gates on the critical path of the circuit. The XOR circuit of Fig. 1(b) has the lower delay than its XNOR circuit, because the critical path of XOR circuit is comprised of a NOT gates with an nMOS transistor (N3). But the critical path of XNOR circuit is comprised of a NOT gates and a pMOS transistor (P5) (pMOS transistor is slower than nMOS transistor). Therefore, to improve the XNOR circuit speed, the size of pMOS transistor (P5) and NOT gates should be increased.
Simultaneous XOR–XNOR Circuits
In recent years, the simultaneous XOR–XNOR circuit is widely used in hybrid FA structures. Commonly, in the hybrid FAs, the XOR–XNOR signals are connected to the inputs of 2-1-MUX as select lines. Therefore, two simultaneous signals with the same delay are necessary to avoid glitches in the output nodes of the FA. Fig. 1(c) shows an example of the simultaneous XOR–XNOR circuit. This circuit is based on the CPL logic style that has been designed by using ten transistors. In this structure, the outputs have been driven only by nMOS transistor, and thus, two pMOS transistors are connected to outputs (XOR and XNOR) as cross coupled to recover the output-level voltages. One problem of this XOR–XNOR circuit is to have the feedback (cross-coupled structure) on the outputs, which increases the delay and short-circuit power of this structure. Therefore, to mitigate the imposed delay, the size of transistors should be increased. Another disadvantage of this structure is the existence of two NOT gates in the critical path. Goel et al removed two transistors (a NOT gates) from the XOR–XNOR circuit of Fig. 1(c) to reduce the powerdissipation of the circuit. In Fig. 1(d), when the inputs are in AB = 00, the transistors N3, N4, and N5 are turned OFF and logic “0” is passed through the transistor N2 to XOR output. This “0” on XOR charges the XNOR output to VDD by transistor P3. Therefore, the critical path of this circuit is larger than that of the circuit of Fig. 1(c). Also, in this structure, the short-circuit current will be passed through the circuit when the input is changed from AB = 01 to AB = 00. When the inputs are in state AB = 01, logic “1” is passed through the transistors N2, N3, and P2 to XOR output and logic “0” is passed through the transistor N4 to XNOR output. When the inputs change to AB = 00, all transistors will be turned OFF except transistors N2 (through the input A) and P2 (through the XNOR output, which has not changed now). Therefore, the short-circuit current will pass from the transistors P2 and N2.
If the amount of current being sourced from the transistor P2 is larger than that of current being sunk from the transistor N2, the short-circuit current will continue to be drawn from VDD and will never switch XOR and XNOR output. This situation also occurs when the input is changed from AB = 11 to AB = 10 and impacts the proper functioning of the circuit. To grantee the proper operation of this circuit, the ON-state resistance of transistors P2 and P3 should not be smaller than that of transistors N2 and N5 (RP2 > RN2, RP3 > RN5), respectively. Furthermore, this structure is very sensitive to process variation; if the size of transistors is changed, the circuit may not operate properly.full-swing XOR–XNOR gate with only six transistors is proposed [shown in Fig. 1(e)]. The two complementary feedback transistors (N3 and P3) restore the weak logic in the output nodes (XOR and XNOR) when the inputs equal to AB = 00, 11. However, this circuit suffers from the high worst case delay, because when the inputs change from AB = 01, 10 to AB = 11, 00, the outputs reach its final voltage value in two steps. To clarify the issue, when the inputs equal to AB = 10, logic “1” and logic “0” are passed through the N2 (XOR output) and P2 (XNOR output) transistors, respectively. By changing the input mode to AB = 11, the transistors P1 and P2 are turned OFF (XOR node is initially high impedance) and weak logic “1” (VDD−Vthn ) is passed through the transistors N1 and N2 to the XNOR output. The weak logic “1” on the XNOR turns ON the feedback N3 so that the XOR output is pulled down to weak logic “0,” which this weak logic “0” turns ON the feedback P3. Eventually, positive feedback is made and the XNOR and XOR outputs will have strong logic “1” and logic “0,” respectively.
This slow response problem is worse in the low-voltage operation and also increases the short-circuit current when one of the outputs (XOR or XNOR) is high impedance and circuit feedback has not yet acted completely, the short-circuit current is passing through the circuit]. Also, if the size of transistors in this circuit is not properly selected, the circuit may not be correctly operated. Thus, this structure is very sensitive to process–voltage–temperature (PVT) variations. Chang et al. have proposed a new structure of the simultaneous XOR–XNOR gate [shown in Fig. 1(f)] by improving the six-transistor XOR–XNOR circuit of Fig. 1(e). In the circuit of Fig. 1(f), to solve the slow response problem and operate in low voltage supplies, two nMOS transistors (for AB = 11) and two pMOS transistors (for AB = 00) have been added to the XOR and XNOR outputs, respectively. The advantages of this structure are good driving capability, fullswing output, and robustness against transistor sizing and supply voltage scaling. The main problem of this circuit is the structure of feedback that imposes extra parasitic capacitance to the XOR and XNOR output nodes. Thus, the delay and power consumption significantly increase. Fig. 1(g) shows another circuit for improving the structure of Fig. 1(e). In this structure, a NOT gate is used to improve the circuit speed. This circuit has a better speed than Fig. 1(e), because in Fig. 1(g), the transistors N5 and P5 have the path from GND or VDD to the output nodes in two states of inputs (AB = X1 for N5 and AB = X0 for P5). But in Fig. 1(e), the transistors N4 and P5 have the same path for only one state of inputs (AB = 11 for N4 and AB = 00 for P5). Also, with the addition of a NOT gate, an intermediate node with a large capacitance will be created that will increase the power consumption of the circuit. Therefore, Fig. 1(g) has more power consumption than Fig. 1(e). Combination of two XOR and XNOR circuits of Fig. 1(a) and (b) will result in two simultaneous XOR–XNOR gates. These new structures will have all advantages and disadvantages of their XOR/XNOR circuits.
- More Power Consumption
- More Critical Path Delay
The non full-swing XOR/XNOR circuit of Fig. 2(a) is efficient in terms of the power and delay. Furthermore, this structure has an output voltage drop problem for only one input logical value. To solve this problem and provide an optimum structure for the XOR/XNOR gate, we propose the circuit shown in Fig. 2(b). For all possible input combinations, the output of this structure is full swing. The proposed system have XOR/XNOR gate does not have NOT gates on the critical path of the circuit. Thus, it will have the lower delay and good driving capability in comparison with the structures of Fig. 1(a) and (b). Although the proposed XOR/XNOR gate has one more transistor than the structure of Fig. 1(b), it demonstrates lower power dissipation and higher speed. The input A and B capacitances of the XOR circuit shown in Fig. 2(b) are not symmetric, because one of these two should be connected to the input of NOT gates and another should be connected to the diffusion of n MOS transistor.
Furthermore, the input capacitances of transistors N2and N3 are not equal in the optimal situation (minimum PDP).Also, the order of input connections to transistors N2 and N3 will not affect the function of the circuit. Thus, it is better to connect the input A, which is also connected to the NOT gates, to the transistor with smaller input capacitance. By doing this, the input capacitances are more symmetrical, and thus, the delay and power consumption of the circuit will be reduced. To clarify which transistor (N2orN3) has larger input capacitance, let us consider the condition that the inputs change from AB=00 to AB=10.
and B capacitances are not equal (the inputs A and B are connected to the same transistor count). Thus, to equal the input of capacitances, they are connected to the circuit. In this case, the input capacitances are approximately equal and the power and delay are optimized. This structure does not have any NOT gates on the critical path and its output capacitance is very small. For this reason, it is very high speed and consumes low power. The delay of XOR and XNOR outputs of this circuit is almost identical, which reduces the glitch in the next stage. Other advantages of this circuit are good driving capability, full-swing output, as well as robustness against transistor sizing and supply voltage scaling. The proposed XOR/XNOR and simultaneous XOR–XNOR structures were compared with all the above-mentioned structures (Fig. 1). The simulation results at TSMC 65-nm technology and 1.2-V power supply voltage (VDD) are shown in Table I. The input pattern is used as all possible input combinations have been included [Fig. 5(a)]. The maximum frequency for the inputs was 1 GHz and 4× unit-size inverter (FO4) was connected to the output (as a load). The size of transistors has been selected for optimum PDP by using the proposed transistor sizing method, which the proposed procedure will be described in Section VI. The optimum size of transistors for each XOR/XNOR and XOR–XNOR circuits are expressed in Table I. In the output rise and fall transition, the delay is calculated from 50% of the input voltage level to 50% of the output voltage level. The PDP will be calculated by multiplying the worst case delay by the average power consumption of the main circuit. The results indicate that the performance of the proposed XOR/XNOR and simultaneous XOR–XNOR structures is better
than that of the compared structures. The proposed XOR and XNOR circuits [Fig. 2(b)] have the lowest PDP and delay, respectively, compared with other XOR/XNOR circuits. Also, the delay of these two proposed circuits is very close together that prevents the creation of glitch on the next stage. The delay, power consumption, and PDP of the XOR and XNOR circuits of Fig. 1(a) are almost equal, due to having the same structures. As mentioned earlier and according to the obtained results, the XOR circuit of Fig. 1(b) has a better performance than its XNOR circuit. The proposed circuit for simultaneous XOR–XNOR has better efficiency in all three calculated parameters (delay, power dissipation, and PDP) when it is compared with other XOR–XNOR gates. The proposed XOR–XNOR circuit is saving almost 16.2%–85.8% in PDP, and it is 9%–83.2% faster than the other circuits. The circuits of Fig. 1(d) and (e) have the very high delay due to its output feedback (which have the slow response problem). As can be seen in Table I, the efficiency of Fig. 1(e) is much worse and its delay is four times more than that of other circuits. Table I indicates that the structures have shown a better performance, which have the minimum NOT gates on the critical path and also have not feedback on the outputs to correct the output voltage level. To better evaluate the XOR–XNOR circuits, they are simulated at different power supply voltages from 0.6 to 1.5 V and also at different output loads from FO1 to FO16. The results of these two simulations are shown in Fig. 5(b) and (c). As seen in Fig. 5(b) and (c), the proposed XOR–XNOR circuit has the best performance in both simulations when compared with other structures.
We proposed six new FA circuits for various applications which have been shown in Fig. 6. Also, Fig. 7 shows the circuit layout of proposed FA cell shown in Fig. 6(a). These new FAs have been employed swith hybrid logic style, and all of them are designed by using the proposed XOR/XNOR or XOR–XNOR circuit. The well-known four-transistor 2-1-MUX structure [Fig. 8(a) is used to implement the proposed hybrid FA cells. This 2-1-MUX is created with TG logic style that has no static and short-circuit power dissipation. Fig. 6(a) shows the circuit of first proposed hybrid FA (HFA-20T) which is made by two 2-to-1 MUX gates and the XOR–XNOR gate of Fig. 2(e). The circuit of HFA-20T has not high power consumption NOT gates on critical path and consists of 20 transistors. The advantages of this structure are full-swing output, low power dissipation and very high speed, robustness against supply voltage scaling, and transistor sizing. If A B = 1, then the output Cout signal equals to the input signal A or B. But to equalize the inputs capacitance, both of the input signals A and B are used for implementation and are connected to the transistors N9 and P10 [in Fig. 6(a), respectively. The only problem of HFA-20T is reduction of the output driving capability when it is used in the chain structure applications, such as ripple carry adder. Of course, this problem exists in the circuits that use the transmission function theory in their implementation without buffering output. Fig. 7 shows the circuit layout of proposed HFA-20T which designed for minimum power consumption. One way to reduce the power consumption of the FA structures is to use a XOR/XNOR gate and a NOT gates to generate the other XOR or XNOR signal. The proposed hybrid FA cell (HFA-17T) shown in Fig. 6(b) is designed by using the XOR gate of Fig. 2(b). This structure is made by 17 transistors that has three transistors less than the HFA-20T.
The delay of HFA-17T is higher than that of HFA-20T due to the addition of NOT gates on the critical path of the HFA-17T (for making the XNOR signal from the XOR signal). It may be expected that the power consumption of HFA-17T is less than that of HFA-20T due to the reduction in the number of transistors. But the NOT gate on the critical path of the circuit increases the short circuit power. So there is no significant reduction in total power dissipation of the HFA-17T. Also, the NOT gate will slightly improve the output driving capability of the circuit. As mentioned earlier, using the buffer on the output of a circuit is almost mandatory, especially in applications that the output capacitance of each stage is high. In practice, the driving capability of VLSI circuits is degraded due to the creation of the parasitic capacitors and resistors during the fabrication, as well as increasing the threshold voltage of transistors over the time, but the output buffer improves this situation. Fig. 6(c) presents the third proposed hybrid FA with buffers on the Sum and Cout outputs (HFA-B-26T), and it is made with 26 transistors. There are XOR–XNOR gate, one 2-1-MUX gate, and NOT gates on the critical path of HFA-B-26T. The output NOT gates are used to prevent the driving output nodes by the inputs of the circuit and also reduce the resistance from the output node of the circuit to the sources (VDD and GND). The power consumption and delay of HFA-B-26T are more than that of HFA-20T and HFA-17T FAs. Fig. 6(d) shows another proposed hybrid FA with new buffers (HFA-NB-26T), where they are placed in the data inputs of 2-1-MUX gates instead of placing the buffers in the outputs. If the input signals of A and C are produced by the buffer, then for all possible input combinations, the Sum and Cout outputs are not driven by the inputs of the circuit. To do this work, three additional NOT gates are enough, because there was already the A signal and can be made the buffered A signal with an extra NOT gate. So the HFA-NB-26T FA circuit is made by 26 transistors.
The data input nodes of 2-1-MUXs reach to their final value (GND or VDD) before the XOR and XNOR signals are produced. Thus, the critical path of HFA-NB-26T consists of an XOR–XNOR gate and a 2-1-MUX gate, and its delay is reduced compared with the HFA-B-26T. The driving capability of the HFA-NB-26T is slightly less than that of HFA-B-26T due to existing the 2-1-MUX gate between the buffer and the output node which increases the resistance from the output node to the sources (VDD and GND)]. The circuits of HFA-20T and HFA-17T have been designed so that the less number of transistors has been used. To produce the output Sum signal, the XOR, XNOR, and C signals are only used so no additional NOT gates needs to generate the C signal, whereas if the C signal is also used to produce the Sum output, then XOR and XNOR signals will not drive the Sum output through the TG multiplexer, but only they will be connected to the data select lines of 2-1-MUX. So the capacitance of XOR and XNOR nodes become smaller, and the delay of the circuit will be improved. The circuits of Fig. 6(e) and (f) (named HFA-22T and HFA-19T, respectively) have been created by applying the above idea to HFA-20T and HFA-17T, respectively. It is expected that the power consumption and delay of the HFA-22T and HFA-19T FA circuits are less than that of HFA-20T and HFA-17T, respectively (despite having two more transistors), due to the less capacitance of XOR and XNOR nodes. Also, by adding the C signal, the driving capability of HFA-22T and HFA-19T will be better than that of HFA-20T and HFA-17T, respectively.
- Circuit has very good speed, accuracy and convergence.
- Less Critical Path Delay.
- It has Superior Speed and Power against other FA Designs.