# A Path Sensitized Glitch Free Approach using NCL with Return to One Protocol: TSNCL+

J. Sudhakar

Associate Professor Department of ECE Vignan's Institute of Engg for Women, Vishakapatnam-A.P A.Mallikarjuna Prasad Professor & Head Department of ECE JNTUK-Kakinada, A.P Ajit Kumar Panda Professor & Dean Department of ECE NIST-Berahmpur, Odisha

Abstract- Due to a number of intrinsic overheads in synchronous circuit design, asynchronous designs have drawn attention in the semiconductor design industry. Self-time clock-less architectures provides a practical solution for several constraints of current and future technologies to build integrated circuits (IC's) and systems. For designing clock-less asynchronous circuits, Delay-Insensitive (DI) approaches are desired due to their easier timing analysis. NULL Convention Logic (NCL) technique is a Delay-Insensitive paradigm used for implementing self-time circuits and enables power, delay and area efficient design paradigms in standard-cell-based template.

This work scrutinizes a new asynchronous or clockless logic design, NCL+, is an advancement of NCL to abutment RTO (Return-to-One) protocol and to mitigate glitch power for significant power savings. In this paper a low power full-adder is designed using NCL+ threshold gates and to further improve efficiency of the proposed method path sensitization technique is incorporated. The performance analysis of the TSNCL+ full-adder is compared with traditional NCL+, NCL and CMOS logic in terms of glitch, power, propagation delay, power-delay product (PDP) and noise. Evaluation results of glitches, power and noise have shown advantages of the proposed logic design.

Key words: Synchronous circuits, Asynchronous circuits, VLSI, CMOS, DI, NCL, NCL+, RTO, RTZ, Threshold gates, Glitch power dissipation, Transistor Sizing.

## Introduction

Synchronous circuit design has been currently a dominating methodology in the semiconductor industry. However, there are major constraints to the synchronous (or clocked) approach such as increasing complexity of clock distribution network, reducing feature size, intricacy with design reuse due to increased clock rates [1, 2]. To attain higher performance, considerable portions of area for clock drivers are dedicated causing the chips to propagate high dynamic power during switching. Although the performance of clocked architectures is not affected by glitches but have significant effect on power, which accounts for 20%-70% of the total dynamic power. Various researches have assessed the importance of glitch power optimization [3, 4, 5]. As the demand continues for templates with no glitch power, decreased feature size, produce less noise, easier reuse of components, increased robustness and curtail power, Delayinsensitive self-timed asynchronous paradigms are widely used in the VLSI community [6, 7]. To design DI circuits, a 4-phase handshake protocol integrates to a 1-of-n DI codes for data is required [8].

NULL Convention Logic (NCL) is one of the promising methods for designing delay-insensitive asynchronous logic design paradigms [1, 2, 9, 10, 13, 19]. To sustain delay insensitivity, NCL circuits exploits threshold gates with hysteresis. The NCL gates utilizes return-to-zero protocol where the absence of data signaled by assigning all wires to zero in a data channel. Various CMOS schemes include dynamic, semi-static and static implementations have been introduced for designing delay-insensitive self-timed asynchronous NCL gates [8, 9, 14, 20].

In this paper, we propose a novel approach for designing self-timed asynchronous paradigm, NCL+ to support return-to-one protocol [11] means that the spacer is encoded at 1 by all the wires in a channel. Doing so, the series of transistors that are unavoidable in the pull up network is moved to the pull-down network to achieve higher electron mobility.

To achieve low power, NCL+ minimizes the dynamic power dissipation in terms of reducing the switching activity due to unbalanced delay paths thus mitigating glitch power significantly. This paper outlines as follows: section II presents the overview of NCL gates and explains briefly about the RTO NCL+ design methodology. Section III explain briefly about the generation and propagation of glitch in conventional CMOS and NCL gates while in section IV transistor sizing methodology is incorporated to the NCL+ concept called TSNCL+ to achieve better performance of the circuit and also the static implementation of TSNCL+ full adder with respect to traditional CMOS logic, NCL and NCL+ in terms of glitch, power consumption, propagation delay, and noise, power delay product. Finally concludes in section VI.

## **Literature Survey**

In Return-to-Zero Protocol DI paradigms, data is encoded through DI codes belong to the m-of-n class [15] consists of n-bit code words where exactly m bits are at 1 and all other (n-m) bits are at 0. The dual-rail (DR) code/1-of-2 code is a case of m-of-n code that uses two wires (d.t, d.f) to represent a DATA bit. Despite the data encoding scheme, handshake protocols can be classified into 2-phase or 4-phase. 4-phase protocols are simple to design than 2-phase protocols that reduces hardware overheads. When the 1-of-n DR codes are coupled to a 4-phase handshake protocol, communication starts by a sender. Data is asserted exactly when one of the n wires is at a specified logic value and data absence called spacer can be marked by any of the other 2n-n code words [17].



Figure 1: RTZ 1-of-2 data transmission.

Figure 1 shows the RTZ protocol with 1-of-2codes, transmission begins with all wires usually at 0. When the valid encoded data is propagated in the channel, the receiver acknowledges data with ack signal.

TABLE I: RTZ Wire Encoding.

| Wire Name | Spacer | Bit '0' | Bit '1' |
|-----------|--------|---------|---------|
| D.t       | 0      | 0       | 1       |
| D.f       | 0      | 1       | 0       |

A spacer is represented when all data wires return to 0, ending the transmission. After the data is acknowledged by a low-to-high transition a spacer is acknowledged by the receiver, to end the communication and to start a new transmission in the channel [18].

NULL Convention Logic (NCL) [13, 19] is a novel technique developed for implementing asynchronous circuits. NCL threshold gates espouse the weak conditions of Seitz's delayinsensitive signaling technique that "all inputs of a combinational circuit must be NULL before all output become NULL" along with the condition that during "all inputs of the circuit must be DATA before all outputs become DATA. The main advantage of NCL is its delay-insensitivity; making timing analysis unnecessary and reduces glitch propagation.



Figure 2: THmn threshold gate.

NCL circuits comprising of 27 fundamental threshold gates [12], can carry out all functions with four or lesser number of inputs. The rudimentary element of the NCL is the threshold gate. The THmn gate shown in fig 2 is the primary type of 2 NCL threshold gates consist of n-inputs and 'm' is the threshold value written inside the gate., where  $1 \leq m \leq n$ . THmn gates, means that at least m of the inputs must be asserted before the output will become asserted.



Figure 3: THmnWw1w2w3...wR weighted gate.

Another type of threshold gate is weighted threshold gate, denoted as THmnWw1w2w3...wR. The integer value 'm' of a weighted threshold gates is  $m \ge wR > 1$ , which is applied to input R. the input R is  $1 \le R < n$ , where n is the number of inputs, m is the threshold value of the gate, and w1 w2, w3... wR are the integer weights of 1 ... R inputs respectively. For example, consider TH24W2 gate, having n=4 inputs, labeled A, B, C and D with threshold 2 as shown in Figure 4.



Figure 4: TH24W2 threshold gate.

NCL threshold gates are implemented with hysteresis stateholding capability such that all asserted inputs must be deasserted before the output will be de-asserted. This ensures a complete switching of inputs back to NULL before asserting the output associated with the next wave front of input data. NCL gates consists of a RESET input to determine the output of the gate to 0 or 1 denoted by d or n after the threshold is label inside the gate. A d indicates that the output rail is reset to data or 1 and n indicates the output is reset to NULL or 0 [13].

RTO protocol [17] is similar to RTZ with only difference that the wires are inverted. Communication starts with all 1's in the channel. When valid data is passed by the sender in the channel the receiver acknowledges the data by lowering the ack signal. To denote a spacer all data wires must return to 1, ending the transmission. When the spacer is detected by the receiver, it raises the ack signal such that a new data can begin.

Both RTZ and RTO protocols can support m-of-n DI codes and can interface with only n inverters.



Figure 5: RTO 1-of-2 data transmission.

Table III: RTO Wire Encoding

| Wire Name | Spacer | Bit '0' | Bit '1' |
|-----------|--------|---------|---------|
| D.t       | 1      | 1       | 0       |
| D.f       | 1      | 0       | 1       |

For m-of-n codes generalization, RTO D.x wire logical value can be obtained from RTZ:

$$\{x \in \mathbb{N} \mid 0 \le x \le m-1 \},$$

$$\text{RTO } (D.x) = \neg \text{ RTZ } (D.x)$$
 (1)

where RTZ (D.x) and RTO (D.x) refers to logical wire values in the RTZ and RTO domains, respectively. Thus the

conversion of data from one domain to another is delay-insensitive [16].

NCL+ gates can implement a threshold function with a set of 14 NCL+ gates. The RTO protocol assigns the switching function of an NCL+ gate to be the reverse of its NCL counterpart [11].



Figure 6: Basic set of 14 NCL+ gates.

In NCL+ the output will switch to logic1 only when all inputs are at logic1 and switch to logic0 only when at least M of its inputs is at logic0, while for other combination of inputs, the output remains in previous value. NCL and NCL+ require same number of transistors and are typically equivalent in terms of cell- area and topology considerations. In this paper, we use the static implementations of NCL and NCL+ gates. Figure7 shows the traditional CMOS implementation while Figure8 and Figure9 shows the static implementation of TH23 threshold gate in NCL and NCL+.



Figure 7: TH23 using traditional CMOS logic.



Figure 8: TH23 NCL gate.



Figure 9: TH23 NCL+ gate.

## **Glitch Generation and Propagation**

Spurious transitions due to computational activity are well known sources of power dissipation. Curtailing glitch power is a preferable target in the CMOS design because only one computational activity per clock cycle is functionally desirable. Unfortunately, glitch power mostly relies on input computational misalignments, and propagation delays. Differential path delay is defined as the maximum difference in the arrival time of the input signals of the gate. It is also defined as the maximum width of the spurious spike or glitch occurred at output that switches to faulty output value before settling to a correct output.



Figure 10: Propagation of glitch in CMOS gates

Consider a 3-input AND gate as shown above. Let us assume that the input signal path A propagates with high speed (i.e. t=0) and B, C are slow with unit delay (t=1). Initially if A=0, B=1 and C=1 then the output, Z switches to 0. Next if A is to change to 1, B to 0 and C to 0, since B, C are propagating slowly, the data 0 arriving at inputs B and C will be slow and

hence Z switches towards 1 momentarily before switching back to 0 resulting in unwanted transition called glitch. Thus the probability of propagating output glitch, P(OG) in conventional digital logic gates is due to probability of occurrence of glitch at input A, P(GA) or input B, P(GB) or input C, P(GC) or both input (A, B), P(GA  $\cap$  GB) or input (B, C), P(GB  $\cap$  GC) or input (C, A), P(GC  $\cap$  GA) or the combination of (A, B, C), P(GA  $\cap$  GB  $\cap$  GC).

 $\begin{array}{l} P(OG) = P(GA) + \ P(GB) + \ P(GC) + P(GA \cap GB) + \\ P(GB \cap GC) + \ P(GC \cap GA) + P(GA \cap GB \cap GC) = \ 1 \\ \text{where '1' represents the propagation of glitch exists.} \end{array}$ 



Figure 11: Optimization of glitches using NCL

In this paper, we present an optimization logic called NULL Convention Logic that seeks the advantage of mitigating glitch propagation through the circuit than in standard static CMOS design. NCL significantly reduces glitch completely due to completion of NULL and DATA wave-fronts and due to monotonic data transitions. This can be illustrated by using following example. Consider a TH23 threshold gate with 3inputs A, B, C and threshold value, m=2 imply that in order for the output to be asserted either (A, B) must be asserted or (B, C) must be asserted or (A, C) must be asserted or combinations of all inputs (A, B, C) must be asserted. Assume that path A is propagating with no delay and paths B, C with a unit delay. To assert the output the inputs A, B must be asserted but due to unit delay path of B the output is not asserted as it does not met its threshold requirement, result in no propagation of glitch at the output as shown in figure 6. Thus the probability of occurrence of glitch at the output is zero. Similarly for other combinations (A, C), (B, C) and (A, B, C) the output is not asserted unless it meets the threshold value of 2 thus optimizing unnecessary transitions at the output results in glitch power reduction.

$$P(OG) = P(GA) + P(GB) + P(GC) + P(GA \cap GB) + P(GB \cap GC) + P(GC \cap GA) + P(GA \cap GB \cap GC) = 0$$

where '0' represents no glitch at the output.

# Proposed TSNCL+

Evolutions in VLSI incessantly decrease the silicon technology to fulfill the rising demands for low power, better functionality, and high efficiency. In present situations, low power has become an important area in the semiconductor design flow. Mostly power consumption takes place due to switching activity i.e. dynamic power. This paper designs a full adder circuit in which reduction of glitch, improved power and delay takes place due to transistor sized NCL+. Transistor sizing is one of the significant methods for determining the circuit performance [24, 25]. It is basically

the process of scaling the transistor according to our performance requirements of the circuit and due to this considerable reduction in area takes place. From many years it has been a significant design automation application. To reduce the effect of propagation delay due to unbalanced delay paths of a gate, a slack is computed at each gate, where it corresponds to slow down the gate without affecting the critical path delay. The transistors on the critical path (shown in figure 12) are scaled to improve delay, power of the circuit and mainly to reduce the propagation of glitch through the circuit, thereby achieving higher efficiency, reduced delay, power, power-delay product and noise by using transistor sized NCL+.



Figure 12: Critical path of TH23 NCL+ gate.

In this paper a full adder is implemented using static CMOS approach [21, 22, 23]. The sum of the inputs X and Y is written to the output S and produces a carry signal C. We assume the use of dual-rail encoding such that each data bit takes two wires, true (1) and false (0).



Figure 13: Gate level schematic of TSNCL+ full adder using threshold gates.

The gate level implementation of TSNCL+ full adder is shown in figure 13. The gate level schematic of NCL and NCL+ are same, where the only difference is that the logical data rails are inverted.

The full adder exploits four threshold gates to add input data and to produce output data. The inputs and outputs are

configured from the input and output registers in mutually exclusive assertion groups.



Figure 14: TSNCL+ full adder system.

The full adder exploits four threshold gates to add input data and to produce output data. The inputs and outputs are configured from the input and output registers in mutually exclusive assertion groups. The inputs are received from alternating DATA and NULL wave-fronts to the input registers. The adder does not hold valid sum until all inputs have been received and propagated from the DATA wave-front. Therefore a DATA wave front cannot pass through the TSNCL+ combinational circuit until all the inputs is DATA, or the subsequent register has requested to pass the DATA. After the valid DATA is acknowledged, the output register sends a request for NULL wave front, rfn to the completion circuitry in order detect the completion of DATA wave front.

### **Results & Discussions**

The proposed full adder was also implemented using NCL and traditional CMOS logic for comparison. Table IV shows the measured values for power, delay and noise of CMOS Logic, NCL, NCL and TSNCL+. The main objective of the proposed TSNCL+ design methodology is to reduce glitch power and to achieve better performance of the circuit.

Table IV. Comparison of NCL, NCL+ and TSNCL+

| LOGIC              | NOISE<br>(V) | POWER<br>(W)          | DELAY<br>(ns) | PDP (fJ) |
|--------------------|--------------|-----------------------|---------------|----------|
| CMOS               | 752.65G      | 5.35×10 <sup>-7</sup> | 51.45         | 27.52    |
| NCL                | 505.54K      | 2.77×10 <sup>-7</sup> | 151.82        | 41.82    |
| NCL+               | 182.99K      | 1.52×10 <sup>-7</sup> | 101.51        | 15.42    |
| Proposed<br>TSNCL+ | 208.78       | 1.19×10 <sup>-7</sup> | 101.49        | 12.07    |

The inputs are received from alternating DATA and NULL wave-fronts. The adder does not hold valid sum output until all inputs have been received and propagated from the DATA wave-front. After the valid DATA is acknowledged, the output register sends a request for NULL, rfn to the completion circuitry in order detect the completion of DATA and sends an request for DATA, rfd signal to the input register. This cycle continues until all the inputs have been received and propagated at the output register.

Because of the hysteresis behavior, the gates will not pass NULL until all the inputs have been received and propagated from the NULL wave-front. So when a NULL wave front arrives at the input register, the NULL values will be passed through the output when all the inputs becomes NULL. After the NULL wave front is passed, the completion circuitry detects the NULL and sends a request for DATA; rfd signal to the previous register indicating that it has acknowledged and stored the NULL wave front and the previous register can pass a DATA wave front. Due to completion of NULL and DATA wave-fronts unwanted switching transitions are reduced there by reducing glitching effect.

From table IV experimental results suggest that the traditional CMOS logic produces less delay but the drawback is that it consumes more power and generates more noise due to the propagation of glitches. In order to reduce the effect of glitch due to unbalanced delay paths and to achieve low power, proposed TSNCL+ design is used. It has significant power savings, which is intriguing particularly for self-timed asynchronous templates, less power-delay product and robust to noise due to inverted data wire logic, that means all the wires of traditional NCL are inverted but the downside is that it incurs delay overhead compared to Boolean CMOS logic. From figure16 it can be observed that the glitches are present at the output due to unwanted switching activity and are need to be reduced. These glitches that are present at the output are the major source for total dynamic power dissipation.



Figure 15: Gate level implementation of full adder using CMOS logic.



Figure 16: Propagation of glitch in static CMOS full adder.

To optimize the effect of glitch in CMOS logic, NCL threshold gates are employed. But from figure 18 small amount of glitch is produced at the output and this can further reduce by using NCL+ design methodology. From figure 20 simulation

waveforms shows that glitches are suppressed by curtailing the switching transitions and considerable reduction in power is obtained.



Figure 17: Gate level implementation of full adder using NCL.



Figure 18: Glitch propagation of full adder using NCL.



Figure 19: Gate level implementation of full adder using NCL+.



Figure 20: Glitching effect in full adder using NCL+.

To attain better performance than the proposed NCL+ design and to reduce the effect of glitch completely transistor sized NCL+ is used, which have significant power savings and reduces delay compared to traditional NCL+ logic. Figure21 determines the critical path of TSNCL+ full

adder that is scaled to improve propagation delay and greatly minimizes glitch power (from figure 22, the propagation of glitches reduced significantly), without affecting the performance of the circuit. The layout of the TSNCL+ is full adder is show in figure 23.



Figure 21: Critical path determination in TSNCL+



Figure 22: Reduction of glitch signals in TSNCL+



Figure 23: Mask Layout for proposed TSNCL+ Full adder

## **Future Scope & Conclusion**

This paper details a new design methodology incorporating delay-insensitive NCL+ into the NCL threshold paradigm for low power consumption is presented. Due to the RTO protocol and delay-insensitive nature, drawbacks in the tradition CMOS and NCL design can be eliminated. Results

suggest that the proposed technique has significant glitch reduction due to unbalanced delay paths by minimizing unwanted switching transitions, thereby reducing low power, generates less noise and PDP but has more propagation delay compared to CMOS logic. To achieve better performance of the circuit transistor sized NCL+ methodology is use. The

main motive of TSNCL+ is to reduce glitch power, which enables further capability to trade for low power consumption. As clock-less asynchronous architectures are enticing solution for low power challenges, the proposed TSNCL+ can demand for future low power and high precision applications.

### References

- [1]. J. T. Roark and S. C. Smith, "Demonstration of the Benefit of Asynchronous vs. Synchronous Circuits," ASEE Midwest Section Conference, September 2012.
- [2]. Scott C. Smith, Jia Di, "Designing Asynchronous Circuits using NULL Convention Logic (NCL)", Synthesis Lectures on Digital Circuits and Systems, Vol. 4/1, July 2009, Morgan & Claypool Publishers
- [3]. J.Sudhakar, K.Tirupathi Rao, B.Suresh, "Glitch Power Minimization Techniques in Low Power VLSI Circuits", International Journal of Emerging Technology and Advanced Engineering, November 2012.
- [4]. M. Favalli and L. Benini, "Analysis of glitch power dissipation in CMOS IC's," in Proc. Int. Symp. Low-Power Design, Apr. 1995.
- [5]. Vikas, Deepak, "A Review on Glitch Reduction Techniques", International Journal of Research in Engineering and Technology, Volume: 03 Issue: 02, Feb-2014.
- [6]. V. Satagopan, B. Bhaskaran, A. Singh, and S. C. Smith, "Energy Calculation and Estimation for Delay-Insensitive Digital Circuits," Elsevier's Microelectronics Journal, Vol. 38/10-11, October/November 2007.
- [7]. B. Bhaskaran, V. Satagopan, and S. C. Smith, "High-Speed Energy Estimation for Delay-Insensitive Circuits, The 2005 International Conference on Computer Design, June 2005.
- [8]. P. A. Beerel, R. O. Ozdag and M. Ferretti, "A Designer's Guide to Asynchronous VLSI", Cambridge University Press, 2010.
- [9] K. M. Fant, and S. A. Brandt, "NULL Convention Logic: a complete and consistent logic for asynchronous digital circuit synthesis," in Appl. Specific Syst., Architectures and Processors, Proc. of Int. Conf. on, Aug. 1996.
- [10]. Farhad A. Parsan and Scott C. Smith, "CMOS Implementation of Static Threshold Gates with Hysteresis: A New Approach", IEEE/IFIP Int. Conf. on VLSI and System-on\_Chip (VLSI-SoC), Oct. 2012.
- [11]. Matheus T. Moreira, Carlos H. M. Oliveira, Ricardo C. Porto, Ney L. V. Calazans, "NCL+: Return-to-One Null Convention Logic", Circuits and Systems (MWSCAS), aug 2013.
- [12]. Farhad. A. Parsan and Scott. C. Smith, "CMOS Implementation Comparison of NCL Gates," IEEE Int.

- Midwest Sym. on Circuits and Systems (MWSCAS), Aug. 2012.
- [13]. K. M. Fant, S. A. Brandt. "NULL convention logic: a complete and consistent logic for asynchronous digital circuit synthesis". In ASAP'96, 1996.
- [14]. S. Yancey, and S. C. Smith, "A differential design for C-elements and NCL gates," in Circ. and Syst., 53rd IEEE Int. Midwest Symp. on, Aug. 2010.
- [15]. A. J. Martin and M. Nyström, "Asynchronous Techniques for System on-Chip Design", Proceedings of the IEEE, 2006, vol. 94(6).
- [16]. A. J. Martin, "The limitations to delay insensitivity in asynchronous circuits", In ARVLSI, 1990.
- [17]. Matheus T. Moreira, Ney L. V. Calazans, "Quasi-Delay-Insensitive Return-to-One Design", 2014.
- [18]. Matheus T. MoreiraCarlos H. M. Oliveira, Ricardo C. Porto, Ney L. V. Calazans, "Design of NCL Gates with the ASCEnD Flow", IEEE Circuits and Systems (LASCAS), March 1 2013.
- [19]. Scott Christopher Smith, "Gate and Throughput Optimizations for Null Convention Self-Timed Digital Circuits", Dissertation, 2001.
- [20]. Harish Gopalakrishnan, "Energy Reduction for Asynchronous Circuits in Soc Applications", Dissertation, 2005.
- [21]. S. C. Smith, R. F. DeMara, J. S. Yuan, D. Ferguson, and D. Lamb, "Optimization of NULL Convention Self-Timed Circuits", Elsevier's Integration, the VLSI Journal, Vol. 37/3, August 2004.
- [22]. David A. Duncan, Gerald E. Sobelman, Karl M. Fant, "Null convention adder", US patent US 5793662 A, 11 Aug 1998.
- [23]. B. Bhaskaran, V. Satagopan, W. Al-Assadi, and S. C. Smith, "Implementation of Design For Test for Asynchronous NCL Designs," The 2005 International Conference on Computer Design, June 2005.
- [24]. Kumara Swamy H. L, Kotresh E. Marali, Siddalingesh, S. Navalgund, "Novel Techniques For Circumventing The Glitch Effects On Digital Circuits For Low Power VLSI Design", International Journal of Engineering Research & Technology (IJERT), Vol. 2 Issue 3, March – 2013.
- [25]. M. Jasmi, "Optimization Techniques for Low Power VLSI Circuits", Middle-East Journal of Scientific Research 20 (9): 1082-1087, 2014.