A Low Area Overhead Fault Tolerant Strategy for Multiple Stuck-At-Faults in Digital Circuits

John Kalloor*
Research Scholar, Department of Electrical and Electronics Engineering, Annamalai University, Chidambaram, Tamil Nadu, India.
Orcid Id: 0000-0002-3091-0617

B.Baskaran
Professor, Department of Electrical and Electronics Engineering, Annamalai University, Chidambaram, Tamil Nadu, India.

Abstract
The paper proposes a design strategy to retain the true nature of the output in the event of occurrence of stuck at faults at the interconnect level of digital circuits. The procedure strives to ensure a sense of reliability in the flow of signals to the utility end and ensures flawless performance. The scheme possesses the ability to identify the presence of repairable faults in combinational circuits and redress the same through a predictive mechanism. The inherent fault tolerant facility attached to the formulation enables to reach out the fault free output of the system in the presence of faults. The Modelsim based simulation results obtained for a decoder designated as the circuit under test testifies to the immaculate performance of the proposed fault tolerant methodology while the use of an FPGA as the target device exhibits its real world viability.

Keywords: Fault Tolerance, FPGA, Reliability, Stuck at faults, VHDL.

INTRODUCTION
The fundamental approaches unravel to increase the reliability of computing systems in view of the ongoing developments in the field of semiconductor technology. The first of its kind called fault prevention appears to gather momentum and dislodge the inconsistent operation of digital circuits. The other category termed fault tolerance imbibes a sense of adaptability and calls for astute techniques to heal them. In the fault tolerant approach, the faults occur during computation, but the feature of redundancy relates to offset their effects by incorporating additional resources and permit computation to continue even in the presence of faults.

Though scaling of semiconductor devices brings out smaller transistors and interconnect features, lower supply voltage and increased clock frequency contribute to higher error rates [13,14]. The reduced voltage supplies and therefore noise margins, together with reduced internal capacitances increase their susceptibility and sensitivity to radiations thereby making the system error prone. The use of fault-prone practices reinforces a path to perpetuate the next generation digital world where fault sensitive logic devices hold the key to comply with the needs.

Fault injection endeavours inculcate the system under test with a sense of responsibility to forecast the faults and provide way for fault tolerance. The phenomenon of fault tolerance refers to the ability of the system to operate normally in the presence of faults [9]. The single stuck-at fault model may not assume greater significance especially for nano-electronics digital designs that encounter greater variability and reliability issues which in turn need the creation of multiple faults model [4].

The philosophy of fault tolerance enables the system to suppress the ill effects of faults and regain its formative performance level. It assumes to be an important feature for many operating environments from automotive to space exploration. The elaborate role of digital circuits in critical applications further emphasizes the necessity of their fault free operating nature. The ability to manage inconsistent resources and service desperate user requirements has been an imperative necessity.

Inspite of the continuing efforts to harp on fault sensitization, a fresh direction assumes significance to characterize the evolution of fault tolerant circuits. The proposed scheme endeavours to articulate the occurrence of stuck at faults in combinational circuits and devise appropriate steps to nullify their impact on the system. The rest of the paper spreads under six headings that include literature review, design methodology, simulation results, hardware implementation, performance analysis and finally conclusion.
LITERATURE REVIEW

A cost effective non intrusive technique similar to duplication with comparison has been presented in [5] to detect any erroneous response of the original function module. Implementation of separable codes for concurrent error detection within VLSI ICs has been described in [11]. The problem of synthesizing totally self checking two level combinational circuits employing three different concurrent error detection schemes have been proposed in [12].

A detailed survey on various fault injection techniques along with their merits and demerits have been presented in [6]. Hardware-based technique to inject both transient and permanent faults inside the VHDL descriptions of combinational as well as sequential digital circuits has been proposed in [10].

A VHDL based fault injection tool has been described in [3] and its effectiveness has been verified through various fault injection experiments carried out using different parameters. During the fault injection campaign, a wide range of transient and permanent faults have been injected through the proposed injection tool on the signals and variables of the chosen model using simulator commands.

An efficient approach for testing, detecting and tolerating single stuck at faults at interconnect levels of digital circuits has been proposed in [7]. The possible interconnect faults for wiring channels have been considered and signal routing in the presence of faulty interconnect resources analyzed at both circuit level and design level. A fault tolerant algorithm for stuck at faults in digital circuits has been presented in [8]. The design based on hardware redundancy has been suggested for single fault model to tolerate transient and permanent faults.

Different TMR based majority voter designs have been analyzed and realized using a cutting edge 32/28 nm CMOS technology in [1]. Further, a new voter design has been presented with improved fault tolerant ability against both single and multiple stuck-at-faults that occur either internally or externally to the voter circuit. A number of standard-cell based majority voter designs relevant to TMR architectures have been presented and their power, delay and area parameters estimated based on physical realization using a 32/28nm CMOS process in [2].

DESIGN METHODOLOGY

The scheme vows to evolve a fault tolerant strategy to suppress the effect of stuck at faults occurring in combinational logic of any digital circuit. The procedure reiterates its promise to develop a thoroughly reliable system with a view to provide fault free output even on the occurrence of faults. It involves saboteur based simulated fault injection procedure to inject faults at the interconnect levels of the system and lay down measures to identify their occurrence in order to formulate the correcting strategy. The fault injection campaign aids in the process of creating a testing environment in the sense it generates different faults through the formation of stuck-at-faults model.

The scheme reaps the benefit of incorporating a in-built healing procedure that can eliminate the impact of stuck at faults present at the interconnect levels and produce true values on the primary output lines of the system thereby making it a fault tolerant system by its very nature. The travel moves on to verify the functional correctness of the proposed architecture using Modelsim platform and ensures its practical suitability with the help of Xilinx FPGA.

Generally a tester does not possess any prior knowledge over the internal structure and functions of the system but exactly knows that a particular input brings out a certain invariable output. The procedure utilizes this fact to develop the in-built fault tolerant strategy which picks the desired output based on the already established input-output relationship of the system regardless of the presence of faults at the interconnect levels of the system.

Based on the relationship exists between the inputs and outputs, the CUT, the 2:4 decoder needs to bring out the desired output all the time for a set of input combination without considering the eventual outcome. The expressions given below establish the relationship between the inputs and the desired outputs of the 2:4 decoder in the fault free state.

\[
D_{out}(0) = \overline{Din(1)} \text{ and } Din(0)
\]
\[
D_{out}(1) = \overline{Din(1)} \text{ and } Din(0)
\]
\[
D_{out}(2) = Din(1) \text{ and } \overline{Din(0)}
\]
\[
D_{out}(3) = Din(1) \text{ and } Din(0)
\]

Figure 1 shows the block diagram of the fault tolerant system. The scheme houses a tolerant circuit consists of as many as four EX-OR gates equal to the number of outputs of the decoder as an integral part of the system. The desired output along with the corresponding eventual outcome from each of the output lines of the decoder together form the inputs to each of the EX-OR gates. The approach extends to follow the outputs of the EX-OR gates and senses the fault when the output of any of the EX-OR gates goes high. It further proceeds to toggle the logic state of the faulty interconnect line and bring it back to its fault free state. The rigorous process of continuous monitoring of signal flow and the ability to take the corrective action on the fly make the system a truly fault tolerant one. The flow diagram seen in Figure 2 along with the algorithmic steps presented below enumerates the steps that proceed to harmonize the fault injection mechanism and the healing part with the rest of the system.

6370
2:4 DECODER

ERROR DETECTION BLOCK

ERROR CORRECTION BLOCK

Figure 1: Block diagram of the proposed fault tolerant system

Figure 2: Flow diagram of the proposed scheme

ALGORITHM
1. Determine the primary outputs for the given set of primary inputs
2. Check the status of the control signal
3. If “control” is not enabled then
4. Simulate and record the primary outputs produced by the system without fault injection
5. Else
6. Choose any of the interconnect line of the system randomly
7. Inject fault on the chosen line
8. Heal the system with the in-built fault tolerant facility
9. Perform simulation again and record the output values
10. End if
11. Compare the values obtained in steps (4) and (9) to validate the performance of the proposed scheme
SIMULATION RESULTS

The Modelsim based simulation results presented in Figure 3 exhibit the fault free state of the CUT without fault injection while results available in Figs. 4 to 6 elucidate the ability of the proposed strategy to retain the decoder in its fault tolerant state. The response seen in Fig. 3 explains the fact that as long as the control signal is at logic 0 state it inhibits the fault injection campaign and the system operates in normal operating environment.

However the enabling of the control signal as seen in Figs. 4 to 6 permits the introduction of stuck-at faults subject to the logic values of the 3 bit fault generator “f” in the sense at the 500\textsuperscript{th} ns the least significant bit of the interconnect signal i.e., int(0) is made to be stuck-at-1, at the 1000\textsuperscript{th} ns the most significant bit int(3) to be stuck-at-0 and at the 1500\textsuperscript{th} ns the int(1 and 2) to be stuck-at-1 respectively. Despite the introduction of faults, the circuit continues to generate the correct information on its primary output lines thanks to the innate ability of the healing mechanism inherently associated with it.

Figure 7 witnesses the normal operating state of the system as the control signal reverts to logic 0 at the 2000\textsuperscript{th} ns. The simulation results clearly show that the proposed fault tolerant architecture exhibits excellent resilience and meets the design specifications even in turbulent situations.

![Figure 3: Output of the fault tolerant decoder when the signal “cont” is at logic 0](image-url)
Figure 4: Output of the fault tolerant decoder when the LSB of the signal “int” is stuck-at-1

Figure 5: Output of the fault tolerant decoder when the MSB of the signal “int” is at stuck-at-0
Figure 6: Output of the fault tolerant decoder when “int (1 & 2)” are at stuck-at-1.

Figure 7: Output of the fault tolerant decoder when the signal “cont” reverts to logic 0.
HARDWARE IMPLEMENTATION

The use of a reconfigurable hardware allows incorporating fault tolerant features and serves to extend the lifetime of the systems. The exercise extends to explore the suitability of implementing the methodology using the facilities of a VLSI hardware for the same decoder. The FPGA turns out to be the preferred choice over the discrete devices as it carries the ability to be reprogrammed even during run time and realize any digital function. The real time implementation of the proposed fault tolerant decoder using XC3S500E FPGA serves to validate the simulated performance. The technology schematic shown in Figure 8 obtained using Xilinx foundation series ISE 9.2i endorses the practical applicability of the scheme.

![Figure 8: The technology schematic view of the fault tolerant 2:4 decoder](image)

PERFORMANCE ANALYSIS

The Table 1 compares the proposed scheme with the TMR approach in terms of fault coverage ability and the area overhead needed for the CUT. The fault coverage explains the ability of the system to tolerate faults at a given point of time while overhead provides an idea over the amount of additional chip area needed to implement the design. The percentage overhead is calculated using the following relation;

\[
\text{\% Overhead} = \left( \frac{\text{No. of redundant gates}}{\text{No. of actual gates to implement the logic}} \right) \times 100
\]

![Table 1: Comparison Summary](image)

<table>
<thead>
<tr>
<th>Methodology</th>
<th>Fault Coverage</th>
<th>% Area Overhead</th>
</tr>
</thead>
<tbody>
<tr>
<td>Proposed Fault tolerant scheme</td>
<td>100%</td>
<td>133.33%</td>
</tr>
<tr>
<td>TMR</td>
<td>33.33%</td>
<td>466.67%</td>
</tr>
</tbody>
</table>

The proposed fault tolerant architecture overshadows the TMR technique in the sense the output of TMR does not necessarily to be fault free because if two or all the three of its inputs turn out to be faulty, it permits the faulty signal to proceed further in the system due to the fact that the voter selects the output based on the majority of inputs. In short, the TMR approach remains suitable for single bit faults. On the contrary, the proposed fault tolerant technique possesses the ability to tolerate the presence of multiple faults in the system and produce fault free outputs. The other important issue concerning TMR relates to the area overhead which needs as high as 466.67% since it requires two identical copies of the CUT along with the voter whereas the proposed scheme needs relatively very less redundant components for implementation which in turn reduces the area overhead as low as 133.33%.

CONCLUSION

A novel fault tolerant strategy has been evolved with a view to address stuck at faults occurring in digital circuits. The procedure has been formulated to sense the possible occurrence of faults at the intermediate levels of combinational circuits and correct them to produce the desired output. The benefits of the simulated fault injection procedure have been sought to attain a very high degree of controllability and observability in the process of generating faults. The Modelsim based simulation results obtained for a 2:4 decoder reiterate the ability of the fault tolerant strategy to produce true outputs even in the presence of faults. The imperative benefits of the proposed scheme have been primarily realized in terms of fault coverage ability and exiguous redundant components involved in the configuration of the fault tolerant architecture which in turn lead to relatively lower area overhead as compared to traditional TMR based fault tolerant systems. The VHDL code developed for the proposed fault tolerant architecture has been validated through XC3S500E FPGA using Xilinx Foundation series ISE 9.2i with a view to exhibit its practical applicability.

REFERENCES
