Introduction
When high-energy ions present in space strike the substrate of an IC, their impact can cause momentary current/voltage pulses in the IC’s circuitry. When these pulses are sufficient to change the data on the circuit, they are referred to collectively as single-event effects (SEEs). Two subclasses of SEEs were of particular interest to the designers of RTAX-DSP:
-
Single-event upsets (SEUs). SEUs are probably the best understood class of SEEs. An SEU occurs
when sufficient charge is collected in a static memory element (latch, register or SRAM cell) that the
resulting voltage causes the static memory element to change state (flip its bit). These errors or
upsets last until the next time new data is written to the memory element.
-
Single-event transients (SETs). When impacting ions induce voltage pulses on combinatorial circuitry
in a device, these effects are known as SETs. If the induced voltage level exceeds that of the
switching threshold and is of sufficient pulse-width, erroneous data values can be propagated through
the circuit. As the name implies, these errors are temporary in nature, with pulse-widths on the order
of 500 ps.
Mitigation Techniques
A number of mitigation techniques exist to counteract SEEs in digital logic. The effects of SEUs on SRAM blocks can be mitigated by the use of error detection and correction (EDAC) schemes in the memory.
Most common is error-correcting code (ECC), where redundant memory bits are stored alongside user data to help detect and correct any errors or upsets to the memory. Depending upon the type of ECC employed, single and double-bit errors to a given data word can be corrected. EDAC solutions are readily available and can be easily implemented by designers.
Mitigating errors in register elements is less straightforward than for blocks of SRAM. Since SEEs are highly localized—an impacting ion only affects a single p-n junction—parallel circuitry can be employed to correct register SEUs. The technique most commonly employed is a form of redundancy called triple module redundancy (TMR). In TMR, three registers (or flip-flops) are used in parallel, with the results feeding a majority-voting circuit. If an SEU occurs in one flip-flop, the other two still contain correct data with the voting circuit passing correct data.
Correcting SETs is more complicated, due to the transient nature of the errors. Two techniques can be employed, depending upon the circuitry involved and the tolerance to delay.