Soft Error

A soft error is in the computer science a special form of an error, so one unexpectedly and unintentionally occurring state of a logic circuit or a data store. Unlike errors, such as those generated by defects of hardware and permanently change the system, temporary state changes are caused by soft errors. If the wrong data corrected, no further influence on the system by the soft error occurred is noted in particular the reliability of the system is not affected.

Soft errors are caused primarily by high-energy radiation, eg cosmic radiation ( cosmic rays ) or radioactive radiation. In a broader sense soft errors can, for example, cross-talk of signals or noise be caused by (external ) noise.

History

Observed soft errors initially in the first semiconductor memories, particularly DRAMs. In these, the information in the form of electric charge, i.e., electrons, in a capacitor is stored. As per stored bit, a capacitor is required with an associated drive transistor, the capacitance of the capacitor is designed to be small, a large number of memory cells to accommodate a chip. With increasing integration density of original 1024x1 bit (Intel 1103) in 1970 up to today (2011) 8Gb DRAM fewer and fewer electrons are available to distinguish between a logical "0" and "1".

A similar sensitivity as DRAMs are flash memory, in which the information is also stored in the form of electrons to insulated gate MOS transistors. Due to the ever-smaller structures of the semiconductors are also actually more stable SRAMs, endangered, whose memory element usually consists of six transistors.

Causes

Be a high-energy radiation particles, some electrons of the storage capacitor " blown off ", the state of the memory can change. This resulting error is reversible, that is, the error can be corrected by re- writing the memory cell with the correct information.

Even the actual integrated circuit or its housing contains a few unavoidable radioactive atoms that emit alpha particles from the decay. These consisting of two protons and two neutrons helium nuclei a relatively large mass and therefore a very small range ( a few centimeters in air and up to about 0.1 mm in solid materials ) as they collide with other atoms quickly on their way. However, ionize in this short way many other atoms, the alpha particles, that is, separate electrons from the nuclei and thereby change the information stored in a memory cell information.

Similarly, alpha rays cause a momentary state of a tilting of the logic circuit, which can then lead to a permanent change of state in the case of switching stations.

By selection of improved materials induced by alpha radiation error rate could be reduced in the last few decades.

Another source for the interfering radiation is cosmic radiation into account, especially fast neutrons. By their electrical neutrality penetrate the Earth's atmosphere and generate mostly unimpeded by various complex processes, for example by interaction with the silicon semiconductor, ionizing particles, which in turn can alter the stored information. Since neutrons are difficult to shield - if so, then no more than at the system level, not at IC level - cosmic radiation is now regarded as the main factor for soft errors.

Is destroyed by the high-energy radiation, the atomic structure of the circuit, this may lead to a permanent fault (hardware error).

Protection against soft errors

The probability of the occurrence of soft errors is called a soft error rate (SER ). Since she normally is very low, it is difficult to measure. To estimate the (in) sensitivity of the actual semiconductor circuit, the bare chip are exposed to a standardized alpha emitters (possibly etched casing) and measured the resulting error rate. With this accelerated measurement is an Accelerated Soft determined error rate ( ASER ).

Since the presence of radioactive atoms and cosmic rays can not be ruled out entirely, switching measures must be taken to reduce the impact of soft errors. One possibility is the introduction of redundancy, can be such that at least a reliable detection of faults or, with appropriate error correction method, the failure of single or multiple (memory) bit hardware- detected and corrected.

In computer systems and software techniques can be used to check the integrity of data and to restore, if necessary.

For components that are used in the automotive industry and are to be qualified according to AEC -Q100, recommends that the current standard tests according to JESD89 when SRAM / DRAM blocks are contained > 1Mbit.

736463
de