ECC Memory
ECC, or Error Correcting Memory, is a feature of DRAM. When used in satellites, probes, and other devices used around magnetic or electrical interference, DRAM experiences many soft errors. To deal with this, the DRAM used in these devices is structured into ECC memory. This memory uses a controller like the Hamming code or triple modular redundancy to reduce these soft errors.
A soft error, also called a one-off error, occurs when background radiation or other interference causes the contents of a memory cell to change. To get around this, ECC memory includes extra memory bits and controllers to exploit them. These bits record parity or an error-correcting code (ECC). Parity detects single-bit errors. Error correcting codes, such as the Hamming code, can actually correct a single-bit error. Error correcting codes, when used with an extra parity bit, can also be used to detect double-bit errors.
Error correcting codes have been used on and off during the years. The original IBM computer used parity checking, and most PCs up until the 1990s did as well. Today, however, many computers do no do parity checking. Most processor memory controllers do support ECC, but most motherboards, especially those with low-end chipsets, do not, even though ECC is much more affordable today than it once was.
Those computers that do use ECC are able to detect and correct any errors of a single bit in a 64-bit group. They can also detect but not correct any error of two bits. These errors are handled in a few different ways. Some systems write the correct version of the 64-bit group back to memory. In many computers, BIOS is able to detect and correct errors in memory as a way of identifying failing memory modules.
Generally, only computers uses as servers support ECC; most standard home computers do not, especially since most memory sold for PCs is not ECC memory. There are several reasons why home users don’t use ECC. ECC memory is more expensive than non-ECC memory, as our motherboards that can use ECC memory. It’s harder to overclock ECC memory as well, and ECC memory is often a bit slower than non-ECC memory since it must perform error checking.
While home users may not know anything about ECC, it is the standard when it comes to computers and servers dedicated to space exploration, finances, and major scientific work. All of these programs feature very sensitive data that must be as correct as possible at all times.