Understanding Error Correction in Computer Memory Systems

AKT211 – CAO08 – Computer Memory (2) Ghifar Parahyangan Catholic University Okt 31, 2011

Last Course Review • Computer Memory System • Memory Characteristics • Memory Hierarchy • RAM Basic Technology • Semiconductor • SRAM vs DRAM • Advanced RAM Organization • SDRAM vs DDR-RAM

Outline • Error Correction • Single error correction • Double error correction

THE BASIC OF ERROR CORRECTION

Semiconductor System Error • Hard Failures • Permanent physical defect so that it can’t reliably store data • Stuck at 0 or 1 or switch erratically between 0 and 1 • Soft Error • Random, nondestructive event that alters the contents of one or more cell without damaging the memory • Caused by power supply problems or alpha particles • Most modern main memory systems include logic for both detecting and correcting errors

Single-bit error • Only 1 bit in data unit has changed

Error Correcting Code (ECC) Function

Simples form of Error Detection • Using a parity bit • A bit that is added to ensure that the number of bits with the value ‘1’ in a set of bits is even or odd • Only for detecting 1-bit error, not more, nor correcting ! • E.g.: no error • E.g.: 1 bit error A wants to transmit: 1001 A computes parity bit value : 1^0^0^1 = 0 A adds parity bit and sends : 10010 B receives : 10010 B computes parity : 1^0^0^1 = 0 B reports correct transmission after observing expected result A wants to transmit: 1001 A computes parity bit value : 1^0^0^1 = 0 A adds parity bit and sends : 10010 *** TRANSMISION ERROR *** B receives : 11010 B computes parity : 1^1^0^1 = 1 B reports incorrect transmission after observing unexpected result

Hamming Error-Correcting Code • linear error-correcting code • can detect up to d-1 bit errors • can correct (d-1)/2 • d is the minimum hamming distance between all pairs in the code words

Hamming (7, 4) Code • encodes 4 data bits (d1, d2, d3, d4) into 7 bits by adding 3 parity bits (p1, p2, p3) • single error correction

Hamming (7,4) Example

HAMMING ALGORITHM GENERALIZATION FOR SINGLE ERROR CORRECTION

Generalization of the Hamming Single Error Correction • The comparison logic receives as input two K-bit values • A bit-by-bit comparison is done by taking the XOR • The result is called the syndrome word • The value 0 indicates that no error was detected and otherwise • We can determine the position from that syndrome word

Required criteria for Hamming Error Correction • If the syndrome contains all 0s, no error has been detected • If the syndrome contains one and only one bit set to 1, then an error has occurred in one of the n check bits. No correction is needed • If the syndrome contains more than one bit set to 1, then numerical value of the syndrome indicates the position of the data bit in error

SEC Step-by-Step • Determine how long the code (check bits) must be • Determine the stored position for each bit in M data bits and K check bits • Construct the appropriate XOR function that match with the required criteria

1. Determine how long the code must be • M : number of bits in data bits • K : number of bits in code bits • Because an error could occur on any of the M data bits or K check bits, we must have : • e.g.: for a word of 8 data bits (M=8), we have 2K – 1 ‹ M + K K=3 : 23 – 1 < 8 + 3 K=4: 24– 1 >8 + 4

2. Determine the stored position Let’s see the explanation ! 

3. Construct the XOR Function Again, let’s see the explanation ! 

Hamming SEC-DED Code • Nowadays, more commonly, semiconductor memory is equipped with a single-error-correcting, double-error-detecting (SEC-DED) code • Needs 1 extra parity bit that indicates whether the total number of 1s is even or odd • Enhances the reliability of the memory, but adds the cost of complexity • E.g. : • The IBM 30xx implementations used an 8-bit SEC-DED code for each 64 bits of data in main memory • The size is actually about 12% larger than is apparent to the user

Hamming SEC-DED Code (2)

Any Question ?

Reference • Chapter 5.2: Error Correction (Stallings, William. Computer Organization and Architecture, 8th ed. Prentice Hall. 2010)

Exercises • Denganpenggunaanalgoritma Hamming, berapakahjumlah check bit yang dibutuhkanjika data bit berukuran 1024-bit ? • Terdapat data bit sebanyak 8-bit tersimpandidalammemori yang isinya 11000010. Denganmenggunakanalgoritma Hamming, tentukannilai check bit yang akantersimpanpadamemori. • Untuk data word 8-bit 00111001, check bit yang tersimpanadalah 0111. Anggapterjadierrorpadapembacaanmemori. Ketika data bit tersebutdibacaulangdarimemori, nilai check bit yang terhitungadalah 1101. Berapakahsebenarnyanilai data bit yang error?

Week 8 Assignment • Bentuklahpersamaan XOR untukmenentukan SEC code (check bit) denganmenggunakanalgoritma Hamming untuk data bit berukuran 16-bit. Bagaimanahasil check bit apabilamenerimamasukan data bit 0101000000111001 ? Simulasikanbagaimanaalgoritma Hamming dapatmengoreksi error apabilaterjadi error di data bit posisi ke-5 (0101000000101001). JelaskanjawabanAndaselengkap-lengkapnya.

THANK YOU

Understanding Error Correction in Computer Memory Systems

Understanding Error Correction in Computer Memory Systems

Presentation Transcript

powerpoint presentation

Powerpoint presentation

PPT Presentation

PowerPoint presentation

PowerPoint Presentation.

talk-ppt - PowerPoint Presentation

AKT211 – CAO 07 – Computer Memory

AKT211 – CAO 03 – SAP-1

AKT211 – CAO 02 – Computer Evolution and Performance

AKT211 – CAO 08 – Computer Memory (2)

Chapter 2 : PowerPoint Presentation

Powerpoint presentation #2

Presentation 08

PowerPoint Presentation #2

Computer memory

PowerPoint Presentation

Presentation 24 – Computer Memory

PowerPoint Presentation Guidelines (Part 2)

AKT211 – CAO 01 - Introduction to Computer Organization and Architecture

PowerPoint Presentation

PowerPoint Presentation

COMPUTER MEMORY