HAPTER Bit Errors using Error Correction Codes

MIT 6.02 DRAFT Lecture Notes

Last update: September 23, 2012

CHAPTER 5

Coping with Bit Errors using Error Correction Codes

Recall our main goal in designing digital communication networks: to send information reliably and efficiently between nodes. Meeting that goal requires the use of techniques to combat bit errors, which are inevitable in both commmunication channels and storage media (storage may be viewed as "communication across time"; you store something now and usually want to be able to retrieve it later).

The key idea we will apply to achieve reliable communication is the addition of redun dancy to the transmitted data, to improve the probability that the original message can be reconstructed from the possibly corrupted data that is received. The sender has an encoder whose job is to take the message and process it to produce the coded bits that are then sent over the channel. The receiver has a decoder whose job is to take the received (coded) bits and to produce its best estimate of the message. The encoder-decoder procedures together constitute channel coding; good channel codes provide error correction capabilities that reduce the bit error rate (i.e., the probability of a bit error).

With proper design, full error correction may be possible, provided only a small num ber of errors has occurred. Even when too many errors have occurred to permit correction, it may be possible to perform error detection. Error detection provides a way for the re ceiver to tell (with high probability) if the message was decoded correctly or not. Error detection usually works by the sender and receiver using a different code from the one used to correct errors; common examples include the cyclic redundancy check (CRC) or hash functions. These codes take n-bit messages and produce a compact "signature" of that mes sage that is much smaller than the message (e.g., the popular CRC-32 scheme produces a 32-bit signature of an arbitrarily long message). The sender computes and transmits the signature along with the message bits, usually appending it to the end of the message. The receiver, after running the decoder to correct errors, then computes the signature over its estimate of the message bits and compares that signature to its estimate of the signature bits in the received data. If the computed and estimated signatures are not equal, then the receiver considers the message to have one or more bit errors; otherwise, it assumes that the message has been received correctly. This latter assumption is probabilistic: there is some non-zero (though very small, for good signatures) probability that the estimated

47

48

CHAPTER 5. COPING WITH BIT ERRORS USING ERROR CORRECTION CODES

and computed signatures match, but the receiver's decoded message is different from the sender's. If the signatures don't match, the receiver and sender may use some higher-layer protocol to arrange for the message to be retransmitted; we will study such schemes later. We will not study error detection codes like CRC or hash functions in this course.

Our plan for this chapter is as follows. To start, we will assume a binary symmetric channel (BSC). In a BSC, the probability of any given bit "flipping" (a 0 sent over the channel is received as a 1, or vice versa) is ", independent of all other bits. Then, we will discuss and analyze an elementary redundancy scheme called a repetition code, which will simply make n copies of any given bit. The repetition code has a code rate of 1/n--that is, for every useful message bit, we end up transmitting n total bits. The overhead of the repetition code of rate n is 1/n, which is rather high for the error correcting power of

1 the code. We will then turn to the key ideas that allow us to build powerful codes capable of correcting errors without such a high overhead (or equivalently, capable of correcting far more errors at a given code rate compared to the repetition code).

There are two big, inter-related ideas used in essentially all error correction codes. The first is the notion of embedding, where the messages one wishes to send are placed in a geometrically pleasing way in a larger space so that the distance between any two valid points in the embedding is large enough to enable the correction and detection of errors. The second big idea is to use parity calculations, which are linear functions over the bits we wish to send, to generate the redundancy in the bits that are actually sent. We will study examples of embeddings and parity calculations in the context of two classes of codes: linear block codes (which are an instance of the broad class of algebraic codes) and convolutional codes (which are perhaps the simplest instance of the broad class of graphical codes).

We start with a brief discussion of bit errors.

5.1 Bit Errors and BSC

A BSC is characterized by one parameter, ", which we can assume to be < 1/2, the proba bility of a bit error. It is a natural discretization of a noise model over signals (a common

model for noise, as we will see in Chapter 9, is additive Gaussian noise, which is also

a single-parameter model fully characterized by the variance, (2). We can determine "

empirically by noting that if we send N bits over the channel, the expected number of

erroneously received bits is N ". By sending a long known bit pattern and counting the ?

fraction or erroneously received bits, we can estimate ", thanks to the law of large numbers.

In practice, even when the BSC is a reasonable error model, the range of " could be rather

large, between -2 (or even higher) all the way to -10 or even -12. A value of " of

10

10

10

about

2

means

that

messages

longer

than

a

100

bits

will

see

at

least

one

error

on

aver

10

age; given that the typical unit of communication over a channel (a "packet") is generally

between 500 bits and 12000 bits (or more, in some networks), such an error rate is too high.

But is " of

12

small

enough

that

we

don't

need

to

bother

about

doing

any

error

10

correction? The answer often depends on the data rate of the channel. If the channel has

a rate of 10 Gigabits/s (available today even on commodity server-class computers), then

the "low" " of -12 means that the receiver will see one error every 10 seconds on average 10

if the channel is continuously loaded. Unless we include some mechanisms to mitigate

SECTION 5.2. THE SIMPLEST CODE: REPETITION

49

the situation, the applications using the channel may find errors occurring too frequently.

On the other hand, an " of

12

may

be

fine

over

a

communication

channel

running

at

10

10

Megabits/s, as long as there is some way to detect errors when they occur.

In the BSC model, a transmitted bit b (0 or 1) is interpreted by the receiver as 1 - b with probability " and as b with probability 1 - ". In this model, each bit is corrupted independently and with equal probability (which makes this an "iid" random process,

for "independent and identically distributed"). We call " the bit-flip probability or the "error

probability", and sometimes abuse notation and call it the "bit error rate" (it isn't really

a "rate", but the term is still used in the literature). Given a packet of size S bits, it is

straightforward to calculate the probability of the entire packet being received correctly

when sent over a BSC with bit-flip probability ":

P(packet

received

correctly )

=

(1

-

"S )

.

The packet error probability, i.e., the probability of the packet being incorrect, is 1 minus

this quantity, because a packet is correct if and only if all its bits are correct.

Hence,

P(packet

error)

=

1

-

(1

-

"S )

.

(5.1)

When " ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download