360 likes | 372 Views
Understand how information is encoded, complexity of code symbols, efficiency of different coding methods, and quantification of information in digital communication systems. Explore the concept of entropy and its role in communication theory.
E N D
Topic 3: Information Theory Arab Open University-Lebanon Tutorial
Topic 3: Information Theory • Introduction • In a digital communication system, messages are produced by a source and transmitted over a channel to the destination. The messages produced by the source need to be encoded for transmission via the communication channel, and then decoded at the destination. A simple model is shown below. Arab Open University-Lebanon Tutorial 9
Messages from a digital source are made up from a fixed set of source symbols. These might be, for instance, numbers or letters. The set of sourcesymbols is sometimes called the source alphabet. • Source symbols need to be represented by code wordsfor transmission. In digital communication systems, code words are usually sequences of binary digits. The code words may be of equal length, or may vary in length. • The set of symbols which are used in the code words is sometimes described as the code alphabet. For a binary code, the code alphabet consists of only the symbols 0 and 1. The number of symbols in the code alphabet is known as the radixof the code, so a binary code has radix 2. Morse code, which uses the symbols dot, dash and space, has radix 3. Arab Open University-Lebanon Tutorial 9
Messages from a source can be coded in different ways, and it is often important to choose an efficient code that will minimize the number of binary digits needed. Another example is ASCII coding where the radix is 2 & they are 128 symbolsUsing 7 bits to represent all characters would give us 27 (or 128). This means that ASCII coding is an efficient coding as there are no left-over combinations. Example: Morse Code Arab Open University-Lebanon Tutorial 9
Question • If it is required to represent coding for the decimal digit 0-9. • How many binary digits required per decimal digit? • Would you describe binary coded decimal as an efficient code? • The number 12345678 is a composition of a single digit coding. If we represent the number using two digit coding ( different grouping of source symbols – called second extension ) it would be represented as: 12 34 56 78. • What would be the total numbers we need to represent? & • How many binary digits needed to per number ? Hint Arab Open University-Lebanon Tutorial 9
Example 12345678 Example 12 34 56 78 (00-99) Example 123 456 (000-999) Arab Open University-Lebanon Tutorial 9
Information and entropy • The concept of informationenables the efficiency of a code to be quantified. • In communication theory, the term ‘information’ has a very specific meaning. • A message is said to contain a certain amount of information, which is related to how much is learnt by the recipient of the message. Arab Open University-Lebanon Tutorial 9
Information and entropy • The information conveyed by a message depends on the probability of receiving that particular message. The least probable messages convey most information. The information, I, gained by receiving a message of probability P is given by: • I = -log (P) • Because probabilities are always less than 1, the value of log (P) is always a negative quantity; the minus sign ensures that information is always positive. Arab Open University-Lebanon Tutorial 9
Information and entropy • If the probability of receiving a particular message is independent of the messages which have been received before, the source is described as memoryless. • For a memoryless source, the amount of information provided by two separate messages can be added to give the information conveyed when both messages have been received. So, • I = −log (P1) + (−log (P2)) = −log (P1P2) • Many sources do have memory. For instance, consecutive video frames typically have very similar content. In this case the information provided by two consecutive frames is considerably less than the sum of the information in each individual frame. Arab Open University-Lebanon Tutorial 9
Information and entropy • If the probability of a given source symbol is known, the information provided by that symbol can be calculated. If the probabilities of allthe possible source symbols are known, the average information for the source can be found. This is described as the entropyof the source. • For a memoryless source, if the source can generate n different symbols and the ith symbol has probability Pi, then the entropy H is given by: Arab Open University-Lebanon Tutorial 9
The entropy is a characteristic of the source, representing the average amount of information it provides. • This is independent of how symbols from the source are coded. • A source has maximum entropy if all its symbols are equally probable. • This corresponds to maximum uncertainty about the outcome of a message. Arab Open University-Lebanon Tutorial 9
An efficient code will represent the information from the source using as few binary digits as possible. • Shannon has shown that, for any code, the source entropy H is the minimum achievable value for the average length L of the code words: • L H • The average code word length L is given by: • where li is the length of the ith code word and Pi is the probability of receiving it. Arab Open University-Lebanon Tutorial 9
Information and entropy • If the entropy of a source is known, the efficiencyof any code used for that source can be calculated. • The efficiency E is defined as the ratio of the entropy H to the average length L of the code words: Arab Open University-Lebanon Tutorial 9
Source coding • The process of representing individual source symbols by appropriate code words is called source coding. In source coding the aim is to represent the source symbols for efficient transmission or storage. • In some cases the most efficient code is one which uses code words of fixed length. A fixed length code with code words of length n binary digits can represent 2ndifferent source symbols. Arab Open University-Lebanon Tutorial 9
Source coding • The average number of binary digits required per source symbol can sometimes be reduced by grouping source symbols before coding. In this case the messages are considered to come from a new source, described as an extensionof the original source. For example, for a source with three symbols, the symbols can be taken in pairs to form the second extension of the source. This extended source would have nine symbols Arab Open University-Lebanon Tutorial 9
Source coding • If the probabilities of the source symbols are not all equal, a variable length code may be more efficient. When using a variable length code, the average code word length can be minimized by using short code words for the most probable source symbols and longer ones for the least probable. • When decoding a message using a fixed length code, the start of each new code word can be found by counting binary digits. Each code word can then be translated unambiguously to a single source symbol; the code is uniquelydecodable. A code word can also be translated as soon as it arrives; so we say that the code is instantaneously decodable. Arab Open University-Lebanon Tutorial 9
Example of fixed length coding tree Arab Open University-Lebanon Tutorial 9
Example of variable length coding Arab Open University-Lebanon Tutorial 9
For a variable length code, however, the code needs to be carefully designed if it is to be uniquely and instantaneously decodable. For example, if four source symbols are represented by the binary sequences 0, 1, 01 and 10, the sequence 01 could be one code word or a sequence of two code words. Therefore this code is not uniquely decodable. • Not all uniquely decodable variable length codes are also instantaneously decodable. For example, if four source symbols are represented by the binary sequences 0, 01, 011 and 111, a typical message is: • 00101101110 • Although this cannot be decoded instantaneously (from left to right), it can be decoded uniquely by working from the right to the left. In terms of code words, the sequence then becomes: • 0, 01, 011, 0, 111, 0 • Because the message cannot be instantaneously decoded, the complete message must be stored before it can be decoded. Arab Open University-Lebanon Tutorial 9
To design a code which is instantaneously decodable, the code designer must ensure that no code word forms the first part (called the prefix) of any other code word. If this condition is met, the decoder can accumulate binary digits until the sequence received corresponds to a complete code word. • Instantaneous codes can be generated and decoded by means of coding trees as shown below. Left and right ‘branches’represent 0s and 1s or vice versa. All the code words for a giveninstantaneous code (filled circles) correspond to theend-points of ‘branches’. Arab Open University-Lebanon Tutorial 9
Which of the trees is most efficient if the symbols are equally probable? Arab Open University-Lebanon Tutorial 9
Huffman coding. Arab Open University-Lebanon Tutorial 9
Huffman code Saving Arab Open University-Lebanon Tutorial 9
One way of finding an instantaneous code which is as efficient as possible is to use Huffman coding. In Huffman coding, source symbols with the largest probabilities are allocated systematically to the shortest code words as shown below. Arab Open University-Lebanon Tutorial 9
Source coding • As with all codes, the average lengthLh of the code words for Huffman coding is greater than or equal to the source entropy H. However, for Huffman coding the average code word length exceeds the entropy by at most 1. • If the source messages are combined, forming a source extension, and then Huffman coded, this results in longer code words and a larger source entropy. One of the features of source extensions is that if H1 is the entropy of the first extension (i.e. original source symbols) then the entropy, H2, of the second extension (i.e. pairs of the original source symbols), is:H2 = 2 H1 Arab Open University-Lebanon Tutorial 9
Source coding • This can be generalized to higher-order extensions, and the entropy of the rth extension is: • Hr = r H1 • If the process of taking higher and higher extensions is continued, the entropy H and the average code-word length Lh become large. The average code-word length therefore approaches the entropy. This result is Shannon’s first theorem. Arab Open University-Lebanon Tutorial 9
Channel coding • Communication systems do experience noise, for example from electrical interference. A model often used to study the effects of this noise has a single source of noise connected to the channel. • Noise distorts the signals used for communication and can cause errors in the received messages. Errors can be detected and sometimes corrected by adding redundant digits to source-coded messages. This means that the average number of binary digits per source symbol, L, is necessarily larger than the entropy, H, and so the efficiency, E, of the code must be less than 1. The redundancy, R, of a code is defined as: Arab Open University-Lebanon Tutorial 9
Channel coding • In many cases, channel codes are designed assuming: • that the error rate is low, • that errors occur independently of each other, • that there is a negligible probability of several errors occurring together. • For burst noise, where sequences of errors can occur, these assumptions are not all valid. Some codes are designed specifically to cope with burst noise; one example is the cyclic redundancy check. Alternatively, interleaving can be used to spread the effect of a noise burst over several code words, reducing the impact on any one of them. Arab Open University-Lebanon Tutorial 9
Channel coding • Figure below illustrates a channel code called a rectangular code. Sequences of message digits are grouped as ‘rows’, making up a fixed size block. A ‘horizontal’ parity check digit is inserted at the end of each row. Then a final row of ‘vertical’ parity checks is added, one for each ‘column’. If no more than one error occurs per block, a rectangular code can locate and hence correct it. Arab Open University-Lebanon Tutorial 9
Channel coding • A more efficient use of parity digits is made in Hamming codes. Hamming showed that channel codes can be designed which use m parity check digits within a block (sequence) of n digits, where: • Here n includes the parity digits. A Hamming code with a block size of n = 15 uses only 4 parity digits for 11 message digits. Hamming codes can only correct one error per block. Arab Open University-Lebanon Tutorial 9
Figure below represents a 7-digit Hamming code for a 4-digit message. The original message digits become digits 3, 5, 6 and 7 of the 7-digit Hamming coded word. Digit 1 of this word forms a parity digit for digits 3, 5 and 7; digit 2 forms a parity digit for digits 3, 6 and 7; and digit 4 forms a parity digit for digits 5, 6 and 7. Arab Open University-Lebanon Tutorial 9
Alternatively, look-up tables can be used by the encoder to convert the original 4-digit message to a 7-digit Hamming code, and by the decoder to convert the received 7-digit sequence to the corresponding 4-digit message. Arab Open University-Lebanon Tutorial 9
Channel capacity • If a transmission channel is noisy, this affects the rate at which information can be transmitted. If a message is transmitted over a noisy channel, one or more of the digits could be corrupted, and some of the information in the message would be lost. • The effect of errors can be minimized by using channel codes which add redundant digits for error detection and correction. So using redundant codes means that the information rate for a channel which can transmit a fixed number of binary digits per second is reduced. Arab Open University-Lebanon Tutorial 9