Basic concepts Block diagram of digital communication system Source encoder Channel encoder Modulator ChannelNoisy Channel decoder Source Source Destination... What is Information The
Trang 1Telecommunication Networks -
Information Theory
By: Vinh Dang
Trang 3 This word is derived from the
Greek word, "tele", which
means, "far off"; and the Latin
word, "communicate", which
means, "to share" Hence
Telecommunication is distance
communication.
The true nature of
telecommunications is the
passing of information to one
or more others in any form that
may be used
Trang 4 People tend to think of telecommunications in
terms of telephones, computer networks,
Internet, and maybe even cable television
This includes the often-considered electrical,
electromagnetic, and optical means already
mentioned, but it also includes simple wire,
radio, or even other visual forms
Trang 5Early Telecommunications
Drum and horn
Smoke/Fire Signal
Light
Tower using mirrors
Trang 7Switching systems
Satellite communication
systems
Data communication systems
Telecommunication Networks
Trang 8 A telecommunicaton networks is a network
of telecommunication links and nodes
arranged so that messages may be
passed from one part of the network to
another over multiple links and through
various modes.
Trang 9Transport Transmission facilities
Switching Switch or Exchange or Central Office (CO)
Access Equipment for the access of subscribers
(Access networks AN)
Customer Premises Equipment (CPE)
Subscriber terminal equipment
Trang 10N T
N T
AN
N T
AN
N T
Exchange
Access Network (AN) with subscriber terminals (CPE)
S IE M E N S S IE M E N S
S I E M E N S
S IE M E N S
Trang 11Basic concepts
Block diagram of digital communication system
Source encoder Channel encoder Modulator ChannelNoisy
Channel decoder
Source
Source
Destination
Trang 12What is Information Theory?
Information theory provides a quantitative measure of source information, the information capacity of a channel
Dealing with coding as a means of utilizing channel
capacity for information transfer
Shannon’s coding theorem:
“If the rate of information from a source does not exceed the capacity of a communication channel, then there
exists a coding technique such that the information can
be transmitted over the channel with an arbitrarily small probability of error, despite the presence of error”
Trang 13Information measure
Information Theory: how
much information
… is contained in a signal?
Information is the
commodity produced by
the source for transfer to
some user at the
destination
Examples: Barcelona vs
GĐT-LA
Trang 14Information measure
Consider the three results: win, draw, loss
The less likely the message, the more information it
conveys
How is information mathematically defined?
1 Barca wins No information 1, quite sure
2 Barca draws with
GĐT-LA More information Relatively low
3 Barca loses A vast amount of
information Very low probability of occurrence in a
typical situation
Trang 15 Let xj be an event with p(xj) is the probability of the
event that xj is selected for transmission
Trang 16 The base of the logarithm
10 the measure of information is hartley
e the measure of information is nat
2 the measure of information is bit
Examples: A random experiment with 16 equally likely
outcomes:
I(x j )=-log2(1/16)=log216=4 bits
The information is greater than one bit, since the probability of each outcome is much less than ½.
Trang 17Entropy and Information rate
Consider an information source emitting a sequence of symbols
from the set X={x 1 ,x 2 ,x M}
Each symbol x i is treated as a message with probability p(x i ) and
self-information I(x i )
This source has an average rate of r symbols/sec
Discrete memoryless source
The amount of information produced by the source during an
arbitrary symbol interval is a disrete random variable X.
The average information per symbol is then given by:
Entropy = information = uncertainty
If a signal is completely predictable, it has zero entropy and no
information
Entropy = average number of bits required to transmit the signal
bit/symbol
) ( log ) ( )}
( { )
j p x p x x
I E X
Trang 18 Random variable with uniform
distribution over 32 outcomes
# bits required = log 32 = 5 bits!
Therefore H(X) = number of bits required to represent a random event
How many bits are needed for:
Outcome of a coin toss
“tomorrow is a Thursday”
Trang 19 The value of H(X) for a given source depends upon the
symbol probabilities p(x i ) and M
However,
The lower bound corresponds to no uncertainty
The upper bound corresponds to maximum uncertainty, occuring when each symbol are equally likely
The proof of this inequality is shown in [2] Chapter 15
M X
Trang 20 The lower bound with arbitrary M is easily done with
noting that a.log(a)0 as a0
The proof of the upper bound is more complex
We invoke the inequality
ln a (a-1)
M X
M x
p x
p
M x
p x
p X
H
M
i
i i
X
2 1
2
log )
( log ).
(
log )
( log ) ( )
Trang 21 Consider:
M X
H M
X H
x
p M
e M
X H
x
p M
e M
X H
x p M
x p
e x
p M
x p e
x p M
x p x
p M
x p x
p
M x
p x
p M
X H
M i
M i
i
M i
i
i
M i
i M
i
i M
i
i i
X
2 2
2 2
1 2 2
1
2 1
2
1
2 1
2 1
2
2 2
log )
( 0
log )
(
) (
1 log
log )
(
) (
1
log log
) (
1 ) (
1 ).
(
log )
(
1 ln
).
(
log
) (
1 log
).
( )
(
log )
( log ).
(
log )
( log ) ( log
) (
p
1 )
(
1 ) (
1
Trang 23Source coding theorem
Information from a source producing different
symbols could be described by the entropy H(X)
Source information rate (bit/s):
Rs = rH(X) (bit/s)
H(X): source entropy (bits/symbol)
r: symbol rate (symbols/s)
Assume this source is the input to a channel:
C: capacity (bits/symbol)
S: available symbol rate (symbols/s)
S.C: bits/s
Trang 24Source coding theorem (cont’d)
Shannon’s first theorem (noiseless coding theorem):
“ Given a channel and a source that generates information at a rate less than the channel capacity, it
is possible to encode the souce output in such a manner that it can be transmitted through the
channel ”
Demonstration of source encoding by an example:
Discrete binary source
Source encoder Binary channel Source symbol rate= r
= 3.5 symbols/s C = 1 bit/symbol
Trang 25Example of Source encoding
Discrete binary source: A(p=0.9), B(p=0.1)
source symbols cannot be transmitted
directly
Check Shannon’s theorem:
H(X)=-0.1 log20.1-0.9log20.9=0.469bits/symbol
R s = rH(X) = 3.5(0.469)=1.642 bits/s < S.C = 2 bits/s
Transmission is possible by source encoding to decrease the average symbol rate
Trang 26Example of Source encoding(cont’d)
Codewords assigned to n-symbol groups of source symbols
Rules:
Shortest codewords for the most probable
group
Longest codewords for the least probable group
n -symbol groups of source symbols # n
th-order extension of original source
Trang 27First-Order extension
Symbol rate of the encoder = symbol rate of source
Larger than that of the channel can accommodate
Trang 28Second-Order extension
i i
x p L
n
* ) (
2 1
Grouping 2 source symbols at a time:
Trang 29Second-Order extension
symbolurce
symbols/socode
645
02
29
1
n L
258
2)
645
0(5
n
L r
The symbol rate at the encoder output:
Still greater than the 2 symbols/second of the channel
So, we continue with the third-order extension
code symbols/sec >2
Trang 31Third-Order extension
533
0 3
598
1
n L
condsymbols/se
code864
.1)
533
0(5
n
L r
The symbol rate at the encoder output:
This rate is accepted by the channel
code symbols/source symbol
Trang 32Efficiency of a source code
Efficiency is a useful measure of goodness of a source code
L L
L eff
1
min min
) (
D
X H L
X H eff
H(X): the entropy of source D: the number of symbols in the coding alphabet
where
or for a binary alphabet
Trang 33Entropy and efficiency of an
extended binary source
Entropy of the n th-order extension of a
discrete memoryless source:
The efficiency of the extended source:
L
X H
n eff . ( )
Trang 34 Decreasing the average codeword length leads to
increasing decoding complexity
Trang 35Shannon-Fano Coding [1]
Procedure: 3 steps
1. Listing the source symbols in order of decreasing
probability
2. Partitioning the set into 2 subsets as close to
equiprobable as possible 0’s are assigned to the upper set and 1’s to the lower set
3. Continue to partition subsets until further partitioning
is not possible
Example:
Trang 36Example of Shannon-Fano Coding
1 1
1 1
1
1
0
0 0
0 0
1
1 1 1
1 1
00 01 10 110 1110 11110 11111
Trang 37Shannon-Fano coding
41.2
7 1
i i
i l p L
37.2log
p U
H
98
041
.2
37.2)
The code generated is prefix code due to equiprobable
partitioning
Not lead to an unique prefix code Many prefix codes have the same efficiency
Trang 38Huffman Coding [1][2][3]
Procedure: 3 steps
1. Listing the source symbols in order of decreasing
probability The two source symbols of lowest probabilities are assigned a 0 and a 1
2. These two source symbols are combined into a new
source symbol with probability equal to the sum of the two original probabilities The new probability is placed in the list in accordance with its value
3. Repeat until the final probability of new combined
symbol is 1.0
Example:
Trang 39Examples of Huffman Coding
0 1 14
0 1 24
0 1 42
0 1 58
0 1 1.0
Codewords
U i
Trang 40Huffman Coding: disadvantages
When source have many symbols
(outputs/messages), the code becomes
bulkyHufman code + fixed-length code.
Still some redundancy and redundancy is large with a small set of
messagesgrouping multiple independent messages
Trang 41Huffman Coding: disadvantages
Example 9.8 and 9.9 ([2],pp 437-438)
Grouping make redundancy small but the number of codewords grows exponentially, code become more complex and delay is introduced.
Trang 42 How many bits do we need?
(a) Index each horse log8 = 3 bits
(b) Assign shorter codes to horses with higher probability:
0, 10, 110, 1110, 111100, 111101, 111110, 111111
Trang 43 Need at least H(X) bits to represent X
H(X) is a lower bound on the required
descriptor length
Entropy = uncertainty of a random variable
Trang 44Joint and conditional entropy
Joint entropy:
H(X,Y) = x y p(x,y) log p(x,y)
simple extension of entropy to 2 RVs
Conditional Entropy:
H(Y|X) = x p(x) H(Y|X=x)
= x y p(x,y) log p(y|x)
“What is uncertainty of Y if X is known?”
Easy to verify:
If X, Y independent, then H(Y|X) = H(Y)
If Y = X, then H(Y|X) = 0
Trang 45Mutual Information
= reduction of uncertainty due to another variable
“How much information about Y is contained
in X?”
If X,Y independent, then I(X;Y) = 0
If X,Y are same, then I(X;Y) = H(X) = H(Y)
Symmetric and non-negative
Trang 46Mutual Information
Relationship between
entropy, joint and
mutual information
Trang 47Mutual Information
I(X;Y) is a great measure of similarity
between X and Y
Widely used in image/signal processing
Medical imaging example:
MI based image registration
Why? MI is insensitive to gain and bias
Trang 48Homework 1
Calculate H(X) for a discrete memoryless source
having six symbols with probabilities:
PA=1/2, PB=1/4, PC=1/8, PD=PE=1/20,PF=1/40
Then find the amount of information contained in the messages ABABBA and FDDFDF and compare with the expect amount of information in a six-smbol
message.
Trang 49Homework 2
A certain data source has 16 equiprobable
symbols, each 1ms long The symbols are
produced in blocks of 15, separated by 5-ms
spaces Find the source symbol rate.
Trang 50Homework 3
Obtain the Shannon-Fano code for the source in Homework 1, and caculate the efficiency.
Trang 51[1] R E Ziemer & W H Transter, “Information
Theory and Coding”, Principles of
Communications: Systems, Modulation, and
2002
[2] A Bruce Carlson, “Communications
Systems”, Mc Graw-Hill, 1986, ISBN
0-07-100560-9
[3] S Haykin, “Fundamental Limits in
Information Theory”, Communication
pp 567-625, 2001