Lecture BSc Multimedia - Chapter 9: Basic compression algorithms

Chapter 9: Basic compression algorithms. The following will be discussed in this chapter: Modeling and compression, basics of information theory, entropy example, Shannon’s coding theorem, compression in multimedia data, lossless vs lossy compression,...

Trang 1

CM3106 Chapter 9:

Basic Compression Algorithms

Prof David Marshall

Trang 2

Modeling and Compression

We are interested in modelingmultimedia data

To model means to replace somethingcomplex with a

simpler(= shorter) analog

Some models help understand the original

phenomenon/data better:

Example: Laws of physics

Huge arrays of astronomical observations (e.g Tycho Brahe’slogbooks) summarised in a few characters (e.g Kepler,

Newton):

|F| = GM1M2

r2

We will look at models whose purpose is primarily

compression of multimedia data

Trang 3

Recap: The Need for Compression

Raw video, image, and audio files can be very large

Example: One minute of uncompressed audio

Audio Type 44.1 KHz 22.05 KHz 11.025 KHz

16 Bit Stereo: 10.1 MB 5.05 MB 2.52 MB

16 Bit Mono: 5.05 MB 2.52 MB 1.26 MB

8 Bit Mono: 2.52 MB 1.26 MB 630 KB

Example: Uncompressed images

512 x 512 Monochrome 0.25 MB

512 x 512 8-bit colour image 0.25 MB

512 x 512 24-bit colour image 0.75 MB

Trang 4

Recap: The Need for Compression

Example: Videos (involves a stream of audio plus video

imagery)

Raw Video — uncompressed image frames 512x512 TrueColour at 25 FPS = 1125 MB/min

HDTV (1920 × 1080) —Gigabytes per minute

uncompressed, True Colour at 25 FPS = 8.7 GB/min

Relying on higher bandwidths is not a good option —

M25 Syndrome: traffic will always increase to fill the

current bandwidth limit whatever this is

CompressionHAS TO BE part of the representation of

audio, image, and video formats

Trang 5

Basics of Information Theory

Suppose we have an information source (random variable) Swhich emits symbols {s1, s2, , sn} with probabilities

p1, p2, , pn According to Shannon, the entropy of S is

When a symbol with probability pi is transmitted, it

reduces the amount of uncertaintyin the receiver by a

factor of 1p

i.log2p1

i = −log2pi indicates the amount of informationconveyed by si, i.e., the number of bits needed to code si(Shannon’s coding theorem)

Trang 6

Entropy Example

Example: Entropy of a fair coin

The coin emits symbols s1 = heads and s2 =tails with

p1 = p2 =1/2 Therefore, the entropy if this source is:

H(coin) = −(1/2 × log21/2 + 1/2 × log21/2) =

−(1/2 × −1 + 1/2 × −1) = −(−1/2 − 1/2) = 1 bit

Example: Grayscale image

In an image with uniformdistribution of gray-level

intensity (and all pixels independent), i.e pi=1/256,

then

The # of bits needed to code each gray level is 8 bits.The entropy of this image is 8

Trang 7

Entropy Example

Example: Breakfast order #1

Alice: “What do you want for breakfast: pancakes or eggs? I am

unsure, because you like them equally (p1 = p2 =1/2) ”

Bob: “I want pancakes.”

Trang 8

Entropy Example

Alice: “What do you want for breakfast: pancakes or eggs? I amunsure, because you like them equally (p1 = p2 =1/2) ”

Bob: “I want pancakes.”

Trang 9

Entropy Example

Alice: “What do you want for breakfast: pancakes, eggs, or salad?

I am unsure, because you like them equally

(p1= p2= p3 =1/3) ”

Bob: “Eggs.”

Question: What is Bob’s entropy assuming he behaves like a

random variable = how much information has Bob communicated

Trang 10

Entropy Example

(p1= p2= p3 =1/3) ”

Bob: “Eggs.”

Question: What is Bob’s entropy assuming he behaves like a

random variable = how much information has Bob communicated

Trang 11

Entropy Example

(p1= p2= p3 =1/3) ”

Question: How much information has Bob communicated to

Alice?

Answer: He has reduced her uncertainty by a factor of 3/2(leaving 2 out of 3 equal options), therefore transmittedlog23/2 ≈ 0.58 bits

Trang 12

Entropy Example

(p1= p2= p3 =1/3) ”

Question: How much information has Bob communicated to

Alice?

Answer: He has reduced her uncertainty by a factor of 3/2

(leaving 2 out of 3 equal options), therefore transmitted

log23/2 ≈ 0.58 bits

Trang 13

Shannon’s Experiment (1951)

Estimated entropy for English text: HEnglish≈ 0.6 − 1.3

bits/letter (If all letters and space were equally probable, then

it would be H0 =log227 ≈ 4.755 bits/letter.)

External link: Shannon’s original 1951 paper

External link: Java applet recreating Shannon’s experiment

Trang 14

Shannon’s coding theorem

Shannon 1948

Basically:

The optimal code length for an event with probability p is

L(p) = −log2p ones and zeros (or generally, −logbpif

instead we useb possible values for codes)

External link: Shannon’s original 1948 paper

Trang 15

Shannon vs Kolmogorov

What if we have afinite string?

Shannon’s entropy is astatistical measure ofinformation We can “cheat” and regard astring as infinitely long sequence of i.i.d ran-dom variables Shannon’s theorem then ap-proximately applies

Kolmogorov Complexity: Basically, thelength of the shortest program that ouputs

a given string Algorithmicalmeasure of formation

in-K(S) is not computable!Practical algorithmic compression ishard

Trang 16

Compression in Multimedia Data

Compression basically employs redundancy in the data:

Temporal in 1D data, 1D signals, audio, between video framesetc

Spatial correlation between neighbouring pixels or data items

Spectral e.g correlation between colour or luminescence

components This uses the frequency domain to exploit

relationships between frequency of change in data

Psycho-visual exploit perceptual properties of the human

visual system

Trang 17

Lossless vs Lossy Compression

Compression can be categorised in two broad ways:

Lossless Compression: after decompression gives an exact copy

of the original data

Example: Entropy encoding schemes (Shannon-Fano,

Huffman coding), arithmetic coding, LZW algorithm (used inGIF image file format)

Lossy Compression: after decompression gives ideally a

“close” approximation of the original data, ideally

perceptually lossless

Example: Transform coding — FFT/DCT based quantisationused in JPEG/MPEG differential encoding, vector

quantisation

Trang 18

Why Lossy Compression?

Lossy methods are typicallyapplied to high resoultion

audio, image compression

Have to be employedin video compression (apart from

special cases)

Basic reason:

Compression ratio of lossless methods (e.g Huffman

coding, arithmetic coding, LZW) is not high enough foraudio/video

By cleverly making a small sacrifice in terms of fidelity ofdata, we can often achieve very highcompression ratios

Cleverly = sacrifice information that is psycho-physicallyunimportant

Trang 19

Lossless Compression Algorithms

Repetitive Sequence Suppression

Run-Length Encoding (RLE)

Trang 20

Simple Repetition Suppression

If a sequence a series on n successive tokens appears:

Replace series with a token and a count number of

occurrences

Usually need to have a specialflagto denote when the

repeated token appears

Trang 21

Simple Repetition Suppression

Fairly straight forward to understand and implement

Simplicity is its downfall: poor compression ratios

Compression savings depend on the content of the data

Applicationsof this simple compression technique include:

Suppression of zeros in a file (Zero Length Suppression)

Silence in audio data, pauses in conversation etc

Sparse matrices

Component of JPEG

Bitmaps, e.g backgrounds in simple images

Blanks in text or program source files

Other regular image or data tokens

Trang 22

Run-length Encoding (RLE)

This encoding method is frequently applied to graphics-typeimages (or pixels in a scan line) — simple compression

algorithm in its own right

It is also a component used in JPEG compression pipeline

Basic RLE Approach (e.g for images):

Sequences of image elements X1, X2, , Xn (row by

Trang 23

Run-length Encoding Example

Original sequence:

111122233333311112222

can be encoded as:

(1,4),(2,3),(3,6),(1,4),(2,4)

How Much Compression?

The savings are dependent on the data: In the worst case

(random noise) encoding is more heavy than original file:

2×integerrather than 1×integerif original data is integer

vector/array

MATLAB example code:

rle.m(run-length encode) , rld.m (run-length decode)

Trang 24

Pattern Substitution

This is a simple form ofstatistical encoding

Here we substitute a frequently repeating pattern(s) with

a code

The code is shorter than the pattern giving us

compression

The simplest scheme could employ predefined codes:

Example: Basic Pattern Substitution

Replace all occurrences of pattern of characters ‘and’ with thepredefined code ’&’ So:

and you and I

becomes:

& you & I

Trang 25

Reducing Number of Bits per Symbol

For the sake of example, consider character sequences here

(Othertoken streams can be used — e.g vectorised image

blocks, binary streams.)

Example: Compression ASCII Characters EIEIO

E(69)

z }| {01000101

I(73)

z }| {01001001

O(79)

z }| {01001111

=5 × 8 = 40 bits

To compress, we aim to find a way to describe the same

information usingless bits per symbol, e.g :

E (2 bits)

z}|{

xx

I (2 bits)z}|{

yy

E (2 bits)z}|{

xx

I (2 bits)z}|{

yy

O (3 bits)z}|{

Oz}|{

3 =11 bits

Trang 26

Code Assignment

A predefined codebook may be used, i.e assign code ci

to symbol si (E.g some dictionary of common

words/tokens)

Better: dynamically determine best codes from data

Theentropy encoding schemes (next topic) basically

attempt to decide the optimum assignment of codes toachieve the best compression

Example:

Count occurrence of tokens (to estimate probabilities)

Assign shorter codes to more probable symbols and viceversa

Ideallywe should aim to achieve Shannon’s limit: −logbp!

Trang 27

Morse code

Morse code makes anattempt to approach optimal code

length: observe that frequent characters (E, T, ) are

encoded with few dots/dashes and vice versa:

Trang 28

The Shannon-Fano Algorithm — Learn by Example

This is a basic information theoretic algorithm

A simple example will be used to illustrate the algorithm:

Trang 29

The Shannon-Fano Algorithm — Learn by ExampleEncoding for the Shannon-Fano Algorithm

A top-down approach:

1 Sort symbols according to their

frequencies/probabilities, e.g ABCDE

2 Recursively divide into two parts, each with approximatelysame number of counts, i.e split in two so as to minimisedifference in counts Left group gets0, rightgroup gets1

Trang 30

The Shannon-Fano Algorithm — Learn by Example

3 Assemble codebook by depth first traversal of the tree:

Raw token stream 8 bits per (39 chars) = 312 bits

Coded data stream = 89 bits

Trang 31

Shannon-Fano Algorithm: Entropy

For the above example:

Trang 32

Shannon-Fano Algorithm: Discussion

Best way to understand: consider best case example

If we couldalways subdivide exactly in half, we would get

ideal code:

uncertainty by a factor 2, so transmit 1 bit

Otherwise, when counts are only approximately equal, weget only good, but not ideal code

Compare with a fair vs biased coin

Trang 33

Huffman Algorithm

Can we do better than Shannon-Fano?

Huffman! Always produces best binary tree for given

probabilities

A bottom-up approach:

1 Initialization: put all nodes in a list L, keep it sorted at alltimes (e.g., ABCDE)

2 Repeat until the list L has more than one node left:

From L pick two nodes having the lowestfrequencies/probabilities, create a parent node of them.Assign the sum of the children’s frequencies/probabilities

to the parent node and insert it into L

Assign code 0/1 to the two branches of the tree, anddelete the children from L

3 Coding of each node is a top-down label of branch labels

Trang 34

Huffman Encoding Example

ACABADADEAABBAAAEDCACDEAAABCDBBEDCBACAE (same string as

Trang 35

Huffman Encoder Discussion

The following points are worth noting about the above

algorithm:

Decoding for the above two algorithms is trivial as long asthe coding table/book is sent before the data

There is a bit of an overhead for sending this

But negligible if the data file is big

Unique Prefix Property: no code is a prefix to any

other code (all symbols are at the leaf nodes) → great fordecoder, unambiguous

If prior statistics are available and accurate, then Huffmancoding is very good

Trang 37

Huffman Coding of Images

In order to encode images:

Divide image up into (typically) 8x8 blocks

Each block is a symbol to be coded

Compute Huffman codes for set of blocks

Encode blocks accordingly

InJPEG: Blocks are DCT coded first before Huffman

may be applied (more soon)

Coding image in blocks is common to all image coding

methods

huffman.m (Used with JPEG code later),

huffman.zip (Alternative with tree plotting)

Trang 38

Arithmetic Coding

What is wrong with Huffman?

Huffman coding etc use an integer number (k) of

1/0s for each symbol, hence k is never less than 1

Idealcode according to Shannon may not be integernumber of 1/0s!

Example: Huffman Failure Case

Consider a biased coin with pheads= q =0.999 and

ptails =1 − q

Suppose we use Huffmanto generate codes for heads andtails and send 1000 heads

This would require 1000 ones and zeros with Huffman!

Shannon tells us: ideallythis should be

−log2pheads≈ 0.00144 ones and zeros, so ≈ 1.44 for

entire string

Trang 39

Arithmetic Coding

Solution: Arithmetic coding

A widely used entropy coder

Also used inJPEG — more soon

Only problem is its speed due possibly complex

computations due to large symbol tables

Good compression ratio (better than Huffman coding),

entropy around the Shannon ideal value

Here we describe basic approach of Arithmetic Coding

Trang 40

Arithmetic Coding: Basic Idea

The idea behind arithmetic coding is: encode the entire

message into a single number, n, (0.0 6 n < 1.0)

Consider a probability line segment, [0 1), and

Assign to every symbol a range in this interval:

Range proportional to probabilitywith

Position at cumulative probability

Once we have defined the ranges and the probability line:

Start to encode symbols

Every symbol defines where the outputreal number landswithin the range

Trang 41

Simple Arithmetic Coding Example

Assume we have the following string: BACA

Therefore:

A occurs with probability 0.5

B and C with probabilities 0.25

Start by assigning each symbol to the probability range

We now know that the code will be in the range 0.5 to

0.74999

Trang 42

Range is not yet unique

Need to narrow down the range to give us a unique code.Basic arithmetic coding iteration:

Subdivide the range for the first symbol given the

probabilities of the second symbol then the symbol etc.For all the symbols:

range = high - low;

high = low + range * high_range of the symbol being coded; low = low + range * low_range of the symbol being coded;

Where:

range, keeps track of where the next range should be

highand low, specify the output number

Initially high = 1.0, low = 0.0

Trang 43

For the second symbol we have:

(nowrange = 0.25, low= 0.5, high = 0.75):

BAC [0.59375, 0.625)

Trang 44

Subdivide again:

(range = 0.03125, low = 0.59375,high = 0.625):

BACA [0.59375, 0.60937)BACB [0.609375, 0.6171875)BACC [0.6171875, 0.625)

So the (unique) output code for BACA is any number in the

range:

[0.59375, 0.60937)

Trang 45

To decodeis essentially the opposite:

We compile the table for the sequence given probabilities.Find the range of number within which the code numberlies and carry on

Định dạng
Số trang	78
Dung lượng	1 MB