Công nghệ Multimedia- Chương 1: Nền tảng kĩ thuật nén- vuson.tk docx

Đối với hầu hết tín hiệu, sự khác nhau của giá trị dự đoán với giá trị thực tế là nhỏ -> ta có thể dùng số bit nhỏ hơn để mã hoá sự sai khác trong khi vẫn duy trì được cùng 1 độ chính

Trang 2

Model base coding (MBC) (brief introduction)

Chapter 3: Multimedia Network

Trang 3

Giới thiệu

Tầm quan trọng của các kĩ thuật Multimedia: -> Multimedia có ở khắp nơi

Trong PC:

Real player, Quicktime, Media

Âm nhạc, hình ảnh miễn phí trên internet (mp2, mp3, mp4, asf, ra, ram, mid, DIVX, v v )

Hội thảo trực tuyến âm thanh, hình ảnh

Dịch vụ quảng cáo trên web, truyền số liệu

Giáo dục từ xa.

Y học từ xa

Trong truyền hình và các thiết bị điện tử dân dụng:

DVB-T/DVB-C/DVB-S (Digital Video Broadcastsing-Terrestrial/Cable/Satellite _ Truyền hình số mặt đất/cáp/vệ tinh) -> biểu diễn MPEG-2 chất lượng cao hơn hẳn truyền hình tương tự truyền thống.

Truyền hình tương tác -> Các ứng dụng internet trên truyền hình (Mail,Web, commerce_thương mại điện tử) -> không cần đợi PC để khởi động, tắt máy.

E- Các đầu đọc CD/VCD/DVD/Mp3

Đồng thời xuất hiện trên các thiết bị cầm tay ( ĐTDĐ thế hệ 3G, PDA

không dây)

Trang 4

Introduction

The importance of Multimedia technologies: Æ Multimedia everywhere !!

On PCs:

Real Player, QuickTime, Windows Media

Music and Video are free on the INTERNET (mp2, mp3, mp4, asf, mpeg, mov, ra, ram, mid, DIVX, etc)

Video/Audio Conferences.

Webcast/ Streaming Applications

Distance Learning (or Tele-Education)

Tele-Medicine

Tele-xxx (Let’s imagine !!)

On TVs and other home electronic devices:

DVB-T/DVB-C/DVB-S (Digital Video Broadcasting –

Terrestrial/Cable/Satellite) Æ shows MPEG-2 superior quality over traditional analog TV !!

Interactive TV Æ Internet applications (Mail, Web, E-commerce) on a TV !!

Æ No need to wait for a PC to startup and shutdown !!

CD/VCD/DVD/Mp3 players

Trang 5

Giới thiệu (2)

Mạng Multimedia

Internet được thiết kế vào những năm 60 cho các mạng tốc độ thấp với những ứng dụng văn bản

nhàm chán -> Độ trễ cao, jitter cao.

-> Những ứng dụng multimedia yêu cầu có sự biến đổi mạnh mẽ của cơ sở hạ tầng internet.

Nhiều cơ cấu tổ chức được nghiên cứu và triển khai

để hỗ trợ cho thế hệ multimedia internet tiếp theo (VD: intServ, DiffServ)

Trong tương lai, tất cả mọi tivi (và PC) sẽ kết nối

internet và bắt sóng miễn phí với hàng triệu trạm

phát sóng trên toàn thế giới.

Hiện tại, mạng multimedia chạy trên ATM (đã cổ), IPv4, và tương lai là IPv6 -> nên sẽ bảo đảm được chất lượng d ịch vụ QoS (Quality of Service)

Trang 6

Æ Multimedia applications require drastic modifications

of the INTERNET infrastructure.

Many frameworks have been being investigated and

deployed to support the next generation multimedia

Internet (e.g IntServ, DiffServ)

In the future, all TVs (and PCs) will be connected to the

Internet and freely tuned to any of millions broadcast

stations all over the World.

At present, multimedia networks run over ATM (almost

obsolete), IPv4, and in the future IPv6 Æ should

guarantee QoS (Quality of Service) !!

Trang 7

Chương 1: Nền tảng kĩ thuật nén

Trong truyền thông: Để thu hẹp dải thông trong các ứng

dụng mạng multimedia như streaming, video theo yêu cầu VOD (video on demand), internet phone.

Các vật chứa kĩ thuật số (VCD, DVD, băng v v ) -> giảm kích cỡ, giảm giá cả, tăng dung lượng và chất lượng cất giữ âm thanh, hình ảnh.

Trang 8

Chapter 1: Background of compression

techniques

For communication: reduce bandwidth in multimedia

network applications such as Streaming media, Demand (VOD), Internet Phone

Video-on- Digital storage (VCD, DVD, tape, etc) Æ Reduce size &

cost, increase media capacity & quality.

Ratio between the source data and the compressed data (e.g 10:1)

Lossless compression

Lossy compression

Trang 9

2.1 Nội dung thông tin và dư thừa

Entropy là đại lượng đo của nội dung thông tin Entropy quy định giới hạn dưới của tốc độ bit hay dòng dữ liệu.

-> Biểu diễn bởi bits/đơn vị nguồn đầu ra (như bits/pixel)

Tín hiệu càng nhiều thông tin thì entropy càng cao

Nén tổn hao thì làm giảm entropy còn nén không tổn hao thì không

Là sự khác nhau giữa tốc độ thông tin và tốc độ bit

Thường thường tốc độ thông tin thấp hơn tốc độ bit rất nhiều

Nén là để loại bỏ sựdư thừa

Trang 10

Information content and redundancy

Information rate

Entropy is the measure of information content.

Æ Expressed in bits/source output unit (such as bits/pixel)

The more information in the signal, the higher the

entropy.

Lossy compression reduce entropy while lossless

compression does not.

Trang 11

It is easy to show (using the method of Lagrange multipliers) that the

uniform distribution achieves maximum entropy, given by H(X) = log2 N

A uniformly distributed source can be considered to have maximum

randomness when compared with sources having other distributions

Combining this with the intuitive English text example mentioned previously,

it is apparent that entropy provides a measure of the compressibility of a source Î High entropy indicates more randomness; hence the source

requires more bits on average to describe a symbol.

Trang 12

Entropy (bổ sung 2)

Calculating Entropy—An Example

An example illustrates the computation of entropy the difficulty in determining the entropy of a fixed-length signal Consider the

four-point signal [3/4 1/4 0 0].

There are three distinct values (or symbols) in this signal, with probabilities 1/4, 1/4, and 1/2 for the symbols 3/4, 1/4, and 0,

respectively The entropy of the signal is then computed as

This indicates that a variable length code requires 1.5

bits/symbol on average to represent this source

In fact, a variable-length code that achieves this entropy is [10 11 0] for the symbols [3/4 1/4 0].

Trang 13

2.3 Nén không tổn hao

Dữ liệu giải mã giống hệt dữ liệu nguồn

như pkzip hay Gzip

thông tin)

Không thể bảo đảm 1 tỉ lệ truyền cố định -> vì tốc

độ dữliệu đầu ra biến đổi -> nảy sinh các vấn đề

cho cơ cấu ghi và truyền thông.

Trang 14

Can not guarantee a fix compression ratio Æ The output data rate is variable Æ problems for recoding mechanisms or communication channel.

Trang 15

2.4 Nén tổn hao:

Dữ liệu giải nén khác dữliệu nguồn nhưng sự khác biệt không thể phân biệt được rõ ràng bằng tai hoặc mắt thường.

100:1)

Dựa trên những kiến thức về sự nhận thức về thị

giác và thính giác

Có thểấn định 1 hệ số nén cố định

Trang 16

Lossy Compression

The data from the expander is not identical to the source data but the difference can not be distinguished auditorily or visually.

lossless (up to 100:1)

Based on the understanding of

psychoacoustic and psychovisual perception.

Can be forced to operate at a fixed

compression factor.

Trang 17

2.5 Quá trình nén:

Truyền thông (giảm chi phí kết nối dữ liệu)

Dữ liệu -> Bộ nén (mã hoá) -> kênh truyền dẫn -> bộ giãn (giải mã) -> dữ liệu

Cơ cấu ghi (tăng thời gian phát lại: tỉ lệ với hệ số nén)

Dữ liệu -> nén (mã hoá) -> thiết bị chứa (băng, đĩa, Ram ) -> bộ giãn (giải mã) -> Dữ liệu

Trang 18

Process of Compression

Communication (reduce the cost of the data link)

→Expander (decoder) →Data'

Recording (extend playing time: in proportion

to compression factor

(tape, disk, RAM, etc.) →Expander (decoder) →

Data‘

Trang 19

2.6 Lấy mẫu và lượng tử hoá:

Máy tính không thể xử lí trực tiếp tín hiệu tương tự

Lấy mẫu tín hiệu tương tựở tốc độkhông đổi và sử dụng một số bit không đổi (thường là 8 hay 16) để biểu diễn các mẫu.

Tốc độ bit = tốc độ lấy mẫu * số bit/mẫu

Trang 20

Sampling and quantization

Why sampling?

use a fixed number of bits (usually 8 or 16) to

represent the samples.

sample

Quantization

precision) to discrete level (finite precision).

Trang 21

2.7 Mã hoá dự đoán:

Dùng các mẫu trước đó để ước lượng mẫu hiện thời.

Đối với hầu hết tín hiệu, sự khác nhau của giá trị dự đoán với giá trị thực tế là nhỏ -> ta có thể dùng số bit nhỏ hơn để mã hoá sự sai khác trong khi vẫn duy trì được cùng 1 độ chính xác.

Gửi đi độ sai khác của mẫu với giá trị dựđoán được tạo ra từ các mẫu trước

Hầu hết các Codec yêu cầu dữ liệu phải được xử lí trước, nếu

không Codec sẽ hoạt động kém khi có nhiễu.

Trang 22

Predictive Coding (bổ sung)

In predictive coding, rather than directly coding the data itself, the coded data consists of

a difference signal formed by subtracting a prediction of the data from the data

itself

The prediction for the current sample is usually formed using past data A predictive

encoder and decoder are shown in Figure, with the difference signal given by d If the internal loop states are initialized to the same values at the beginning of the signal, then y

= x

If the predictor is ideal at removing redundancy, then the difference signal contains

only the “new” information at each time instant that is unrelated to previous data

This “new” information is sometimes referred to as the innovation, and d is called the

innovations process If predictive coding is used, an appropriate predictor must be

determined.

Trang 23

Predictive coding

Prediction

sample.

and actual values is small Æ We can use smaller

number of bits to code the difference while

maintaining the same accuracy !!

Most codec requires the data being preprocessed or otherwise it may perform badly when the data contains noise.

Trang 24

2.8 Mã hoá thống kê: Mã Huffman

Gán mã ngắn cho mẫu có xác suất xuất hiện cao

và gán mã dài cho mẫu ít xuất hiện hơn

Î Sựgán bit dựa trên sự thống kê của dữliệu

nguồn.

Thống kê dữ liệu nguồn được thực hiện trước quá trình gán bit.

Còn gọi là VLC – Variable Length Coding

(Một ví dụ về Huffman code) Î Mã Morse

Trang 25

Statistical coding: the Huffman code

Assign short code to the most probable data pattern and long code to the less frequent

data pattern.

Bit assignment based on statistic of the

source data.

The statistics of the data should be known

prior to the bit assignment.

Trang 26

2.9 Nhược điểm của nén:

Dễ gây lỗi dữ liệu

lại là yếu tố cần thiết để ngăn cho dữ liệu không bị lỗi.

Đòi hỏi yêu cầu che giấu đối với các ứng dụng thời gian thực

vào dữ liệu nén.

Méo nhân tạo (Artifact):

Trang 27

Drawbacks of compression

Compression eliminates the redundancy which is essential

to making data resistant to errors.

Error correction code is required, hence, adds redundancy

to the compressed data.

Trang 28

2.10 Một ví dụ về mã hoá: Tập hợp các điểm màu.

Trong 1 tấm ảnh, giá trị điểm ảnh được tập hợp trong

vài cực đại.

Mỗi tập hợp đại diện cho 1 vùng màu của 1 đối tượng

trong ảnh (ví dụ: bầu trời xanh)

Quá trình mã hoá:

Chia giá trị điểm ảnh thành 1 số lượng giới hạn của các tập hợp

dữ liệu (VD: tập hợp các điểm ảnh của bầu trời xanh hay đồng

cỏ xanh)

Gửi thông tin của tấm ảnh bao gồm màu chính của mỗi tập hợp

và 1 con số nhận dạng cho mỗi tập hợp.

Với mỗi điểm ảnh, truyền đi:

Màu trung bình của vùng màu mà nó gần nhất

Sự khác nhau của nó so với tập hợp màu trung bình ( -> có thể

được mã hoá để giảm dư thừa khi mà các sự sai khác gần như

Trang 29

A coding example: Clustering color pixels

peaks

object in the image (e.g blue sky)

1 Separate the pixel values into a limited number of data clusters (e.g., clustered pixels of sky blue or grass green)

2 Send the average color of each cluster and an

identifying number for each cluster as side information.

3 Transmit, for each pixel:

The number of the average cluster color that it is close to.

Its difference from that average cluster color (Æ can be

coded to reduce redundancy since the differences are often similar !!) Æ Prediction

Trang 30

2.11 Mã hoá vi sai khung:

Mã hoá vi sai khung = dự đoán từ khung hình

trước đó.

1 khung hình được chứa trong bộ mã hoá để so sánh với khung hiện tại -> gây ra độ trễ 1 khung

Với ảnh tĩnh:

Chỉ cần gửi dữ liệu của 1 khung đầu tiên

Toàn bộ sai số dự đoán sau có giá trị 0

Thỉnh thoảng truyền lại khung để cho phép bên nhận (nếu mới được bật) có được điểm khởi đầu

-> FDC giảm thông tin của ảnh tĩnh nhưng lại để sót lại khá nhiều dữ liệu cho ảnh động (VD: một chuyển động của camera)

Trang 31

Frame-Differential Coding

previous video frame

comparison with the present frame Æ causes

encoding latency of one frame time

Data can be sent only for the first instance of a frame

All subsequent prediction error values are zero

Retransmit the frame occasionally to allow receivers that

have just been turned on to have a starting point

leaves significant data for moving images (e.g a

movement of the camera)

Trang 32

2.12 Dự báo bù chuyển động

Dữ liệu trong FDC có thể bị loại bỏ bằng

cách so sánh điểm ảnh hiện tại với vị trí

của đối tượng tương ứng trong khung

hình trước đó (-> chứ không phải vị trí

không gian tương ứng trong khung trước

đó)

Bộ mã hoá ước lượng sự chuyển động

trong ảnh để tìm vùng tương ứng trong

khung hình trước đó

Bộ mã hoá tìm phần giống của khung

trước với khung mới sắp truyền đi.

Sau đó nó gửi 1 Véctơ chuyển động,

véctơ này sẽ cho bộ gi ải mã biết phần

nào của khung trước đó sẽ được dùng

để dự đoán khung mới

Đồng thời nó cũng gửi sai số dự đoán

để khôi phục khung mới

Sơ đồ trên -> không có bù chuyển động

Sơ đồ dưới -> có bù chuyển động.

Trang 33

Motion Compensated Prediction

More data in Frame-Differential Coding can

be eliminated by comparing the present

pixel to the location of the same object

in the previous frame (Æ not to the

same spatial location in the previous frame)

The encoder estimates the motion in the

image to find the corresponding area in a

previous frame.

The encoder searches for a portion of a

previous frame which is similar to the part

of the new frame to be transmitted

It then sends (as side information) a

motion vector telling the decoder what

portion of the previous frame it will use to

predict the new frame

It also sends the prediction error so that

the exact new frame may be reconstituted

See top figure Æ without motion

compensation – Bottom figure Æ With

motion compensation

Trang 34

Motion compensation (Bổ sung)

Actions:

1 Compute Motion Vector

2 Shift Data from Picture

N Using Vector to Make Predicted Picture N+1

3 Compare Actual Picture with Predicted Picture

4 Send Vector and Prediction Error

Trang 35

2.12.1 Thông tin không thể dự báo

Thông tin không thể dự báo từ khung trước

đó:

thay đổi)

do chuyển động của vật thể ngang qua nền,

hoặc rìa của khung phong cảnh (VD: khuôn mặt của cầu thủ bị che bởi trái bóng đang bay)

Trang 36

motion across a background, or at the edges of a panned scene (e.g a soccer ’s face uncovered

by a flying ball)

Trang 37

2.12.2 Xử lí thông tin không thể dự báo trước (bổ sung)

Phông thay đổi

Æ ảnh mã hoá trong phải được gửi đầu tiên ->yêu cầu nhiều dữ liệu hơn

ảnh dự đoán (P picture)

Ảnh mã hóa trong được gửi 2 lần/s -> Thời gian và tần số gửi có thể được

điều chỉnh để phù hợp với sự thay đổi phông.

Thông tin bị che khuất :

Ảnh mã hoá dự đoán hai chiều Bi-directionally

Trong hệ thống phải có đủ chỗ chứa khung để chờ ảnh phía sau để có được thông tin mong muốn.

Để giới hạn bộ nhớ của bộ giải mã, bộ mã hóa chứa các ảnh và gửi các ảnh tham khảo được yêu cầu trước khi gửi ảnh dự đoán hai chiều

Trong k ỹ thuật nén MPEG:

Các ảnh được nén trong được gọi là ảnh loại I (I picture)

Các ảnh được mã hóa chỉ sử dụng các ảnh tham chiếu ngược gọi là ảnh P hay ảnh dự đoán (P picture)

Các ảnh được mã hóa từ việc nội suy cả các ảnh tham chiếu ngược và tham

Trang 38

Dealing with unpredictable Information

Scene change

Æ An Intra-coded picture (MPEG I picture) must be sent for a

starting point Æ require more data than Predicted picture (P picture)

I pictures are sent about twice per second ÆTheir time and sending

frequency may be adjusted to accommodate scene changes

Uncovered information

Bi-directionally coded type of picture, or B picture

There must be enough frame storage in the system to wait for the

later picture that has the desired information

To limit the amount of decoder’s memory, the encoder stores

pictures and sends the required reference pictures before

sending the B picture

In MPEG:

Pictures which are intracoded only are termed I pictures;

Pictures which are encoded using only backward references are

termed P pictures for Predictive

Pictures which are encoded frominterpolation of both a backward reference and a forward reference are termed B pictures

Trang 39

2.13 Mã hoá biến đổi (Transform Coding)

giá trị của các hệ số biến đổi trong miền tần số

đổi

lượng) của ảnh Î các hệ số này có thể được mã

hoá tiếp bởi mã hoá entropy không tổn hao

số đặc biệt (chủ yếu là các hệ số có tần số thấp)

Trang 40

Transform Coding

Convert spatial image pixel values to

transform coefficient values

Æ the number of coefficients produced is

equal to the number of pixels transformed

Few coefficients contain most of the

energy in a picture Æ coefficients may be

further coded by lossless entropy coding

The transform process concentrates the

energy into particular coefficients

(generally the “low frequency” coefficients )

Tiêu đề	Nền tảng kỹ thuật nén
Trường học	Hanoi University of Technology
Chuyên ngành	Multimedia Technology
Thể loại	Giáo trình giới thiệu
Năm xuất bản	2006
Thành phố	Hà Nội

Định dạng
Số trang	171
Dung lượng	0,95 MB