Tài liệu Image processing P1 pptx

A monochrome image is a 2-dimensional light intensity function,fz, y, where x and y are spatial coordinates and the value off at X, y is proportional to the brightness of the image at th

Trang 1

Image Processing: The Fundamentals Maria Petrou and Panagiota Bosdogianni

Copyright 0 1999 John Wiley & Sons Ltd Print ISBN 0-471-99883-4 Electronic ISBN 0-470-84190-7

Introduction

Image Processing has been developed in response t o three major problems concerned

with pictures:

0 Picture digitization and coding t o facilitate transmission, printing and storage

0 Picture enhancement and restoration in order, for example, t o interpret more

of pictures

easily pictures of the surface of other planets taken by various probes

0 Picture segmentation and description as an early stage in Machine Vision

What is an image?

A monochrome image is a 2-dimensional light intensity function,f(z, y), where x and

y are spatial coordinates and the value off at (X, y) is proportional to the brightness of

the image at that point If we have a multicolour image, f is a vector, each component

of which indicates the brightness of the image at point (X, y) at the corresponding

colour band

A digital image is an image f ( z , y) that has been discretized both in spatial co-

ordinates and in brightness It is represented by a 2-dimensional integer array, or

a series of 2-dimensional arrays, one for each colour band The digitized brightness

value is called the grey level value

Each element of the array is called a pzxel or a pel derived from the term “picture

element” Usually, the size of such an array is a few hundred pixels by a few hundred

pixels and there are several dozens of possible different grey levels Thus, a digital

image looks like this:

Trang 2

with 0 5 f(z, y) 5 G - 1 where usually N and G are expressed as integer powers of

2 ( N = 2n, G = 2m)

What is the brightness of an image at a pixel position?

Each pixel of an image corresponds t o a part of a physical object in the 3D world

This physical object is illuminated by some light which is partly reflected and partly absorbed by it Part of the reflected light reaches the sensor used to image the scene and is responsible for the value recorded for the specific pixel The recorded value

of course, depends on the type of sensor used t o image the scene, and the way this sensor responds to the spectrum of the reflected light However, as a whole scene

is imaged by the same sensor, we usually ignore these details What is important

t o remember is that the brightness values of different pixels have significance only relative t o each other and they are meaningless in absolute terms So, pixel values between different images should only be compared if either care has been taken for the physical processes used t o form the two images t o be identical, or the brightness values of the two images have somehow been normalized so that the effects of the different physical processes have been removed

Why are images often quoted as being 512 X 512, 256 X 256, 128 X 128 etc?

Many image calculations with images are simplified when the size of the image is a power of 2

How many bits do we need to store an image?

The number of bits, b, we need t o store an image of size N X N with 2m different grey

levels is:

So, for a typical 512 X 512 image with 256 grey levels ( m = 8) we need 2,097,152

bits or 262,144 8-bit bytes That is why we often try to reduce m and N , without significant loss in the quality of the picture

What is meant by image resolution?

The resolution of an image expresses how much detail we can see in it and clearly depends on both N and m

Keeping m constant and decreasing N results in the checkerboard effect (Figure

1.1) Keeping N constant and reducing m results in false contouring (Figure 1.2)

Experiments have shown that the more detailed a picture is, the less it improves by keeping N constant and increasing m So, for a detailed picture, like a picture of crowds (Figure 1.3), the number of grey levels we use does not matter much

Trang 6

How do we do Image Processing?

We perform Image Processing by using Image Transformations Image Transforma- tions are performed using Operators An Operator takes as input an image and

produces another image In this book we shall concentrate mainly on a particular class of operators, called Linear Operators

What is a linear operator?

Consider 0 t o be an operator which takes images into images If f is an image, 0(f)

is the result of applying 0 t o f 0 is linear if

for all images f and g and all scalars a and b

How are operators defined?

Operators are defined in terms of their point spread functions The point spread

function of an operator is what we get out if we apply the operator on a point source:

Or:

How does an operator transform an image?

If the operator is linear, when the point source is a times brighter, the result will be

a times larger:

An image is a collection of point sources (the pixeZs) each with its own brightness value We may say that an image is the sum of these point sources Then the effect of

an operator characterized by point spread function h ( z , a , y , P ) on an image f ( z , y) can be written as:

z=o y=o

where g ( a , P ) is the output “image”, f(z, y) is the input image and the size of the images is N X N

Trang 7

Introduction 7

The point spread function h(z, a , y, P) expresses how much the input value at position

(z, y) influences the output value at position ( a , P) If the influence expressed by the point spread function is independent of the actual positions but depends only on the

relative position of the influencing and the influenced pixels, we have a shift invariant point spread function:

Then equation (1.6) is a convolution:

N-l N-l

2=0 y=o

If the columns are influenced independently from the rows of the image, then the point spread function is separable:

h(x7 a7 Y, P) = M z , a)h,(y, P) (1.9)

where the above expression serves also as the definition of functions h,(z,a) and

h,(y, P) Then equation (1.6) can be written as a cascade of two 1D transformations:

(1.10)

2=0 y=o

If the point spread function is both shift invariant and separable, then equation

(1.6) can be written as a cascade of two 1D convolutions:

N-l N-l

x = o y=o

B 1 l : The formal definition of a point source in the continuous domain

Define an extended source of constant brightness:

Sn(x, y) = n2rect(nx, ny)

where n is a positive constant and

(1.12)

l inside a rectangle 1nx1 5 $ , lny l 5 1

0 elsewhere

The total brightness of this source is given by

Trang 8

/" /" S n ( x , y)dxdY = n2 111: rect(nz,ny)dzdy = 1 (1.14)

-cc -cc

Y

area of rectangle

and is independent of n

As n t 00, we create a sequence, S,, of extended square sources which grad- ually shrink with their brightness remaining constant At the limit, S, becomes Dirac's delta function

# O f o r z = y = O

{ = 0 elsewhere with the property

LL S(2,y)dzdy = 1

The integral

(1.15)

(1.16)

(1.17)

is the average of image g(z, y) over a square with sides centred at (0,O) At the limit we have:

Scc Scc S(z, y)g(n:, y)dzdy = g @ , 0) (1.18)

CO CO

which is the value of the image at the origin Similarly

(1.19)

is the average value of g over a square k X k centred at n: = a , y = b, since:

Sn(z - a , y - b ) = n'rect[n(z - a ) , n(y - b ) ]

We can see that this is a square source centred at ( a , b ) by considering that

In( -.)I 5 + means -+ 5 n(n: - U ) 5 + i.e -& < n: - a < & or a - 1 < <

a + & T h u s w e h a v e t h a t S , ( n : - a , y - b ) = n 2 i n t h e r e g i o n a - ~ < n : < a + n ,

At the limit of n + 00, integral (1.19) is the value of the image g at n: = a ,

L n - -

1

b - 2 ; ; < y < b + 2 ; ; 1

y = b, i.e

Trang 9

Introduction 9

S" SW S(%, Y)Sn(z - a , Y - b)dzdy = d a , b ) (1.21)

-W -"

This equation is called the shifting property of the delta function This equation

also shows that any image g ( z , y) can be expressed as a superposition of point sources

How can we express in practice the effect of a linear operator on an image?

This is done with the help of matrices We can rewrite equation (1.6) as follows:

g ( a , P ) =

f(O,O)h(O, Q, 0, P) + f(l,O)h(l, a , 0, P ) + + f ( N - 1 , O ) W - 1, a , 0, P )

+f(O, l)h(O, a , 1, P ) + f ( l , l ) h ( l , a , 1, P ) + + f ( N - 1 , l ) W - 1, Q, 1, P)

+ + f(0, N - l)h(O, a , N - 1,P) + f ( 1 , N - l ) h ( l , a , N - 1,P) +

+ f ( N - 1, N - l ) h ( N - 1, a , N - 1, P ) (1.22)

The right hand side of this expression can be thought of as the dot product of vector

with vector

This last vector is actually the image f(z, g ) written as a vector by stacking its columns one under the other If we imagine writing g ( a , P ) in the same way, then vectors will arrange themselves as the rows of a matrix H , where for a = 0, P

will run from 0 t o N - 1 t o give the first N rows of the matrix, then for a = 1, P will run again from 0 to N - 1 t o give the second N rows of the matrix, and so on Thus, equation (1.6) can be written in a more compact way as:

(1.25)

This is the Fundamental Equation of linear Image Processing H here is a square

N 2 X N 2 matrix that is made up of N X N submatrices of size N X N each, arranged

Trang 10

in the following way:

X +

In this representation each bracketed expression represents an N X N submatrix

made up from function h(%, a , y, p) for fixed values of y and p and with variables II:

and a taking up all their possible values in the directions indicated by the arrows This schematic structure of matrix H is said t o correspond t o a partition of this matrix

into N 2 square submatrices

B1.2: What is the s t a c k i n g o p e r a t o r ?

The stacking operator allows us to write an N X N image array as an N 2 X 1

vector, or an N 2 X 1 vector as an N X N square array

We define some vectors V,, and some matrices N , as:

N , =

L

1 0 0

0 1 0

0 0 l

0

i n - 1 square N X N ma-

trices on top of each other with all their elements 0

i

the nth matrix is the unit

matrix

N - n square N X N ma-

trices on the top of each other with all their elements 0

Trang 11

Introduction 11

The dimensions of V, are (N X 1) and of Nn (N2 X N) Then vector f which corresponds to the (N X N) square matrix f is given by:

(1.27)

It can be shown that if f is an N2 X 1 vector, we can write it as an N X N

matrix f the first column of which is made up from the first N elements o f f , t h e

second column from the second N elements of f , and so on, by using the following

expression:

N

f = C N,TfVz

n = l

(1.28)

Example 1.1 (B)

You are given a 3 X 3 image f and you are asked to use the stacking operator to write it in vector form

Let us say that:

1 3 1 f 3 2

W e define vectors V, and matrices N , for n = 1 , 2 , 3 :

, N 2 =

, N 3 =

Trang 12

According to equation (1.27):

f = NI f V 1 +NZ f V 2 + N 3 f V 3

W e shall calculate each t e r m separately:

N l f V 1 =

Similarly

(1 0 0

\ o 0 0 '1 0 0

N 2 f V 2 =

0

f 1 2

f 2 2

f 3 2

0

f l l f12 f21 f22 f31 f32

(9;) =

fll

f21

f 3 1

0

7 N 3 f V 3 =

0

f 1 3

f 2 3

f 3 3

(1.29)

(1.30)

T h e n by substituting in (1.29) we get vector f

Trang 14

I1 Similarly

11 Then b y substituting in (1.31) we g e t matrix f

What is the implication of the separability assumption on the structure of

According to the separability assumption, we can replace h ( z , a , y , p) by h,(z, a)hr(y, p)

Then inside each partition of H in equation (1.26), h c ( z , a ) remains constant and we may write for H :

( hcoo ( hrOO

hrON-l

hTOO

hcol ( '7'

hrON-l

Trang 15

Introduction 15

How can a separable transform be written in matrix form?

Consider again equation (1.10) which expresses the separable linear transform of an

image:

(1.35)

Notice that factor Cril f ( z , y)h,(y,P) actually represents the product of two

N X N matrices, which must be another matrix of the same size Let us define it as:

Then (1.35) can be written as:

N l

2=0

Thus in matrix form

m

(1.36)

(1.37)

(1.38)

The separability assumption implies that our operator 0 (the point spread function

of which, h ( z , a , y , P ) , is separable) operates on the rows of the image matrix f

independently from the way it operates on its columns These independent operations

are expressed by the two matrices h, and h, respectively That is why we chose subscripts T and c to denote these matrices ( r = rows, c = columns)

B1.3 The formal derivation of the separable matrix equation

We can use equations (1.27) and (1.28) with (1.25) as follows: First express the

output image g using (1.28) in terms of g :

N

m = l

Then express g in terms of H and f from (1.25) and replace f in terms of f using

(1.27):

(1.40)

Trang 16

Substitute (1.40) into (1.39) and group factors with the help of brackets to get:

(1.41)

m=l n=l

H is a ( N 2 X N 2 ) matrix We may think of it as partitioned in N X N submatrices stacked together Then it can be shown that NKHN, is the H,, such submatrix Under the separability assumption, matrix H is the Kronecker product of matrices

h, and h,:

Then partition H,, is essentially hT(m, n)hT If we substitute this in (1.41) we obtain:

N N

(1.43)

m = l n = l

The product V n V z is the product between an ( N X 1) matrix with the only non-zero element at position n , with a (l X N ) matrix, with the only non-zero

element at position m So, it is an N X N square matrix with the only non-zero

element at position ( n , m )

When multiplied by h T ( m , n ) it places the ( m , n ) element of the h? matrix

in position ( n , m ) and sets to zero all other elements The sum over all m’s and n’s is h, So from (1.43) we have:

You are given a 9 X 9 matrix H which is partitioned into nine 3 X 3 submatrices Show that N T H N 3 , where N2 and N3 are matrices of the

stacking operator, is partition H23 of matrix H

Let us say that:

Tiêu đề	Introduction
Tác giả	Maria Petrou, Panagiota Bosdogianni
Chuyên ngành	Image Processing
Thể loại	Book
Năm xuất bản	1999

Định dạng
Số trang	20
Dung lượng	5,25 MB