A monochrome image is a 2-dimensional light intensity function,fz, y, where x and y are spatial coordinates and the value off at X, y is proportional to the brightness of the image at th
Trang 1Image Processing: The Fundamentals Maria Petrou and Panagiota Bosdogianni
Copyright 0 1999 John Wiley & Sons Ltd Print ISBN 0-471-99883-4 Electronic ISBN 0-470-84190-7
Introduction
Image Processing has been developed in response t o three major problems concerned
with pictures:
0 Picture digitization and coding t o facilitate transmission, printing and storage
0 Picture enhancement and restoration in order, for example, t o interpret more
of pictures
easily pictures of the surface of other planets taken by various probes
0 Picture segmentation and description as an early stage in Machine Vision
What is an image?
A monochrome image is a 2-dimensional light intensity function,f(z, y), where x and
y are spatial coordinates and the value off at (X, y) is proportional to the brightness of
the image at that point If we have a multicolour image, f is a vector, each component
of which indicates the brightness of the image at point (X, y) at the corresponding
colour band
A digital image is an image f ( z , y) that has been discretized both in spatial co-
ordinates and in brightness It is represented by a 2-dimensional integer array, or
a series of 2-dimensional arrays, one for each colour band The digitized brightness
value is called the grey level value
Each element of the array is called a pzxel or a pel derived from the term “picture
element” Usually, the size of such an array is a few hundred pixels by a few hundred
pixels and there are several dozens of possible different grey levels Thus, a digital
image looks like this:
Trang 2with 0 5 f(z, y) 5 G - 1 where usually N and G are expressed as integer powers of
2 ( N = 2n, G = 2m)
What is the brightness of an image at a pixel position?
Each pixel of an image corresponds t o a part of a physical object in the 3D world
This physical object is illuminated by some light which is partly reflected and partly absorbed by it Part of the reflected light reaches the sensor used to image the scene and is responsible for the value recorded for the specific pixel The recorded value
of course, depends on the type of sensor used t o image the scene, and the way this sensor responds to the spectrum of the reflected light However, as a whole scene
is imaged by the same sensor, we usually ignore these details What is important
t o remember is that the brightness values of different pixels have significance only relative t o each other and they are meaningless in absolute terms So, pixel values between different images should only be compared if either care has been taken for the physical processes used t o form the two images t o be identical, or the brightness values of the two images have somehow been normalized so that the effects of the different physical processes have been removed
Why are images often quoted as being 512 X 512, 256 X 256, 128 X 128 etc?
Many image calculations with images are simplified when the size of the image is a power of 2
How many bits do we need to store an image?
The number of bits, b, we need t o store an image of size N X N with 2m different grey
levels is:
So, for a typical 512 X 512 image with 256 grey levels ( m = 8) we need 2,097,152
bits or 262,144 8-bit bytes That is why we often try to reduce m and N , without significant loss in the quality of the picture
What is meant by image resolution?
The resolution of an image expresses how much detail we can see in it and clearly depends on both N and m
Keeping m constant and decreasing N results in the checkerboard effect (Figure
1.1) Keeping N constant and reducing m results in false contouring (Figure 1.2)
Experiments have shown that the more detailed a picture is, the less it improves by keeping N constant and increasing m So, for a detailed picture, like a picture of crowds (Figure 1.3), the number of grey levels we use does not matter much
Trang 6How do we do Image Processing?
We perform Image Processing by using Image Transformations Image Transforma- tions are performed using Operators An Operator takes as input an image and
produces another image In this book we shall concentrate mainly on a particular class of operators, called Linear Operators
What is a linear operator?
Consider 0 t o be an operator which takes images into images If f is an image, 0(f)
is the result of applying 0 t o f 0 is linear if
for all images f and g and all scalars a and b
How are operators defined?
Operators are defined in terms of their point spread functions The point spread
function of an operator is what we get out if we apply the operator on a point source:
Or:
How does an operator transform an image?
If the operator is linear, when the point source is a times brighter, the result will be
a times larger:
An image is a collection of point sources (the pixeZs) each with its own brightness value We may say that an image is the sum of these point sources Then the effect of
an operator characterized by point spread function h ( z , a , y , P ) on an image f ( z , y) can be written as:
z=o y=o
where g ( a , P ) is the output “image”, f(z, y) is the input image and the size of the images is N X N
Trang 7Introduction 7
The point spread function h(z, a , y, P) expresses how much the input value at position
(z, y) influences the output value at position ( a , P) If the influence expressed by the point spread function is independent of the actual positions but depends only on the
relative position of the influencing and the influenced pixels, we have a shift invariant point spread function:
Then equation (1.6) is a convolution:
N-l N-l
2=0 y=o
If the columns are influenced independently from the rows of the image, then the point spread function is separable:
h(x7 a7 Y, P) = M z , a)h,(y, P) (1.9)
where the above expression serves also as the definition of functions h,(z,a) and
h,(y, P) Then equation (1.6) can be written as a cascade of two 1D transformations:
(1.10)
2=0 y=o
If the point spread function is both shift invariant and separable, then equation
(1.6) can be written as a cascade of two 1D convolutions:
N-l N-l
x = o y=o
B 1 l : The formal definition of a point source in the continuous domain
Define an extended source of constant brightness:
Sn(x, y) = n2rect(nx, ny)
where n is a positive constant and
(1.12)
l inside a rectangle 1nx1 5 $ , lny l 5 1
0 elsewhere
The total brightness of this source is given by
Trang 8/" /" S n ( x , y)dxdY = n2 111: rect(nz,ny)dzdy = 1 (1.14)
-cc -cc
Y
area of rectangle
and is independent of n
As n t 00, we create a sequence, S,, of extended square sources which grad- ually shrink with their brightness remaining constant At the limit, S, becomes Dirac's delta function
# O f o r z = y = O
{ = 0 elsewhere with the property
LL S(2,y)dzdy = 1
The integral
(1.15)
(1.16)
(1.17)
is the average of image g(z, y) over a square with sides centred at (0,O) At the limit we have:
Scc Scc S(z, y)g(n:, y)dzdy = g @ , 0) (1.18)
CO CO
which is the value of the image at the origin Similarly
(1.19)
is the average value of g over a square k X k centred at n: = a , y = b, since:
Sn(z - a , y - b ) = n'rect[n(z - a ) , n(y - b ) ]
We can see that this is a square source centred at ( a , b ) by considering that
In( -.)I 5 + means -+ 5 n(n: - U ) 5 + i.e -& < n: - a < & or a - 1 < <
a + & T h u s w e h a v e t h a t S , ( n : - a , y - b ) = n 2 i n t h e r e g i o n a - ~ < n : < a + n ,
At the limit of n + 00, integral (1.19) is the value of the image g at n: = a ,
L n - -
1
b - 2 ; ; < y < b + 2 ; ; 1
y = b, i.e
Trang 9Introduction 9
S" SW S(%, Y)Sn(z - a , Y - b)dzdy = d a , b ) (1.21)
-W -"
This equation is called the shifting property of the delta function This equation
also shows that any image g ( z , y) can be expressed as a superposition of point sources
How can we express in practice the effect of a linear operator on an image?
This is done with the help of matrices We can rewrite equation (1.6) as follows:
g ( a , P ) =
f(O,O)h(O, Q, 0, P) + f(l,O)h(l, a , 0, P ) + + f ( N - 1 , O ) W - 1, a , 0, P )
+f(O, l)h(O, a , 1, P ) + f ( l , l ) h ( l , a , 1, P ) + + f ( N - 1 , l ) W - 1, Q, 1, P)
+ + f(0, N - l)h(O, a , N - 1,P) + f ( 1 , N - l ) h ( l , a , N - 1,P) +
+ f ( N - 1, N - l ) h ( N - 1, a , N - 1, P ) (1.22)
The right hand side of this expression can be thought of as the dot product of vector
with vector
This last vector is actually the image f(z, g ) written as a vector by stacking its columns one under the other If we imagine writing g ( a , P ) in the same way, then vectors will arrange themselves as the rows of a matrix H , where for a = 0, P
will run from 0 t o N - 1 t o give the first N rows of the matrix, then for a = 1, P will run again from 0 to N - 1 t o give the second N rows of the matrix, and so on Thus, equation (1.6) can be written in a more compact way as:
(1.25)
This is the Fundamental Equation of linear Image Processing H here is a square
N 2 X N 2 matrix that is made up of N X N submatrices of size N X N each, arranged
Trang 10in the following way:
X +
In this representation each bracketed expression represents an N X N submatrix
made up from function h(%, a , y, p) for fixed values of y and p and with variables II:
and a taking up all their possible values in the directions indicated by the arrows This schematic structure of matrix H is said t o correspond t o a partition of this matrix
into N 2 square submatrices
B1.2: What is the s t a c k i n g o p e r a t o r ?
The stacking operator allows us to write an N X N image array as an N 2 X 1
vector, or an N 2 X 1 vector as an N X N square array
We define some vectors V,, and some matrices N , as:
N , =
L
1 0 0
0 1 0
0 0 l
0
i n - 1 square N X N ma-
trices on top of each other with all their elements 0
i
i
the nth matrix is the unit
matrix
N - n square N X N ma-
trices on the top of each other with all their ele- ments 0
Trang 11Introduction 11
The dimensions of V, are (N X 1) and of Nn (N2 X N) Then vector f which corresponds to the (N X N) square matrix f is given by:
(1.27)
It can be shown that if f is an N2 X 1 vector, we can write it as an N X N
matrix f the first column of which is made up from the first N elements o f f , t h e
second column from the second N elements of f , and so on, by using the following
expression:
N
f = C N,TfVz
n = l
(1.28)
Example 1.1 (B)
You are given a 3 X 3 image f and you are asked to use the stacking operator to write it in vector form
Let us say that:
1 3 1 f 3 2
W e define vectors V, and matrices N , for n = 1 , 2 , 3 :
, N 2 =
, N 3 =
Trang 12According to equation (1.27):
f = NI f V 1 +NZ f V 2 + N 3 f V 3
W e shall calculate each t e r m separately:
N l f V 1 =
N l f V 1 =
Similarly
(1 0 0
\ o 0 0 '1 0 0
N 2 f V 2 =
0
0
0
f 1 2
f 2 2
f 3 2
0
0
0
f l l f12 f21 f22 f31 f32
(9;) =
fll
f21
f 3 1
0
0
0
0
0
0
7 N 3 f V 3 =
0
0
0
0
0
0
f 1 3
f 2 3
f 3 3
(1.29)
(1.30)
T h e n by substituting in (1.29) we get vector f
Trang 14I1 Similarly
11 Then b y substituting in (1.31) we g e t matrix f
What is the implication of the separability assumption on the structure of
According to the separability assumption, we can replace h ( z , a , y , p) by h,(z, a)hr(y, p)
Then inside each partition of H in equation (1.26), h c ( z , a ) remains constant and we may write for H :
( hcoo ( hrOO
hrON-l
hTOO
hcol ( '7'
hrON-l
Trang 15
Introduction 15
How can a separable transform be written in matrix form?
Consider again equation (1.10) which expresses the separable linear transform of an
image:
(1.35)
Notice that factor Cril f ( z , y)h,(y,P) actually represents the product of two
N X N matrices, which must be another matrix of the same size Let us define it as:
Then (1.35) can be written as:
N l
2=0
Thus in matrix form
m
(1.36)
(1.37)
(1.38)
The separability assumption implies that our operator 0 (the point spread function
of which, h ( z , a , y , P ) , is separable) operates on the rows of the image matrix f
independently from the way it operates on its columns These independent operations
are expressed by the two matrices h, and h, respectively That is why we chose subscripts T and c to denote these matrices ( r = rows, c = columns)
B1.3 The formal derivation of the separable matrix equation
We can use equations (1.27) and (1.28) with (1.25) as follows: First express the
output image g using (1.28) in terms of g :
N
m = l
Then express g in terms of H and f from (1.25) and replace f in terms of f using
(1.27):
(1.40)
Trang 16Substitute (1.40) into (1.39) and group factors with the help of brackets to get:
(1.41)
m=l n=l
H is a ( N 2 X N 2 ) matrix We may think of it as partitioned in N X N submatrices stacked together Then it can be shown that NKHN, is the H,, such submatrix Under the separability assumption, matrix H is the Kronecker product of matrices
h, and h,:
Then partition H,, is essentially hT(m, n)hT If we substitute this in (1.41) we obtain:
N N
(1.43)
m = l n = l
The product V n V z is the product between an ( N X 1) matrix with the only non-zero element at position n , with a (l X N ) matrix, with the only non-zero
element at position m So, it is an N X N square matrix with the only non-zero
element at position ( n , m )
When multiplied by h T ( m , n ) it places the ( m , n ) element of the h? matrix
in position ( n , m ) and sets to zero all other elements The sum over all m’s and n’s is h, So from (1.43) we have:
You are given a 9 X 9 matrix H which is partitioned into nine 3 X 3 submatrices Show that N T H N 3 , where N2 and N3 are matrices of the
stacking operator, is partition H23 of matrix H
Let us say that: