The Essential Guide to Image Processing- P17 pot

The image to the right gives the best wavelet packet decomposition of each spatialsegment.. The image to the right givesthe best wavelet packet decomposition for each spatial segment.. W

Trang 1

488 CHAPTER 18 Wavelet Image Compression

Time Frequency

(a)

Frequency

Time (b)

Frequency

Time (c)

Frequency

Time (d)

FIGURE 18.14

Tiling representations of several expansions for 1D signals (a) STFT-like decomposition;(b) wavelet decomposition; (c) wavelet packet decomposition, and (d) “anti-wavelet” packetdecomposition

Fig 18.14(d)highlights a wavelet packet expansion where the time-frequency attributesare exactly the reverse of the wavelet case: the expansion has good frequency resolution athigher frequencies, and good time localization at lower frequencies—we might call thisthe “anti-wavelet” packet There are a plethora of other options for the time-frequencyresolution tradeoff, and these all correspond to admissible wavelet packet choices.The extra adaptivity of the wavelet packet framework is obtained at the price ofadded computation in searching for the best wavelet packet basis, so an efﬁcient fastsearch algorithm is the key in applications involving wavelet packets The problem ofsearching for the best basis from the wavelet packet library for the compression problemusing an RD optimization framework and a fast tree-pruning algorithm was described

in[22]

The 1D wavelet packet bases can be easily extended to 2D by writing a 2D basis tion as the product of two 1D basis functions In another words, we can treat the rowsand columns of an image separately as 1D signals The performance gains associated with

func-wavelet packets are obviously image-dependent For difﬁcult images such as Barbara in

Fig 18.12, a wavelet packet decomposition shown inFig 18.15(a)gives much better

cod-ing performance than the wavelet decomposition The wavelet packet decoded Barbara

image at 0.1825 b/p is shown inFig 18.15(b), whose visual quality (or PSNR) is the same

as the wavelet SPIHT decoded Barbara image at 0.25 b/p inFig 18.12 The bit rate savingachieved by using a wavelet packet basis instead of the wavelet basis in this case is 27%

at the same visual quality

An important practical application of wavelet packet expansions is the FBI waveletscalar quantization (WSQ) standard for fingerprint image compression[23] Because ofthe complexity associated with adaptive wavelet packet transforms, the FBI WSQ standarduses a fixed wavelet packet decomposition in the transform stage The transform structurespecified by the FBI WSQ standard is shown inFig 18.16 It was designed for 500 dots perinch fingerprint images by spectral analysis and trial and error A total of 64 subbands aregenerated with a five-level wavelet packet decomposition Trials by the FBI have shownthat the WSQ standard benefited from having fine frequency partitions in the middlefrequency region containing the fingerprint ridge patterns

Trang 2

(a) (b)

FIGURE 18.15

(a) A wavelet packet decomposition for the Barbara image White lines represent frequency

boundaries Highpass bands are processed for display; (b) Wavelet packet decoded Barbara at

p/2

p

p 0

FIGURE 18.16

The wavelet packet transform structure given in the FBI WSQ speciﬁcation The number

seq-uence shows the labeling of the different subbands

Trang 3

FIGURE 18.17

Space-frequency segmentation and tiling for the Building image The image to the left shows

that spatial segmentation separates the sky in the background from the building and the pond inthe foreground The image to the right gives the best wavelet packet decomposition of each spatialsegment Dark lines represent spatial segments; white lines represent subband boundaries ofwavelet packet decompositions Note that the upper-left corners are the lowpass bands of waveletpacket decompositions

As an extension of adaptive wavelet packet transforms, one can introduce variation by segmenting the signal in time and allowing the wavelet packet bases to evolvewith the signal The result is a time-varying transform coding scheme that can adapt tosignal nonstationarities Computationally fast algorithms are again very important forﬁnding the optimal signal expansions in such a time-varying system For 2D images, thesimplest of these algorithms performs adaptive frequency segmentations over regions ofthe image selected through a quadtree decomposition More complicated algorithms pro-vide combinations of frequency decomposition and spatial segmentation These jointlyadaptive algorithms work particularly well for highly nonstationary images.Figure 18.17

time-shows the space-frequency tree segmentation and tiling for the Building image[24] Theimage to the left shows the spatial segmentation result that separates the sky in the back-ground from the building and the pond in the foreground The image to the right givesthe best wavelet packet decomposition for each spatial segment

18.7 JPEG2000 AND RELATED DEVELOPMENTS

JPEG2000 by default employs the dyadic wavelet transform for natural images in manystandard applications It also allows the choice of the more general wavelet packet trans-forms for certain types of imagery (e.g., ﬁngerprints and radar images) Instead of usingthe zerotree-based SPIHT algorithm, JPEG2000 relies on embedded block coding with

Trang 4

optimized truncation (EBCOT) [25] to provide a rich set of features such as quality

scalability, resolution scalability, spatial random access, and region-of-interest coding

Besides robustness to image type changes in terms of compression performance, the

main advantage of the block-based EBCOT algorithm is that it provides easier random

access to local image components On the other hand, both encoding and decoding in

SPIHT require nonlocal memory access to the whole tree of wavelet coefﬁcients,

caus-ing reduction in throughput when codcaus-ing large-size images A thorough description

of the JPEG2000 standard is in[1] Other JPEG2000 related references areChapter 17

and[26, 27]

Although this chapter is about wavelet coding of 2D images, the wavelet coding

framework and its extension to wavelet packets apply to 3D video as well Recent research

works (see[28]and references therein) on 3D scalable wavelet video coders based on the

framework of motion-compensated temporal ﬁltering (MCTF)[29]have shown

com-petitive or better performance than the best MC-DCT-based standard video coder (e.g.,

H.264/AVC[30]) They have stirred considerable excitement in the video coding

com-munity and stimulated research efforts toward subband/wavelet interframe video coding,

especially in the area of scalable motion coding[31]within the context of MCTF MCTF

can be conceptually viewed as the extension of wavelet-based coding in JPEG2000 from

2D images to 3D video It nicely combines scalability features of wavelet-based coding

with motion compensation, which has been proven to be very efﬁcient and necessary in

MC-DCT-based standard video coders We refer the readers to a recent special issue[32]

on the latest results andChapter 11in[9]for an exposition of 3D subband/wavelet video

coding

18.8 CONCLUSION

Since the introduction of wavelets as a signal processing tool in the late 1980s, a variety of

wavelet-based coding algorithms have advanced the limits of compression performance

well beyond that of the current commercial JPEG image coding standard In this chapter,

we have provided very simple high-level insights, based on the intuitive concept of

time-frequency representations, into why wavelets are good for image coding After introducing

the salient aspects of the compression problem in general and the transform coding

problem in particular, we have highlighted the key important differences between the

early class of subband coders and the more advanced class of modern-day wavelet image

coders Selecting the EZW coding structure embodied in the celebrated SPIHT algorithm

as a representative of this latter class, we have detailed its operation by using a simple

illustrative example We have also described the role of wavelet packets as a simple but

powerful generalization of the wavelet decomposition in order to offer a more robust

and adaptive transform image coding framework

JPEG2000 is the result of the rapid progress made in wavelet image coding research in

the 1990s The triumph of wavelet transform in the evolution of the JPEG2000 standard

underlines the importance of the fundamental insights provided in this chapter into why

wavelets are so attractive for image compression

Trang 5

REFERENCES

[1] D Taubman and M Marcellin JPEG2000: Image Compression Fundamentals, Standards, and Practice Kluwer, New York, 2001.

[2] G Strang and T Nguyen Wavelets and Filter Banks Wellesley-Cambridge Press, New York, 1996.

[3] M Vetterli and J Kovaˇcevi´c Wavelets and Subband Coding Prentice-Hall, Englewood Cliffs, NJ,

[12] M W Marcellin and T R Fischer Trellis coded quantization of memoryless and Gauss-Markov

sources IEEE Trans Commun., 38(1):82–93, 1990.

[13] T Berger Rate Distortion Theory Prentice-Hall, Englewood Cliffs, NJ, 1971.

[14] N Farvardin and J W Modestino Optimum quantizer performance for a class of non-Gaussian

memoryless sources IEEE Trans Inf Theory, 30:485–497, 1984.

[15] D A Huffman A method for the construction of minimum redundancy codes Proc IRE, 40:

1098–1101, 1952.

[16] T C Bell, J G Cleary, and I H Witten Text Compression Prentice-Hall, Englewood Cliffs, NJ,

1990.

[17] J W Woods, editor Subband Image Coding Kluwer Academic, Boston, MA, 1991.

[18] M Antonini, M Barlaud, P Mathieu, and I Daubechies Image coding using wavelet transform.

IEEE Trans Image Process., 1(2):205–220, 1992.

[19] J Shapiro Embedded image coding using zero-trees of wavelet coefﬁcients IEEE Trans Signal Process., 41(12):3445–3462, 1993.

[20] A Said and W A Pearlman A new, fast, and efﬁcient image codec based on set partitioning in

hierarchical trees IEEE Trans Circuits Syst Video Technol., 6(3):243–250, 1996.

[21] R R Coifman and M V Wickerhauser Entropy based algorithms for best basis selection IEEE Trans Inf Theory, 32:712–718, 1992.

[22] K Ramchandran and M Vetterli Best wavelet packet bases in a rate-distortion sense IEEE Trans Image Process., 2(2):160–175, 1992.

[23] Criminal Justice Information Services WSQ Gray-Scale Fingerprint Image Compression Speciﬁcation (Ver 2.0) Federal Bureau of Investigation, 1993.

Trang 6

[24] K Ramchandran, Z Xiong, K Asai, and M Vetterli Adaptive transforms for image coding using

spatially-varying wavelet packets IEEE Trans Image Process., 5:1197–1204, 1996.

[25] D Taubman High performance scalable image compression with EBCOT IEEE Trans Image

Process., 9(7):1151–1170, 2000.

[26] Special Issue on JPEG2000 Signal Process Image Commun., 17(1), 2002.

[27] D Taubman and M Marcellin JPEG2000: standard for interactive imaging Proc IEEE, 90(8):

1336–1357, 2002.

[28] J Ohm, M van der Schaar, and J Woods Interframe wavelet coding – motion picture representation

for universal scalability Signal Process Image Commun., 19(9):877–908, 2004.

[29] S.-T Hsiang and J Woods Embedded video coding using invertible motion compensated 3D

subband/wavelet ﬁlter bank Signal Process Image Commun., 16(8):705–724, 2001.

[30] T Wiegand, G Sullivan, G Bjintegaard, and A Luthra Overview of the H.264/AVC video coding

standard IEEE Trans Circuits Syst Video Technol., 13:560–576, 2003.

[31] A Secker and D Taubman Highly scalable video compression with scalable motion coding IEEE

Trans Image Process., 13(8):1029–1041, 2004.

[32] Special issue on subband/wavelet interframe video coding Signal Process Image Commun., 19,

2004.

Trang 7

19

Gradient and Laplacian

Edge Detection

Phillip A Mlsna 1 and Jeffrey J Rodríguez 2

1Northern Arizona University;2University of Arizona

19.1 INTRODUCTION

One of the most fundamental image analysis operations is edge detection Edges are

often vital clues toward the analysis and interpretation of image information, both in

biological vision and in computer image analysis Some sort of edge detection capability

is present in the visual systems of a wide variety of creatures, so it is obviously useful in

their abilities to perceive their surroundings

For this discussion, it is important to deﬁne what is and is not meant by the term

“edge.” The everyday notion of an edge is usually a physical one, caused by either the

shapes of physical objects in three dimensions or by their inherent material properties

Described in geometric terms, there are two types of physical edges: (1) the set of points

along which there is an abrupt change in local orientation of a physical surface and (2) the

set of points describing the boundary between two or more materially distinct regions of

a physical surface Most of our perceptual senses, including vision, operate at a distance

and gather information using receptors that work in, at most, two dimensions Only

the sense of touch, which requires direct contact to stimulate the skin’s pressure sensors,

is capable of direct perception of objects in three-dimensional (3D) space However,

some physical edges of the second type may not be perceptible by touch because material

differences—for instance different colors of paint—do not always produce distinct tactile

sensations Everyone ﬁrst develops a working understanding of physical edges in early

childhood by touching and handling every object within reach

The imaging process inherently performs a projection from a 3D scene to a

two-dimensional (2D) representation of that scene, according to the viewpoint of the imaging

device Because of this projection process, edges in images have a somewhat different

meaning than physical edges Although the precise deﬁnition depends on the

applica-tion context, an edge can generally be deﬁned as a boundary or contour that separates

adjacent image regions having relatively distinct characteristics according to some

fea-ture of interest Most often this feafea-ture is gray level or luminance, but others, such as

495

Trang 8

reﬂectance, color, or texture, are sometimes used In the most common situation whereluminance is of primary interest, edge pixels are those at the locations of abrupt gray levelchange To eliminate single-point impulses from consideration as edge pixels, one usuallyrequires that edges be sustained along a contour; i.e., an edge point must be part of anedge structure having some minimum extent appropriate for the scale of interest Edgedetection is the process of determining which pixels are the edge pixels The result of theedge detection process is typically an edge map, a new image that describes each originalpixel’s edge classiﬁcation and perhaps additional edge attributes, such as magnitude andorientation.

There is usually a strong correspondence between the physical edges of a set of objectsand the edges in images containing views of those objects Infants and young childrenlearn this as they develop hand–eye coordination, gradually associating visual patternswith touch sensations as they feel and handle items in their vicinity There are manysituations, however, in which edges in an image do not correspond to physical edges Illu-mination differences are usually responsible for this effect—for example, the boundary

of a shadow cast across an otherwise uniform surface

Conversely, physical edges do not always give rise to edges in images This can also

be caused by certain cases of lighting and surface properties Consider what happenswhen one wishes to photograph a scene rich with physical edges—for example, a craggymountain face consisting of a single type of rock When this scene is imaged whilethe sun is directly behind the camera, no shadows are visible in the scene and henceshadow-dependent edges are nonexistent in the photo The only edges in such a photoare produced by the differences in material reﬂectance, texture, or color Since our rockysubject material has little variation of these types, the result is a rather dull photographbecause of the lack of apparent depth caused by the missing edges Thus images canexhibit edges having no physical counterpart, and they can also miss capturing edgesthat do Although edge information can be very useful in the initial stages of such imageprocessing and analysis tasks as segmentation, registration, and object recognition, edgesare not completely reliable for these purposes

If one deﬁnes an edge as an abrupt gray level change, then the derivative, or gradient,

is a natural basis for an edge detector.Figure 19.1illustrates the idea with a continuous,one-dimensional (1D) example of a bright central region against a dark background

The left-hand portion of the gray level function f c (x) shows a smooth transition from dark to bright as x increases There must be a point x0 that marks the transition fromthe low-amplitude region on the left to the adjacent high-amplitude region in the center

The gradient approach to detecting this edge is to locate x0wheref⬘

c (x)reaches a local

maximum or, equivalently, f⬘

c (x) reaches a local extremum, as shown in the second plot of

Fig 19.1 The second derivative, or Laplacian approach, locates x0where a zero-crossing

of f⬘⬘

c (x) occurs, as in the third plot ofFig 19.1 The right-hand side ofFig 19.1illustrates

the case for a falling edge located at x1

To use the gradient or the Laplacian approaches as the basis for practical image edgedetectors, one must extend the process to two dimensions, adapt to the discrete case, andsomehow deal with the difﬁculties presented by real images Relative to the 1D edges

Trang 9

Edge detection in the 1D continuous case; changes in f c(x) indicate edges, and x0and x1are

the edge locations found by local extrema of f c(x) or by zero-crossings of f⬘ ⬘⬘

c (x).

shown inFig 19.1, edges in 2D images have the additional quality of direction One

usually wishes to ﬁnd edges regardless of direction, but a directionally sensitive edge

detector can be useful at times Also, the discrete nature of digital images requires the

use of an approximation to the derivative Finally, there are a number of problems that

can confound the edge detection process in real images These include noise, crosstalk or

interference between nearby edges, and inaccuracies resulting from the use of a discrete

grid False edges, missing edges, and errors in edge location and orientation are often the

result

Because the derivative operator acts as a highpass ﬁlter, edge detectors based on

it are sensitive to noise It is easy for noise inherent in an image to corrupt the real

edges by shifting their apparent locations and by adding many false edge pixels Unless

care is taken, seemingly moderate amounts of noise are capable of overwhelming the

edge detection process, rendering the results virtually useless The wide variety of edge

detection algorithms developed over the past three decades exists, in large part, because

of the many ways proposed for dealing with noise and its effects Most algorithms employ

noise-suppression ﬁltering of some kind before applying the edge detector itself Some

decompose the image into a set of lowpass or bandpass versions, apply the edge detector

to each, and merge the results Still others use adaptive methods, modifying the edge

detector’s parameters and behavior according to the noise characteristics of the image

Trang 10

data Some recent work byMathieu et al [20]on fractional derivative operators showssome promise for enriching the gradient and Laplacian possibilities for edge detection.Fractional derivatives may allow better control of noise sensitivity, edge localization, anderror rate under various conditions.

An important tradeoff exists between correct detection of the actual edges and preciselocation of their positions Edge detection errors can occur in two forms: false positives,

in which nonedge pixels are misclassified as edge pixels, and false negatives, which arethe reverse Detection errors of both types tend to increase with noise, making goodnoise suppression very important in achieving a high detection accuracy In general, thepotential for noise suppression improves with the spatial extent of the edge detection filter.Hence, the goal of maximum detection accuracy calls for a large-sized filter Errors inedge localization also increase with noise To achieve good localization, however, the filtershould generally be of small spatial extent The goals of detection accuracy and locationaccuracy are thus put into direct conflict, creating a kind of uncertainty principle for edgedetection[28]

In this chapter, we cover the basics of gradient and Laplacian edge detection methods

in some detail Following each, we also describe several of the more important and usefuledge detection algorithms based on that approach While the primary focus is on graylevel edge detectors, some discussion of edge detection in color and multispectral images

is included

19.2 GRADIENT-BASED METHODS

19.2.1 Continuous Gradient

The core of gradient edge detection is, of course, the gradient operator,ⵜ In continuous

form, applied to a continuous-space image, f c (x,y), the gradient is deﬁned as

ⵜf c (x,y) ⫽ ⭸f c (x,y)

⭸f c (x,y)

where ixand iy are the unit vectors in the x and y directions Notice that the gradient is a

vector, having both magnitude and direction Its magnitude,|ⵜf c (x0 , y0)|, measures themaximum rate of change in the intensity at the location(x0 , y0) Its direction is that ofthe greatest increase in intensity; i.e., it points “uphill.”

To produce an edge detector, one may simply extend the 1D case described earlier.Consider the effect of ﬁnding the local extrema ofⵜf c (x,y) or the local maxima of

Trang 11

19.2 Gradient-Based Methods 499

the locally strongest of the edge contour points To fully construct edge contours, it is

better to applyEq (19.2)to a 1D local neighborhood, namely a line segment, whose

direction is chosen to cross the edge The situation is then similar to that ofFig 19.1,

where the point of locally maximum gradient magnitude is the edge point Now the issue

becomes how to select the best direction for the line segment used for the search

The most commonly used method of producing edge segments or contours from

Eq (19.2)consists of two stages: thresholding and thinning In the thresholding stage,

the gradient magnitude at every point is compared with a predeﬁned threshold value, T

All points satisfying the following criterion are classiﬁed as candidate edge points:

The set of candidate edge points tends to form strips, which have positive width Since

the desire is usually for zero-width boundary segments or contours to describe the edges,

a subsequent processing stage is needed to thin the strips to the ﬁnal edge contours

Edge contours derived from continuous-space images should have zero width because

any local maxima of ⵜfc (x,y), along a line segment that crosses the edge, cannot be

adjacent points For the case of discrete-space images, the nonzero pixel size imposes a

minimum practical edge width

Edge thinning can be accomplished in a number of ways, depending on the

appli-cation, but thinning by nonmaximum suppression is usually the best choice Generally

speaking, we wish to suppress any point that is not, in a 1D sense, a local maximum

in gradient magnitude Since a 1D local neighborhood search typically produces a

sin-gle maximum, those points that are local maxima will form edge segments only one

point wide One approach classiﬁes an edge-strip point as an edge point if its gradient

magnitude is a local maximum in at least one direction However, this thinning method

sometimes has the side effect of creating false edges near strong edge lines[17] It is also

somewhat inefﬁcient because of the computation required to check along a number of

different directions A better, more efﬁcient thinning approach checks only a single

direc-tion, the gradient direcdirec-tion, to test whether a given point is a local maximum in gradient

magnitude The points that pass this scrutiny are classiﬁed as edge points Looking in

the gradient direction essentially searches perpendicular to the edge itself, producing a

scenario similar to the 1D case shown inFig 19.1 The method is efﬁcient because it is not

necessary to search in multiple directions It also tends to produce edge segments

hav-ing good localization accuracy These characteristics make the gradient direction, local

extremum method quite popular The following steps summarize its implementation

1 Using one of the techniques described in the next section, computeⵜf for all

pixels

2 Determine candidate edge pixels by thresholding all pixels’ gradient magnitudes

by T

3 Thin by supressing all candidate edge pixels whose gradient magnitude is not a

local maximum along its gradient direction Those that survive nonmaximum

supression are classiﬁed as edge pixels

Trang 12

The order of the thinning and thresholding steps might be interchanged Ifthresholding is accomplished first, the computational cost of thinning can be signifi-cantly reduced However, it can become difficult to predict the number of edge pixelsthat will be produced by a given threshold value By thinning first, there tends to besomewhat better predictability of the richness of the resulting edge map as a function ofthe applied threshold.

Consider the effect of performing the thresholding and thinning operations in tion If thresholding alone were done, the edges would show as strips or patches instead

isola-of thin segments If thinning were done without thresholding, that is, if edge points weresimply those having locally maximum gradient magnitude, many false edge points wouldlikely result because of noise Noise tends to create false edge points because some points

in edge-free areas happen to have locally maximum gradient magnitudes The ing step ofEq (19.3)is often useful to reduce noise either prior to or following thinning

threshold-A variety of adaptive methods have been developed that adjust the threshold according

to certain image characteristics, such as an estimate of local signal-to-noise ratio Adaptivethresholding can often do a better job of noise suppression while reducing the amount

Trang 13

FIGURE 19.3

Gradient edge detection steps, using the Sobel operator: (a) After thresholdingⵜf; (b) after

thinning (a) by ﬁnding the local maximum ofⵜfalong the gradient direction

become more broken and fragmented By decreasing T , one can obtain more connected

and richer edge contours, but the greater noise sensitivity is likely to produce more false

edges If only thresholding is used, as inEq (19.3)andFig 19.3(a), the edge strips tend

to narrow as T increases and widen as it decreases. Figure 19.4compares edge maps

obtained from several different threshold values

Sometimes a directional edge detector is useful One can be obtained by decomposing

the gradient into horizontal and vertical components and applying them separately

Expressed in the continuous domain, the operators become:

ⱖT for edges in the x direction.

An example of directional edge detection is illustrated inFig 19.5

A directional edge detector can be constructed for any desired direction by using the

directional derivative along a unit vector n,

⭸f c

⭸n ⫽ ⵜf c (x,y) · n,

Trang 14

where␪ is the angle of n relative to the positive x axis The directional derivative is most

sensitive to edges perpendicular to n.

The continuous-space gradient magnitude produces an isotropic or rotationally metric edge detector, equally sensitive to edges in any direction[17] It is easy to showwhyⵜfis isotropic In addition to the original X -Y coordinate system, let us introduce

sym-a new system, X⬘-Y ⬘, which is rotsym-ated by sym-an sym-angle of ␾ relsym-ative to X-Y Let n x⬘and ny⬘be

Trang 15

(a) (b)

FIGURE 19.5

Directional edge detection comparison, using the Sobel operator: (a) results of horizontal

difference operator; (b) results of vertical difference operator

the unit vectors in the x ⬘ and y⬘ directions, respectively For the gradient magnitude to

be isotropic, the same result must be produced in both coordinate systems, regardless of

␾ UsingEq (19.4)along with abbreviated notation, the partial derivatives with respect

to the new coordinate axes are

f x⬘ ⫽ ⵜf · n x⬘ ⫽ f xcos␾ ⫹ f ysin␾,

f y⬘ ⫽ ⵜf · n y⬘ ⫽ ⫺f xsin␾ ⫹ f ycos␾.

From this point, it is a simple matter of plugging intoEq (19.2)to show the gradient

magnitudes are identical in both coordinate systems, regardless of the rotation angle,␾.

Occasionally, one may wish to reduce the computation load ofEq (19.2)by

approxi-mating the square root with a computationally simpler function Three possibilities are

gradient somewhat For instance, the approximated gradient magnitudes ofEqs (19.5),

(19.6), and(19.7)are not isotropic and produce their greatest errors for purely diagonally

Định dạng
Số trang	30
Dung lượng	2,37 MB