The Essential Guide to Image Processing- P19 ppt

In general, the edge detection step after anisotropic diffusion of the image is straightforward.. As an example, QA algorithms can be used to systematically evaluate the performance of d

Trang 1

(a) (b)

FIGURE 20.11

(a) In tracking a white blood cell, the GVF vector diffusion fails to attract the active contour;

(b) successful detection is yielded by MGVF

Thus (20.48)provides an external force that can guide an active contour to a moving

object boundary The capture range of GVF is increased using the motion gradient

vector ﬂow (MGVF) vector diffusion[51] With MGVF, a tracking algorithm can simply

use the ﬁnal position of the active contour from a previous video frame as the initial

contour in the subsequent frame For an example of tracking using MGVF, seeFig 20.11

Anisotropic diffusion is an effective precursor to edge detection The main beneﬁt of

anisotropic diffusion over isotropic diffusion and linear ﬁltering is edge preservation

By properly specifying the diffusion PDE and the diffusion coefﬁcient, an image can

be scaled, denoised, and simpliﬁed for boundary detection For edge detection, the

most critical design step is speciﬁcation of the diffusion coefﬁcient The variants of

the diffusion coefﬁcient involve tradeoffs between sensitivity to noise, the ability to

spec-ify scale, convergence issues, and computational cost The diverse implementations of

the anisotropic diffusion PDE result in improved ﬁdelity to the original image, mean

curvature motion, and convergence to LOMO signals As the diffusion PDE may be

considered a descent on an energy surface, the diffusion operation can be viewed in a

variational framework Recent variational solutions produce optimized edge maps and

image segmentations in which certain edge-based features, such as edge length, curvature,

thickness, and connectivity, can be optimized

The computational cost of anisotropic diffusion may be reduced by using

multireso-lution somultireso-lutions, including the anisotropic diffusion pyramid and multigrid anisotropic

diffusion Application of edge detection to multispectral imagery and to radar/ultrasound

imagery is possible through techniques presented in the literature In general, the edge

detection step after anisotropic diffusion of the image is straightforward Edges may be

detected using a simple gradient magnitude threshold, using robust statistics, or using a

Trang 2

feature extraction technique Active contours, used in conjunction with vector diffusion,can be employed to extract meaningful object boundaries.

REFERENCES

[1] D G Lowe Perceptual Organization and Visual Recognition Kluwer Academic, New York, 1985.

[2] V Caselles, J.-M Morel, G Sapiro, and A Tannenbaum Introduction to the special issue on partial

differential equations and geometry-driven diffusion in image processing and analysis IEEE Trans Image Process., 7:269–273, 1998.

[3] A P Witkin Scale-space ﬁltering In Proc Int Joint Conf Art Intell., 1019–1021, 1983.

[4] J J Koenderink The structure of images Biol Cybern., 50:363–370, 1984.

[5] D Marr and E Hildreth Theory of edge detection Proc R Soc Lond B, Biol Sci., 207:187–217,

1980.

[6] P Perona and J Malik Scale-space and edge detection using anisotropic diffusion IEEE Trans Pattern Anal Mach Intell., PAMI-12:629–639, 1990.

[7] S Teboul, L Blanc-Feraud, G Aubert, and M Barlaud Variational approach for edge-preserving

regularization using coupled PDE’s IEEE Trans Image Process., 7:387–397, 1998.

[8] R T Whitaker and S M Pizer A multi-scale approach to nonuniform diffusion Comput Vis Graph Image Process.—Image Underst., 57:99–110, 1993.

[9] Y.-L You, M Kaveh, W Xu, and A Tannenbaum Analysis and design of anisotropic diffusion

for image processing In Proc IEEE Int Conf Image Process., Austin, Texas, November 13–16,

1994.

[10] Y.-L You, W Xu, A Tannenbaum, and M Kaveh Behavioral analysis of anisotropic diffusion in

image processing IEEE Trans Image Process., 5:1539–1553, 1996.

[11] F Catte, P.-L Lions, J.-M Morel, and T Coll Image selective smoothing and edge detection by

nonlinear diffusion SIAM J Numer Anal., 29:182–193, 1992.

[12] L Alvarez, P.-L Lions, and J.-M Morel Image selective smoothing and edge detection by nonlinear

diffusion II SIAM J Numer Anal., 29:845–866, 1992.

[13] C A Segall and S T Acton Morphological anisotropic diffusion In Proc IEEE Int Conf Image Process., Santa Barbara, CA, October 26–29, 1997.

[14] L.-I Rudin, S Osher, and E Fatemi Nonlinear total variation noise removal algorithm Physica D,

[18] K N Nordstrom Biased anisotropic diffusion—a uniﬁed approach to edge detection Tech Report, Dept of Electrical Engineering and Computer Sciences, University of California at Berkeley, Berkeley, CA, 1989.

[19] J Canny A computational approach to edge detection IEEE Trans Pattern Anal Mach Intell.,

PAMI-8:679–714, 1986.

Trang 3

[20] A El-Fallah and G Ford The evolution of mean curvature in image ﬁltering In Proc IEEE Int.

Conf Image Process., Austin, Texas, November 1994.

[21] S Osher and J Sethian Fronts propagating with curvature dependent speed: algorithms based on

the Hamilton-Jacobi formulation J Comp Phys., 79:12–49, 1988.

[22] N Sochen, R Kimmel, and R Malladi A general framework for low level vision IEEE Trans Image

[25] D Mumford and J Shah Boundary detection by minimizing functionals In IEEE Int Conf Comput.

Vis Pattern Recognit., San Francisco, 1985.

[26] S T Acton and A C Bovik Anisotropic edge detection using mean ﬁeld annealing In Proc IEEE

Int Conf Acoust., Speech and Signal Process (ICASSP-92), San Francisco, March 23–26, 1992.

[27] D Geman and G Reynolds Constrained restoration and the recovery of discontinuities IEEE

Trans Pattern Anal Mach Intell., 14:376–383, 1992.

[28] P J Burt, T Hong, and A Rosenfeld Segmentation and estimation of region properties through

cooperative hierarchical computation IEEE Trans Syst Man Cybern., 11(12):1981.

[29] P J Burt Smart sensing within a pyramid vision machine Proc IEEE, 76(8):1006–1015, 1988.

[30] S T Acton A pyramidal edge detector based on anisotropic diffusion In Proc of the IEEE Int Conf.

Acoust., Speech and Signal Process (ICASSP-96), Atlanta, May 7–10, 1996.

[31] S T Acton, A C Bovik, and M M Crawford Anisotropic diffusion pyramids for image

segmentation In Proc IEEE Int Conf Image Process., Austin, Texas, November 1994.

[32] A Morales, R Acharya, and S Ko Morphological pyramids with alternating sequential ﬁlters.

IEEE Trans Image Process., 4(7):965–977, 1996.

[33] C A Segall, S T Acton, and A K Katsaggelos Sampling conditions for anisotropic diffusion In

Proc SPIE Symp Vis Commun Image Process., San Jose, January 23–29, 1999.

[34] R M Haralick, X Zhuang, C Lin, and J S J Lee The digital morphological sampling theorem.

IEEE Trans Acoust., 3720(12):2067–2090, 1989.

[35] S T Acton Multigrid anisotropic diffusion IEEE Trans Image Process., 7:280–291, 1998.

[36] J H Bramble Multigrid Methods John Wiley, New York, 1993.

[37] W Hackbush and U Trottenberg, editors Multigrid Methods Springer-Verlag, New York, 1982.

[38] R T Whitaker and G Gerig Vector-valued diffusion In B ter Haar Romeny, editor,

Geometry-Driven Diffusion in Computer Vision, 93–134 Kluwer, 1994.

[39] S T Acton and J Landis Multispectral anisotropic diffusion Int J Remote Sens., 18:2877–2886,

1997.

[40] G Sapiro and D L Ringach Anisotropic diffusion of multivalued images with applications to color

ﬁltering IEEE Trans Image Process., 5:1582–1586, 1996.

[41] S DiZenzo A note on the gradient of a multi-image Comput Vis Graph Image Process., 33:

116–125, 1986.

[42] Y Yu and S T Acton Speckle reducing anisotropic diffusion IEEE Trans Image Process., 11:

1260–1270, 2002.

Trang 4

[43] Y Yu and S T Acton Edge detection in ultrasound imagery using the instantaneous coefﬁcient of

variation IEEE Trans Image Process., 13(12):1640–1655, 2004.

[44] P J Rousseeuw and A M Leroy Robust Regression and Outlier Detection Wiley, New York, 1987 [45] W K Pratt Digital Image Processing Wiley, New York, 495–501, 1978.

[46] M Kass, A Witkin, and D Terzopoulos Snakes: active contour models Int J Comput Vis.,

Trang 5

Image Quality Assessment

Kalpana Seshadrinathan 1 , Thrasyvoulos N Pappas 2 ,

Robert J Safranek 3 , Junqing Chen 4 , Zhou Wang 5 ,

Hamid R Sheikh 6 , and Alan C Bovik 7

1The University of Texas at Austin;2Northwestern University;3Benevue, Inc.;

4Northwestern University;5University of Waterloo;6Texas Instruments, Inc.;

7The University of Texas at Austin

Recent advances in digital imaging technology, computational speed, storage capacity, and

networking have resulted in the proliferation of digital images, both still and video As the

digital images are captured, stored, transmitted, and displayed in different devices, there

is a need to maintain image quality The end users of these images, in an overwhelmingly

large number of applications, are human observers In this chapter, we examine objective

criteria for the evaluation of image quality as perceived by an average human observer

Even though we use the term image quality, we are primarily interested in image ﬁdelity,

i.e., how close an image is to a given original or reference image This paradigm of image

quality assessment (QA) is also known as full reference image QA The development of

objective metrics for evaluating image quality without a reference image is quite different

and is outside the scope of this chapter

Image QA plays a fundamental role in the design and evaluation of imaging and

image processing systems As an example, QA algorithms can be used to systematically

evaluate the performance of different image compression algorithms that attempt to

minimize the number of bits required to store an image, while maintaining sufﬁciently

high image quality Similarly, QA algorithms can be used to evaluate image acquisition

and display systems Communication networks have developed tremendously over the

past decade, and images and video are frequently transported over optic ﬁber, packet

switched networks like the Internet, wireless systems, etc Bandwidth efﬁciency of

appli-cations such as video conferencing and Video on Demand can be improved using QA

systems to evaluate the effects of channel errors on the transported images and video

Further, QA algorithms can be used in “perceptually optimal” design of various

compo-nents of an image communication system Finally, QA and the psychophysics of human

vision are closely related disciplines Research on image and video QA may lend deep

553

Trang 6

insights into the functioning of the human visual system (HVS), which would be ofgreat scientiﬁc value.

Subjective evaluations are accepted to be the most effective and reliable, albeit quitecumbersome and expensive, way to assess image quality A significant effort has beendedicated for the development of subjective tests for image quality[56, 57] There hasalso been standards activity on subjective evaluation of image quality[58] The study ofthe topic of subjective evaluation of image quality is beyond the scope of this chapter.The goal of an objective perceptual metric for image quality is to determine thedifferences between two images that are visible to the HVS Usually one of the images isthe reference which is considered to be “original,”“perfect,” or “uncorrupted.” The secondimage has been modified or distorted in some sense The output of the QA algorithm isoften a number that represents the probability that a human eye can detect a difference inthe two images or a number that quantifies the perceptual dissimilarity between the twoimages Alternatively, the output of an image quality metric could be a map of detectionprobabilities or perceptual dissimilarity values

Perhaps the earliest image quality metrics were the mean squared error (MSE) andpeak signal-to-noise ratio (PSNR) between the reference and distorted images Thesemetrics are still widely used for performance evaluation, despite their well-known lim-

itations, due to their simplicity Let f (n) and g(n) represent the value (intensity) of an

image pixel at location n Usually the image pixels are arranged in a Cartesian grid and

n⫽ (n1 , n2) The MSE between f (n) and g(n) is deﬁned as

where N is the total number of pixel locations in f (n) or g(n) The PSNR between these

image patches is deﬁned as

or negative It can be easily shown that the MSE/PSNR between the original image andboth of the distorted images are exactly the same However, the visual quality of the twodistorted images is drastically different Another example is shown inFig 21.2, whereFig 21.2(b)was generated by adding independent white Gaussian noise to the originaltexture image inFig 21.2(a) InFig 21.2(c), the signal sample values remained the same

as inFig 21.2(a), but the spatial ordering of the samples has been changed (through

a sorting procedure).Figure 21.2(d)was obtained fromFig 21.2(b), by following thesame reordering procedure used to createFig 21.2(c) Again, the MSE/PSNR between

Trang 7

FIGURE 21.1

Failure of the Minkowski metric for image quality prediction (a) original image; (b) distorted

image by adding a positive constant; (c) distorted image by adding the same constant, but with

random sign Images (b) and (c) have the same Minkowski metric with respect to image (a), but

drastically different visual quality

Figs 21.2(a) and21.2(b)andFigs 21.2(c)and21.2(d)is exactly the same However,

Fig 21.2(d)appears to be signiﬁcantly noisier thanFig 21.2(b)

The above examples clearly illustrate the failure of PSNR as an adequate measure

of visual quality In this chapter, we will discuss three classes of image QA algorithms

that correlate with visual perception signiﬁcantly better—human vision based metrics,

Structural SIMilarity (SSIM) metrics, and information theoretic metrics Each of these

techniques approaches the image QA problem from a different perspective and using

different ﬁrst principles As we proceed in this chapter, in addition to discussing these

QA techniques, we will also attempt to shed light on the similarities, dissimilarities, and

interplay between these seemingly diverse techniques

Human vision modeling based metrics utilize mathematical models of certain stages of

processing that occur in the visual systems of humans to construct a quality metric

Most HVS-based methods take an engineering approach to solving the QA problem by

Trang 8

Noise (a)

Reordering pixels

(c) 1

FIGURE 21.2

Failure of the Minkowski metric for image quality prediction (a) original texture image; (b) torted image by adding independent white Gaussian noise; (c) reordering of the pixels in image(a) (by sorting pixel intensity values); (d) reordering of the pixels in image (b), by following thesame reordering used to create image (c) The Minkowski metrics between images (a) and (b)and images (c) and (d) are the same, but image (d) appears much noisier than image (b)

dis-measuring the threshold of visibility of signals and noise in the signals These thresholdsare then utilized to normalize the error between the reference and distorted images toobtain a perceptually meaningful error metric To measure visibility thresholds, differ-ent aspects of visual processing need to be taken into consideration such as response

to average brightness, contrast, spatial frequencies, orientations, etc Other HVS-basedmethods attempt to directly model the different stages of processing that occur in theHVS that results in the observed visibility thresholds InSection 21.2.1, we will discuss theindividual building blocks that comprise a HVS-based QA system The function of theseblocks is to model concepts from the psychophysics of human perception that apply toimage quality metrics InSection 21.2.2, we will discuss the details of several well-knownHVS-based QA systems Each of these QA systems is comprised of some or all of thebuilding blocks discussed inSection 21.2.1, but uses different mathematical models foreach block

21.2.1 Building Blocks

21.2.1.1 Preprocessing

Most QA algorithms include a preprocessing stage that typically comprises of tion and registration The array of numbers that represents an image is often mapped to

Trang 9

calibra-units of visual frequencies or cycles per degree of visual angle, and the calibration stage

receives input parameters such as viewing distance and physical pixel spacings (screen

resolution) to perform this mapping Other calibration parameters may include

ﬁxa-tion depth and eccentricity of the images in the observer’s visual ﬁeld[37, 38] Display

calibration or an accurate model of the display device is an essential part of any image

quality metric[55], as the HVS can only see what the display can reproduce Many

qual-ity metrics require that the input image values be converted to physical luminances1

before they enter the HVS model In some cases, when the perceptual model is obtained

empirically, the effects of the display are incorporated in the model[40] The obvious

disadvantage of this approach is that when the display changes, a new set of model

parameters must be obtained[43] The study of display models is beyond the scope of

this chapter

Registration, i.e., establishing point-by-point correspondence between two images, is

also necessary in most image QA systems Often times, the performance of a QA model

can be extremely sensitive to registration errors since many QA systems operate pixel by

pixel (e.g., PSNR) or on local neighborhoods of pixels Errors in registration would result

in a shift in the pixel or coefﬁcient values being compared and degrade the performance

of the system

21.2.1.2 Frequency Analysis

The frequency analysis stage decomposes the reference and test images into different

channels (usually called subbands) with different spatial frequencies and orientations

using a set of linear ﬁlters In many QA models, this stage is intended to mimic

simi-lar processing that occurs in the HVS: neurons in the visual cortex respond selectively

to stimuli with particular spatial frequencies and orientations Other QA models that

target speciﬁc image coders utilize the same decomposition as the compression

sys-tem and model the thresholds of visibility for each of the channels Some examples of

such decompositions are shown inFig 21.3 The range of each axis is from⫺u s /2 to

u s /2 cycles per degree, where u s is the sampling frequency.Figures 21.3(a)–(c) show

transforms that are polar separable and belong to the former category of

decomposi-tions (mimicking processing in the visual cortex).Figures 21.3(d)–(f) are used in QA

models in the latter category and depict transforms that are often used in compression

systems

In the remainder of this chapter, we will use f (n) to denote the value (intensity,

grayscale, etc.) of an image pixel at location n Usually the image pixels are arranged

in a Cartesian grid and n⫽ (n1 , n2) The value of the kth image subband at location

n will be denoted by b (k,n) The subband indexing k ⫽ (k1, k2) could be in Cartesian

or polar or even scalar coordinates The same notation will be used to denote the kth

coefﬁcient of the nth discrete cosine transform (DCT) block (both Cartesian coordinate

systems) This notation underscores the similarity between the two transformations,

1 In video practice, the term luminance is sometimes, incorrectly, used to denote a nonlinear transformation

of luminance [75, p 24]

Trang 10

(a) Cortex transform (Watson)

(b) Cortex transform (Daly)

Trang 11

21.2.1.3 Contrast Sensitivity

The HVS’s contrast sensitivity function (CSF, also called the modulation transfer

func-tion) provides a characterization of its frequency response The CSF can be thought of

as a bandpass ﬁlter There have been several different classes of experiments used to

determine its characteristics which are described in detail in[59, Chapter 12]

One of these methods involves the measurement of visibility thresholds of

sine-wave gratings For a ﬁxed frequency, a set of stimuli consisting of sine sine-waves of varying

amplitudes are constructed These stimuli are presented to an observer, and the detection

threshold for that frequency is determined This procedure is repeated for a large number

of grating frequencies The resulting curve is called the CSF and is illustrated inFig 21.4

Note that these experiments used sine-wave gratings at a single orientation To fully

characterize the CSF, the experiments would need to be repeated with gratings at various

orientations This has been accomplished and the results show that the HVS is not

perfectly isotropic However, for the purposes of QA, it is close enough to isotropic that

this assumption is normally used

It should also be noted that the spatial frequencies are in units of cycles per degree of

visual angle This implies that the visibility of details at a particular frequency is a function

of viewing distance As an observer moves away from an image, a ﬁxed size feature in

the image takes up fewer degrees of visual angle This action moves it to the right on

the contrast sensitivity curve, possibly requiring it to have greater contrast to remain

visible On the other hand, moving closer to an image can allow previously imperceivable

details to rise above the visibility threshold Given these observations, it is clear that

the minimum viewing distance is where distortion is maximally detectable Therefore,

quality metrics often specify a minimum viewing distance and evaluate the distortion

metric at that point Several“standard”minimum viewing distances have been established

Trang 12

for subjective quality measurement and have generally been used with objective models

as well These are six times image height for standard deﬁnition television and three timesimage height for high deﬁnition television

The baseline contrast sensitivity determines the amount of energy in each subbandthat is required in order to detect the target in a (arbitrary or) ﬂat mid-gray image This is

sometimes referred to as the just noticeable difference (JND) We will use t b (k) to denote

the baseline sensitivity of the kth band or DCT coefﬁcient Note that the base sensitivity

is independent of the location n.

21.2.1.4 Luminance Masking

It is well known that the perception of lightness is a nonlinear function of luminance.Some authors call this “light adaptation.” Others prefer the term “luminance masking,”which groups it together with the other types of masking we will see below[41] It iscalled masking because the luminance of the original image signal masks the variations

in the distorted signal

Consider the following experiment: create a series of images consisting of a

back-ground of uniform intensity, I , each with a square of a different intensity, I ⫹ ␦I, inserted

into its center Show these to an observer in order of increasing␦I Ask the observer to

determine the point at which she can ﬁrst detect the square Then, repeat this ment for a large number of different values of background intensity For a wide range ofbackground intensities, the ratio of the threshold value␦I divided by I is a constant This

frequency components in the beach image hides or masks the presence of the noise ﬁeld.

Contrast masking refers to the reduction in visibility of one image component caused

by the presence of another image component with similar spatial location and frequencycontent As we mentioned earlier, the visual cortex in the HVS can be thought of as aspatial frequency ﬁlter bank with octave spacing of subbands in radial frequency andangular bands of roughly 30 degree spacing The presence of a signal component in one

Trang 13

of these subbands will raise the detection threshold for other signal components in the

same subband[64–66]or even neighboring subbands

21.2.1.6 Error Pooling

The ﬁnal step of an image quality metric is to combine the errors (at the output of the

models for various psychophysical phenomena) that have been computed for each spatial

frequency and orientation band and each spatial location, into a single number for each

pixel of the image, or a single number for the whole image Some metrics convert the

where bk(n) and ˆbk(n) are the nth element of the kth subband of the original and

coded image, respectively, t (k,n) is the corresponding sensitivity threshold, and M is the

total number of subbands In this case, the errors are pooled across frequency to obtain

a distortion measure for each spatial location The value of Q varies from 2 (energy

summation) to inﬁnity (maximum error)

21.2.2 HVS-Based Models

In this section, we will discuss some well-known HVS modeling based QA systems We

will ﬁrst discuss four general purpose QA models: the visible differences predictor (VDP),

the Sarnoff JND vision model, the Teo and Heeger model, and visual signal-to-noise ratio

(VSNR)

We will then discuss quality models that are designed speciﬁcally for different

com-pression systems: the perceptual image coder (PIC) and Watson’s DCT and wavelet-based

metrics While still based on the properties of the HVS, these models adopt the frequency

decomposition of a given coder, which is chosen to provide high compression efﬁciency

as well as computational efﬁciency The block diagram of a generic perceptually based

coder is shown inFig 21.5 The frequency analysis decomposes the image into several

Contrast sensitivity

Masking model

FIGURE 21.5

Perceptual coder

Trang 14

components (subbands, wavelets, etc.) which are then quantized and entropy coded Thefrequency analysis and entropy coding are virtually lossless; the only losses occur at thequantization step The perceptual masking model is based on the frequency analysis andregulates the quantization parameters to minimize the visibility of the errors The visualmodels can be incorporated in a compression scheme to minimize the visibility of thequantization errors, or they can be used independently to evaluate its performance Whilecoder-speciﬁc image quality metrics are quite effective in predicting the performance ofthe coder they are designed for, they may not be as effective in predicting performanceacross different coders[36, 83].

21.2.2.1 Visible Differences Predictor

The VDP is a model developed by Daly for the evaluation of high quality imaging systems[37] It is one of the most general and elaborate image quality metrics in the literature Itaccounts for variations in sensitivity due to light level, spatial frequency (CSF), and signalcontent (contrast masking)

To model luminance masking or amplitude nonlinearities in the HVS, Daly includes asimple point-by-point amplitude nonlinearity where the adaptation level for each imagepixel is solely determined from that pixel (as opposed to using the average luminance in aneighborhood of the pixel) To account for contrast sensitivity, the VDP ﬁlters the image

by the CSF before the frequency decomposition Once this normalization is accomplished

to account for the varying sensitivities of the HVS to different spatial frequencies, thethresholds derived in the contrast masking stage become the same for all frequencies

A variation of the Cortex transform shown inFig 21.3(b)is used in the VDP for thefrequency decomposition Daly proposes two alternatives to convert the output of thelinear ﬁlter bank to units of contrast: local contrast, which uses the value of the baseband

at any given location to divide the values of all the other bands, and global contrast,which divides all subbands by the average value of the input image The conversion tocontrast is performed since to a ﬁrst approximation the HVS produces a neural image

of local contrast[35] The masking stage in the VDP utilizes a “threshold elevation”approach, where a masking function is computed that measures the contrast threshold

of a signal as a function of the background (masker) contrast This function is computedfor the case when the masker and signal are single, isolated frequencies To obtain amasking model for natural images, the VDP considers the results of experiments thathave measured the masking thresholds for both single frequencies and additive noise

The VDP also allows for mutual masking which uses both the original and distorted

images to determine the degree of masking The masking function used in the VDP isillustrated inFig 21.6 Although the threshold elevation paradigm works quite well indetermining the discriminability between the reference and distorted images, it fails togeneralize to the case of supra-threshold distortions

In the error pooling stage, a psychometric function is used to compute the probability

of discrimination at each pixel of the reference and test images to obtain a spatial map.Further details of this algorithm can be found in[37], along with an interesting discussion

of different approaches used in the literature to model various stages of processing in theHVS, including their merits and drawbacks

Trang 15

log (mask contrast * CSF)

FIGURE 21.6

Contrast masking function

21.2.2.2 Sarnoff JND Vision Model

The Sarnoff JND vision model received a technical Emmy award in 2000 and is one of

the best known QA systems based on human vision models This model was developed

by Lubin and coworkers, and details of this algorithm can be found in[38]

Preprocessing steps in this model include calibration for distance of the observer

from the images In addition, this model also accounts for ﬁxation depth and eccentricity

of the observer’s visual ﬁeld The human eye does not sample an image uniformly since

the density of retinal cells drops off with eccentricity, resulting in a decreased spatial

resolution as we move away from the point of ﬁxation of the observer To account for

this effect, the Lubin model resamples the image to generate a modeled retinal image

The Laplacian pyramid ofBurt and Adelson [77]is used to decompose the image into

seven radial frequency bands At this stage, the pyramid responses are converted to units

of local contrast by dividing each point in each level of the Laplacian pyramid by the

corresponding point obtained from the Gaussian pyramid two levels down in resolution

Each pyramid level is then convolved with eight spatially oriented ﬁlters ofFreeman and

Adelson [78], which constitute Hilbert transform pairs for four different orientations

The frequency decomposition so obtained is illustrated inFig 21.3(c) The two Hilbert

transform pair outputs are squared and summed to obtain a local energy measure at

each pixel location, pyramid level, and orientation To account for the contrast sensitivity

Tiêu đề	The Essential Guide to Image Processing
Trường học	Standard University
Chuyên ngành	Image Processing
Thể loại	Bài luận
Thành phố	New York

Định dạng
Số trang	30
Dung lượng	1,6 MB