Image-texture analysis during the past three decades has primarily focused on texture classification, texture segmentation, and texture synthesis.. In medical image analysis, texture is
Trang 1Image Databases: Search and Retrieval of Digital Imagery
Edited by Vittorio Castelli, Lawrence D Bergman Copyright 2002 John Wiley & Sons, Inc ISBNs: 0-471-32116-8 (Hardback); 0-471-22463-4 (Electronic)
by their texture Brodatz [1], in his introduction to Textures: A photographic
album, states “The age of photography is likely to be an age of texture.” His
texture photographs, which range from man-made textures (woven aluminumwire, brick walls, handwoven rugs, etc.), to natural objects (water, clouds, sand,grass, lizard skin, etc.) are being used as a standard data set for image-textureanalysis Such textured objects are difficult to describe in qualitative terms,let alone creating quantitative descriptions required for machine analysis Theobserved texture often depends on the lighting conditions, viewing angles anddistance, may change over a period of time as in pictures of landscapes.Texture is a property of image regions, as is evident from the examples Texturehas no universally accepted formal definition, although it is easy to visualize whatone means by texture One can think of a texture as consisting of some basicprimitives (texels or Julesz’s textons [2,3], also referred to as the micropatterns),whose spatial distribution in the image creates the appearance of a texture Mostman-made objects have such easily identifiable texels The spatial distribution
of texels could be regular (or periodic) or random In Figure 12.1a, “brick” is amicropattern in which particular distribution in a “brick-wall” image constitutes
a structured pattern The individual primitives need not be of the same size andshape, as illustrated by the bricks and pebbles textures (Fig 12.1b) Well-definedmicropatterns may not exist in many cases, such as pictures of sand on the beach,water, and clouds Some examples of textured images are shown in Figure 12.1.Detection of the micropatterns, if they exist, and their spatial arrangement offersimportant depth cues to the human visual system (see Fig 12.2.)
313
Trang 2(a) Brick wall (b) Stones and pebbles
Figure 12.1 Examples of some textured images.
Image-texture analysis during the past three decades has primarily focused
on texture classification, texture segmentation, and texture synthesis In textureclassification the objective is to assign a unique label to each homogeneousregion For example, regions in a satellite picture may be classified into ice,water, forest, agricultural areas, and so on In medical image analysis, texture
is used to classify magnetic resonance (MR) images of the brain into gray andwhite matter or to detect cysts in X-ray computed tomography (CT) images
of the kidneys If the images are preprocesed to extract homogeneous-texturedregions, then the pixel data within these regions can be used for classifying theregions In doing so we associate each pixel in the image to a corresponding
class label , the label of the region to which that particular pixel belongs An
excellent overview of some of the early methods for texture classification can befound in an overview paper by Haralick [4]
Trang 3Texture, together with color and shape, helps distinguish objects in a scene.
Figure 12.2c shows a scene consisting of multiple textures Texture segmentation
refers to computing a partitioning of the image, each of the partitions beinghomogeneous in some sense Note that homogeneity in color and texture maynot ensure segmenting the image into semantically meaningful objects Typically,segmentation results in an overpartitioning of the objects of interest Segmentationand classification often go together — classifying the individual pixels in theimage produces a segmentation However, to obtain a good classification, oneneeds homogeneous-textured regions, that is, one must segment the image first
Texture adds realism to synthesized images The objective of texture synthesis
is to generate texture that is perceptually indistinguishable from that of a providedexample Such synthesized textures can then be used in applications such as
texture mapping In computer graphics, texture mapping is used to generate
surface details of synthesized objects Texture mapping refers to mapping animage, usually a digitized image, onto a surface [5] Generative models that cansynthesize textures under varying imaging conditions would aid texture mappingand facilitate the creation of realistic scenes
In addition, texture is considered as an important visual cue in the emergingapplication area of content-based access to multimedia data One particular aspectthat has received much attention in recent years is query by example Given aquery image, one is interested in finding visually similar images in the database
As a basic image feature, texture is very useful in similarity search This isconceptually similar to the texture-classification problem in that we are interested
in computing texture descriptions that allow us to make comparisons between
different textured images in the database Recall that in texture classification
Trang 4we compute a label for a given textured image This label may have semanticsassociated with it, for example, water texture or cloud texture If the textures
in the database are similarly classified, their labels can then be used to retrieveother images containing the water or cloud texture The requirements on similarityretrieval, however, are somewhat different First, it may not be feasible to create
an exhaustive class-label dictionary Second, even if such class-label information
is available, one is interested in finding the top N matches within that classthat are visually similar to the given pattern The database should store detailedtexture descriptions to allow search and retrieval of similar texture patterns Thefocus of this chapter is on the use of texture features for similarity search
12.1.1 Organization of the Chapter
Our main focus will be on descriptors that are useful for texture representationfor similarity search We begin with an overview of image texture, emphasizingcharacteristics and properties that are useful for indexing and retrieving imagesusing texture In typical applications, a number of top matches with rank-orderedsimilarities to the query pattern will be retrieved In presenting this overview,
we will only be able to give an overview of the rich and diverse work in thisarea and strongly encourage the reader to follow-up on the numerous referencesprovided
An overview of texture features is given in the next section For convenience,the existing texture descriptors are classified into three categories: features that arecomputed in the spatial domain (Section 12.3), features that are computed usingrandom field models (Section 12.4), and features that are computed in a transformdomain (Section 12.5) Section 12.6 contains a comparison of different texturedescriptors in terms of image-retrieval performance Section 12.7 describes theuse of texture features in image segmentation and in constructing a texturethesaurus for browsing and searching an aerial image database Ongoing workrelated to texture in the moving picture experts group (MPEG-7) standardizationwithin the international standards organization (ISO) MPEG subcommittee hasalso been described briefly
12.2 TEXTURE FEATURES
A feature is defined as a distinctive characteristic of an image and a descriptor is
a representation of a feature [6] A descriptor defines the syntax and the semantics
of the feature representation Thus, a texture feature captures one specific attribute
of an image, such as coarseness, and a coarseness descriptor is used to representthat feature In the image processing and computer-vision literature, however,the terms feature and descriptor (of a feature) are often used synonymously Wealso drop this distinction and use these terms interchangeably in the followingdiscussion
Initial work on texture discrimination used various image texture statistics.
For example, one can consider the gray level histogram as representing the
Trang 5TEXTURE FEATURES 317
first-order distribution of pixel intensities, and the mean and the standard ation computed from these histograms can be used as texture descriptors fordiscriminating different textures First-order statistics treat pixel-intensity values
devi-as independent random variables; hence, they ignore the dependencies betweenneighboring-pixel intensities and do not capture most textural properties well.One can use second-order or higher-order statistics to develop more effective
descriptors Consider the pixel value at position s, I (s)-l Then, the joint
distri-bution is specified by P (l, m, r ) = Prob (I (s) = l and I (s + r) = m), where
s and r denote 2D pixel coordinates One of the popular second-order
statis-tical features is the gray-level co-occurrence matrix, which is generated from the
empirical version of P (l, m, r ) (obtained by counting how many pixels have value l and the pixel displaced by r has the value m) Many statistical features
computed from co-occurrence matrices have been used in texture discrimination(for a detailed discussion refer to Ref [7], Chapter 9) The popularity of thisdescriptor is due to Julesz, who first proposed the use of co-occurrence matricesfor texture discrimination [8] He was motivated by his conjecture that humansare not able to discriminate textures that have identical second-order statistics(this conjecture has since then proven to be false.)
During the 1970s the research mostly focused on statistical texture features fordiscrimination, and in the 1980s, there was considerable excitement and interest ingenerative models of textures These models were used for both texture synthesisand texture classification Numerous random field models for texture representa-tion [9–12] were developed in this spirit and a review of some of the recent workcan be found in Ref [13] Once the appropriate model features are computed,the problem of texture classification can be addressed using techniques fromtraditional pattern classification [14]
Multiresolution analysis and filtering has influenced many areas of imageanalysis, including texture, during the 1990s We refer to these as spatialfiltering–based methods in the following section Some of these methodsare motivated by seeking models that capture human texture discrimination
In particular, preattentive texture discrimination — the ability of humans
to distinguish between textures in an image without any detailed sceneanalysis — has been extensively studied Some of the early work in this field can
be attributed to Julesz [2,3] for his theory of textons as basic textural elements.Spatial filtering approaches have been used by many researchers for detectingtexture boundaries [15,16] In these studies, texture discrimination is generallymodeled as a sequence of filtering operations without any prior assumptions aboutthe texture-generation process Some of the recent work involves multiresolutionfiltering for both classification and segmentation [17,18]
12.2.1 Human Texture Perception
Texture, as one of the basic visual features, has been studied extensively bypsychophysicists for over three decades Texture helps in the studying and under-standing of early visual mechanisms in human vision In particular, Julesz and his
Trang 6colleagues [2,3,8,19] have studied texture in the context of preattentive vision.Julesz defines a “preattentive visual system” as one that “cannot process complexforms, yet can, almost instantaneously, without effort or scrutiny, detect differ-
ences in a few local conspicuous features, regardless of where they occur” (quoted
from Ref [3]) Julesz coined the word textons to describe such features that
include elongated blobs (together with their color, orientation, length, and width),line terminations, and crossings of line-segments Differences in textons or in theirdensity can only be preattentively discriminated The observations in Ref [3] aremostly limited to line drawing patterns and do not include gray scale textures.Julesz’s work focused on low-level texture characterization using textons,whereas Rao and Lohse [20] addressed issues related to high-level features fortexture perception In contrast with preattentive perception, high-level features
are concerned with attentive analysis There are many applications, including
some in image retrieval, that require such analysis Examples include image analysis (detection of skin cancer, analysis of mammograms, analysis ofbrain MR images for tissue classification and segmentation, to mention a few) andmany process control applications Rao and Lohse identify three features as being
medical-important in human texture perception: repetition, orientation, and complexity.
Repetition refers to periodic patterns and is often associated with regularity Abrick wall is a repetitive pattern, whereas a picture of ocean water is nonrepet-itive (and has no structure) Orientation refers to the presence or absence ofdirectional textures Directional textures have a flowlike pattern as in a picture
of wood grain or waves [21] Complexity refers to the descriptional complexity
of the textures and, as the authors state in Ref [20], “ if one had to describe
the texture symbolically, it (complexity) indicates how complex the resulting
description would be.” Complexity is related to Tamura’s coarseness feature
certain direction, occur in an image Let (x, y) ∈ {1, , N} be the intensity value of an image pixel at (x, y) Let [(x1− x2)2+ (y1− y2)2]1/2 = d be the distance that separates two pixels at locations (x1, y1) and (x2, y2), respectively,
and with intensities i and j , respectively The co-occurrence matrices for a given
Trang 7TEXTURE FEATURES BASED ON SPATIAL-DOMAIN ANALYSIS 319
d are defined as follows:
c (d) = [c(i, j)], i, j, ∈ {l, , N} ( 12.1)
where c(i, j ) is the cardinality of the set of pixel pairs that satisfy I (x1, y1)=
i and I (x2, y2)= j, and are separated by a distance d Note that the
direc-tion between the pixel pairs can be used to further distinguish co-occurrence
matrices for a given distance d Haralick and coworkers [25] describe 14 texture
features based on various statistical and information theoretic properties of theco-occurrence matrices Some of them can be associated with texture propertiessuch as homogeneity, coarseness, and periodicity Despite the significant amount
of work on this feature descriptor, it now appears that this characterization oftexture is not very effective for classification and retrieval In addition, thesefeatures are expensive to compute; hence, co-occurrence matrices are rarely used
in image database applications
12.3.2 Tamura’s Features
One of the influential works on texture features that correspond to human textureperception is the paper by Tamura, Mori, and Yamawaki [27] They characterizedimage texture along the dimensions of coarseness, contrast, directionality, line-likeness, regularity, and roughness
resolu-tion Consider two aerial pictures of Manhattan taken from two different heights:the one which is taken from a larger distance is said to be less coarse than theone taken from a shorter distance wherein the blocky appearance of the buildings
is more evident In this sense, coarseness also refers to the size of the underlyingelements forming the texture Note that an image with finer resolution will have
a coarser texture An estimator of this parameter would then be the best scale
or resolution that captures the image texture Many computational approaches tomeasure this texture property have been described in the literature In general,these approaches try to measure the level of spatial rate of change in image inten-sity and therefore indicate the level of coarseness of the texture The particularprocedure proposed in Ref [27] can be summarized as follows:
1 Compute moving averages in windows of size 2k× 2k at each pixel (x, y), where k = 0, 1, , 5.
2 At each pixel, compute the difference E k (x, y) between pairs of
nonover-lapping moving averages in the horizontal and vertical directions
3 At each pixel, the value of k that maximizes E k (x, y) in either direction is used to set the best size: Sbest(x, y)= 2k
4 The coarseness measure Fcrsis then computed by averaging Sbest(x, y) over
the entire image
Trang 8Instead of taking the average of Sbest, an improved version of the coarsenessfeature can be obtained by using a histogram to characterize the distribution of
Sbest This modified feature can be used to deal with a texture that has multiplecoarseness properties
present in an image Contrast also refers to the overall picture quality — a contrast picture is often considered to be of better quality than a low–contrastversion Dynamic range of the intensity values and sharpness of the edges in theimage are two indicators of picture contrast In Ref [27], contrast is defined as
where n is a positive number, σ is the standard deviation of the gray-level probability distribution, and α4 is the kurtosis, a measure of the polarizationbetween black and white regions in the image The kurtosis is defined as
where µ4 is the fourth central moment of the gray-level probability distribution
In the experiments in [27], n = 1/4 resulted in the best texture-discrimination
performance
Direction-ality (or lack of it) is due to both the basic shape of the texture element and theplacement rule used in creating the texture Patterns can be highly directional(e.g., a brick wall) or may be nondirectional, as in the case of a picture of acloud The degree of directionality, measured on a scale of 0 to 1, can be used as
a descriptor (for example, see Ref [27]) Thus, two patterns, which differ only intheir orientation, are considered to have the same degree of directionality Thesedescriptions can be computed either in the spatial domain or in the frequencydomain In [27], the oriented edge histogram (number of pixels in which edgestrength in a certain direction exceeds a given threshold) is used to measure thedegree of directionality Edge strength and direction are computed using the Sobel
edge detector [28] A histogram H (φ) of direction values φ is then constructed
by quantizing φ and counting the pixels with magnitude larger than a predefined
threshold This histogram exhibits strong peaks for highly directional images and
is relatively flat for images without strong orientation A quantitative measure ofdirectionality can be computed from the sharpness of the peaks as follows:
Trang 9AUTOREGRESSIVE AND RANDOM FIELD TEXTURE MODELS 321
where n p is the number of peaks and φ p is the pth peak position of H For each peak p, w p is the set of bins distributed over it and r is the normalizing factor related to quantizing levels of φ.
In addition to the three components discussed earlier, Tamura and coworkers[27] also consider three other features, which they term line-likeness, regularity,and roughness There appears to be a significant correlation between these threefeatures and coarseness, contrast, and directionality It is not clear that addingthese additional dimensions enhances the effectiveness of the description Theseadditional dimensions will not be used in the comparison experiments described
in Section 12.7
Tamura’s features capture the high-level perceptual attributes of a texture welland are useful for image browsing However, they are not very effective for finertexture discrimination
12.4 AUTOREGRESSIVE AND RANDOM FIELD TEXTURE MODELS
One can think of a textured image as a two-dimensional (2D) array of randomnumbers Then, the pixel intensity at each location is a random variable One can
model the image as a function f (r, ω), where r is the position vector representing
the pixel location in the 2D space and ω is a random parameter For a given value
of r, f (r, ω) is a random variable (because ω is a random variable) Once we select a specific texture ω, f (r, ω) is an image, namely, a function over the two-dimensional grid indexed by r f (r, ω) is called a random field [29] Thus,
one can think of a texture-intensity distribution as a realization of a randomfield Random field models (also referred to as spatial-interaction models) imposeassumptions on the intensity distribution One of the initial motivations for suchmodel-based analysis of texture is that these models can be used for texturesynthesis There is a rich literature on random field models for texture analysisdating back to the early seventies and these models have found applicationsnot only in texture synthesis but also in texture classification and segmentation[9,11,13,30–34]
A typical random field model is characterized by a set of neighbors (typically, asymmetric neighborhood around the pixel), a set of model coefficients, and a noisesequence with certain specified characteristics Given an array of observations
{y(s)} of pixel-intensity values, it is natural to expect that the pixel values are
locally correlated This leads to the well known Markov model
where N is a symmetric neighborhood set For example, if the neighborhood
is the four immediate neighbors of a pixel on a rectangular grid, then N =
{(0, 1), (1, 0), (−1, 0), (0, −1)}.
We refer to Besag [35,36] for the constraints on the conditional probabilitydensity for the resulting random field to be Markov If, in addition to being
Trang 10Markov, {y(s)} is also Gaussian, then, a pixel value at s, y(s), can be written
as a linear combination of the pixel values y(s + r), r ∈ N, and an additive
correlated noise (see Ref [34])
A special case of the Markov random field (MRF) that has received muchattention in the image retrieval community is the simultaneous autoregressivemodel (SAR), given by
loca-and unit variance The parameters ({θ(r), β}) characterize the texture
observa-tions {y(s)} and can be estimated from those observations The SAR and MRF
models are related to each other in that, for every SAR there exists an equivalentMRF with second-order statistics that are identical to the SAR model However,the converse is not true: given an MRF, there may not be an equivalent SAR
The model parameters ( {θ(r)}, β) form the texture feature vector that
can be used for classification and similarity retrieval The second-orderneighborhood has been widely used and it consists of the 8-neighborhood of a
pixel N = {(0, 1), (1, 0), (0, −1), (−1, 0), (1, 1)(1, −1), (−1, −1), (−1, 1)} For
a symmetric model θ(r) = θ(−r); hence five parameters are needed to specify a
symmetric second-order SAR model
In order to define an appropriate SAR model, one has to determine the size
of the neighborhood This is a nontrivial problem, and often, a fixed-size borhood does not represent all texture variations very well In order to addressthis issue, the multiresolution simultaneous autoregressive (MRSAR) model hasbeen proposed [37,38] The MRSAR model tries to account for the variability
neigh-of texture primitives by defining the SAR model at different resolutions neigh-of aGaussian pyramid Thus, three levels of the Gaussian pyramid, together with
a second-order symmetric model, requires 15(3 × 5) parameters to specify the
texture
12.4.1 Wold Model
Liu and Picard propose the Wold model for image retrieval application [39]
It is based on the Wold decomposition of stationary stochastic processes Inthe Wold model, a 2D homogeneous random field is decomposed into threemutually orthogonal components, which approximately correspond to the threedimensions (periodicity, directionality, and complexity or randomness) identi-fied by Rao and Lohse [20] The construction of the Wold model proceeds asfollows First, the periodicity of the texture pattern is analyzed by consideringthe autocorrelation function of the image Note that for periodic patterns, theautocorrelation function is also periodic The corresponding Wold feature setconsists of the frequencies and the magnitudes of the harmonic spectral peaks
Trang 11SPATIAL FREQUENCY AND TRANSFORM DOMAIN FEATURES 323
In the experiments in Ref [39] the 10 largest peaks are kept for each image Theindeterministic (random) components of the texture image are modeled using theMRSAR process described in the preceding section For similarity retrieval, two
separate sets of ordered retrievals are computed, one using the harmonic-peak
matching and the other using the distances between the MRSAR features Then,
a weighted ordering is computed using the confidence measure (the posteriorprobability) on the query pattern’s regularity
The experimental results in the Brodatz database, presented in Ref [39], whichshows that the Wold model provides perceptually better quality results than theMRSAR model The comparative results shown in Ref [39] also indicate thatthe Tamura features fare significantly worse than the MRSAR or Wold models
12.5 SPATIAL FREQUENCY AND TRANSFORM DOMAIN
FEATURES
Instead of computing the texture features in the spatial domain, an attractivealternative is to use transform domain features The discrete Fourier transform(DFT), the discrete cosine transform (DCT), and the discrete wavelet transforms(DWT) have been quite extensively used for texture classification in the past.Some of the early work on the use of Fourier transform features in analyzingtexture in satellite imagery can be found in Refs [40–44] The power spectrumcomputed from the DFT is used in the computation of texture features [45] and
in analyzing texture properties such as periodicity and regularity [46] In [46], the
two spatial-frequencies, f1 and f2, that represent the periodicity of the textureprimitives are first identified If the texture is perfectly periodic, most of theenergy is concentrated in the power spectrum at frequencies corresponding to
f = mf1+ nf2(m, n are integers) The corresponding spatial grid is overlaid on
the original texture and the texture cell is defined by the grid cell If the texturehas a strong repetitive pattern, this method appears to work well in identifyingthe basic texture elements forming the repetitive pattern In general, power spec-trum–based features have not been very effective in texture classification andretrieval; this could primarily be a result of the manner in which the power spec-trum is estimated Laws [47] makes a strong case for computing local featuresusing small windows instead of global texture features Although some of thework in image retrieval has used block-based features (as in the DCT coefficients
in 8× 8 blocks of JPEG compressed images), detailed and systematic studies arenot available on their performance at this time
The last decade has seen significant progress in multiresolution analysis ofimages, and much work has been done on the use of multiresolution features
to characterize image texture Two of the more popular approaches have beenreviewed, one based on orthogonal wavelet transforms and the other based onGabor filtering, which appear very promising in the context of image retrieval.Conceptually, these features characterize the distribution of oriented edges in theimage at multiple scales
Trang 1212.5.1 Wavelet Features
The wavelet transform [48,49] is a multiresolution approach that has been usedquite frequently in image texture analysis and classification [17,50] Wavelettransforms refer to the decomposition of a signal with a family of basis functionsobtained through translation and dilation of a special function called the motherwavelet The computation of 2D wavelet transforms involves recursive filteringand subsampling; and at each level, it decomposes a 2D signal into four subbands,which are often referred to as LL, LH, HL, and HH, according to their frequencycharacteristics (L= Low, H = High) Two types of wavelet transforms have beenused for texture analysis, the pyramid-structured wavelet transform (PWT) andthe tree-structured wavelet transform (TWT) (see Figure 12.3) The PWT recur-sively decomposes the LL band However, for some textures the most importantinformation often appears in the middle frequency channels and further decom-position just in the lower frequency band may not be sufficient for analyzing thetexture TWT has been suggested as an alternative in which the recursive decom-position is not restricted to the LL bands (see Figure 12.3) For more details, werefer the reader to Ref [17]
A simple wavelet transform feature of an image can be constructed using themean and standard deviation of the energy distribution in each of the subbands ateach decomposition level This in turn corresponds to the distribution of “edges”
in the horizontal, vertical, and diagonal directions at different resolutions For athree-level decomposition, PWT results in a feature vector of (3× 3 × 2 + 2 =20) components As with TWT, the feature will depend on how subbands at eachlevel are decomposed A fixed decomposition tree can be obtained by sequentiallydecomposing the LL, LH, and HL bands, and thus results in a feature vector of
40× 2 components Note that, in this example, the feature obtained by PWTcan be considered as a subset of the TWT features The specific choice of basisfunctions of the wavelet does not seem to significantly impact the image-retrievalperformance of the descriptors
Trang 13SPATIAL FREQUENCY AND TRANSFORM DOMAIN FEATURES 325
in the sense of minimizing the joint 2D uncertainty in space and frequency[52,59] These filters can be considered as the orientation and scale-tunable edgeand line (bar) detectors, and the statistics of these microfeatures in a homoge-neous region are often used to characterize the underlying texture information.Gabor features have been used in several image-analysis applications, includingtexture classification and segmentation [51,60,61], image recognition [62,63],image registration, and motion tracking [57]
A 2D Gabor function is defined as
Figure 12.4 shows 3D profiles of the real (even) and imaginary (odd) components
of such a Gabor function A class of self-similar Gabor filters can be obtained
by appropriate dilations and rotations of g(x, y).
1000
500
0
40 30 20 10
Trang 14To compute the Gabor texture feature vector, a given image I (x,y) is first filtered with a set of scale- and orientation-tuned Gabor filters Let m and n index the scale and orientation, respectively, of these Gabor filters Let α mn and
β mn denote the mean and the standard deviation, respectively, of the energy
distribution, of the transform coefficients If S is the total number of “scales” and K is the number of orientations, then the total number of filters used is SK.
A texture feature can then be constructed as
'
f = [α00β00α01β01· · · α (S −1)(K−1) β (S −1)(K−1)] ( 12.8)
In the experiments described in the next section, S = 6 orientations and K =
4 scales are used to construct the texture feature vector of 6× 4 × 2 = 48dimensions
12.6 COMPARISON OF DIFFERENT TEXTURE FEATURES FOR IMAGE RETRIEVAL
12.6.1 Similarity Measures for Textures
We have presented several different texture descriptors that are useful for texturediscrimination As mentioned earlier, texture descriptors are quite useful insearching for similar patterns in a large database In a typical “query-by-example”scenario, the user would be interested in retrieving several similar images and notjust the best match This requires comparing two descriptors to obtain a measure ofsimilarity (or dissimilarity) between the two image patterns Similarity judgmentsare perceptual and subjective; however, the computed similarity depends not only
on the specific feature descriptors but also on the choice of metrics It is generallyassumed that the descriptors are in an Euclidean space (for a detailed discussion
on similarity metrics see Ref [64]) Some of the commonly used dissimilaritymeasures in listed are the following section (see Ref [65].)
Let the descriptor be represented as an m-dimensional vector f = [f1· · · f m]T
Given two images I and J , let D(I , J ) be the distance between the two images
as measured using the descriptors f I and f J
Euclidean distance (squared) (also called the L-2 distance).
D(I, J ) = ||f I − f J || = (f I − f J ) T (f I − f J ) ( 12.9)
Mahalanobis distance.
D(I, J ) = (f I − f J ) T +−1(f I − f J ), ( 12.10)
where the covariance matrix
+ = E[(f − µ f )(f − µ f ) T ] and µ f = E[f ] ( 12.11)
L 1 distance.
D(I, J ) = |f i − f j| =|f k,I − f k,J| ( 12.12)
Trang 15COMPARISON OF DIFFERENT TEXTURE FEATURES FOR IMAGE RETRIEVAL 327
In Ref [58], a weighted L-1 norm is used:
Kullback-Leibler (K–L) divergence If f is considered as a probability
distribu-tion (e.g., a normalized histogram), then
D(I, J )=
k
f k,Ilogf k,I
f k,J
denotes the “distance” between two distributions In Information Theory it is
commonly known as relative entropy Note that this function is not symmetric
and does not satisfy the triangle inequality However, one can use a symmetric
distance by taking the average of D (I,J ) and D (J,I )
Reference [65] contains a systematic performance evaluation of differentdissimilarity measures for texture retrieval Performance is measured by precision,which is the percentage of relevant retrievals relative to the total number ofretrieved images The Gabor feature descriptor described in Section 12.5.2 is used
in the comparisons The weighted L1 distance performs as well as some of theother distances for small sample sizes, whereas measures based on distributionssuch as the K-L divergence perform better for larger sample sizes The authorsconclude that no single measure achieves the best overall performance
12.6.2 A Comparison of Texture Descriptors for Image Retrieval
In this section, experimental results intended to compare the effectiveness ofseveral popular texture descriptors on a set of image retrieval tasks will be
presented The image database used consists of 19,800 color natural images
from Corel photo galleries and 116 512× 512 texture images from the Brodatzalbum [1] and the USC texture database [66]
Except for the MRSAR feature vector, the weighted L1 distance is used incomputing the dissimilarity between two texture descriptors For the MRSAR, it
is observed that the Mahalanobis distance gives the best retrieval performance[39] Note that a covariance matrix for the feature vectors needs to be computedfor this case, and using an image-dependent covariance matrix (one such matrixfor each image pattern in the database) gives a better performance over using
a global covariance matrix (one matrix for all the images in the database) Inaddition to the texture features described in the previous sections, we also used
an edge histogram descriptor [67] In constructing this histogram, we quantize
Trang 16the orientation into eight bins and use a predefined fixed threshold to removeweak edges Each bin in the histogram represents the number of edges having acertain orientation.
Figure 12.5a shows the retrieval performance of different texture featuresusing Corel photo galleries The query and ground truth (relevant retrievals for
a given query) are manually established The performance is measured in terms
of the average number of relevant retrievals as a function of the number ofretrievals The results are averaged over all the queries The texture featureswith performance ordered from the best to the worst are: MRSAR (using image-dependent covariance), Gabor, TWT, PWT, MRSAR (using global covariance),modified Tamura coarseness histogram and directionality, Canny edge histogram,and traditional Tamura features Note that using an image-dependent covariancematrix significantly increases the size of MRSAR features
In addition to the Corel photo database, the Brodatz texture album was alsoused to evaluate the performance of texture features This database has beenwidely used for research on texture classification and analysis, and therefore, isvery appropriate for bench-marking texture features Each 512× 512 Brodatztexture image is divided into 16 nonoverlapping subimages, each 128× 128pixels in size Thus, for each of the 16 subimages from a 512× 512 textureimage, there are 15 other subimages from the same texture Having chosen anyone subimage as the query, we would like to retrieve the remaining 15 othersubimages from that texture as the top matched retrievals The percentage ofcorrect retrievals in the top 15 retrieved images is then used as a performancemeasure Figure 12.5b shows the experimental results based on averaging theperformance over all the database images As can be seen, the features withperformance ordered from the best to the worst are Gabor, MRSAR (usingimage-dependent covariance), TWT, PWT, modified Tamura, MRSAR (usingglobal covariance), traditional Tamura, coarseness histogram, directionality, andCanny edge histogram The results are similar to the ones shown in Figure 12.5(a)except that Gabor and traditional Tamura features show improved performance(for more details and discussions, see [58,68]) Note that, using the image class
Diamond: Tamura (modified)
Triangle: coarseness histogram
Circle: directionality
Satr: Canny edge histogram
Solid: Gabor Dashdot: MRSAR (M) Dashed: TWT Dotted: PWT Diamond: Tamura (modified) Plus: MRSAR Square: Tamrua (traditional) Triangle: coarseness histogram Circle: directionality
Figure 12.5 Retrieval performance of different texture features (a) for the Corel photo
databases and (b) performance on the Brodatz texture image set.