Tài liệu Cơ sở dữ liệu hình ảnh P15 pdf

The distance = dissimilarity between two objects O1 and O2 isdenoted by For example, if the objects are two equal length time series, the distanceD could be their Euclidean distance sum

Trang 1

Edited by Vittorio Castelli, Lawrence D Bergman Copyright  2002 John Wiley & Sons, Inc ISBNs: 0-471-32116-8 (Hardback); 0-471-22463-4 (Electronic)

CHRISTOS FALOUTSOS

Carnegie Mellon University, Pittsburgh, Pennsylvania

15.1 INTRODUCTION

In this chapter we focus on the design of methods for rapidly searching a database

of multimedia objects, allowing us to locate objects that match a query object,exactly or approximately We want a method that is general and that can handleany type of multimedia objects Objects can be two-dimensional (2D) colorimages, gray scale medical images in two-dimensional or three-dimensional (3D)(e.g., MRI brain scans), one-dimensional (1D) time series, digitized voice or

music, video clips, and so on A typical query-by-content is “in a collection of color photographs, find ones with the same color distribution as a sample sunset photograph.”

Specific applications include the following:

• Image databases [1] in which we would like to support queries on color(Chapter 11), shape (Chapter 13), and texture (Chapter 12)

• Video databases [2,3]

• Financial, marketing, and production time series, such as stock prices, sales

numbers and so on In such databases, typical queries would be “find companies whose stock prices move similarly,” or “find other companies that have sales patterns similar to our company,” or “find cases in the past that resemble last month’s sales pattern of our product”[4].

• Scientific databases (Chapters 3 and 5), with collections of sensor data In

this case, the objects are time series, or, more general, vector fields, that is, tuples of the form, < x, y, z, t, pressure, temperature, > For example,

in weather data [5], geologic, environmental, and astrophysics databases,

and so on, we want to ask queries of the form “find previous days in which the solar magnetic wind showed patterns similar to today’s pattern” to help

in predictions of the Earth’s magnetic field [6]

435

Trang 2

• Multimedia databases, with audio (voice, music), video, and so on [7] Usersmight want to retrieve, for example, music scores or video clips that aresimilar to provided examples.

• Medical databases (Chapter 4) in which 1D objects (e.g., ECGs), 2D images(e.g., X rays), and 3D images (e.g., MRI brain scans) are stored Ability torapidly retrieve past cases with similar symptoms would be valuable fordiagnosis and for medical and research purposes [8,9]

• Text and photographic archives [10], digital libraries [11,12] containingASCII text, bitmaps, gray scale, and color images

• DNA databases [13] containing large collections of long strings (hundred

or thousand characters long) from a four-letter alphabet (A,G,C,T); a newstring has to be matched against the old strings to find the best candidates.Searching for similar patterns in databases such as these are essential because ithelps in predictions, computer-aided medical diagnosis and teaching, hypothesistesting and, in general, in “data mining” [14–16] and rule discovery

Of course, the dissimilarity between two objects has to be quantified ilarity is measured as a distance between feature vectors, extracted from theobjects to be compared We rely on a domain expert to supply such a distancefunctionD():

Dissim-Definition 1 The distance (= dissimilarity) between two objects O1 and O2 isdenoted by

For example, if the objects are two (equal length) time series, the distanceD()

could be their Euclidean distance (sum of squared differences), whereas forDNA sequences, the editing distance (smallest number of insertions, deletions,and substitutions that are needed to transform the first string to the second) iscustomarily used

Similarity queries can been classified into two categories:

Whole Match Given a collection of N objects O1, O2, , O N and a query

object Q, we want to find those data objects that are within distance ε from

Q Notice that the query and the objects are of the same type: for example,

if the objects are 512× 512 gray scale images, so is the query

Subpattern Match. Here, the query is allowed to return only part of the

objects being searched Specifically, given N data objects (e.g., images)

O1, O2, , O N , a query object Q and a tolerance ε, we want to identify

the parts of the data objects that match the query If the objects are, forexample, 512× 512 gray scale images (such as medical X-rays), the querymight be a 16× 16 subpattern (e.g., a typical X-ray of a tumor)

Additional types of queries include “nearest neighbors” queries (e.g., “find the five most similar stocks to IBM’s stock”) and “all pairs” queries or “spatial joins”

Trang 3

(e.g., “report all the pairs of stocks that are within distance ε from each other”).

Both these types of queries can be supported by our approach: As we shall see,

we can reduce the problem into searching for multidimensional points that will beorganized into R-trees; in this case, nearest-neighbor search can be handled with

a branch-and-bound algorithm [17,18] and the spatial-join query can be handledwith recently developed, finely tuned algorithms [19]

For both “whole match” and “subpattern match,” the ideal method shouldfulfill the following requirements:

• It should be fast Sequential scanning and computing distances for each and

every object can be too slow for large databases

• It should be correct In other words, it should return all the qualifying

objects without missing any (i.e., no “false dismissals”) Notice that “falsealarms” are acceptable because they can be discarded easily through a post-processing step

• The ideal method should require a small amount of additional memory

• The method should be dynamic It should be easy to insert, delete, andupdate objects

The remainder of the chapter is organized as follows Section 15.2 describes themain ideas for “GEMINI,” a generic approach to indexing multimedia objects.Section 15.3 shows the application of the approach for 1D time series indexing.Section 15.4 focuses on indexing methods for shape, texture, and particularly,color Section 15.5 shows how to extend the ideas to handle subpattern matching.Section 15.6 summarizes the chapter and lists problems for future research.Appendix 15.6 gives some background material on past-related work, on imageindexing, and on spatial access methods (SAMs)

15.2 GEMINI: FUNDAMENTALS

To illustrate the basic concepts of indexing, we shall focus on “whole match”queries The problem is defined as follows:

• We have a collection of N objects: O1, O2, , O N;

• The distance and dissimilarity between two objects (O i , O j) is given by thefunction D(Oi , O j )

• The user specifies a query object Q and a tolerance ε.

Our goal is to find the objects in the collection that are within distance ε of the

query object An obvious solution is to apply sequential scanning: for each and

every object O i (1≤ i ≤ N), we can compute its distance from Q and report the

objects with distance D(Q, Oi ) ≤ ε.

However, sequential scanning may be slow, for two reasons:

1 The distance computation might be expensive For example, the editingdistance in DNA strings requires a dynamic-programming algorithm, which

Trang 4

grows with the product of the string lengths (typically, in the hundreds orthousands, for DNA databases);

2 The database size N might be huge.

Thus, we look for a faster alternative The “GEMINI” (GEneric Multimediaobject INdexIng) approach is based on two ideas, each of which tries to avoidthe two disadvantages of sequential scanning:

• a “quick-and-dirty” test, to discard quickly the vast majority of nonqualifyingobjects (possibly, allowing some false alarms);

• the use of SAM, to achieve faster-than-sequential searching, as suggested

where S[i] stands for the value of stock S on the i-th day Clearly, computing

the distance between two stocks will take 365 subtractions and 365 squarings.The idea behind the “quick-and-dirty” test is to characterize a sequence with asingle number, which will help us discard many nonqualifying sequences Such

a number could be, for example, the average stock price over the year Clearly,

if two stocks differ in their averages by a large margin, they cannot be similar.The converse is not true, which is exactly the reason we may have false alarms.Numbers that contain some information about a sequence (or a multimedia object,

in general), will be referred to as “features” for the rest of this paper A good

feature (such as the “average” in the stock prices example) will allow us toperform a quick test, which will discard many items, using a single numericalcomparison for each

If using a single feature is good, using two or more features might be evenbetter because they may reduce the number of false alarms, at the cost of makingthe “quick-and-dirty” test a bit more elaborate and expensive In our stock pricesexample, additional features might include the standard deviation or some of thediscrete Fourier transform (DFT) coefficients, as we shall see in Section 15.3

By using f features, we can map each object into a point in f -dimensional (f -d) space We shall refer to this mapping as F ():

Definition 2 Let F () be the mapping of objects to f -d points, that is, F (O)

will be the f -d point that corresponds to object O.

This mapping provides the key to improving on the second drawback of sequential

scanning: by organizing these f -d points into a SAM, we can cluster them in a

Trang 5

1 365 Sn

S1

365 1

Feature 2

Feature 1 F(Sn)

F(S1)

e

Figure 15.1 Illustration of the basic idea: a database of sequences S1, Sn; each

sequence is mapped to a point in feature space; a query with tolerance ε becomes a sphere of radius ε.

hierarchical structure, for example, an R∗-tree In processing a query, we use the

R∗-tree to prune out large portions of the database that are not promising Such

a structure will be referred to as an F-index (for “Feature index”) By using an F-index, we do not even have to do the “quick-and-dirty” test on all of the f -d

points!

Figure 15.1 illustrates the basic idea: Objects (e.g., time series that are points long) are mapped into 2D points (e.g., using the average and standarddeviation as features) Consider the “whole-match” query that requires all the

365-objects that are similar to S n within tolerance ε: this query becomes an f -d sphere in feature space, centered on the image F (S n ) of S n Such queries onmultidimensional points is exactly what R-trees and other SAMs are designed

to answer efficiently More specifically, the search algorithm for a whole-matchquery is as follows:

Algorithm 1 Search an F-index:

1 Map the query object Q into a point F (Q) in feature space;

2 Using the SAM, retrieve all points within the desired tolerance ε from

F (Q);

3 Retrieve the corresponding objects, compute their actual distance from Q,

and discard the false alarms

Intuitively, an F-index has the potential to relieve both problems of the sequentialscan, presumably resulting in much faster searches

However, the mapping F () from objects to f -d points must not distort the

distances More specifically, letD()be the distance function between two objectsandD ()be the distance between the corresponding feature vectors Ideally,

Trang 6

the mapping F () should preserve the distances exactly, in which case the SAM

will have neither false alarms nor false dismissals However, preserving distancesexactly might be very difficult: for example, it is not obvious which features can

be used to match the editing distance between two DNA strings Even if thefeatures are obvious, there might be practical problems: for example, we couldtreat every stock price sequence as a 365-dimensional vector Although in theory

a SAM can support an arbitrary number of dimensions, in practice they all sufferfrom the “dimensionality curse” discussed in the survey appendix

The crucial observation is that we can avoid false dismissals completely inthe “F-index” method if the distance in feature space never overestimates the

distance between two objects Intuitively, this means that our mapping F () from objects to points should make things look closer Mathematically, let O1and O2

be two objects (e.g., same-length sequences) with distance functionD()(e.g., the

Euclidean distance) and F (O1) , F (O2)be their feature vectors (e.g., their firstfew Fourier coefficients), with distance function Dfeature() (e.g., the Euclideandistance, again) Then we have:

Lemma 1 To guarantee no false dismissals for whole-match queries, the feature

extraction function F () should satisfy the following formula:

Dfeature[F (O1), F (O2)]≤D(O1, O2) ( 15.3)

Proof Let Q be the query object, O be a qualifying object, and ε be the

toler-ance We want to prove that if the object O qualifies for the query, then it will

be retrieved when we issue a range query on the feature space That is, we want

Section 15.4

In conclusion, the approach to indexing multimedia objects for fast similaritysearching is as follows:

Algorithm 2 “GEMINI” approach:

1 Determine the distance functionD() between two objects;

Trang 7

2 Find one or more numerical feature-extraction functions, to provide a

“quick-and-dirty” test;

3 Prove that the distance in feature space lower-bounds the actual distance

D(), to guarantee correctness

4 Choose a SAM and use it to manage the f -d feature vectors.

In the next sections we show two case studies of applying this approach to 2Dcolor images and to 1D time series We shall see that the philosophy of the

“quick-and-dirty” filter, in conjunction with the lower-bounding lemma, can lead

to solutions to two problems:

• The dimensionality curse (time series)

• The “cross talk” of features (color images)

For each case study we (1 ) describe the objects and the distance function, (2 ) show how to apply the lower-bounding lemma, and (3 ) give experimental

results, on real or realistic data

15.3 1D TIME SERIES

Here the goal is to search a collection of (equal length) time series to find theones that are similar to a desired series For example, in a collection of yearlystock price movements, we want to find the ones that are similar to IBM For

the rest of the paper, we shall use the following notational conventions: If S and

Qare two sequences, then:

• Len(S) denotes the length of S;

• S [i : j ] denotes the subsequence that includes entries in positions i through j ;

• S [i] denotes the ith entry of sequence S;

• D(S, Q)denotes the distance of the two (equal length) sequences S and Q.

15.3.1 Distance Function

The first step in the GEMINI algorithm is to determine the distance measurebetween two time series This is clearly application-dependent Several measureshave been proposed for 1D and 2D signals In a recent survey for images (2Dsignals), Brown [21] mentions that one of the typical similarity measures is thecross-correlation (which reduces to the Euclidean distance, plus some additiveand multiplicative constants)

We chose the Euclidean distance because (1 ) it is useful in many cases and (2 ) other similarity measures often can be expressed as the Euclidean

distance between feature vectors after some appropriate transformation [22] As

Trang 8

in Ref [23], we choose the Euclidean distance because it is generally applicable,and because other similarity measures can often be expressed as the Euclideandistance between appropriately transformed feature vectors [22].

We denote the Euclidean distance between two sequences S and Q byD(S, Q).Additional and more elaborate distance functions, such as time-warping [24],can also be handled [4] as long as we are able to extract appropriate featuresfrom the time series

15.3.2 Feature Extraction and Lower-Bounding

Having decided on the Euclidean distance as the dissimilarity measure, the nextstep is to find some features that can lower-bound it We would like a set offeatures that preserve or lower-bound the distance and carry enough informationabout the corresponding time series to limit the number of false alarms Thesecond requirement suggests that we use “good” features, namely, features withmore discriminatory power In the stock price example, a “bad” feature would

be, for example, the value during the first day: two stocks might have similarfirst-day values, yet they may differ significantly from then on Conversely, twootherwise similar sequences, may agree everywhere, except for the first day’svalues

A natural feature to use is the average Additional features might include theaverage of the first half, of the second half, of the first quarter, and so on Thesefeatures resemble the first coefficients of the Hadamard transform [25] In signalprocessing, the most well-known transform is the Fourier transform, and, forour case, the discrete Fourier transform (DFT) Before we describe the desirablefeatures of the DFT, we proceed with its definition and some of its properties

15.3.3 Introduction to DFT

The n-point DFT [26,27] of a signal x = [x i ], i = 0, , n − 1 is defined to be

a sequence X of n complex numbers X F , F = 0, , n − 1, given by

Trang 9

energies (squares of the amplitude|x i|) at every point of the sequence:

where Xand Y are Fourier transforms of x and y, respectively.

Thus, if we keep the first f coefficients of the DFT as the features, we have

that is, the resulting distance in the f -d feature space will clearly underestimate

the distance of two sequences Thus, according to Lemma 1, there will be nofalse dismissals

Note that the F-index approach can be applied with any orthonormal transform,such as, the discrete cosine transform (DCT) [28], the wavelet transform [29],and so on, because they all preserve the distance between the original and thetransformed space In fact, our response time will improve with the ability of thetransform to concentrate the energy: the fewer the coefficients that contain most

of the energy, the fewer the false alarms, and the faster our response time Thus,the performance results presented next are pessimistic bounds; better transformswill achieve even better response times

We have chosen the DFT because it is the most well known, its code is readily

available (e.g., in the Mathematica package [30] or in “C” [31]), and it does a

good job of concentrating the energy in the first few coefficients In addition, the

DFT has the attractive property that the amplitude of the Fourier coefficients is

Trang 10

invariant under time shifts Thus, using the DFT for feature extraction allows us

to extend our technique to finding similar sequences, while ignoring shifts

15.3.4 Energy-Concentrating Properties of DFT

Having proved that keeping the first few DFT coefficients lower-bounds theactual distance, we address the question of how good DFT is, that is, whether itproduces few false alarms To achieve that, we have to argue that the first fewDFT coefficients will usually contain most of the information about the signal

The worst-case signal for the method is white noise, in which each value

x i is completely independent of its neighbors x i−1 and x i+1 The energy

spec-trum of white noise follows O(F0)[32], that is, it has the same energy in every

frequency This is bad for the F -index because it implies that all the

frequen-cies are equally important However, many real signals have a skewed energy

spectrum For example, random walks (also known as brown noise or brownian walks) exhibit an energy spectrum of O(F−2)[32] and therefore an amplitude

spectrum of O(F−1) Random walks follow the formula

plot Notice that, because it is a random walk, the amplitude of the Fourier

coefficients follow the 1/F line.

The mathematical argument for keeping the first few Fourier coefficientsagrees with the intuitive argument of the Dow Jones theory for stock price

Trang 11

1 5 10 50 100 500 1000.

0.001

0.1

10.

Figure 15.3 (Log-log) amplitude of the Fourier transform of the Swiss franc exchange

rate, along with the 1/F line.

movement [35] This theory tries to detect primary and secondary trends in the stock market movement and ignores minor trends.

Primary trends are defined as changes that are larger than 20 percent, typicallylasting more than a year; secondary trends show 1/3 to 2/3 relative changeover primary trends, with a typical duration of a few months; minor trendslast approximately for a week From the foregoing definitions, we concludethat primary and secondary trends correspond to strong, low-frequency signals,whereas minor trends correspond to weak, high-frequency signals Thus, theprimary and secondary trends are exactly the ones that our method willautomatically choose for indexing

In addition to the signals mentioned earlier, there is another group of signals,

called black noise [32] Their energy spectrum follows O(F −b ) , b > 2, which

is even more skewed than the spectrum of the brown noise Such signals modelsuccessfully, for example, the water level of rivers as they vary over time [34]

In addition to stock price movements and exchange rates, it is believed thatseveral families of real signals belong to the family of “colored noises,” withskewed spectra For example, 2D signals, like photographs, are far from whitenoise, exhibiting a few strong coefficients in the lower-spatial frequencies TheJPEG image-compression standard [28] exploits this phenomenon, effectivelyignoring the high-frequency components of the DCT, (which is closely related

to the Fourier transform) If the image consisted of white noise, no compression

would be possible at all Birkhoff’s theory [32] claims that “interesting” signals, such as musical scores and other works of art, consist of pink noise, whose energy spectrum follows O(F−1) The theory argues that white noise with O(F0)energy

spectrum is completely unpredictable, whereas brown noise with O(F−2)energyspectrum is too predictable and therefore uninteresting, and so is black noise.The energy spectrum of pink noise lies inbetween Signals with pink noise alsohave their energy concentrated in the first few frequencies (but not as few as inthe random walk)

Trang 12

15.3.5 Experiments

To determine the effectiveness of the F -index method, we compare it to a sequential-scanning method.

Experiments in Ref [23] used an R*-tree and showed that the response time

has a (rather flat) minimum when we retain f = 2 or 3 features For the rest

of the experiments, we kept f = 2 Fourier coefficients for indexing, resulting

in a four-dimensional (4D) R∗-tree (two real numbers for each complex DFTcoefficient) The sequences were artificially generated random walks, with length

n = 1024; their number N varied from 50 to 400 Figure 15.4 shows the response

time for the two methods (F-index and sequential scan), as a function of the

number of sequences N Clearly, the F -index method outperforms sequential

specif-2 For signals with skewed spectra, the minimum response time is achieved

for a small number of Fourier coefficients (f = 1 − 3) Moreover, the

minimum is rather flat, which implies that a suboptimal choice for f will

give a search time that is close to the minimum Thus, with the help ofthe lower-bounding lemma and the energy-concentrating properties of theDFT, we avoid the “dimensionality curse”;

3 The success in 1D series suggests that the F-index method seems promisingfor 2D or higher-dimensionality signals if those signals also have skewedspectrum The success of JPEG (using DCT) indicates that real imagesindeed have a skewed spectrum

178

80

42 21 11

Sequence set size

400

F-index Seq

Figure 15.4 Search time per query versus number N of sequences, for whole-match

queries; F -index method (black line) and sequential scanning (gray line).

Trang 13

15.4 2D COLOR IMAGES

One of the earliest systems for content search in large image databases was queryby-image content (QBIC) [36] The QBIC project studies methods to query largeon-line image databases using image content as the basis for the queries Types

of content include color, texture, shape, position, and dominant edges of image

items and regions Potential applications include medical (“Give me other images that contain a tumor with a texture like this one”) and photojournalism (“Give me images that have blue at the top and red at the bottom”), as well as art, fashion,

cataloging, retailing, and industry

In this section, we give an overview of the indexing aspects of QBIC, cally the distance functions and application of the lower-bounding lemma Moredetails about the algorithms and the implementation are in Refs [1,37]

specifi-We restrict the discussion to databases of still images, with two main datatypes: “images” (≡ “scenes”) and “items.” A scene is an (color) image; an item

is a part of a scene, for example, a person, a piece of outlined texture, or anapple Each scene has zero or more items The identification and extraction ofitems is beyond the scope of this paper [1]

Given that semantic features are outside the capability of current machinevision technology, QBIC uses (in addition to text key words) the properties of

color, texture, shape, location, and overall sketch-like appearance as the basis for

retrieval These properties are computable (to a greater or lesser degree), and havebroad, intuitive applicability For either a scene or an item, the user may query

on any or a combination of the above properties All queries are “approximate”

or “similarity” queries, and the system ranks the images, based on the selectedsimilarity function

As an example, a user interested in retrieving a beach scene needs to mapthe query into the available parameters offered by QBIC: for instance the colordistribution would be set to 35 percent white and 65 percent blue, and the texture

to “sand texture.” QBIC will retrieve and display images with these propertiesranked by the selected similarity measure The result will include beach scenesand false alarms (images that happened to have similar color and texture distri-bution) The user is then given an opportunity to select items of interest anddiscard unwanted information, thereby guiding the retrieval process

QBIC supports two ways of specifying a query:

1 “Direct query,” in which the user specifies the desired color, shape, texture,

or sketch directly, through methods such as picking colors from a palette

on the screen or drawing a freehand shape with the mouse

2 “Query-by-example,” closely related to the concept of relevance back [38], in which the user chooses a displayed image or item (say, theresult of a previous query) and asks for additional, similar images or items.The two modes of query may be used interchangeably within the same querysession

Trang 14

feed-15.4.1 Image Features and Distance Functions

In this section, we describe the feature sets used to characterize images anditems, and the associated distance functions that try to capture the similarity that

a human perceives The features are computed once during image ingest, whereasthe matching functions are applied at query time using those features We focusmainly on color features because color presents an interesting problem (namely,

the “cross talk” between features), which can be resolved by the GEMINI

algo-rithm 2

Color A typical method for representing color is to compute a k-element color

histogram for each item and scene Conceptually, k can be as high as 16×

106 colors, with each individual color being denoted by a point in a 3D colorspace In practice, we can cluster similar colors together using an agglomerativeclustering technique [25], divide the color space into nonoverlapping buckets(called “color bins”), and choose one representative color for each color bin In

the experiments reported later, the number of clusters was k = 256 and k = 64.

Each component in the color histogram is the fraction of pixels that are mostsimilar to the corresponding representative color For such a histogram of afictitious photograph of a sunset, there are many red, pink, orange, and purplepixels, but few white and green ones (Fig 15.5)

Determining the similarity of two images now reduces to the problem of

measuring the distance between their color histograms One such method (k× 1vectors) x and y is given by

d hist2 ( x, y) = ( x − y) t A( x − y) =

simi-performance

This measure correctly accounts for color similarity (an orange image is similar

to a red one) and color distribution (a half red–half blue image is different from

an all-purple one)

Shape Features Shape similarity has proven to be a difficult problem [40,41]

in model-based vision applications and the problem remains difficult in based image retrieval Typical features are the area, circularity, eccentricity, majoraxis orientation, and a set of algebraic moment invariants

content-The distance between two shape vectors is the (weighted) Euclidean distance

in which the weights reflect the importance of each feature

Texture Features Our texture features are modifications of the coarseness,

contrast, and directionality features proposed in Ref [42] See Ref [37] for

Trang 15

x

Bright red Pink Orange

Light blue

Dark blue

Pixel count

Figure 15.5 An example of a color histogram of a fictitious sunset photograph: Many

red, pink, orange, purple, and bluish pixels; few yellow, white, and green ones.

more details on our implementation of texture features, including descriptions ofhow we improved the robustness and efficiency of these measures The distancefunction is the (weighted) Euclidean distance in the 3D texture space Becauseindexing points in the 3D texture space is straightforward, we do not discusstexture further

Sketch The system supports the image retrieval method described in Refs.

[43,44] that allows images to be retrieved based on a rough user sketch A Cannyedge operator is applied to the luminance of each data image, transforming it to

a black-and-white sketch (“edge-map”), and the user’s sketch is compared witheach edge map in order to retrieve the similar ones Because of the complexity

of the matching function, the current implementation uses sequential scanning.Applying the lower-bounding lemma for faster indexing is the topic of futureresearch

15.4.2 Lower-Bounding

As mentioned earlier, we focus on indexing color; the texture creates no indexing

problems, whereas shapes present only the “dimensionality curse” (f ≈ 20features), which can be resolved with an energy-concentrating transformation: wecan use the Karhunen-Loeve (K-L) transform [25] Experiments on a collection

of 1,000 and 10,000 shapes showed that an R∗-tree with the first f = 2 K-Lcoefficients gives the best overall response time [45]

There are two obstacles in applying the F-index method for color indexing:

(1) the “dimensionality curse” (see Chapter 14): because the histogram can have numerous bins, for instance 64 or 256; and, most importantly, (2) the quadratic nature of the distance function: the distance function in the feature space involves

“cross talk” among the features (see Eq 15.14), and it is thus a full quadraticform involving all cross terms Not only is such a function much more expensive

to compute than a Euclidean (or any L p) distance, but it also precludes efficientimplementation of commonly used multikey indexing methods Figure 15.6 illus-trates the situation: to compute the distance between the two color histograms

Tiêu đề	Image Databases: Search and Retrieval of Digital Imagery
Tác giả	Christos Faloutsos
Trường học	Carnegie Mellon University
Chuyên ngành	Computer Science
Thể loại	Thesis
Năm xuất bản	2002
Thành phố	Pittsburgh

Định dạng
Số trang	30
Dung lượng	249,26 KB