DSpace at VNU: Content-Based Image Retrieval Using Moments of Local Ternary Pattern

Content-Based Image Retrieval Using Moments of LocalTernary Pattern Prashant Srivastava&Nguyen Thanh Binh& Ashish Khare Published online: 18 July 2014 # Springer Science+Business Media N

Trang 1

Content-Based Image Retrieval Using Moments of Local

Ternary Pattern

Prashant Srivastava&Nguyen Thanh Binh&

Ashish Khare

Published online: 18 July 2014

# Springer Science+Business Media New York 2014

Abstract Due to the availability of large number of digital

images, development of an efficient content-based indexing

and retrieval method is required Also, the emergence of

smartphones and modern PDAs has further substantiated the

need of such systems This paper proposes a combination of

Local Ternary Pattern (LTP) and moments for Content-Based

Image Retrieval Image is divided into blocks of equal size

and LTP codes of each block are computed Geometric

mo-ments of LTP codes of each block are computed followed by

computation of distance between moments of LTP codes of

query and database images Then, the threshold using distance

values is applied to retrieve images similar to the query image

Performance of the proposed method is compared with other

state-of-the-art methods on the basis of results obtained on

Corel-1,000 database The comparison shows that the

pro-posed method gives better results in terms of precision and

recall as compared to other state-of-the-art image retrieval

methods

Keywords Image retrieval Content-based image retrieval

Local ternary pattern Geometric moments

1 Introduction

With the advent of numerous digital image libraries, contain-ing huge amount of different types of images, it has become necessary to develop systems that are capable of performing efficient browsing and retrieval of images Also, with the emergence of mobiles and smartphones, the number of images

is increasing day-by-day Pure text-based image retrieval sys-tems are prevalent but are unable to retrieve visually similar images Also, it is practically difficult to annotate manually large number of images Hence, pure text-based approach is insufficient for image retrieval

Content-Based Image Retrieval (CBIR) - the retrieval of images on the basis of features present in the image, is an important problem of Computer Vision Content-based image retrieval, instead of using keywords and text, uses visual features such as colour, texture and shape to search an image from large database [1,2] These features form a feature set which act as an indexing scheme to perform search in an image database These feature sets of query images are com-pared with that of database images to retrieve visually similar images Since retrieval is based on contents of image, the process of arrangement and classification of images is easier

as it does not require manual annotation The automatic clas-sification of images together makes the access of similar images easier to the users

Early image retrieval systems were based on primitive features such as colour, texture and shape The field of image retrieval has witnessed substantial work on colour feature Colour is a visible property of an object and a powerful descriptor of object Colour based CBIR systems use conven-tional colour histogram to perform retrieval Texture is another feature that has been used extensively for image retrieval Texture feature represents structural arrangement of a region and describe characteristics such as smoothness, coarseness, roughness of a region One such feature is Local Binary

P Srivastava:A Khare ( *)

Department of Electronics and Communication, University of

Allahabad, Allahabad, Uttar Pradesh, India

e-mail: ashishkhare@hotmail.com

A Khare

e-mail: khare@allduniv.ac.in

P Srivastava

e-mail: prashant.jk087@gmail.com

N T Binh

Faculty of Computer Science and Engineering, Ho Chi Minh City

University of Technology, Ho Chi Minh, Vietnam

e-mail: ntbinh@cse.hcmut.edu.vn

DOI 10.1007/s11036-014-0526-7

Trang 2

Pattern (LBP) [3] which is applied on gray-level images LBP

is a very powerful descriptor as it is practically easy to

com-pute and is invariant to gray-level transformations However,

being based on bit values 0 and 1, LBP operator fails to

discriminate between multiple patterns Also, the presence of

noise in the image affects the LBP operator as it is highly

sensitive to noise Tan et al [4] provided an extension of LBP

a s L o ca l Te r na r y P a t t er n ( LT P ) LT P t h r e s h o l d s

neighbourhood pixels to three values and is less sensitive to

noise as compared to LBP However, LTP is not invariant to

gray level transformation

Content-based retrieval methods based on shape feature

has been used extensively Shape does not mean shape of

whole image but shape of a particular object or a region in

the image Shape features generally act as global features The

global features consider whole image to extract features

However, they do not consider local variations in the image

Shape features are generally used after segmentation of

ob-jects from images unlike colour and texture [5] Since

seg-mentation is a difficult problem, therefore, shape features have

not been exploited much But, still shape is considered as a

powerful descriptor Single feature is insufficient to construct

efficient feature vector which is very essential for efficient

image retrieval The combination of more than one feature

attempts to solve this problem The combination of colour and

texture [6], colour and shape [7], and colour, texture, and

shape [8] has been widely used for this purpose

Modern image retrieval methods combine local and global

features of an image to perform efficient retrieval The

com-bination of local and global features exploits the advantages of

both the features This property has motivated us to combine

local feature LTP with global feature moments This paper

combines LTP and moments in the form of moments of LTP

Grayscale images are divided into blocks of equal size and

LTP codes of each block are computed Geometric moments

of these LTP codes are then computed to form feature vector

Euclidean distance is computed between blocks of query

image and database images to measure similarity followed

by computation of threshold values to find images similar to

the query image

Rest of the paper is organized as follows Section 2

dis-cusses some of the related work in the field of image retrieval

Section 3 describes fundamentals of LTP and image moments

along with their properties Section 4 of this paper is

con-cerned with the proposed method Section 5 discusses

exper-imental results and Section 6 concludes the paper

2 Related work

Over a past few decades the field of image retrieval has

witnessed a number of approaches to improve the

perfor-mance of image retrieval Text-based approaches are still in

use and almost all web search engines follow this approach Early CBIR systems were based on colour features Later on, colour based techniques saw use of colour histograms Texture features caught the attention of researchers and were used extensively for the purpose of image retrieval Texture fea-tures such as LBP, LTP are considered to be powerful descrip-tive features and have been used for various applications Pietikäinen et al [9] proposed block-based method for image retrieval using LBP Murala et al [10] proposed two new features, namely Local Tetra Patterns (LTrP) and Direc-tional Local Extrema Pattern (DLEP) [11], based on the concept of Local Binary Pattern (LBP) as features for image retrieval Liu et al [12] proposed the concept of Multi-texton Histogram (MTH) which is considered as an improvement of Texton Co-occurrence Matrix (TCM) [13] The concept of MTH works for natural images The concept of Micro-structure Descriptor (MSD) has been described in [14] This feature computes local features by identifying colours that have similar edge orientations

Shape has also been exploited as a single feature as well as in combination with other features Zhand et al [15] proposed a region based shape descriptor, namely, Generic Fourier Descriptor (GFD) Two dimensional fou-rier descriptor was applied on polar raster sampled shape image in order to extract GFD, which was applied on image to determine the shape of the object Lin et al [16] proposed a rotation, translation and scale invariant method for shape identification which is also applicable

on the objects with modest level of deformation Yoo

et al [17] proposed the concept of histogram of edge directions, called as edge angles to perform shape based retrieval [18] used the concept of moments for CBIR The method divided images into blocks and computed geometric moments of each block Euclidean distance between blocks of query image and database image was computed followed by computation of threshold to re-trieve visually similar images

However, these features have been exploited as single feature which are not sufficient for constructing powerful feature vector Therefore, the combination of two or more features emerged as silver lining in the field of image retrieval

as this combined the advantages of all features [19] proposed the combination of SIFT, LBP and HOG descriptors as bag of feature model in order to exploit the concept of local and global features of image The combination of wavelets with other features has also been exploited for image retrieval Combination of gabor filter and Zernike moments has been proposed in [20] Gabor filter performs texture extraction while Zernike moment performs shape extraction This

meth-od has been applied for face recognition, fingerprint recogni-tion, shape recognition Wavelet has also been used with colour as wavelet correlogram in [21] Wavelet has a powerful characteristic of multiresolution analysis It is because of this

Trang 3

property that wavelets have been used extensively for image

retrieval The combination of á trous wavelet with

micro-structure descriptor (MSD) as á trous gradient micro-structure

de-scriptor has been proposed in [22] Wang et al [8]

incorpo-rated colour, texture and shape features for image retrieval

Colour feature has been exploited by using fast colour

quan-tization Texture features are extracted using filter

decompo-sition and finally, shape features have been exploited using

pseudo-Zernike moments Li et al [23] proposed the use of

phase and magnitude of Zernike moment, for image retrieval

Deselaers et al [24] compared certain features for image

retrieval on different databases

3 Features used and their properties

3.1 Local ternary patterns

Local Ternary Pattern (LTP) is an extension of Local

Binary Pattern (LBP) Whereas LBP operator thresholds

a pixel to 2-valued codes 0 and 1, LTP thresholds a pixel

to 3-valued codes The gray levels in a zone of width ± t

around pixel c are quantized to 0, those which are above

this are quantized to +1 and those below this are

quan-tized to− 1 That is,

LTP p; c; tð Þ ¼

1; p≥c þ t 0;p−c < t

−1; p≤c−t

8

<

:

9

=

where t is a user-specified threshold

In order to eliminate negative values, the LTP values are

divided into two channels, the upper LTP (ULTP) and the

lower LTP (LLTP) The ULTP is obtained by replacing the

negative values by 0 The two channels of LTP are treated as

separate entities for which separate histograms and similarity

metrics are computed combining these at the end

Computa-tion of LTP with the help of an example has been shown in

Fig.1(t=5)

3.2 Properties of LTP

LTP holds following important

properties-1 LTPs are less sensitive to noise as compared to LBP

2 LTP is not invariant to gray level transformation

3.3 Moments

Moment is a measure of shape of object Image moments

are useful to describe objects after segmentation Image

moments and various types of moment based invariants play an important role in object recognition and shape analysis The (p + q)th order geometric moment Mpq of a gray-level f(x, y) is defined as

Mpq¼

Z∞

∞

Z∞

∞

In discrete cases [25], the integral in the equation (2) reduces to summation and equation (2) becomes

Mpq¼X

m

where n x m is the size of gray-level image f(x,y)

Simple properties of image which are found via image moments include area, its centroid and information about the orientation Moment features are invariant to geometric trans-formations Such features are useful to identify objects with unique shapes regardless of their size, and orientation Being invariant under linear coordinate transformations, the moment invariants are useful features in pattern recognition problems Moments have been used for distinguishing between shapes

of different aircraft, character recognition, and scene matching applications Following properties of image moments are very useful in image

retrieval-1 Moment features are invariant to geometric transformations

2 Moment features provide enough discrimination power to distinguish among objects of different shapes

3 Moment features provide efficient local descriptors for identifying the shape of objects

4 Infinite sequence of moments uniquely identifies objects

3.4 Local ternary patterns and moments

Single feature fails to capture complete information of an image The combination of features is required to incor-porate fine details of an image while constructing feature vector The combination of local and global features is one such approach in this direction The local features help in capturing local variations On the other hand global features capture holistic ideas of an image Also, this approach combines the advantages of both the fea-tures The combination of LTP and moments help in fulfilling these criteria LTP, a local feature captures tex-ture details and act as a powerful classifier Moment, a global feature determines shape of an object in the image

Trang 4

and is invariant to geometric transformation The

advan-tages of this combination are summarized as

follows-1 LTP, as compared to LBP, is less sensitive to noise and

hence the combination of LTP with moments is less

affected by the presence of noise

2 The use of geometric moment as a single feature creates

numerical instabilities as it takes high values for higher

order moments [26] But the combination of LTP and

moments overcome this disadvantage as the moment

values of LTP are not very high

3 Geometric moments are invariant to geometric

trans-formations Hence its combination with LTP

incor-porates this advantage in the LTP-Moment feature

vector

4 The proposed method

The proposed method consists of three steps:

1 The first step is concerned with division of image into

blocks and computation of LTP codes of each block

2 In second step, we compute geometric moments of LTP

codes of query image and database images

3 Threshold is computed to perform retrieval in the third

step

The schematic diagram of the proposed method is shown in Fig.2

4.1 Computation of LTP codes

The algorithm for computation of LTP codes is as follows:

1 Convert the image into grayscale

2 Rescale the image to 252×252

3 Divide the image into blocks of 84 × 84 and compute LTP codes of each block

4 Computation of LTP yields two values: upper LTP (ULTP) and lower LTP (LLTP)

4.2 Computation of moments

Geometric moments of ULTP and LLTP codes are computed using eqn (3) The sequence of moments chosen here is 0 to 15 The moment values of ULTP and LLTP are computed separately

4.3 Distance measurement

Let the moments of LTP codes for different blocks of query image be represented as mQ=(mQ1,mQ2,mQn) Let the mo-ments of LTP codes for different blocks of database images

Fig 1 Computation of LTP

Trang 5

be represented as mDB¼ mð DB 1; mDB 2; mDBnÞ Then, the

Eu-clidean distance between block LTP moments of query and

database image is given as

D mQ; mDB

¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffimQi−mDBi

q

ð4Þ 4.4 Computation of threshold

Threshold is used to perform retrieval Use of threshold

im-proves the retrieval results as compared to the retrieval result

obtained without using threshold The basic idea behind

threshold computation is to find the range of distance values

which return images similar to the query image The

Euclid-ean distance values computed using equation (4) are sorted in

ascending order so that images are arranged according to

similarity to query image That is, the most similar image first

and others after that The index of similar images is stored

along with their distance values to identify minimum and

maximum values of range This determines the range of

similarity to a query image This procedure is repeated for every image of database to find the range of similarity Finally, the minimum and maximum of all range of values is deter-mined These values determine threshold of the entire

catego-ry of similar images This is done for all categories of images

in database The threshold values for upper LTP and lower LTP are computed separately To compute threshold, let

(i) N be total number of relevant images in database and NDB be total number of images in the database

(ii) sortmat be the sorted matrix (ascending order) of distance values and minix be first N indices of images in sortmat matrix

(iii) start_range and end_range be the range of relevant im-ages in the database

(iv) maxthreshold and minthreshold are respectively the maxi-mum and minimaxi-mum distance values of each query image (v) mthreshmat be the maximum of all the values of maxthreshold

Then the algorithm to compute threshold is given below:

Trang 6

5 Experiment and results

To perform experiment using the proposed method, images

from Corel-1K database [27] have been used The images in

this database have been classified into ten categories, namely,

Africans, Beaches, Buildings, Buses, Dinosaurs, Elephants,

Flowers, Horses, Mountains, Food Each image is of size

either 256 × 384 or 384 × 256 Each category of image

consists of 100 images Each image has been rescaled to

252×252 to ease the computation Sample images from each

category are shown in Fig.3

Each image of this database is taken as query image If the

retrieved images belong to the same category as that of the

query image, the retrieval is considered to be successful,

otherwise the retrieval fails

5.1 Performance evaluation

Performance of the proposed method has been measured in

terms of precision and recall Precision is defined as the ratio

of total number of relevant images retrieved to the total

number of images retrieved Mathematically, precision can

be formulated as

P¼ IR

where IRdenotes total number of relevant images retrieved and TRdenotes total number of images retrieved

Recall is defined as the ratio of total number of relevant images retrieved to the total number of relevant images in the database Mathematically, recall can be formulated as

R¼ IR

where IRdenotes total number of relevant images retrieved and CRdenotes total number of relevant images in the data-base In this experiment, TR=10 and CR=100

5.2 Retrieval results

For the experimentation purpose, each image is divided into blocks of size 84 ×84 Local Ternary Pattern codes of each block are computed followed by computation of geometric moments of LTP codes Distance between block moments of

Fig 2 Schematic diagram of the

proposed method

Fig 3 Sample images from Corel-1,000 database

Trang 7

query image and database image is determined Then the

retrieval is performed using threshold obtained by using

threshold algorithm

The computation of local ternary pattern yields two values,

namely upper LTP and lower LTP These two values are

treated as separate entities of LTP codes Separate moment

distance and threshold values are computed which are

subsequently combined at the end of computation of thresh-old After computing distance measurement of the two mo-ment values, threshold is computed for the purpose of

retriev-al This produces two sets of similar images Union of these two sets is taken to produce final set of similar images Recall

is computed by counting total number of relevant images in the final set Similarly, for precision, top n matches for each image set is counted and then union is applied on these two sets to produce final set Mathematically, this can be formu-lated as follows Let fULTPbe set of similar images obtained from moments of upper LTP codes and fLLTPbe set of similar images obtained from moments of lower LTP codes Then, the final set of similar images denoted by fRSis given by

Similarly, let fULTPn and fLLTPn be set of top n images

obtain-ed from moments of upper LTP codes and moments of lower LTP codes respectively Then the final set of top n images denoted by fPSn is given as

fnPS ¼ fn

Table 1 Average precision and recall values for each category of image

Category Precision (%) Recall (%)

Fig 4 a Precision vs Category plot b Recall vs Category plot

Table 2 Comparison of the proposed method with other methods

CBIR using moments [ 18 ] 35.94 Gabor histogram [ 24 ] 41.30 Image-based HOG-LBP [ 19 ] 46.00

LF SIFT histogram [ 24 ] 48.20 Color histogram [ 24 ] 50.50

Fig 5 Comparison of the proposed method (PM) with other methods in terms of average precision

Trang 8

Retrieval is considered to be good if the values of precision

and recall are high Table 1 shows the performance of the

proposed method for each category of image of database in

terms of precision and recall Fig.4shows the plot between

recall and precision values for different image categories

The proposed method is compared with other

state-of-the-art methods such as Block-based LBP method [9],

Image-based HOG-LBP [19], and LF SIFT Histogram [24] Table2

shows the performance comparison of the proposed method

with other methods in terms of average precision Fig.5shows

the plot between precision and methods Values of precision

and recall were computed on the same Corel-1K image

database From Table2 and Fig 5 it can be observed that

the proposed method outperforms, in terms of precision,

Block-based LBP [9] by 30.70 %, CBIR using Moments

[18] by 17.76 %, Gabor Histogram [24] by 12.4 %,

Image-based HOG-LBP [19] by 7.7 %, LF SIFT Histogram [24] by

5.5 %, Color Histogram [24] by 3.2 %

6 Conclusion

In this paper, we have presented the combination of LTP and

moments Local Ternary Pattern codes of blocks of gray level

image are computed Geometric moments of the resulting LTP

codes are then computed The method then computes distance

between blocks of query and database images and finally

retrieval is performed on the basis of threshold This method

combines the advantage of low noise sensitivity of LTP and

invariance to geometric transformation property of moments

Also, this method exploits the advantages of fusion of local

and global features of an image

Performance of the proposed method was measured in

terms of precision and recall The experimental results showed

that the proposed method outperformed other state-of-the-art

methods Results of the proposed method can be further

improved by dividing moments into more number of

sequences

References

1 Long H, Zhang H, Feng DD (2003) Fundamentals of content-based

image retrieval Multimedia information retrieval and management.

Springer Berlin, Heidelberg, pp 1 –26

2 Rui Y, Huang TS, Chang S (1999) Image retrieval: current

tech-niques, promising directions, and open issues J Vis Commun Image

Represent 10:39 –62

3 Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale

and rotation invariant texture classification with local binary patterns.

IEEE Trans Pattern Anal Mach Intell 24(7):971 –987

4 Tan X, Triggs B (2010) Enhanced local texture feature sets for face recognition under difficult lighting conditions IEEE Trans Image Process 19(6):1635 –1650

5 Khare M, Srivastava R K, Khare A (2013) Moving object segmen-tation in daubechies complex wavelet domain Signal, Image and Video Processing Accepted, doi: 10.1007/s11760-013-0496-4 , Springer

6 Wang X, Zhang B, Yang H (2002) Content-based image retrieval by integrating color and texture features MultimediaTools Appl 1 –25

7 Gevers T, Smeulders AW (2000) Pictoseek: combining color and shape invariant features for image retrieval IEEE Trans Image Process 33(1):102 –119

8 Wang X, Yu Y, Yang H (2011) An effective image retrieval scheme using color, texture and shape features Comput Stand Interfaces 33(1):59 –68

9 Pietikäinen M, Takala V, Ahonen T (2005) Block-based methods for image retrieval using local binary patterns.14th Scandinavian Conference on Image Analysis 882 –891

10 Murala S, Maheshwari RP, Balasubramanian R (2012) Local tetra patterns: a new descriptor for content-based image retrieval IEEE Trans Image Process 21(5):2874–2886

11 Murala S, Maheshwari RP, Balasubramanian R (2012) Directional local extrema patterns: a new descriptor for content-based image retrieval Int J Multimedia Inf Retrieval 1(3):191–203

12 Liu G, Zhang L, Hou Y, Yang J (2008) Image retrieval based on multi-texton histogram Pattern Recogn 43(7):2380–2389

13 Liu G, Yang Y (2008) Image retrieval based on texton co-occurrence matrix Pattern Recogn 41(12):3521–3527

14 Liu G, Li Z, Zhang L, Xu Y (2011) Image retrieval based on microstructure descriptor Pattern Recogn doi: 10.1016/j.patcog 2011.02.003

15 Zhang D, Lu G (2002) Shape-based image retrieval using generic fourier descriptor Signal Process-Image Commun 17(10):825–848

16 Lin H, Kao Y, Yen S, Wang C (2004) A study of shape-based image retrieval In Proc 24th International Conference on Distributed Computing Workshops 118–123

17 Yoo H, Jang D, Jung S, Park J, Song K (2002) Visual information retrieval via content-based approach J Pattern Recognit Soc 35:749– 769

18 Srivastava P, Binh N T, Khare A (2013) Content-based image retrieval using moments In Proc 2nd International Conference on Context-Aware Systems and Applications 228–237

19 Yu J, Qin Z, Wan T, Zhang X (2013) Feature integration analysis of bag-of-features model for image retrieval Neurocomputing 120:

355 –364

20 Fu X, Li Y, Harrison R, Belkasim S (2006) Content-based image retrieval using gabor-zernike features 18th International Conference

on Pattern Recognition, Hong Kong 2:417 –420

21 Moghaddam HA, Khajoie TT, Rouhi AH, Tarzjan MS (2005) Wavelet correlogram: a new approach for image indexing and re-trieval Pattern Recogn 38:2506 –2518

22 Agarwal M, Maheshwari RP (2012) Á trous gradient structure de-scriptor for content based image retrieval Int J Multimedia Inf Retr 1(2):129 –138

23 Li S, Lee MC, Pun CM (2009) Complex Zernike moments shape-based image retrieval IEEE Trans Syst Man Cybern Part A: Syst Hum 39(1):227 –237

24 Deselaers T, Keysers D, Ney H (2008) Features for image retrieval:

an experimental comparison Inf Retr 11:77 –107

25 Flusser J (2005) Moment invariants in image analysis Enformatika 11

26 Kotoulas L, Andreadis I (2005) Image analysis using moments 5th International Conference on Technology and Automation, Thessaloniki, Greece 360 –364

27 http://wang.ist.psu.edu/docs/related/

Định dạng
Số trang	8
Dung lượng	767,84 KB