DSN686978 1 11 Research Article International Journal of Distributed Sensor Networks 2017, Vol 13(1) � The Author(s) 2017 DOI 10 1177/1550147716686978 journals sagepub com/home/ijdsn Deep combining of[.]
Trang 1International Journal of Distributed Sensor Networks
2017, Vol 13(1)
Ó The Author(s) 2017 DOI: 10.1177/1550147716686978 journals.sagepub.com/home/ijdsn
Deep combining of local phase
quantization and histogram of oriented
gradients for indoor positioning based
on smartphone camera
Jichao Jiao and Zhongliang Deng
Abstract
To achieve high accuracy in indoor positioning using a smartphone, there are two limitations: (1) limited computational and memory resources of the smartphone and (2) the human walking in large buildings To address these issues, we pro-pose a new feature descriptor by deeply combining histogram of oriented gradients and local phase quantization This fea-ture is a local phase quantization of a salient histogram of oriented gradient visualizing image, which is robust in indoor scenarios Moreover, we introduce a base station–based indoor positioning system for assisting to reduce the image matching at runtime The experimental results show that accurate and efficient indoor location positioning is achieved
Keywords
Indoor positioning, smartphone, salient region detection, deep combining of histogram of oriented gradients and local phase quantization, histogram of oriented gradient visualization
Date received: 27 June 2016; accepted: 24 November 2016
Academic Editor: Gang Wang
Introduction
Indoor positioning is considered an enabler for a variety
of applications, such as guidance of passengers on
air-ports, conference attendees, visitors in shopping malls,
and for many novel context-aware services, which can
play a significant role for monetarization The demand
for an indoor positioning service or indoor
location-based services (iLBS) has also accelerated given that
people spend the majority of their time indoors.1 Over
the last decade, researchers have studied many indoor
positioning techniques.2 In addition, with the
develop-ment of the integrated circuit technology, multi-sensors,
for example, camera, Earths magnetic field, WiFi,
Bluetooth, inertial module, have been integrated in
smartphones Therefore, smartphones are becoming
powerful platforms for location awareness
The traditionally used outdoor localization method,
Global Navigation Satellite System (GNSS), is not
available in indoor environments, even though naviga-tion tasks on street level are very precise A catalog of alternative localization techniques has been investi-gated, such as infrared-,3 sensor-,3,4 wireless-,5,6 com-munication basestation–based technologies,7 pseudolite8or visual markers.9However, most of these technologies, relying on wireless technology, face issues
in the presence of radio frequency interference (RFI) and interference of non-line of sight (NLOS) caused by dense forests, urban canyons, and terrain.1Moreover, some of these technologies work in a limited area such
School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Corresponding author:
Jichao Jiao, School of Electronic Engineering, Beijing University of Posts and Telecommunications, Xitu Road, Haidian, Beijing 100876, China Email: jiaojichao@gmail.com
Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 3.0 License
(http://www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (http://www.uk.sagepub.com/aboutus/ openaccess.htm).
Trang 2as inertial sensor–based approaches or some need a
particular environmental infrastructure and
augmenta-tion such as Locata, that is, a pseudolite posiaugmenta-tioning
system.8 Therefore, smartphone camera–based indoor
positioning is a promising approach for accurate
indoor positioning without the need for expensive
infrastructure such as access points or beacons
The key method of camera-based localization is
image matching Images taken by a smartphone camera
are matched to previously acquired reference images
with known position and orientation The matching of
smartphone recordings with a database of
geo-referenced images allows for meter accurate
infrastructure-free localization.10 According to the
matched reference image, the location of the
smart-phone is calculated In mobile indoor scenarios that are
shown by Figure 2, users usually walk during
position-ing and navigation procedure Therefore, the captured
images by smartphone cameras are scaled, rotated, and
even blurred because of hands shaking Moreover,
most of the researchers recently focus on invariant
fea-ture extraction Ravi et al.11 extracted color
histo-grams, wavelet decomposition, and image shape for
image matching to locate a user’s position Kim and
Jun12 proposed a method based on image color
histo-gram feature for positioning using augmented reality
tool However, the positioning accuracy of those two
methods would work inefficiently in the varying light
and crowded scenarios In order to extract the
invar-iant features, SIFT and its improved algorithms are
widely used for image-based indoor localization
Kawaji et al used principal component analysis-scale
invariant feature transform (PCA-SIFT) feature for
railway museum indoor positioning Werner et al.13
proposed a camera-based indoor positioning using
speeded up robust features (SURF) feature for
speed-ing up the image matchspeed-ing Li and Wang14 introduced
affine-scale invariant feature transform (A-SIFT)
fea-ture for image matching achieved by random sample
consensus (RANSAC), which increased the matching
accuracy Heikkila¨ et al.15proposed a similar method14
for indoor positioning
However, those two complex computational
meth-ods are not suitable for smartphone-based indoor
posi-tioning This is because the limited computational
resources of mobile devices16extracted the edge-based
features from the visual tag image, and those features
are fused with inertial information for indoor
naviga-tion Kim and Jun12 used the Sobel filter integrating
mean structural similarity index for estimating the
arri-val of angle and height during the indoor localization
However, these two methods need additional visual
marks for assisting smartphone camera for detecting
features, which increases the indoor positioning cost
Meanwhile, all of these research works mainly focus on
improving image-matching accuracy Some of these
algorithms are, however, quite demanding in terms of their computational complexity and therefore not sui-ted to run on mobile devices, which need smartphones with high hardware configuration Although smart-phones are inexpensive, they have even more limited performance than Tablet and PCs Phones are embedded systems with severe limitations in both the computational facilities and memory bandwidth Therefore, natural feature extraction and matching on phones have largely been considered prohibitive and have not been successfully demonstrated to date.17To address these issues, Van Opdenbosch et al.10 used the improved vector of locally aggregated descriptors’ (VLAD) image signature and emerging binary feature descriptor binary robust independent elementary fea-tures (BRIEF) to achieve the smartphone camera-based indoor positioning Besides, in order to reduce the overall computational complexity, they proposed a scalable streaming approach for loading the reference images to the phones Different with their method, this article proposed an efficient feature descriptor named Turbo Fusing Histogram of oriented gradients (HOG) and Local phase quantization (LPQ) Salient feature (TFHLS) The TFHLS features are extracted from the partial image which are salient image regions, and they are invariant to the illumination, scale, rotation, and blur caused by camera shaking Moreover, a wireless-based indoor positioning system time&code division-orthogonal frequency division multiplexing (TC-OFDM) is introduced to calculate the coarse positions for supporting the floor number to the smartphone, which would reduce the number of images which are downloaded to the smartphones Using this approach, our camera-based indoor positioning algorithm results
in the reduction in computational complexity, hard-ware requirement, and network latency
This article is organized as follows to achieve our investigations First of all, we discuss the related work on HOG and LPQ feature extraction in section ‘‘Related work.’’ Then, we introduce our image feature extraction based on fusing HOG and LPQ in section ‘‘Proposed smartphone camera-based indoor positioning.’’ After that, we test the proposed algorithm on the Technische Universita¨t Mu¨nchen (TUM) indoor dataset18 and the Beijing University of Posts and Telecommunications (BUPT) indoor dataset collected by our lab, and the evo-lution of our algorithm is also shown in this section Finally, in Section ‘‘Conclusion,’’ we conclude the article and provide a future work on possible extensions
Related work Finding efficient and discriminative descriptors is cru-cial for indoor complex scenarios HOG descriptor was proposed by Dalal and Triggs19 for human detection
Trang 3The main idea behind HOG is based on the local edge
information.15 Because of its efficient performance,
HOG feature are widely used in human detection,20,21
face recognition,22,23and image searching.24All of these
applications show that HOG feature is invariant to the
illumination According to our experiment, HOG
fea-ture is not robust when the humans are crowded and
the images are blurred Wang et al.25 combined the
HOG and local binary pattern (LBP) features for
human detection However, they concluded that their
detector cannot handle the articulated deformation of
people Our visualizations reveal that the world that
features see is slightly different from the world that the
human eye perceives
Recently, LPQ is insensitive to image blurring, and
it has proven to be a very efficient descriptor in face
recognition from blurred and sharp images.15,26 LPQ
was originally designed by Ojansivu and Heikkila
simi-lar to the LBP methodology as a texture descriptor.27
In our opinion, robust and efficient image matching
requires several different kinds of appearance
informa-tion to be taken into account, suggesting the use of
het-erogeneous feature sets In our proposed algorithm, the
HOG features are extracted from the salient regions,
and LPQ features are extracted from the HOG
visualiz-ing image Therefore, the HOG and LPQ are integrated
for building an efficient feature, that is, TFHLS for
indoor image matching
Proposed smartphone camera-based
indoor positioning
The smartphone camera-based indoor positioning
pro-cedure using TFHLS feature is shown in Figure 1
Study materials
In order to test and evaluate the proposed algorithm,
two databases are used The first one is supported by
TUM.28 In TUM dataset, there are 54,896 reference
views, which covers 3431 positions with 1-m accuracy
Another dataset is collected by our lab which captured
1000 indoor images using smartphone cameras in
BUPT campus Different with TUM dataset in
calcu-lating the reference positions, a static measurement
sys-tem based on TC-OFDM and BeiDou real-time
kinematic is introduced The scalable locations with
positioning accuracy 0.1–1 m are obtained The BUPT
dataset covers four buildings and results in a total of
2189 positions
Superpixel-based, sparsifying, high-resolution image
Inspired by the human vision system (HVS), the
fea-tures extracted from salient regions are invariant to
viewpoint change, insensitivity to image perturbations and repeatability under intra-class variation.29 These features are extracted from some regions of the image, but not the whole image This procedure is called spar-sifying image in this article Therefore, the salient region
is introduced for the image matching In this article, a superpixel-based approach, simple linear iterative clus-tering (SLIC), proposed by Achanta et al.30 is used to pre-segment an image SLIC method generates super-pixels by clustering super-pixels based on their combined five-dimensional similarity and proximity in the image plane which is shown by the following functions
dlab=
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (lk li)2+ (ak ai)2+ (bk bi)2
q
ð1Þ
dxy=
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (xk xi)2+ (yk yi)2
q
ð2Þ
Figure 1 Flowchart of smartphone camera-based indoor positioning.
Trang 4Ds= dlab+m
where Ds is the sum of the dlab distance and the dxy
plane is normalized by the grid interval S A variable m
is introduced in Dsallowing us to control the
compact-ness of a superpixel Equation (1) is used to calculate
the distance between two different pixels in the lab color
space Equation (2) is used to obtain the Euclidean
dis-tance between two different pixels Equation (3) is used
to transform different dimensional distances into the
same dimensional distance Based on equation (3), the
size of each superpixel can be varied with Ds, which
makes our proposed segmentation approach robust
and accurate In the SLIC method, the desired number
of superpixels should be specified, which increases the
computation complexity and is unsuitable for
segment-ing image sequences To detect salient regions from
superpixel image and not the pixel-level image using
equation (4)
<(si) = a 3 C(si) + b 3 T (si) ð4Þ
where< is the candidate salient region, siis the
super-pixel, C is the contrast, T is the superpixel entropy, and
a+ b = 1 The threshold C used for detecting salient
superpixels is calculated using equation (5)
C(si) =msi
where msi is the mean of the i superpixel, and mf is the
mean of an image Then,salient superpixel regions are
detected Moreover, in order to extract the HOG
fea-tures, each salient superpixel regions are extended into
relational rectangles which are named salient rectangles
The sizes of those rectangles are calculated using
equa-tion (6)
Ra =jxmax xminj
Rb =jymax yminj
ð6Þ
where xmax is the position of the far right pixel in the
horizontal direction, and xmin is the position of the far
left pixel in the horizontal direction ymax is the position
of the topside pixel in the vertical direction, and ymin is
the position of the downside pixel in the vertical
direc-tion The center of the salient rectangle is the centroid
of the related superpixel
TFHLS feature extraction approach
HOG feature extraction HOG descriptors are invariant
to two-dimensional (2D) rotation which has been used
in many different problems in computer vision, such as
pedestrian detection Compared to the original HOG,
the integrated HOG feature proposed by Zhu et al.21
without trilinear interpolation is easier and faster to be
computed However, the HOG’s performance would be worse than the original HOG Therefore, we intro-duced a constrained trilinear interpolation approach to replace the general trilinear interpolation Moreover, it should be noted that both Wang et al.25and Li et al.31 proposed a 7 3 7 kernel that is shown in equation (7)
to convolve the gradients for calculating the gradient orientation at each pixel However, it is a heavy com-putation procedure to convolve using the 7 3 7 kernel
A novelty 5 3 5 convolution kernel is designed to be implemented For an 8-bit image, the kernel template is shown in equation (8)
Conv77= 1
256
2 6 6 6 6 4
3 7 7 7 7 5 ð7Þ
ConvHOG= 1
256
2 6 6 4
3 7 7
Moreover, in order to reduce the space complexity of the integral image method, the kernel in equation (8) is convoluted with the salient rectangle but not the whole original image, which decreased the computational complexity
HOG feature visualization In this article, we introduced a HOG visualizing method proposed by Vondrick et al.32 Different with their complex method, a more simple method based on equation (9) is proposed
f1(y) = argmin
x2R D
where x2RD is a salient rectangle sub-image and
y = f(x) is the corresponding HOG feature descriptor
In this article, HOG feature visualization is posed to be
a feature inversion procedure In order to optimize equation (9), we used gradient-descent strategies by numerically evaluating the derivative in image space with least-squares method
LPQ feature extraction from HOG visualization image After inverting HOG features into an image YHOG, LPQ fea-tures are extracted from YHOG using a simple scalar quantizer equation (10) LPQ feature is based on quan-tifying the Fourier transform phase by considering the sign of each component in Fourier coefficients G(x) Different with LBP, LPQ features are calculated in an
Trang 5image frequency transformed by fast Fourier transform
(FFT) However, the LPQ feature is extracted in a local
region of FFT domain, which is similar to LBP
According to Dhall et al.,33 the local Fourier
coeffi-cients of each pixel are computed around its four
fre-quency points After that, in order to obtain the phase
information of each pixel in superpixel area, a binary
scalar quantizer is implemented for quantifying the
signs of the real and imaginary part of each coefficient
Finally, the quantization result of each coefficient is
coded in an 8-bit binary string
qi(x) = 1 if gi(x) 0
0 otherwise
ð10Þ
where gi(x) is the ith component of G(x) Then, the
phase information of the 8-bit HOG visualizing image
is described using equation (10)
fLPQ(x) = X8
n = 1
The final LPQ features are used as feature vectors to
represent an indoor sub-image
TFHLS feature matching
The main advantage of the binarization, apart from a
reduced memory footprint, is a very fast matching
pro-cess using the normalized Hamming distance by
equa-tion (12)
d =
PN
i = 1
PN
j = 1
PM(i, j)\ QM(i, j)\ Pð R(i, j) QR(i, j)Þ + PM(i, j)\ QM(i, j)\ Pð I(i, j) QI(i, j)Þ
2 PN
i = 1
PN
j = 1
PM(i, j)\ QM(i, j)
ð12Þ
where PR(QR), PI(QI), and PM(QM) are the real part, the
imaginary part, and the mask of P(Q), respectively The
result of the Boolean operator ( ) is equal to zero if
and only if there are two bits The symbol\ represents
the AND operator, and the size of the feature matrixes
is N 3 N
Experimental results Query dataset and setup description
We recorded a query set of 128 images captured by an iPhone 6 with manually annotated position informa-tion The images are approximately 5 megapixels in size and are taken using the default settings of the iPhone 6 camera application Furthermore, the images consist of landscape photos either taken head-on in front of a building or at a slanted angle of approximately 308 After obtaining the images, next, we run the remaining query images with successfully retrieved database images through the pose estimation part of the pipeline
In order to characterize pose estimation accuracy, we first manually ground truth for the position and pose
of each query image taken This is done using the com-puter-aided design (CAD) map of the buildings in BUPT and distance measurements recorded during the query dataset collection For a detailed evaluation, the query set has been split into classes that is the same with the TUM database: high texture, low texture, hall-ways, ambiguous objects, and building structure, where each query can be assigned to more than one class (Figures 2 and 3) Meanwhile, the framework of our smartphone camera-based indoor positioning system is shown in Figure 4 It should be known that we ignore the orientation information calculation
Our method was implemented using MATLAB 2015a, and this method was programmed by integrat-ing C# and MATLAB code The hardware configura-tion of our experimental platform where our method ran is shown in Table 1
Figure 2 Exemplary queries for all classes from TUM: (a) low textures, (b) high textures, (c) blurred image, (d) building hall, (e) hallway, and (f) Illumination change.
Trang 6It is noted that the camera-based positioning method
proposed by Ravi et al.11is used to compare with our
proposed method, and both the test data and the
MATLAB code of that method are supported by
Opdenbosch
Evaluation of high-resolution image sparsifying
Figure 5 shows a qualitative result for the image
sparsi-fying by detecting salient regions based on superpixels
From the second row of Figure 5 obtained by the pro-posed HVS-based approach for a variety of images from the TUM and BUPT database Preserve the sali-ent regions in each image while remaining compact and uniform in size of objects Moreover, the salient super-pixels that are detected include sparse features, which can be achieved to reduce the computation of indoor positioning
According to Figure 5, we can find that salient regions are detected even when the image is blurred, which is shown by three images in the second column
of Figure 5
According to our statistics, the number of TFHLS features in Figure 6(b) is 69% less than that in Figure 6(a) It is noted that features in Figure 6(b) are extracted from the salient regions of an image, which shows that our salient region detection approach is effi-cient and powerful Therefore, less features are used for image matching, which speeds up the process of the image matching and remains high matching ration according to Table 2
Figure 3 Exemplary queries for all classes from BUPT: (a) low textures, (b) high textures, (c) blurred image, (d) building hall, (e) hallway, and (f) illumination change.
Figure 4 The module of navigation and positioning system.
Table 1 Hardware configuration
CPU Processor Core i7 Core32:5 GHz
Trang 7Qualitative evaluation of HOG visualization The third row of Figure 5 shows the HOG feature visualization results under different indoor scenarios These result visualizations allow us to analyze object
Figure 5 Exemplary queries for salient region detection and HOG feature visualization: (a) Indoor of Our Lab, (b) Hall of Our Research Building, (c) Corridor of a Building in TMU, (d) Corridor of Our Research Building, (e) Salient Map of Figure 5(a), (f) Salient Map of Figure 5(b), (g) Salient Map of Figure 5(c), (h) Salient Map of Figure 5(d), (i) HOG Feature Visualization of Figure 5(e), (j) HOG Feature Visualization of Figure 5(f), (k) HOG Feature Visualization of Figure 5(g), and (l) HOG Feature Visualization of Figure 5(h).
Figure 6 TFHLS features matching for BUPT images: (a)
TFHLS feature extraction with high density and (b) TFHLS
features sparsing.
Table 2 Matching result of different image features.
rate (%)
Running time (ms)
Trang 8from the view of HOG detector, which is a new
approach and gain new insight into the detectors
fail-ures, which is different with the human salient vision
From the first and third row of Figure 5, the
high-frequency details in original images have high contrast
in HOG visualization images Paired dictionary
learn-ing tends to produce the best visualization for HOG
descriptors Although HOG does not explicitly describe
the color, we found that the paired dictionary is able to
recover color from HOG descriptors Therefore, by
visualizing feature spaces, we can obtain a more
intui-tive understanding of recognition systems
Evaluation of TFHLS feature extraction and matching
In order to identify optimal parameters for the
approach described above, several experiments are
con-ducted with varying settings Figure 7 summarizes the
performance of comparing the TFHLS feature
matching to the method proposed by Van Opdenbosch
et al.10 A smartphone running Andriod OS 4.4 was used to implementing the positioning methods which were used in this article
Qualitative results Figure 7 shows the TFHLS features matching results in four different scenarios As shown
in Figure 7(a), successful retrieval usually involves matching of object textures in both query and database images According to Figure 7(b), we can find that our proposed TFHLS feature is efficient to match the blurred images
Quantitative results Table 2 shows that we successfully match 113 of 128 images to achieve a retrieval rate of 93%, where LS means linear search and LSH means locality sensitive hashing Moreover, as shown in Table 2, the proposed method achieves to match the images of TUM database with the highest success in
Figure 7 TFHLS features matching for BUPT images: (a) TFHLS features matching for high-texture image, (b) TFHLS features matching for blurred image, (c) TFHLS features matching for low-texture image, and (d) TFHLS features matching for indoor image.
Figure 8 The performance comparison between the proposed human detectors and the state-of-the-art detectors on BUPT database.
Trang 913.2 ms for each image Figure 8 shows the
perfor-mance comparison in miss rate between our proposed
method and other two LBP-based methods
Positioning result evaluation
Figure 9 summarizes the performance of the location
information estimation and the comparison result
From Figure 9(a) and (b), we can localize the position
to within sub-meter level of accuracy for over 56% of
the query images Furthermore, 85% of the query
images are successfully localized to within 2 m of the
ground’s truth position As seen in Figure 7(a), when
the location error is less than 1 m, the TFHLS features
of the corresponding corridor signs present in both
query and database images matched together
Moreover, we find that the TFHLS detector extracted
more features,10 even though the images are blurred,
which is shown in Figure 9(b) As shown in Figure
10(a) and (b), we plot the estimated and ground-truth
locations in the horizontal and vertical directions Besides, Figure 10(c) shows the comparing locations of the query images onto the New Research Buildings 2D floor plan As seen from Figure 10, there is a close agreement between the ground truth and TFHLS-based results The root mean square error (RMSE)
Figure 9 The module of smartphone camera-based indoor positioning: (a) positioning result based on TUM dataset and
(b) positioning result based on BUPT dataset.
Figure 10 The location comparison result: (a) positioning result in horizontal direction, (b) positioning result in vertical direction, and (c) locations on the 2D floor plan.
Figure 11 Performance comparison between the proposed indoor positioning and the state-of-the-art positioning methods.
Trang 10between the estimated and the ground-truth positioning
results is 1.253 m
Figure 11 shows the indoor positioning
compari-son performance in RMSE From this figure, we can
find that the proposed approach can achieve
high-accuracy indoor locations than VLAD- and
OFDM-based methods Most of the VLAD and
TC-OFDM indoor positioning results are more than 3 m,
while the positioning results based on our method is
less than 1.5 m Moreover, the proposed method is
robust because its RMSE curve is smooth, which
shows that our method can get stable results The
per-formance gap between the ground truth and
estima-tions in both Figures 9 and 11 suggests that the
TFHLS-based method can be adaptive to the
illumi-nation and the dense multipath indoor environments
which result in obtaining a higher indoor positioning
accuracy
Conclusion
We presented a scalable and efficient mobile
camera-based localization system To this end, we built a
modi-fied model of a feature that deeply combined HOG and
LPQ, and jointly addressed the problem of limited
computational capacity, as well as the required memory
footprint Moreover, we employed TC-OFDM indoor
positioning system for supporting the coarse
position-ing knowledge related to the camera location
According to our test on the TUM and BUPT
data-base, the indoor positioning based on the proposed
algorithm is less than 1.5 m Furthermore, the RMSE
between estimated and ground-truth positioning results
up to 1.25 m, which shows that our smartphone
camera-based indoor positioning algorithm is precise
and accuracy In the future work, we will study the
sub-meter indoor positioning algorithms based on the
fusion of image and wireless signals
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with
respect to the research, authorship, and/or publication of this
article.
Funding
The author(s) disclosed receipt of the following financial
sup-port for the research, authorship, and/or publication of this
article: This project was sponsored by the National Key
Research and Development Program (no 2016YFB0502002),
the National High Technology Research and Development
Program of China (no 2015AA124103), the National Natural
Science Foundation of China (no 61401040), and Beijing
University of Posts and Telecommunications Young Special
Scientific Research Innovation Plan (2016RC13).
References
1 Gonzalez MC, Hidalgo CA and Barabasi AL Under-standing individual human mobility patterns Nature 2008; 453(7196): 779–782.
2 Liu H, Darabi H, Banerjee P, et al Survey of wireless indoor positioning techniques and systems IEEE T Syst Man Cy C 2007; 37(6): 1067–1080.
3 Lee C, Chang Y, Park G, et al Indoor positioning sys-tem based on incident angles of infrared emitters In: Proceedings of the 30th annual conference of IEEE Indus-trial Electronics Society, Busan, South Korea, 2–6 November 2004, pp.2218–2222 New York: IEEE.
4 Li B, Gallagher T, Dempster AG, et al How feasible is the use of magnetic field alone for indoor positioning? In: Proceedings of the international conference on indoor positioning and indoor navigation (IPIN), Sydney, NSW, Australia, 13–15 November 2012, pp.1–9 New York: IEEE.
5 Zou H, Jiang H, Lu X, et al An online sequential extreme learning machine approach to wifi based indoor positioning In: Proceedings of the 2014 IEEE World Forum on Internet of Things (WF-IoT), Seoul, Korea, 6–
8 March 2014, pp.111–116 New York: IEEE.
6 Huang CH, Lee LH, Ho CC, et al Real-time rfid indoor positioning system based on Kalman-filter drift removal and heron-bilateration location estimation IEEE T Instrum Meas 2015; 64(3): 728–739.
7 Zhongliang D, Yanpei Y, Xie Y, et al Situation and development tendency of indoor positioning China Com-mun 2013; 10(3): 42–55.
8 Politi N, Yong L, Faisal K, et al Locata: A new technol-ogy for high precision positioning In: European naviga-tion conference, Naples, 3–6 May 2009 Institute of Navigation.
9 Kalkusch M, Lidy T, Knapp N, et al Structured visual markers for indoor pathfinding In: Proceedings of the 1st IEEE international workshop on augmented reality toolkit, Darmstadt, 29 September 2002, p.8 New York: IEEE.
10 Van Opdenbosch D, Schroth G, Huitl R, et al Camera-based indoor positioning using scalable streaming of compressed binary image signatures In: Proceedings of the IEEE international conference on image processing (ICIP), Paris, 27–30 October 2014 New York: IEEE.
11 Ravi N, Shankar P, Frankel A, et al Indoor localization using camera phones In: Proceedings of the 7th IEEE workshop on mobile computing systems and applications (WMCSA’06), Semiahmoo Resort Blaine, WA, USA, 1 August 2005, p.49 New York: IEEE.
12 Kim J and Jun H Vision-based location positioning using augmented reality for indoor navigation IEEE T Consum Electr 2008; 54(3): 954–962.
13 Werner M, Kessel M and Marouane C Indoor position-ing usposition-ing smartphone camera In: Proceedposition-ings of the international conference on indoor positioning and indoor navigation (IPIN), Sydney, NSW, Australia, 21–23 Sep-tember 2011, pp.1–6 New York: IEEE.
14 Li X and Wang J Image matching techniques for vision-based indoor navigation systems: performance analysis for 3d map based approach In: Proceedings of the