A detection for positioning sensor node based on multilayer perceptron

This paper proposed a prediction method for the sensor node positioning based on a multilayer perception in the neural networks.. The experimental results compared with the other method

Trang 1

on Multilayer Perceptron

Thi-Kien Dao

1

, Shi-Jie Jiang

1 , Truong-Giang Ngo

2 (B) , Thi-Thanh-Tan Nguyen

3 , and Trong-The Nguyen

1,4

1

Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of

Technology, Fuzhou 350014, China vnthe@hpu.edu.vn

2

Thuyloi University, 175 Tay Son, Dong Da, Hanoi, Vietnam

giangnt@tlu.edu.vn

3

Information Technology Faculty, Electric Power University, Hanoi, Vietnam

4

Haiphong University of Manage and Technology, Haiphong 180000, Vietnam

Abstract Node positioning accuracy and the environmental impact on devices

in wireless sensor networks (WSN) have been paid attention much by scholars recently This paper proposed a prediction method for the sensor node positioning based on a multilayer perception in the neural networks The node locations based

on its signals’ strength characteristics are captured to be a dataset The features about the signal strength of the node considered to extract from a large number of signal strength samples included noise that is measured by the nearest neighbor estimation for inputs of the scheme system The experimental results compared with the other method in the literature shows that the proposed scheme provides higher positioning accuracy and the lower average error than the competitors

Keywords:Wireless sensor networks · Multilayer perceptron · Indoor

positioning node · Predictive positioning

Thanks to developing computer technology and smartphones, the sensors’ smart devices have become popularized in our daily life, e.g., in the fields of health care with positioning services, environment monitoring [1 3] The node devicelocation is tofind outits position by the estimation technique for the indoor environment [4,5] Several factors can influence the node localization accuracy, e.g., complex indoor radio transmission environment, indoor building layout, personnel mobility, and so on [6] The indoor signal fading model cannot be established accurately, so its progress lags far behind the outdoor positioning technology [7] Global positioning system (GPS) often widely deployed and applied in outdoor positioning technology and the cellular base station

Node indoor positioning solutions with low-cost and high-precision have been paid more attention from scholars The wireless communication technology, e.g., WiFi, Zig-bee, Cellular, Bluetooth, can effectively be used to solve the blocking problem with GPS

J.-S Pan et al (eds.), Advances in Intelligent Information Hiding and Multimedia

Signal Processing, Smart Innovation, Systems and Technologies 212,

Trang 2

as outdoor [8] However, the localization accuracy is affected by obstacles, non-line-of-sight propagation, and noise due to the complex indoor environment [9] The traditional methods, such as continuous positioning, cannot achieve high accuracy because of the complex environments or the overﬁtting problem The primary practical signiﬁcance is

to study indoor positioning algorithms for smart node devices

This paper considers the node device indoor positioning method based on the multi-layers perception learning with the hidden structural features of data extracted by direct learning The network generalization ability is used to avoid the overﬁtting problem A deep learning model of the input, hidden, and output layers with setting the input–output approximatelyequal, learning the parametersof network weights, and then build the encoding mode, is applied to identify continuous prediction positioning The location

of the nodes is estimated and measured through the captured signals strength with the nearest neighbor algorithm The signal strength data of a stacked coding scheme is used

to build a position database for ﬁtting generalization

Amultilayer perceptron (MLP)is amultilayer neural network thatusually contains multiple hidden layers, which i mprove network expression ability for prediction It is similar tothethree-layer structureof the traditional neural networkthat includesan input layer, a hidden layer, and output layer The gradient of concentrated learning is transmitted effectively through layer-by-layer training methods

MLP is known as a concentrated pre-feedback network that is a typical deep learning model Its multiple layers of nodes, where each layer is fully connected to the next layer, and each node in the hidden layer is operated with a nonlinear activation function There are several activation functions, e.g., Sigmoid, Tanh, Relu functions

The Sigmoid function is expressed as follows The Sigmoid function is used neural networks as an S-type function that can compress the real number into the interval of [0, 1], which is under a durable explanatory power However, when the neuron approaches

0 or 1, saturation will occur, leading to gradient dispersion Therefore, the weight should

be initialized carefully

f (x) =

1

1 + ex

(1) The Tanh function: this function has good data control ability and maps real numbers

to the interval of [−1, 1], but there is still a saturation problem

f (x) = e x

− e

−x

ex+ e−x

(2) Relu function: is a linear correction unit, which is 0 when x < 0 and 1 when x > 0 Relu converges faster, but Relu is also more fragile Large gradient ﬂow may lead to the permanent failure of neurons, which can be avoided by selecting an appropriate learning rate or inter-layer batch regularization The formula is as follows:

Trang 3

The networkmodel is trained by using a back-propagation model The training sample set is x

(1)

, y (1) , , | x

(m) , x (m) , where m is the number of samples, and the sample set is utilized for training the neural network The loss function in the experiment

is expressed as follows

J (W , b; x, y) =

1 2

hW ,b(x) − y

2

(4) The critical step of the gradient descent method is equivalent to calculate the partial derivatives The iterative formula is given as follows

W (1)

ij = W

(1)

ij − α

∂

∂ W (1) ij

where W andb are the weight and the bias item in the network, respectively

b (1)

(1)

i − α

∂

∂ b (1) ij

where the a is t he learning rate

A deep learning regressionprediction model with an indoor location scheme can predict and estimate discrete points For a more accurate continuous prediction location,

a regression prediction model is used to build a dataset by using meaningful learning The linear regression model can be expressed as follows

f (x) = w

T

where x represents input, w represents the weight, and b represents deviation w and b are trained as minimized objective functions The model ﬁrst processes input data and then performs pre-training When the output layer is achieved, the model will propagate back The algorithm stops when it converges

The location information of the sensor nodes inthe deployed network environment

is estimated by deep learning indoor location algorithm with the signal strength of various signal sources sensor devices The position point’s dataset is established based

on the principle of extracts feature or reduces noise It would be trained and tested by a multilayer perception It matches the signal intensity features in the position of the node dataset, and the nearest neighbor algorithm is used to estimate the location of the points that are measured and chooses for the best matching position

3.1 The Positioning Algorithm

The related features of the collected data can be extracted from high-dimensional col-lected data, and reduce the data dimension The input layer, the hidden layer, and the output layer are set as the calculation as follows

h =

1

(8)

Trang 4

Let h be the hidden layer, and v be to calculate reconstructed output layer u’ The calculation method is as follows:

u =

1

1 + exp(−w h − b )

(9)

where w and w are, respectively, the connection weights between the input layer and the hidden layer and between the hidden layer and reconstructed output layer The weight mat rix w is limited to the transpose of the weight matrix w that is, w = w

T b and b is the bias units of the hidden layer and reconstructed output layer, respectively; h is the hidden layer unit data The training of the automatic coding machine is to minimize the reconstruction error between u and u obtained through the input layer v The smaller the error is, the closer the reconstructed output layer is the input layer The hidden layer can better expressthe information of theinputlayer toreachthe purpose of feature extraction

The K-dimensional vector v = {vi|i = 1, 2, , k },the inputlayer of the N the hidden layer The number of hiddenlayerneurons of experimentalencoding values

vofﬂine = vji|j = 1, 2, , J ; i = 1, 2, , K is trained at the input of the stack-ingautomatic coding machine of structure J is the number of datastrips collected

in the offline phase, and each dimension of each piece of data corresponds to an RSS of fixed AP or iBeacon The training under newly collected data: DATAoffline = h

3

ji ofﬂine

|j = 1, 2, , J ; i = 1, 2, , n ={ dataji|j =1, 2, , J ; i =1, 2, , n},

n represents the dimension of data

3.2 Nearest Neighbor Technique

Phase data {vofﬂine= 1, 2, , K } is put in system with the input layer, and the structure for a forward propagation, where the parameters w andb are the DATA trained in the ofﬂine phase, and DATA = h

3

i online

|i = 1, 2, , n = {DATAi|i = 1, 2, , n} as the input data of the classifier nearest neighbor method iBeacon corresponding to the RSS [10] of each dimension of the original fingerprint database and the Vonline phase DATA online are the same, and the information expressed by each dimension of the new fingerprint database and the online DATA is also corresponding In the original dataset

of ofﬂine and online DATA, the nearest neighbor method is used to calculate the online phase data and the Euclidean distance of the i data in the new dataset

dj = n

i =1 DATAi− dataij

2

(10 )

where datajirepresents the i dimension data of the j data in the new ﬁngerprint database, datajirepresents the i dimension data in the online phase, and n represents the dimension

of the data processed by the automatic stack encoder Finally, depending on the order

ofEuclidean distancedj fromsmall to large (theshorter the distance,the higher the similarity of the two kinds of data), the coordinate of the sampling point with the smallest range is the positioning result

Trang 5

4 Experimental Results

A deployed network area is used for the nodes device localization to verify the effective-ness of the proposed scheme The setting environment of the deployed coverage network area includes corridors and ofﬁces equipped with desks, chairs, bookcases, and other ofﬁce items The signal strength collected with the groups at a time interval of constant seconds for data in each location is to collect data

The simulation of the multilayered neural network is tested with a set of 100 samples Several hidden layers in the multilayered neural network classiﬁer are set to L (L is set

to 3, 5, 10, 20, 50); the activation function used Relu adopted of the hidden layer that initializes the weight Regression ﬁtting is carried out on the test set to predict the results

of coordinate points It can be observed in the table that the prediction effect of multiple hidden layers is obviously better than that of a single layer, but the positioning error is still substantial

Table1shows a comparison of the prediction effect of multiple hidden layers with

a single layer The results of coordinate node-points are estimated on the test set based

on a regression ﬁtting It can be seen that the positioning error of the multilayer hidden

is better than a single layer

Table 1 Comparison of results of multi-hidden layers with the single hidden layer

Hidden layer setting A single hidden layer Multi-hidden layers Mean positioning error /m 0.336 0.268

The error is less than 0.25 m registration point/% 19.5 51.8

Table2shows the speciﬁc resultsof theselected activation functions,e.g., Relu, Sigmoid,and Tanh, for thehidden layers Itis clear to see thatadequate positioning accuracy through the complete action Relu function performs well in the classiﬁcation task However, the Sigmoid and other functions are so practical in the case of uneven data distribution due to its weak ability to control data

Table 2 Comparison of positioning accuracy of selected activation functions for the hidden layers

The activation function Relu Sigmoid Tanh Mean positioning error /m 0.2716 0.2158 0.1796 The error is less than 0.25 m registration point /% 37 54 71

The experimental results of the proposed method are compared with the other tech-niques, e.g., grey wolf optimizer (GWO) [4], ﬁreﬂy algorithm (FA) [5], pigeon-inspired optimization (PIO) [9], Ion motion optimization (IMO) [11] for constructing the rela-tionship between positioning features and positioning coordinates under the same

Trang 6

exper-method with several nodes positioning means by, e.g., GWO, FA, IMO, and PIO algo-rithms Subﬁgure (a) is the average positioning error comparison, and subﬁgure (b) is the average positioning error cumulative probability distribution Observed, the obtained resultsof the proposedmethodcan provide smaller errors inthe devicepositioning problems

Table3 depictsthe comparison of thetime consumption of the proposed method with the GWO, FA, IMO, and PIO approaches for node positioning problem (ms) with

a variety of nodes numbers of deployed networks It can be seen that most cases of the time running of the proposed method produce a shorter time than the competitors

Table 3 Comparison of the time consumption of the proposed method with the GWO, FA, IMO, andPIOapproaches fornodepositioningproblemwithdifferentnodesnumbers ofdeployed networks

Algorithms N = 20 N = 50 N = 80 N = 110 N = 130 N = 160 Proposed method 297.476 350.721 398.991 451.081 502.766 552.004 GWO 301.350 357.109 408.329 459.231 513.221 565.171

FA 302.701 353.281 405.421 458.341 508.217 557.124 IMO 6.213 102.334 166.210 217.662 275.329 327.371 PIO 60.001 101.219 159.296 241.002 266.737 319.534

Generally, the comparison results of location accuracy and calculation time show that the proposed algorithm can achieve better performance than the competitors

In this study, we proposed a prediction scheme for the sensor node positioning based

ona multilayer perceptionin the neural networks The nearest neighbor was used to estimation for inputs of the scheme system The signals’ strength of the node locations wasused to bethe parametersas inputs to theclassiﬁcation system The featuresof the node signal strength were extracted from a large number of signal strength samples eventincluded noise Theexperimentalresults comparedwith theotherapproaches, e.g., GWO, FA, IMO, and PIO methods in the literature, show that the proposed scheme provides higher positioning accuracy and the lower average error than the competitors

Trang 7

Fig 1 Comparison of the obtained values of the proposed method with several nodes positioning means by, e.g., GWO, FA, IMO, and PIO algorithms Subﬁgure a is the average positioning error comparison, and subﬁgure b is the average positioning error cumulative probability distribution

References

1 Clemensen, J.,Larsen,S.B.,Kyng,M.,Kirkevold,M.: Participatorydesigninhealth sci-ences: using cooperative experimental methods in developing health services and computer technology Qual Health Res 17, 122–130 (2007)

2 Dao, T., Nguyen, T., Pan, J., Qiao, Y., Lai, Q.: Identiﬁcation failure data for cluster heads aggregation in WSN based on improving classiﬁcation of SVM IEEE Access 8, 61070–61084 (2020).https://doi.org/10.1109/ACCESS.2020.2983219

3 Nguyen, T.-T., Qiao, Y., Pan, J.-S., Chu, S.-C., Chang, K.-C., Xue, X., Dao, T.-K.: A hybridized parallel bats algorithm for combinatorial problem of traveling salesman J Intell Fuzzy Syst Preprint, 1–10 (2020).https://doi.org/10.3233/JIFS-179668

4 Nguyen,T.-T.,Thom,H.T.H.,Dao,T.-K.:Estimationlocalizationinwirelesssensor net-work based on multi-objective grey wolf optimizer In: Akagi, M., Nguyen, T.-T., Vu, D.-T., Phung, T.-N., Huynh, V.-N (Eds.), Adv Inf Commun Technol Proc Int Conf ICTA 2016, Springer International Publishing, Cham, pp 228–237 (2017) https://doi.org/10.1007/978-3-319-49073-1_25

5 Nguyen, T.-T., Pan, J.-S., Chu, S.-C., Roddick, J.F., Dao, T.-K.: Optimization localization in wireless sensor network based on multi-objective ﬁreﬂy algorithm J Netw Intell 1, 130–138 (2016)

6 Nguyen, T.T., Pan, J.S., Dao, T.K.: An improved ﬂower pollination algorithm for optimizing layouts of nodes in wireless sensor network IEEE Access 7, 75985–75998 (2019).https:// doi.org/10.1109/ACCESS.2019.2921721

7 Pei, L., Chen, R., Chen, Y., Leppäkoski, H., Perttula, A.: Indoor/outdoor seamless positioning technologies integrated on smart phone In: 2009 First Int Conf Adv Satell Sp Commun., IEEE, pp 141–145 (2009)

8 Yeh, S.-C., Hsu, W.-H., Su, M.-Y., Chen, C.-H., Liu, K.-H.: A study on outdoor positioning technology using GPS and WiFi networks In: 2009 Int Conf Networking, Sens Control, IEEE, pp 597–601 (2009)

9 Nguyen, T.-T., Pan, J.-S., Dao, T.-K., Sung, T.-W., Ngo, T.-G.: Pigeon-inspired optimization for node locationin wireless sensornetwork BT—advancesin engineering researchand application In: Sattler, K.-U., Nguyen, D.C., Vu, N.P., Tien Long, B., Puta, H (Eds.) Springer International Publishing, Cham, pp 589–598 (2020)

10 Vagheﬁ, R.M., Gholami, M.R., Ström, E.G.: RSS-based sensor localization with unknown transmit power In: 2011 IEEE Int Conf.Acoust Speech Signal Process., pp 2480–2483

Trang 8

11 Pan, J.-S., Nguyen, T.-T., Chu, S.-C., Dao, T.-K., Ngo, T.-G.: Network, diversity enhanced ion motion optimization for localization in wireless sensor J Inf Hiding Multimed Signal Process 10, 221–229 (2019)

Trang 9

and Feature Scoring Method

Tserenpurev Chuluunsaikhan

1 , Kwan-Hee Yoo

1 , HyungChul Rah

2 , and Aziz Nasridinov

1 (B)

1

Department of Computer Science, Chungbuk National University, Cheongju, South Korea

{teo,khyoo,aziz}@chungbuk.ac.kr

2

Department of Management Information System, Chungbuk National University, Cheongju,

South Korea hrah@chungbuk.ac.kr

Abstract A large amount of text data may hide a numeric connection related to some other subject, for example, price In this paper, we aimed to predict pork prices based on topic modeling and word scoring method This study consists of foursteps, suchas feature extraction, word scoring, featureselection, and pre-diction.Anypredictionmodelhas input/featuresandoutput.Weextractedour features from online news data using the topic modeling technique (LDA) Also,

we selected the daily pork price as the output After that, we created a word scoring corpus using the result of LDA and price movements Because of our features and output are numeric values, we applied the Pearson’s correlation as feature selec-tion To check our word scoring method, we built a prediction model of pork price using LSTM We evaluated the model without feature selection and with feature selection We used RMSE, MAE, and MAPE to measure our model accuracy The results show that our model can be used in the price prediction of pork and other agricultural commoditi es

Keywords:Price prediction · Topic modeling · Word scoring · LSTM

Agriculture is an importantsector that hasalways been with human growth It is the process of producing food by the cultivation of grain and the raising of domesticated animals (livestock) All the humans are participants of this process For example, they are governors, farmers, and consumers The main thing that connects them is the price

of agricultural commodities.High prices dobeneﬁtfarmers, butnot for consumers Consumers always want low prices In this situation, governors attempt to keep the price

at the proper level Predicting the price will help governors make decisions in the future

If farmers know the price in advance, they also can regulate the production of agricultural commodities It helps them to avoid the loss of the economy

J.-S Pan et al (eds.), Advances in Intelligent Information Hiding and Multimedia

Signal Processing, Smart Innovation, Systems and Technologies 212,

Trang 10

Many research works attempt to predict agricultural commodity prices [1 3] Notably,online text data(online news,Twitter, and others)make a signiﬁcant con-tribution to predicting agricultural commodities prices [2,3] Because online text data could include reasons (words) for price changes So, we need to extract good keywords

to get good results of agricultural price prediction

Pork is one of the agricultural commodities that people mostly use in Korea The market supply and demand determine the price of pork like any other product Some unexpected and unplanned actions can change supply and demand For example, African SwineFeveris ahottopicof the pork market inKorea.During this disease,people decrease their usage of pork Because of t hat, the pork price is also fallen When demand

is decreasingcontinuously, the governors begin to take action to support consump-tion Consumers usually knowthis kind of information from news andregulate their consumption That is why we believe that news can affect the price of pork

Pri ce prediction is a process of trying to calculate the future price using inputs/features that can affect the price The inputs/features can be different values based

on their sector and goals In this paper, we proposed a pork price prediction model using topic modeling and word scoring method First, we applied a topic modeling technique

to obtain the most relevant words from online news data Also, we compared the price with rising and falling prices for the previous day Finally, we scored each word of topic modeling using the price changes We evaluated our model using the LSTM algorithm

To increase the accuracy of the model, we also applied a feature selection method We measuredouraccuracy usingroot mean squarederror (RMSE), meanabsoluteerror (MAE), and mean absolute percentage error (MAPE)

We explain our proposed method in this section The section consists of feature extraction (2.1), feature scoring (2.2), feature selection (2.3), and price prediction (2.4) Figure1

shows an overview of our purposed method The proposed method mainly consists of feature extraction, feature scoring, and feature selection Additionally, we applied some other data preprocessing techniques to increase the accuracy of our model We explain

it in detail in the fol lowing subsections

2.1 Feature Extraction

Feature extraction is the initial step of the methodology We collected online news data from PigTimes [4], which is a portal web that publishes news about pig Since the portal publishes news related to pig, we can assure that our dataset just related to pig, pork, pork farm, and pork market Topic modeling is an important technique of natural language processing (NLP) It extracts relevant topics from a large amount of text data There are many approaches to obtaining topics from text data For example, LDA is one of the popular topic modeling techniques Weextracted the input features of our prediction model using the LDA technique

Our output feature is the daily price of pork We collected the retail price of pork from KAMIS [5], which provides various information related to the distribution of agricultural

Định dạng
Số trang	11
Dung lượng	191,49 KB