Luận văn an efficient on demand charging for wrsns using fuzzy logic and q learning

TIANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY Master’s Thesis in Data Science and Artificial Intelligence An Efficient, On-demand Charging for WRSNs Using Fuzzy Logic and Q-Learning

Trang 1

TIANOI UNIVERSITY OF SCIENCE AND

TECHNOLOGY

Master’s Thesis in Data Science and

Artificial Intelligence

An Efficient, On-demand Charging

for WRSNs Using Fuzzy Logic and

Q-Learning

La Van Quan Quon L¥202335MGsis.hust.cdu.vn

Department: Department cf Software engineering

Institute: School of Information and Communication ‘lechnology

Hanoi, 2022

Trang 2

Declaration of Authorship and Topic Sentences

1 Personal information

Full name: La Yan Quan

Phone number: 039 721 1659

Email: Quan.LV202335M Gsis.hust.cdu vir

Mojor: Data Scicuce uid Artificial Intelligence

2 "Topic

An Efficient, On-demand Charging for WRSNs

Using urzy Logic and Q-Tearning

3 Contributions

to be charged to the sensors

charging location to maximize the number of alive sensors

* Propose Fuzzy Q.charging, which uses Q-learning in its charging scheme

to guarantce the target coverage and connectivity,

4A Declaration of Authorship

Thercby declare that my thesis, titled ‘An Bificlent, On-demand Charging for WRSNs Using Fuzzy Logic and Q-Learning", is the work of myself and my supervisor Dr Nguycn Phi Le All papers, sources, tables, and so on used in this thesis huve been (horoughly cited,

Trang 3

Acknowledgments

I would like to thank my supervisor, Dr Nguyen Phi Le, for her continued support and guidance throughout the course of my Masters’ studies She has been

a great teacher and mentor for me since iny underyraduate years, and I am proud

tu have completed this thesis under her supervision

Twant to thank my family aud my friends, who have given me their unconditional Jove and support to finish my Masters’ studies,

Finally, T wonid like to again thank Vingronp and the Vingronp Trnovation

Foundation, who have supported my studies through their Domestic Master/Ph.D

Scholarship program

Paxts of this work were published in the paper “(Q-learning based, Optimized On

demand Charging Algorithm in WRSN” by La Van Quan, Phi Le Nguyen, Thanh

Hung Nguyen, and Kieu Nguyen in the Proceedings of the 19th IEEE International Symposium on Network Computing and Applications, 2020

La Van Quan was fuuded by Vingroup Joint Stock Company and supported by

the Domestic: Master/Th.D Scholarship Programme of Vingroup Tnnovation Foun- dation (VINIF], Vingronp Tig Data Institute, eade VINTF.2020.ThS BIC.03.

Trang 4

several important tasks, two of which are sensing and communication Every time

the above tasks are performed, the sensor’s energy will be lost over time Therefore

some sensor nodes may die A sensor node is considered dead when it runs out of

energy Correspondingly, the lifetime of WS

of operation until a sensor di

The limited battery capacity of the sensor is always a "bottleneck" that greatly af-

fects the life of the network To solve this problem, Wireless Rechargeable Sensor Networks (WRSNs) were born, WRSNs include sensors equipped with battery charg-

(Mobile Chargers (MC)) responsible for adding

ers and one or more mobile charge

power to the sensors In WRSNs MCs move around the network, stopping at spe-

cific locations (called charging sites) and charging the sensors Thus, it is necessary

to find a charging route for MC to improve the lifetime of WRSNs [2] [3]

Trang 5

4.4 Charging time determination

4A Fury logic-based safe energy level determination

44.1 Motivation 44.2 Furzification

443 Luzzy controller 44.4 Deluuzification 4.5 Reward function

vi

ONT hee

24

25 2

Trang 6

3.2 Comparison with existing algorithms

Trang 7

Comparison of non-rtonitored targets over lime 6

Comparison of dead yeusors over bine 6

vi

waa

Trang 8

Fuzzy rules evaluation

Trang 9

cludes many battery-powered sensor nodes, monitoring several target

ing In the WSNs, it is nec«

to provide sufficient monitoring quality surrounding the targets (i.e., guarantee-

ing target coverage) Moreover, the WSNs need to have adequate capacity for the communication between the sensors and base station (i.e

Í0][7]IS] The target coverage and connectivity are severely affected by the depletion

, ensuring connectivity)

of the battery on sensor nodes, When a node runs out of battery, it becomes a dead

nsing and communication capability, damaging the whole network in

WRSNs) leverages the advan- node without s

coverage and connectivity

In a normal operation, the MC moves around the networks and performs charg-

ing strategie: h can be classified into the periodic {{)][1][10][11][12] or on-demand

charging [1][2][/4][15] [10][L7][1S] In the former, the MC, with a predefined trajec-

tory, stops at charging locations to charge the nearby sensors’ batteries In the latter

the MC will move and charge upon receiving requests from the sensors, which have the remaining energy below a threshold The periodic strategy is limited since it can- not adapt to the sensors’ energy consumption rate dynamic On the contrary, the

on-demand charging approach potentially deals with the uncertainty of the energy consumption rate Since a sensor with a draining battery triggers the on-demand op- cration, the MC’s charging strategy faces a new time constraint challenge The MC

needs to handle two crucial issues: deciding the next charging location and staying

period at the location

Trang 10

Although many, the

<isting on-demand charging schemes in the literature face two seriona problems The first one is the consideration of the same role for the sensor nodes in WRSNs That ia somewhat unrealistic since, intuitively, several sensors, depending on their locations, significantly impact the target coverage and the connectivity than others Hence, the existing charging schemes may enrich unnecessary sensors’ power while letting necessary ones run out of energy, leading to charging algoriUnus" jneflicicney Tt is of great importance to tuke inte account the target coverage und connectivity simulluncously The second problen is about Uie MC's chorging awount, which iy vither a ull capacity (uf seusor buttcry) or a fixed amount

of energy The former case may cause: 1) a long waiting time of other sensors staying near the charging lacatin; 2) qnick exhanstion of the MC’s energy Tn contrast, charging a too small amomnt to a node may tead to its lack of power ta operare

should adjust the transferred energy level dynamically following the network condition

for WRSN that assures the larget coverage aud connectivity and adjusts Lie energy

Jy proposal, named Fuzzy Q-charging,

aims to maximize the network lifetime, which is the time until the first target is nov

level charged to the sensors dynamically

monitored First, this work exploit Fuzzy logic in an optimization algorithm that determines the optimal charging time at each charging location, aiming to maximize the numbers of alive sensors and monitoring targets Fuzzy logic is used to cope with network dynamics by taking various nctwork parameters into account during the determination process of optimal charging time Socond, this thesis leverage the Qlcamning technique in 9 now algorithm that selects the next charging location to maximize Uhe network lifetiine The MC naintains a Q-table coulaiuing Uhe charging locations’ Q-values representing the charging locations’ goodness The Q-values will

be updated in a real-time manner whenever there is a new charging request from a sensor I design the Q-value to pricritize charging locations at which the MC cen charge a node depending on its critical role After finishing tasks in one place, the

MC chooses the next one, which has the highest Q-value, and determines an optimal charging time The main contributions of the paper are as follows

« This thesis propose a Fuzzy logic-based algorithm that determines the energy level to be charged to the sensors ‘I'he energy level is adjusted dynamically

following the nctwork condition

@ Based on the above algorithm, this thesis introduce a new method that opti- mizes the optimal charging time at each charging location It considers sev-

Trang 11

eral parameters (i

_ Temaining energy, energy consumption rate, sensor-to-

charging location's distance) to maximize the number of alive sensors

reward fimction is designed to maximize the charged amormt: ro essential sen-

sors and the mmber af manitored cargets

‘he rest of this thesis is constructed as follows

of the WSN and the WIRSN, the previous works of the charging scheme opti mization problem and some optimization algorithms

and fnzzy logie

« Chaprer 4 presents the propoael algorithms, which are eamprised df the fnzzy logic Q-learning approach

® Chapter 5.2.4 concludes the thesis and discusses about future works.

Trang 12

Chapter 2

Theoretical Basis

A Wireless Sensor Network (WSN) is a network that consists of several spatially

di

to monitor and record physical conditions in a variety of situations The sensors

tributed and specialized sensors connected by a communications infrastructure

in WRSNs will collectively convey the sensing data to the Base Station (BS), also known as a sink, where it will be gathered, processed, and/or multiple actions done

as needed A typical WSN connecting with end-users is seen in Fig 2.1

To monitor the physical

Sensors play an important role in a sensor networ

environments and communicate with others efficiently, sensors have a lot of re-

quirements They not only need to record surroundings accurately and precisely, be

capable of computing, analyzing, and storing the sensing data, but also have to be small in space, low in cost, and effective in power consumption

Sensor

s commonly comprise four fundamental units: a sensing unit, which monitors environments and converts the analog signal into a digital signal; a processing unit, which processes the digital data and stores in memory; a transceiver unit, which provides communication capability; and a power unit, which supplies energy

to the sensor [|] In addition, some sensors also have a navigation system to deter-

deployed in the 1950s as part of a sound surveillanc

's have a wide range of applications in a variety of fields They were first

system designed by the US

Navy to detect and track Soviet submarines WSNs are now used in a variety of civilian applications, including environmental monitoring, health monitoring, smart agriculture, and so on A WSN can be used in a forest, for example to alert au- thorities to the risk of a forest fire Furthermore, WSN can track the location of a fire before it has a chance to expand out of control WSNs have a lot of potential in

ed system can record the fluctuation rate of glucose in

Trang 14

“me Sensor "i tase station «Gf Mobile charger (MC) fH! Depot @ Charging location

4 Monitored target % Now-monitred tuyet_ —+ Duta transmission.) Sensing range

Figure 2.3: Network model

Many efforts have been made to reduce the energy usage of WSNs They have

attempted to optimize radio signals using cognitive radio standardization, lower data rate using data aggregation, save more energy using sleep/wake-up schemes,

and pick efficient energy routing protocols However, none of them completely solved

the energy problem of the sensor node in WSNs The battery will ultimately run out

if there is no external source of electricity for the sensors Gathering energy from the

environment is another way to overcome the sensor's energy depletion problem Each

In recent years, thanks to advancements in wireless energy transfer and recharge-

able battery technology, a recharging device can be used to recharge the battery of

sensors in WSNs As a result, WRSNs, a new generation of sensor networks, was born

(Fig 2.3) The sensor nodes in WRSNs are equipped with a wireless energy receiver

via wireless transfer radio waves based on electromagnetic radiation and magnetic resonant coupling technology, giving them an edge over standard WSNs WRSNs use one or more chargers to recharge sensor nodes on a regular basis As a result,

the lifetime of the network is optimally prolonged for eternal operations WRSNs

approach Charging terminals and MCs are the two types of chargers available, A

„ in particular, necessitate a charger employment charging terminal is a device that has a fixed placement in the network and can

Trang 15

recharge many sensors Because the network scale is normally large, a significant

network, allowing it to cover a large region If it runs out of power, it will return to

the BS to replenish its battery a result, the only issue is that we need to figure

Q-Learning [L9] is one of the most often used Reinforcement Learning (RL) al-

gorithms It learns to predict the quality, in terms of expected cumulative reward, of

an action in a specifi¢ state (Q-value) [1] Moreover, as it is a model-free reinforce-

ment learning algorithm [20], [15], the agent does not have a model representation

of the environment, it simply learns and acts without knowing the changes being

caused in the environment The methods in which an environment model is known are called model-based In this case, the agent knows approximately how the environment is going to evolve This is the reason why model-based methods focus on planning while model-free ones focus on learning [19]

The standard Q-learning framework consists of four components: an environ-

ment, one or more agents, a state space, and an action space, as shown in Fig 2.4 The Q-value represents the approximate goodness of the action concerning the

y and the Q-value After

performing an action, the agent modifies its policy to attain its goal The Q-value

agent’s goal An agent chooses actions according to the polic

is updated using the Bellman equation as follows:

QS Ad) & (1 = a) Q(S) Ad) + [Re +7 max Q(Sts1, a), (2.1) where Q(5;, A;) is the Q-value of action Ay at a given sate Sj Ry is the reward

obtained if performing action Ay where in the state S; Moreover, max Q(Sts1,4)

is the maximum possible Q-value in the next state S,,; for all possible actions a a

and ¥ are the learning rate and the future reward discount factor Their values are set between () and 1

An explicit procedure to implement the Q-learning algorithm is provided in Algorithm 1

Trang 16

States, Reward Re ActionA,

2 for each episode do

Get initial state +

Select @ using policy derived from Q:

Take action a, observe next state s” and obtain reward r;

Update Q(s.a) by equation 2.1;

rtainly true or false in some cases In such situations,

fuzzy logic can be used as a flexible method for reasoning, given the uncertainty

In logic Boolean, a classic logical statement is a declarative sentence that deliv- ers factual information, If the information is correct, the statement is true; if the

information is erroneous, the statement is false However, sometimes, true or false

values are not enough

Lotfi et al [7] coined the term "fuzzy logic" in the 1960s to describe a type

of logic processing that contains more than two true values The fact that some

assertions contain imprecise or non-numerical information influences fuzzy logic

The term "fuzzy" was also used to describe ambiguity and unclear information As

a result, fuzzy logic can describe and manipulate ambiguous and uncertain data, and it has been used in a variety of industries

Following the fuzzy method, fuzzy logic uses particular input values, such as multi-numeric values or lingnistic variables, to produce a specific output The fuzzy technique will determine if an object fully or partially contains a property, even

if the property is ambiguous For example, the term "extremely strong engine" is

based on the fuz method There are hidden degrees of intensity ("very") of the

trait in question ("strong").

Trang 17

J-ontput has the following form

Be: LP (fis An )OUp is Ag lO Un is Aue)

where {h, - Tx} represents the crisp inputs to the role {Ay, , As} and Ry are linguistic variables The operator @ can be AND, OR, or NOT Inference Engine is in charge of the estimation of the Vuzzy output set It calculates the membership degree (4) of the output for all linguistic variables by applying the rule set described in Knowledge Hase, Kor Fuzzy rules with lots of inputs, the output calculation depends on the operators used inside it, Le., AND, OR, or NOT The

calculation for euch Lype of operator iy described as follows:

(his Ay AND Ty is Ay):

Hage as ag) = main Gea lho Mas}, (i is A; OR Fy is Ay):

Baxva,(u) = max (ua, (0), 444,()))-

where gø(2) is the output membership function of the Hnguistie varinblc Z8,

Trang 18

Chapter 3

Literature Review

Initially, This thesi

in WRSNs In [1], the authors leverage PSO and GA to propose a charging path

introduces the existing works related to periodic charging

determination algorithm that minim: the docking time during which the MC

ff at the depot [1] jointly conside

s charging path planning and depot

Lin et al derive a new energy transfer model with distance and angle

all nodes They use linear programming and obtain the optimal solution As the

charging schedule is always fixed, the periodic scheme fails to adapt to the dynamic

of sensors’ energy consumption

Regarding the on-demand charging, the authors in [17] address the node failure

problem They first propose to choose the next charging node based on the charging

probability Second, they introduce a charging node selected method to minimize the

number of other requesting nodes suffering from energy depletion In [2, 11], aiming

to maximize the charging throughput, they propose a double warning threshold

charging scheme, Two dynamic warning thresholds are triggered depending on the

residual energy of sensors The authors in [15] studied how to optimize the serving order of the charging requests waiting in the queue using the gravitational search algorithm In [!()], X Cao et al introduce a new metric (i.e., charging reward), which quantifies the charging scheme’s quality The authors then address the problem of

maximizing the total reward in each charging tour under the constraint of the MC's

10

Trang 19

energy and sensors’ charging time windows They use a deep reinforcement learning-

based on-demand charging algorithm to solve the addressed problem

‘The existing charging algorithms have two serious problems, First, the charging

time problem has not been thoroughly considered Most of the charging schemes leverage either the fully charging approach [1, 2, 9, 10, 1, 14, 17], or the partial

charging one [21] I want to emphasize that the charging time is an essential factor

that decides how much the charging algorithm can prolong the network lifetime

Moreover, there is no existing work considering the target coverage and connectivity

constraints concurrently, Most previous works treat all sensors in WRSNs evenly;

hence, the MC may charge unnecessary sensors while necessary ones may run out

of energy Unlike them, this work addresses the target coverage and connectivity

constraints in charging schedule optimization This thesis uniquely considers the

robotics [24], embedded controllers [25] In WSNs, Fuzzy logic is a promising, tech-

nique in dealing with various problems, including localization, routing [20, 27], elus-

tering [19], and data aggregation [28, 20] R M Al-Kiyumi et al in [26] propose a

Fuzzy logic-based routing for lifetime enhancement in WSNs, which maps network

shortest path In [20], the

anthors also leverage Fuzzy logic and Q-learning, but in a cooperative multi-agent

status into corresponding cost values to calculate the

system for controlling the energy of a microgrid In [il], Fuzzy and Q-learning are

combined to address the problem of thermal unit commitment Specifically, each in-

put state vector is mapped with the Fuzzy rules to determine all the possible actions with corresponding Q-values, The main idea is exploiting Fuzzy logic to map network status into corresponding cost values to calculate the shortest path Recently,

the authors in [15] use Fuzzy logic in an algorithm for adaptively determining the charging threshold and deciding the charging schedule Different from the others, I use Fuzzy logic and Q-learning in my unique Fuzzy Q-charging proposal The earlier version of this work has been published in ['}1], which considers only Q-charging

Figure 3.1 shows the considered network model, in which a WRSN monitors:

several targets The network has three main components: an MC, sensor nodes, and

a base station The MC is a robot that can move and carry a wireless power charger The sensor nodes can receive charged energy from the MC via a wireless medium

The base station is static and responsible for gathering sensing information, We

assume that there are n sensors Sj (j = 1, ,n) and m targets T (k = 1, ,m)

We call a sensor a farget-covering sensor if it covers at least one target Moreover,

11

Trang 20

Ấm Somer ÍÍBaeuaiom €Ể Mobilechaser(MC) ÏỂ Do, ® Chayinglocmlon

mm”

Figure 3.1: Network model

if there exists an alive routing path between a sensor and the base station, it is connected to the base station The target is defined as to be monitored when at

least one sensor connected to the base station covers it

A sensor node that has its remaining energy below Ej, (i.c., a predefined thresh-

old) will send a charging request to the MC We target a non-preemptive charging

schedule, in which charging requests from sensors are queued at the MC We as-

sume that there are k charging locations denoted by Dj, , Dy in the network When the MC completes its tasks at a charging location, it runs the proposed algorithm to select the next optimal charging location from D,, , Dg Moreover, the MC also determines the optimal charging time at that charging location When

the energy of the MC goes below a thi

itself Besides gathe

hold, it returns to the depot to recharge

ing the sensing information, the base station is also responsible for collecting information about the remaining energy sensors Based on that, the

Trang 21

following procedures to update the Q-rable

« The MC leverages Fussy logo tơ calculate a so-elled safe energy level, which

is sulficieutly higher than Ey The MC thou uses the algorithm deseribed

in Section 4.3 lo determine the charging lime at cach charging location, The

charging time is optimized to maximize the mmber of sensors which guarantee

the safe energy level

Ihe MC calculates the reward of every charging location using (4.9), and update the Q-table using equation (4.1)

Aller finivhiug charging ut a charging location, the MC selects the next charging location aa the one with the highest Q-valne Finally, the MC moves to the next charging location and charges far the determined charging time When the energy

of the MC goes helow a threshold, it returns to the depot to recharge itself Fignre 4.1 presents the overview of our charging algarithm

In our Q-learning: based model, the network is considered the environment while the MC is the agent A stute is defined by the current uburging location of the

MG, nnd un uetion isu wove to the next charging location Each MC iaintains its own Qeluble, which is a two-dimensiounl array Each row represents a stute, and each ealimn represents an action An item Q(2;, 1%) in the j-th row and £1h cahimn represents the Q-value corresponding to the action when the MC! moves from

the current charging square Dy to the next, charging location 7 Figure 4.2 shows

13

Trang 22

State space

Figure 4.2: Illustration of the Q-table

an illustration of our Q-table In the figure, the gray row represents the Q-values

concerning all possible actions when the MC stays at the charging location D The

green cell depicts the maximum Q-value regarding the next charging location

Let D, be the current charging location and D, be an arbitrary charging location, then the Q-value of action moving from D, to D; is iteratively updated by using the Bellman equation as follows:

Q (De, Di) — Q (De, Di) + a(r(Di) arene) (Di, Dj)-—Q(De,Di)) (41)

The equation ’s right side consists of two elements including the current Q-value

and the temporal difference The temporal difference measures the gap between the

estimated target, i.e., r(Di)+maz9 (Dj, Dj), and the old Q-value, i-e., Q (De, Di)-

a and 7 are two hyper-parameters whose names are learning rate and discount factor, respectively r(Dj) is our proposed reward function which will be detailed in

Section 4.5

In the following, we first describe our algorithms to determine the optimal charging time and the safety energy level in Sections 4.3, 4.4 Then, we present the details

14

Trang 23

of the reward function and the mechanism for updating the Q-table in Sections 4.5,

46

‘This work aims to design a charging strategy so that the number of sensors reaching a safe energy level is as big as possible after each charging round Here, the

safe energy level means the energy amount that is sufficiently greater than Ey, We

define the safe energy level, By, as

where Emar is the maximum energy capacity of the s

determine the optimal charging time 7; to minimize the number of critical sensors

We adopt the multi-nodes charging model, in which the MC can simultaneously

energy consumption rate of 8) which is estimated by the MC Suppose that the MC

charges Sj at D;, we denote the remaining energy of 9; when the charging process

starts and finishes as E; and Ej, then E; = Ej +(p\—e) xTj At the charging location

Dj, we call p',—e; the energy gain of S) The remaining energy of 8; will increase if its

energy gain is positive and decreases otherwise Note that the energy of 5; equals to

E,

the safety energy level, if the charging time equals to which is named as the safety charging time of S; with respect to the charging location D,, and denoted as

Aj The sensors can be classified into four groups The first and second ones contain

normal sensors with positive energy gain and critical sensors with negative energy

ively The third and fourth groups contain normal sensors with negative

gain, resper

energy gain, and critical sensors with positive energy gain, respectively Obviously, the first and second groups’ sensors don’t change their status no matter how long the MC charges at D, In contrast, a sensor Sj in the third group will fall into the critical status, and a sensor in the four groups can alleviate the critical status, if the

15

Tiêu đề	An efficient, on-demand charging for wrsns using fuzzy logic and q-learning
Tác giả	La Van Quan
Người hướng dẫn	Dr. Nguyen Phi Le
Trường học	Hanoi University of Science and Technology
Chuyên ngành	Data Science and Artificial Intelligence
Thể loại	Luận văn
Năm xuất bản	2022
Thành phố	Hanoi

Định dạng
Số trang	46
Dung lượng	1,47 MB