ODLIE: On-Demand Deep Learning Framework for Edge Intelligence inIndustrial Internet of Things Khanh-Hoi Le Minh∗, Kim-Hung Le† University of Information Technology, Vietnam National Uni
Trang 1ODLIE: On-Demand Deep Learning Framework for Edge Intelligence in
Industrial Internet of Things
Khanh-Hoi Le Minh∗, Kim-Hung Le† University of Information Technology, Vietnam National University Ho Chi Minh City
Ho Chi Minh, Vietnam Email:∗hoilmk@uit.edu.vn,†hunglk@uit.edu.vn
Abstract—Recently, we have witnessed the evolution of Edge
Computing (EC) and Deep Learning (DL) serving Industrial
Internet of Things (IIoT) applications, in which executing DL
models is shifted from cloud servers to edge devices to reduce
latency However, achieving low latency for IoT applications is
still a critical challenge because of the massive time consumption
to deploy and operate complex DL models on constrained edge
devices In addition, the heterogeneity of IoT data and device
types raises edge-cloud collaboration issues To address these
challenges, in this paper, we first introduce ODLIE, an
on-demand deep learning framework for IoT edge devices ODLIE
employs DL right-selecting and DL right-sharing features to
reduce inference time while maintaining high accuracy and
edge collaboration In detail, DL right-selecting chooses the
appropriate DL model adapting to various deployment contexts
and user-desired qualities, while DL right-sharing exploits W3C
semantic descriptions to mitigate the heterogeneity in IoT data
and devices To prove the applicability of our proposal, we
present and analyze latency requirements of IIoT applications
that are thoroughly satisfied by ODLIE
Index Terms—Industry 4.0, Edge Intelligence, Deep Learning
framework, Industrial Internet of Things
I INTRODUCTION
In the open world economy, industrial companies are under
the pressures of enhancing product quality and integrating
information technologies into the traditional industry [1]
This raises the prevalence of Smart industry, also known as
Industrial 4.0, a revolution in manufacturing technologies
en-abling automation and data analysis across machines through
connected smart objects (such as sensors, actuators) As a
result, an enormous amount of data from various is generated
for several manufacturing processes [2] They demand AI
models to process and infer valuable knowledge to promote
the automation in smart factories For example, Figure 1
presents the process to detect the anomalies in the product
surface by using scanned images from a camera These
images are then processed by an AI model in Edge servers
to decide whether a product is qualified
Among AI approaches, deep learning is a breakthrough
technology adopted in several scenarios because of its ability
to derive high-level knowledge from complex input space
by using DL models [3] Training these models may need a
large amount of computational resources and input data; thus,
the cloud-centric paradigm constituted of powerful servers is
leveraged to perform heavy tasks In more detail, IoT data
is transmitted from devices to cloud servers, in which they
are processed and analyzed by DL models The model output
Fig 1: The example scenario of surface inspection for Smart factory
is then responded to the devices However, transmitting data over a long network distance from devices to cloud may cause high end-to-end latency and various security issues These drawbacks bring out an edge computing paradigm that is low-latency and energy-efficiency by leveraging the computation of network edge devices [4] These devices act as cloud servers to collect IoT data, perform inference tasks, and respond to IoT devices This shortens the data transmission route, and thus reducing end-to-end latency Therefore, con-verging IoT and AI on the edge received substantial interest from both the research and the industrial community [5]
Fig 2: Overview of Intelligence Edge Despite several advantages of running DL models on edge devices, it is a trivial challenge because: (1) DL tasks demand excessive computational resources (storing model, inferring knowledge) comparing with edge device capacity; (2) The DL models are trained by a general dataset on cloud, and thus it is inefficient to specific edge contexts; (3) The heterogeneity of hardware specifications and collected data
Trang 2limits the collaboration between edge devices To overcome
these challenges, we introduce an On-demand Deep Learning
framework for Edge devices, namely ODLIE The superiority
of our proposal comes from two key features The first
one is DL right-selecting, which selects suitable DL models
for various deployment contexts, resulting in minimizing
the end-to-end latency while guaranteeing the user-desired
performance To propose an appropriate selection, DL
right-selecting examines the model running information formed as a
tuple, including < Accuracy, Latency, Computation, Memory
> The second one is DL right-sharing, which exploits
the semantic description provided by W3C to describe the
edge device resources (performance capability, DL models,
collected data) This enables direct access to these resources
via a uniform RESTful service, and thus, the heterogeneity
and complexity of edge systems are removed In summary,
our contribution is listed below:
• A systematic overview of edge intelligence (IE) is
pre-sented Base on this knowledge, fundamental
require-ments of EI are also identified
• We propose an on-demand deep learning framework for
edge devices, namely ODLIE Our framework addresses
the current EI limitations (latency, computational power,
and data sharing and collaboration) by employing DL
right-selecting and DL right-sharing features The
ap-plicability of the proposal in various IoT deployment
context are also discussed
The remainder of the paper is organized as follows In
the section II, we present our motivation and the systematic
overview of edge intelligence The ODLIE framework and its
deployed context are depicted in section III The conclusion
is reported in section IV
II EDGEINTELLIGENCEOVERVIEW
The edge intelligence has been emerging as a consequence
of the exponential growth of IoT devices (smart objects,
sensors, actuators) and stringent requirements about latency,
accuracy, and security Applying EI to IoT is expected to not
only satisfy these requirements but also derive more business
values from sensory data [6] Because of being closer data
sources than the core cloud, EI owns the advantages regarding
end-to-end latency, efficiency, and security Instead of sending
data to the central cloud, the analysis is located close to the
data sources It thus reduce the end-to-end latency and the
security risks relating to data interception and violence while
transferring On the other hand, IoT data is processed by
well-trained models to convert into valuable information, enabling
end-user to real-time react with observed events In summary,
the presence of EI is to deal with massive data at edge nodes
in an intelligent way [7]
A Definition
Recently, several research groups have been working on
EI definition and related concepts International
Electrotech-nical Commission (IEC) presents EI as a process to collect,
store raw data from IoT devices before performing machine
learning tasks at edge nodes [8] IEC also discusses the shift
of information technologies from cloud to edge level to deal with the challenges about latency, security, and connectivity
In our vision, we define EI is the capability to perform AI operations in edge devices Depend on edge hardwares and
AI algorithms, each edge device has a different capacity, which is represented via four key factors: accuracy, latency, computation, and memory
• Accuracy (higher is better) represents the correctness of
AI outputs indicated by different metrics For example, accuracy in object detection is measured by the mean average precision, whereas the f-score is widely used in anomaly detection tasks Since computational resources
of edge devices are constrained (such as computational capacity, memory), most of AI mechanisms deployed on edge must be optimized by compression, quantization, or other methods However, these optimizations may reduce inference accuracy
• Latency (lower is better) is a end-to-end delay represent-ing the total time from sendrepresent-ing a request to receivrepresent-ing a response from the edge device
• Computation and Memory (lower is better) refer to the increase of CPU and RAM usage when performing the AI inference tasks These two factors represent computational requirements for each AI models
B Overview of EI
Fig 3: Collaboration of Intelligence Edge 1) EI Collaboration: Figure 3 shows the overview of EI including several technologies, such as algorithms, software,
or hardware platforms These technologies cooperate on providing intelligent services (e.g., defect detection, vehicle counting) There are two types of collaboration [9]:
• Cloud to Edge: The AI model is trained on cloud before transferring to edge devices, where perform inference tasks If these tasks exceed the device capacity, off-loading between edge devices may be employed In some cases, edges retrain the received model with incoming data, and then synchronize with the original model on
Trang 3cloud This process is also known as transfer
learn-ing [10]
• Edge to Edge: A set of edges collaborates on intensive
tasks demanding high computational capability For
ex-ample, to train a complex model, the model and training
data are separated and allocated to edges based on their
computational capability After successfully training, the
models are synchronized together [11]
2) EI Dataflow: Depending on the EI collaboration types,
EI data flows are operated in a different way As shown in
Figure 3, the collected data has three major flows:
• The edge devices send the data to cloud servers, where
stores well-trained models Then, these servers perform
inference tasks based on the received data and return the
outputs to the edges However, the drawback of this flow
is high latency since the data have to transmit over the
network backbone
• The inference tasks are executed directly on edge
de-vices by using well-trained models downloaded from the
cloud In some cases, the model must be optimized to fit
the edge capacity The advantages of this data-flow are
low latency and security risks
• The general AI model loading from the cloud to the edge
is locally retrained by the collected data This processing
makes the model more specific for each edge, resulting
in increasing model accuracy However, training complex
AI models on a single edge may drain the device battery
or even crash In this case, the edge-to-edge collaboration
is required
III ODLIE: ON-DEMANDDEEPLEARNING
FRAMEWORK FOREDGEINTELLIGENCE
In this section, we present an on-demand deep learning
framework for edge intelligence (ODLIE) supporting AI
model selection and data sharing The goal of ODLIE is
to reduce the runtime of inference tasks, while maximizing
accuracy and edge collaboration Applying our proposed
framework could make EI suitable to various edge devices
(e.g., Raspberry Pi 3, NVIDIA Jetson Nano)
A Requirements
In general, the main goal of ODLIE is to turn a single-board
computer (such as Raspberry, Nano Jetson, BeagleBone) into
an edge intelligence, which is able to run complex AI models
or algorithms For example, a raspberry equipped ODLIE
could perform real-time object detection based on an on-board
camera In this case, several challenges may emerge: (1) How
does a Raspberry device meet the real-time requirement (2)
How do edges collaborate together through ODLIE
Following the mentioned example, we defined three key
requirements of an EI framework as follows:
• Facility: Currently, deploying and running an AI model
on the edge device is a complicated process By
wrap-ping this process behind uniform RESTful web services,
ODLIE is easy to use even with not-tech users
Fig 4: The overview of ODLIE
• Adaptation: In the AI world, there is a large number
of AI models with different properties (size, accuracy, format) and purposes Thus, selecting a suitable AI model for edges based on deployment context is essential
in EI
• Interoperability: To collaborate and share data (output results, collected data) with heterogeneous edges, the EI platform has to semantically describe their resources by using uniform descriptions and access methods Our proposed EI framework could fulfill the above require-ments As shown in Figure 4, the ODLIE architecture has three components: (1) DL sharing is used to interact with other edge devices and end-users in a semantic manner (2)
DL right-selecting takes responsibility to select the most suitable model for edges (3) Package manager is used to run or train the chosen model directly on edge devices
B DL right-selecting Along with the development of AI applications, the Deep learning model based on the neural network has been signif-icantly growing in terms of quality and categorization Each model is designed for dedicated purposes For example, the object detection model is insufficient for detecting defects
on product surfaces Thus, a method for selecting the most suitable model for different edge capabilities in different deployment scenarios is necessary Aware of this demand, ODLIE arms the DL right-selecting feature, including model selector and model optimizer components After receiving a
Trang 4Fig 5: Model Selector
deployment request from the end-user, the model selector
extracts user-desired configurations such as desired
accu-racy, latency It can be considered as a multi-dimensional
space selection problem As an example shown in Figure 5,
ODLIE deploying for object detection purposes takes at
least three dimensions into account, such as AI models
(ResNet, Mobilinets), DL software platforms (Tensorflow,
PyTorch, MXNet), and edge hardware (NVIDIA Nano Jetson,
Raspberry 3 and 4) In more detail, the model selector first
evaluates the capacity of the hardware platform via four main
factors formed a tube < Accuracy, Latency, Computation,
Memory > This information is obtained by running AI
models or sharing between edges Then, the most suitable
model is selected by solving the optimization equation
argmin
m∈M odels
< A, L, C, M >
s.t A ≥ Areq, L ≤ Lreq, C ≤ Cpro, M ≤ Mpro
(1)
where < A, L, C, M > represents the tube < Accuracy,
Latency, Computation, Memory> Areq and Lreq refers to
the user-desired accuracy and latency Cpro and Mpro are
the computation (CPU) and memory (RAM) footprint while
running the model at edge devices As shown in equation 1,
the goal of model selector is to select the best fit model,
which meets not only the accuracy and latency requirement
but also CPU and RAM consumption A reinforcement
learn-ing algorithms could be exploited to enhance the selection
performance
C DL sharing
DL sharing supports the interaction with end-users via
cloud, other edges, and IoT devices Based on the semantic
description framework provide by W3C [12], all edge
re-sources are described by using semantic language, such as
processed data, running AI model, device information These
resources are encoded under JSON-LD format1 and directly
access via uniform URIs The Figure 6 shows a simple
1 https://json-ld.org/
Fig 6: The example of edge description
example of edge resource description, which has three main sections The first section presents the general information of the edge device, such as name, id, model, security method The next section is “properties” describing available resources
of edge devices and their access methods In the example, we can access and capture an image from the onboard camera of edge device via calling an HTTP “GET” request The last section describes supported actions of services provided by the edge, namely “actions” As shown in the example, the edge device supports an image detection service via the HTTP
“POST” method The other components in DL sharing are:
• Lab Notebook provides a programming environment to create and evaluate AI models before importing into
DL right-selecting Its interface is similar to Jupyter Notebook2
• Service Editor is build based on Node-red3 platform supporting end-user to create simple applications from edge resources, which is described in properties and actionsin edge resource description
D Package manager Similar to TensorFlow Lite4for mobile phones, the Package manager of ODLIE aims to provide a lightweight DL software environment for edge devices It supports to inference and train the AI model in an optimized way (low computational and memory footprint) There are three key components in package manager: (1) Model training is used to retrain the model based on collected data It makes the model well-fit to specific data features of each edge device Thus, the model has better performance than general models; (2) Model inference aims to perform near real-time inference tasks by marking all
AL tasks as high priority operations to the system Thus, these
2 https://jupyter.org/
3 https://nodered.org/
4 https://www.tensorflow.org/lite
Trang 5tasks are provided maximum computational device resources;
(3) The hardware connection is used to connect ODLIE to
hardware components of the edge device Through this
con-nection, our framework could effectively manage the generate
training and inference tasks The optimization in both models
and the DL running environment is capable of accelerating
the EI framework performance As a result, a single computer
board with limited memory and computational capability
could run an overpowering DL task smoothly
E Deployment scenarios
With the significant growth of IoT and AI, several
intelli-gent applications have been applying in different life aspects,
from home safety, smart transportation to health care services
We describe some typical EI scenarios where our proposal
could be deployed
1) Surface inspection for Smart factory: IE is expected
to significantly reduce the latency of industrial applications
compared with the central cloud paradigm Consider the
surface inspection application in the automated assembly
line shown in Figure 1, the products are conveyed to the
right position, and an inspection camera captures its surface
The output images are processed at an edge server, which
validates whether the product is pass or not If processing
each product has a small delay, the cumulative delay time will
be considerable However, the product types in the factory are
various, and there is not a general model fitting all of them
We set up an illustration with a Raspberry Pi 4 as the edge
server The latency and demanded resource consumption of
inspection tasks of various AI models are reported in Figure 7
and Figure 8, respectively The differences between models
are notable The lowest model is “Rcnn resnet”, taking 16.3
seconds to process, while the best one (“Mobilinet v2”) only
needs 0.687 seconds Similar results are found for resource
consumption, “Rcnn resnet” consumes around 100% CPU
and 50% RAM when performing the inference tasks This
figure of “Mobilinet v2” is significantly lower and reported
about 7% CPU and 37% RAM All these results demonstrate
the need for an on-demand deep learning framework
support-ing selectsupport-ing the most suitable DL models and sharsupport-ing edge
resources
Fig 7: Surface inspection latency breakdown
Fig 8: Surface inspection resource consumption breakdown
2) Real-time camera monitoring system for large ware-house: A large number of cameras is deployed to enhance the safety of large warehouses (such as access control by face recognition, detecting theft) These monitoring applications require real-time video analytic to detect abnormal situations, which generates several challenges Firstly, the edge devices are not strong enough to execute an extensive convolutional neural network, which provides high detection accuracy Secondly, the vibration and noises in the captured video make the analyzing process more difficult Pre-processing input data of models is necessary to achieve high efficiency This additional step may increase the end-to-end latency We have to consider the balance between latency and accuracy
of the system Finally, interoperation between large camera monitoring systems with other management systems in the whole smart factory context also a considerable challenge
To mitigate all mentioned issues, ODLIE could be deployed directly on cameras or edge servers to support complex detection tasks, such as face recognition, people counting
IV CONCLUSION With the explosion of AI, edge computing, along with the strict requirement of IIoT applications, edge intelligence is
a potential solution to reduce the end-to-end latency, while maintaining the service quality as offered by the cloud Addressing the challenges relating to limited computational capability as well as collaboration, we introduce ODLIE, an on-demand deep learning framework, supporting to select the deep learning model based on the deployment context and user-desired quality Besides, ODLIE could enhance data sharing capability by leveraging the semantic description concept We hope that ODLIE could be a model when developing applications of frameworks for EI
ACKNOWLEDGEMENT This research is funded by Vietnam National Univer-sity HoChiMinh City (VNU-HCM) under grant number DSC2021-26-04
Trang 6REFERENCES [1] Y Lu, “Industry 4.0: A survey on technologies, applications and open
research issues,” Journal of Industrial Information Integration, vol 6,
pp 1–10, 2017.
[2] L D Xu, E L Xu, and L Li, “Industry 4.0: state of the art and future
trends,” International Journal of Production Research, vol 56, no 8,
pp 2941–2962, 2018.
[3] W Liu, Z Wang, X Liu, N Zeng, Y Liu, and F E Alsaadi, “A
survey of deep neural network architectures and their applications,”
Neurocomputing, vol 234, pp 11–26, 2017.
[4] J Wang, Y Ma, L Zhang, R X Gao, and D Wu, “Deep learning for
smart manufacturing: Methods and applications,” Journal of
Manufac-turing Systems, vol 48, pp 144–156, 2018.
[5] W Yu, F Liang, X He, W G Hatcher, C Lu, J Lin, and X Yang, “A
survey on the edge computing for the internet of things,” IEEE access,
vol 6, pp 6900–6919, 2017.
[6] A Yousefpour, C Fung, T Nguyen, K Kadiyala, F Jalali, A
Ni-akanlahiji, J Kong, and J P Jue, “All one needs to know about fog
computing and related edge computing paradigms: A complete survey,”
Journal of Systems Architecture, 2019.
[7] H El-Sayed, S Sankar, M Prasad, D Puthal, A Gupta, M Mohanty,
and C.-T Lin, “Edge of things: the big picture on the integration of
edge, iot and the cloud in a distributed computing environment,” IEEE
Access, vol 6, pp 1706–1717, 2017.
[8] ICE, “Edge intelligence (white paper),” 2018.
[9] X Wang, Y Han, V C Leung, D Niyato, X Yan, and X Chen,
“Convergence of edge computing and deep learning: A comprehensive
survey,” IEEE Communications Surveys & Tutorials, 2020.
[10] H Li, K Ota, and M Dong, “Learning iot in edge: Deep learning for
the internet of things with edge computing,” IEEE network, vol 32,
no 1, pp 96–101, 2018.
[11] W Z Khan, E Ahmed, S Hakak, I Yaqoob, and A Ahmed, “Edge
computing: A survey,” Future Generation Computer Systems, vol 97,
pp 219–235, 2019.
[12] K.-H Le, S K Datta, C Bonnet, and F Hamon, “Wot-ad: A descriptive
language for group of things in massive iot,” in 2019 IEEE 5th World
Forum on Internet of Things (WF-IoT), pp 257–262, IEEE, 2019.