A Practical High Efficiency Video Coding Solution for Visual Sensor Network using Raspberry Pi Platform Thao Nguyen Thi Huong1, Huy Phi Cong1, Tien Vu Huu1, Xiem HoangVan2 1 PTIT – Po
Trang 1A Practical High Efficiency Video Coding Solution for Visual Sensor Network using
Raspberry Pi Platform
Thao Nguyen Thi Huong1, Huy Phi Cong1, Tien Vu Huu1, Xiem HoangVan2
1 PTIT – Posts and Telecommunications Institute of Technology
2 VNU – University of Engineering and Technology thaonth@ptit.edu.vn ; huypc@ptit.edu.vn ; tienvh@ptit.edu.vn ; xiemhoang@vnu.edu.vn
Abstract
Visual sensor network (VSN) has recently
emerged as a promising solution for tremendous
range of new vision-sensor based applications, from
video surveillance, environmental monitoring to
remote sensing However, the practical VSN
currently faces to the visual processing and
transmitting problems due to the limitation of power
at sensor nodes and the restriction of transmission
bandwidth In this context, the selection of a suitable
video compression algorithm is utmost important
task for achieving a practical VSN To address this
problem, this paper introduces a practical Raspberry
Pi based High Efficiency Video Coding (HEVC)
solution for visual sensor networks The selected
video coding solution is one of the most up-to-date
compression engines but still achieving the low
complexity capability Experimental results show
that the proposed video coding architecture has good
compression performance with acceptable
complexity performance
Keywords: Visual sensor network, Raspberry Pi,
HEVC
1 Introduction
Nowadays, Visual Sensor Networks (VSNs) [1, 2]
plays an important role in the era of Internet of
Things A VSN typically consists of a large number
of sensor nodes, i.e., cameras VSNs have been
successfully applied in many applications such as
video surveillance and security system where a
network of nodes can identify and track objects from
their visual information, i.e., video Such networks
are made up of multiple cameras capable of
capturing visual information from their surrounding
environment, performing simple processing on the
captured data and transmitting the captured data to
remote locations for further content analysis and distribution
However, in a VSN, sensor nodes usually have limited processing capabilities and power budget This constrains naturally requires lightweight video signal processing and compression algorithms for individual sensor nodes At the same time, the restriction of the transmission bandwidth in a VSN also asks for an efficient video compression solution which must be used at each sensor node These two requirements are critical to achieve a practical VSN system
Video coding aims to reduce the size of video data by exploiting the spatial, temporal and statistical correlation of video and the human visual system characteristics The current video coding standards, such as H.264/AVC [3] or High Efficiency Video Coding (HEVC) [4] can drastically reduce the size of transmitted video data while still guaranteeing the acceptable decoded information at the receiver HEVC is the most recent video coding standard, which provides around 50% of bitrate reduction in comparison with the widely deployed H.264/AVC standard [3] while preserving the same subjective quality However, the achievement of compression efficiency of HEVC usually associates to a large number of coding modes and selection process, i.e
35 directional intra predictions, expensive motion estimation process This may restricts the use of video compression engine in a practical VSN
In this context, we present a practical, low complexity HEVC solution for visual sensor network using the common Raspberry Pi platform [5] The low complexity characteristic is achieved by using an appropriate HEVC compression profile as described later The Raspberry Pi platform is chosen as it is popular, low cost and be able to play the role of sensor nodes in a visual sensor network HEVC Test Model (HM) reference software [6] is used to provide implementation of HEVC encoder
2018 IEEE 12th International Symposium on Embedded Multicore/Many-core Systems-on-Chip
Trang 2To achieve this objective, the rest of the paper is
organized as follows Section 2 gives a brief
overview of visual sensor network and some related
works Section 3 presents the selected video
compression solution and describes the Raspberry Pi
platform Afterwards, Section 4 provides the
performance evaluation of the proposed video coding
solution including compression performance and
encoding complexity assessments Finally, Section 5
gives some conclusions and remarks for future
works
2 Related Work
A VSN usually consists of tiny visual sensor
nodes such as camera sensors, which integrate the
image sensor, embedded processor, and wireless
transceiver
Fig.1 illustrates an example of a VSN in which
consists of hundreds of camera nodes and a base
station (BS)
CSN: Camera Sensor Node
CSN
CSN CSN
CSN
CSN
CSN
Base station Video bit stream
Fig 1 An example of visual sensor networks
In the VSN architecture, camera nodes capture the
visual data, process and transmit valuable video
information to the BS for further analysis Usually,
camera nodes have small sizes and require long
lifetime of battery Meanwhile, they must perform
visual data processing and communicating, which are
very computationally expensive, in a limited
bandwidth condition Therefore, the data collected
by sensor node should be compressed at each sensor
node before sending to the destination However, this
is not easy because traditional video codec is usually
designed for broadcasting (one – to – many)
applications in which the encoder is much more
complex than the decode This requirement naturally
is reversed with the VSN which follows a many – to
– one information flow
In the literature, some works focus on solutions
for the communication of video data on devices with
limited hardware resources For example, distributed
video coding architectures were proposed in [7, 8]
for low complexity VSN requirement Although
experimental results showed that this is a potential
direction but there is a gap in compression efficiency
between the distributed video coding solution and the
current video coding standards, e.g., H.264/AVC or
HEVC The works in [9, 10] implemented traditional video coding codec such as H.264/AVC on low complexity devices Both approaches brought fantastic results with low delay under the constraints
of typical low complexity devices
3 Proposed Video Compression Platform
3.1 Proposed Video Coding based VSN
The overall Raspberry Pi based low complexity HEVC architecture is illustrated in Fig 2
YUV Video sequence
HEVC Encoder
Raspberry Pi module
Video Sensor Network
HEVC Decoder
Base station HEVC Video Stream
Fig 2 Proposed video coding architecture
In this case, the Raspberry Pi platform plays the role of sensor node in a VSN Raw video sequences are fed into a Raspberry Pi platform to be encoded This Raspberry Pi platform will produce the video bitstream using most recent HEVC standard HEVC bitstream is transmitted in VSN to base station, a higher complexity device (a computer in this case), and further processing
3.2 Raspberry Pi Platform
Raspberry Pi is an embedded platform running the Linux operating system manufactured in UK with the purpose of inspiring the teaching of basic computer science in education institute [5] In this research, the most recent Raspberry Pi model 3 is used Fig 3 illustrates the Raspberry Pi platform
Fig 3 Selected raspberry pi 3 model
The Raspberry Pi 3 features built around the Broadcom BCM2837 processor including CPU, GPU, audio/video processor and other features all integrated into this low-power chip
The Raspberry Pi 3 has a Camera Serial Interface (CSI) connector to attach a camera module directly
to the Broadcom Video Core 4 Graphics Processing Unit (GPU) using the CSI protocol Being small as a credit card, Raspberry Pi still has the capabilities of
Trang 3working as a normal computer, it can play 1080p
resolution video without lagging
However, Raspberry Pi cannot completely
replace a computer A disadvantage of Raspberry Pi
device is that it does not support Windows operation
system but it can run on Linux with utilities
including web, desktop environment, and other tasks
In addition, the Raspberry Pi has a low price as
compared to a computer and it requires much low
power which is a necessary feature in sensor
networks
3.3 HEVC Low Complexity Profile
The first version of the HEVC standard was
finalized in January 2013 to fulfill emerging video
resolution and quality requirements in traditional
video broadcasting, tele-conferencing and mobile
applications The HEVC standard still adopted the
hybrid predict and transform coding architecture,
which has been widely used in traditional video
coding standards from H.261 [11] In HEVC, the
correlation between consecutive frames is mainly
exploited in Inter coding modes while the spatial
correlation between samples inside each frame is
exploited in Intra coding modes As reported, the
HEVC Inter coding significantly outperforms the
HEVC Intra coding in terms of the compression
performance However, due to the large number of
computations associated to the motion estimation
process, the HEVC Inter coding profile may not be
adopted in video applications with the low
complexity requirement
HEVC Intra coding contains several improvement
elements when compared to the prior H.264/AVC
Intra coding solution The novelties of the HEVC
Intra coding [12] are specified as the following
1) Larger and flexible coding block size: The size
of Coding Tree Unit (CTU) in HEVC can have up to
64×64 pixels in order to exploit better spatial
correlation, especially for high definition picture, and
better adaptation to different video content
2) Angular prediction with 33 prediction
directions: When large block sizes are used, more
prediction directions help predict accurately
directional structures in video content
3) Removing intra artifact by using boundary
smoothing: removing the discontinuities along block
boundaries introduced by intra prediction
4) Removing intra artifact by using reference
sample smoothing: depending on the block size and
prediction mode to reduce the contouring artifacts
5) Block size-dependent transform selection:
HEVC utilizes intra mode dependent transforms and
coefficient scanning for coding the residual
information
6) Intra mode coding based on contextual
information: Due to the substantially increased
number of intra modes, more efficient coding techniques are required for mode coding in HEVC
An important feature in HEVC is fast encoding mode When the number of intra prediction modes is increased, the rate-distortion (RD) optimization process is more complex To solve this problem, HEVC introduces a fast encoding algorithm for a large set of prediction candidates Experiments performed by the official HM 6.0 reference software [6] show that fast encoding algorithm can reduce three times the encoding time with a slight coding gain reduction In other words, HM 6.0 encoder can provide a better compromise between coding efficiency and complexity
4 Experimental Results
4.1 Test Methodology
In order to evaluate the proposed video coding architecture, the common rate – distortion (RD) performance and the complexity performance are used [13] RD performance metric represents the relationship between the bitrate (i.e., kbps) needed and the peak-signal-to-noise ratio (PSNR) (dB) achieved For the same bitrate, the higher the PSNR, the better the quality of the frame achieved In other words, RD performance shows the quality of the encoded video sequence The second metric, complexity performance is time consuming for encoding In addition, in order to evaluate the feasibility of the proposed architecture on Raspberry
Pi, results on Raspberry are compared to results on Personal Computer (PC) The basic configurations of Raspberry Pi and PC are shown in Table 1
Table 1 Configuration of Raspberry and PC
CPU type/speed
ARM
Fig 4 The first frames in test sequences: RaceHorses, BasketballDrill, BQMall and
PartyScene
Table 2 Characteristics of test video sequences
Test sequences resoluti Spatial
on
Temporal resolution Number of
frames
QP
RaceHorses
832x480
30Hz 300
7,17,2 7,37,4
7
Basketball-Drill 50Hz 500 BQMall 60Hz 600 PartyScene 50Hz 500
Trang 4In this implementation, four common video
sequences are used for assessment including
RaceHorses, BasketballDrill, BQMall and
PartyScene with the characteristics summarized in
Table 2 These sequences were selected for their
representativeness of motion and texture
characteristics Each sequence is assessed for five
RD points corresponding quantization parameters
(QP) 7, 17, 27, 37, 47 The first frames of each
sequence are illustrated in Fig 4
Table 3 RD performance of test video sequences
Sequence QP Bitrate (kbps) PSNR (dB)
RaceHorses
7 53748.58 55.13
17 25379.81 46.06
27 9345.58 38.80
37 2803.44 32.19
47 574.37 27.05
BasketballDrill
7 89934.62 55.14
17 38533.27 45.58
27 12277.56 38.49
37 3668.84 32.82
47 1022.31 27.57
BQMall
7 106343.36 55.20
17 45068.94 45.49
27 14626.96 38.98
37 5023.23 32.90
47 1392.92 27.10
PartyScene
7 118482.63 55.39
17 65960.88 45.49
27 27871.62 36.69
37 9228.99 29.34
47 1701.17 23.38
Fig 5 RD performance of video test sequences
4.2 Performance Evaluation
a Compression performance
In this experiment, both Raspberry Pi and PC use
the same video test sequences and profile test and
experimental results showed that RD performances
of two platforms are similar but the encoding time is different due to the configuration difference of two platforms In particularly, RD performance results for four test video sequences are presented in Table 3 and visualized in Fig 5
The results from Fig.5 and Table 3 show that the quality of the decoded video is decreased when the
QP value is increased However, at the middle QP value 27, the quality of video is still at high level (PSNR value is around 37dB) Therefore, the QP value 27 can be considered as the most suitable selection in term of RD performance for coding video on Raspberry Pi in visual sensor network
b Complexity performance
The Fig 6 illustrates the comparison between these two platforms while Table 4 shows the differential percentage In this case, the differential percentage is computed as Equation (1):
=( ) 100% (1)
Where DP is differential percentage, ET RB and
ET PC are encoding time of Raspberry and PC, respectively
0 200 400 600 800 1000 1200 1400 1600 1800
QP = 7 QP = 17 QP = 27
QP = 37 QP = 47
Fig 6 Encoding time comparison between Raspberry Pi (RB) and Personal Computer (PC)
for each test video sequence
Table 4 Differential percentage of encoding time
between Raspberry and PC
QP BasketBall PartySene RaceHorse BQMall
The results show that the encoding time of Raspberry is always higher than encoding time of
PC However, the encoding time difference is in proportional to the QP Therefore, in terms of encoding time, the QP 27 is also the most suitable selection for Raspberry Pi
Trang 5In summary, the disadvantage of video codec
implementation on Raspberry Pi platform is to take a
higher time consuming compared to PC However,
the advantage of the Raspberry Pi is the compact and
low cost Therefore, if the performance of Raspberry
Pi is improved in the future, it can be considered as
suitable platform for video sensor network
application
5 Conclusion
This paper presents a Raspberry Pi based HEVC
platform for visual sensor networks The results have
shown that our platform can achieve good
compression ratio with moderate computational
complexity even in the case of encoding high
resolution video sequences and this satisfies the
stated requirements for sensor nodes in visual sensor
networks Our future work is performing further
comprehensive assessments for different video
compression algorithms on various low complexity
devices such as Raspberry Pi Zero, smartphones
References
1 Y Charfi et al., “Challenging issues in visual sensor
networks”, IEEE Wireless Communications, vol
16, no 2, pp 44-49, Apr 2009
2 S Soro and W Heinzelman, “A Survey of Visual
Sensor Networks”, Advances in Multimedia, vol
2009, pp 1-21, May 2009
3 T Wiegand et al, ‘Overview of the H.264/AVC
video coding standard’, IEEE Transactions on
Circuits and Systems for Video Technology, vol 13,
no 7, pp 560-576, 2003
4 G.J.Sullivan et al, “Overview of the high efficiency
video coding (HEVC) standard”, IEEE Transactions
on circuits and system for video technology, vol.22,
no.12, pp 1649–1668, Dec 2012
5 Raspberry Pi Org [Online], Available:
http://www.raspberrypi.org
6 HM reference software,
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftwa
re/
7 R Puri and K Ramchandran, “PRISM: A new
robust video coding architecture based on distributed
compression principles,” in Proceedings of the 40th
Allerton Conference Communication, Control, and
Computing, Allerton, IL, pp 301-304, Oct 2002
8 B Girod et al., “Distributed Video Coding,” in
Proceedings of the IEEE, vol 93, no 1, pp 71-83,
Jan 2005
9 R Pereira and E Pereira, “Video Streaming: H.264
and the Internet of Things”, in Proceedings of the
2015 IEEE 29th International Conference on
Advanced Information Networking and Applications
Workshops (WAINA), Gwangju, Korea, pp 711–714,
Mar 2015
10 U Jennehag, S Forsstrom and F.V Fiordigigli,
“Low Delay Video Streaming on the Internet of
Things Using Raspberry Pi” Electronics, vol 5, no
3, pp 1-11, Sep 2016
11 T Turletti, “H.261 software codec for
videoconferencing over the Internet”, in Rapports de
Recherche 1834, Insitut National de Recherche en Informatique et en Automatique (INRIA),
Sophia-Antipolis, France, Jan 1993
12 J Lainema et al., “Intra Coding of the HEVC
Standard”, IEEE transactions on circuits and
systems for video technology, vol.22, no.12, pp
1792-1801, Dec 2012
13 Z Kotevski and P Mitrevski, “Experimental Comparison of PSNR and SSIM Metrics for Video
Quality Estimation”, In Proceedings of ICT
Innovations-09 International Conference, Macedonian Society on Information and Communication Technologies, Ohrid, Macedonia,
pp 357-366, 2009