The centr and visualize th he results of the presented in th pedestrians in t Institute ence for on1,2 behaviors as video tracking equire a lot of an movements and extracting ing obje
Trang 1IERI Procedia 4 ( 2013 ) 181 – 187
2212-6678 © 2013 The Authors Published by Elsevier B.V.
Selection and peer review under responsibility of Information Engineering Research Institute
doi: 10.1016/j.ieri.2013.11.026
ScienceDirect
Abst
Vide
nowa
have
huma
from
chara
intere
objec
of al
path
expe
pede
recor
© 20
Sele
Keyw
2013 Inte
A Prac
T
Halimatu
tract
eo tracking of p
adays pedestria
some limitatio
an intervention
m a video is pre
acteristics of a
est which are a
cts are labelled
l objects are co
of a pedestrian
riment for the
strian moveme
rded correctly H
013.Published
ection and pee
words: Object trac
ernational C
ctical an Tracking
ul Saadiah
pedestrian mov
an safety is a v ons For exampl
n In this paper, esented The pr pedestrian from ssumed to be pe for identificatio omposed to obt
n Besides, it i proposed fram ent is plotted a However the sp
d by Elsevier B
r review unde cking; video; pede
Conference o
nd Autom gPedestri
Md Yatim
1 S 11
2 Dep Al-Madinah Al
vements can be very important
le some works
a practical and roposed framew
m a video Con edestrians The
on These steps tain the movem
is also possible mework using accurately and peed of the pede B.V
er responsibilit estrian; monitorin
on Electron
mated Im ian Mov
m1, Abdulla
School of Comput Universiti Sains M
1800 USM Penan epartment of Comp Taibah Unive l-Munawwarah, K
e used to gain consideration i focus on a spe
d automated im work consists o nsecutive frame
e extracted data
s are automated ment vectors wh
e to estimate th three videos w the maximum estrian is slight
ty of Informat ng; movement pa
nic Engineer
mage-bas vements
ah Zawawi
ter Sciences Malaysia,
ng, Malaysia mputer Science ersity, Kingdom of Saudi
better understa
in many situatio ecific event and mage-based fram
of several steps
es in the video obtained are fil
d and less huma hich are used to
he speed of a with different s and minimum tly inaccurate.
tion Engineer ath; speed
ring and Co
sed Fram from a V
i Talib1, Fa
i Arabia
anding of crow ons The existi
d a specific plac mework for trac such as detect are processed t ltered to remov
n efforts are ne
o plot a graph pedestrian Th scenarios are p
m numbers of p
ing Research
omputer Scie
mework f Video
azilah Haro
wd features and ing works on v
ce, and some re cking pedestria tion, tracking a
to extract movi
ve unwanted obj eeded The centr and visualize th
he results of the presented in th pedestrians in t
Institute
ence
for
on1,2
behaviors as video tracking equire a lot of
an movements and extracting ing objects of jects and then roid positions
he movement
e preliminary
he paper The the video are
© 2013 The Authors Published by Elsevier B.V.
Selection and peer review under responsibility of Information Engineering Research Institute
Trang 21 Introduction
For the past few decades, videos have been used mainly for watching recorded events Later, they have been used in monitoring and surveillance through closed-circuit television video (cctv) Manual human monitoring through video recording is not always practical In order to help and automate video monitoring, research on surveillance and monitoring [1]has become a growing research area These applications can help
to minimize the human effort as it can be run automatically or semi-automatically to meet specific objectives The main idea in automated monitoring is to extract and analyze the macroscopic and even microscopic data from the images of the video camera automatically without manual inspection on the video and therefore, with less or no human effort Some research applications are aimed at getting the pedestrian characteristics from a video such as the speed and trajectory of moving objects and density of pedestrians in a specific area The data obtained could be used to validate and calibrate simulation model for safety purpose, enhances the design architecture of a building or alert the security personnel on anomaly of events In this paper, a practical andautomated framework for the entire process of pedestrian tracking from video footageis proposed The framework can assist in understanding crowd characteristics and behaviors with little or no human effort
2 Related Work
There exist a number of works on extracting data from a video footage with varying focus and targets[1] The works can be classified into extracting the density of crowd in image [2–5], counting the number of pedestrians[2,6,7] and also extracting trajectories of moving object from video footage [4,5,8,9] In term of detecting trajectories of an object, the object can be a pedestrian [10], vehicle [11], human fingertip[12] and many more Different researchers have used different methods to obtain the object’s trajectory such as supervised learning [8], unsupervised learning [13] and in some casesthrough clustering of trajectories[14] From the trajectory, it can be observed whether the movement pattern of the object of interest is abnormal or normal[14,15].In order to extract data of a pedestrian from video footage, the first and most important step is
to detect individual pedestrian in the video This step is quite challenging and existing methods focus on specific and controlled situations A popular method for pedestrian detection is background subtraction [16– 18] However, it cannot guarantee that a detected object is the object of interest Therefore, there are methods that combine background subtraction with other method [19,20].Object classification is another method for detection.For examplethere is a method that classifies the pedestrians based on their color [21,22] Once the pedestrian is properly detected, tracking ofthe detected pedestrian throughout the video frameswill take place
A number of tracking algorithms are available such as Kalman filter [6,21], particle filter [23], feature-based tracking [9,22] and active contour-based tracking [24] The existing automatic detection and tracking methods focus on some specific events with some limitations Therefore, it is not guaranteed that it can be applied in different situation
After detection and tracking, the next step is to extract the position of a pedestrian accurately in order to obtain some information from video footage especially speed measurement Because of the camera placement and also lens characteristics, accurate pedestrian position can hardly be obtained due to the distortion of the image This phenomenon called geometric distortion[25], must be correctedthrough a technique called image calibration [7] The data will be further analyzed to get the speed, trajectory and the number of pedestrians on the images The conservative way to get the speed of pedestrian is by calculating the speed manually and there is another way to obtain speed but it requires a device called GPS [26] However, this device requires a pedestrian to bring it along in order to get speed measurements Extracting pedestrian data from sequence of images is an alternative way of getting speed measurement without the need for the pedestrian to carry anything Nonetheless this is a new area of study
Trang 33 The Proposed Framework
The purpose of the framework is to automate the entire process of detecting, tracking and extracting pedestrian characteristics from video footages Therefore, the overall framework as shown in Figure 1 consists
of object detection, object tracking and lastly extracting and visualizing the object’s characteristics Object here refers to pedestrian Object detection is implemented by getting the background image from the video [27]and then applying the frame differencing method [28] in order to extract moving object in the video We assume that a moving object on the video is a pedestrian and no other objects are present on the background image Then, each frame will undergo image processing techniques to extract individual objects The objects are filtered, and unwanted objects (e.g very small objects) are removed Then they are labeled in such a way that each specific object can be identified using the same label for every frames After obtaining the desired objects, the centroidcoordinate [16] of each object is identified and the movement vector for each of them is extracted The next step is extracting characteristics from the video footage such as speed, trajectory and also the number of pedestrians detected The steps used in the implementation of the framework are adapted from some existing works of other researchers.The techniques used in our approach include: image processing, extracting objects and filtering unwanted objects, labeling and identifying objects, extracting the movement vectors, and finally plotting the movement vector to visualize the trajectory, measuring the speed and counting the number of pedestrians
3.1 Detection
The first step in the framework involves extracting the background image[27]from a video frame In order
to detect objects in a video, consecutive frames are processed using frame differencing techniques [29] between video frames and the background image in order to get the objects of interest Moving objects in the scene are identified as the objects of interest Pixels which change during the video will be grouped as foreground blobs The foreground blobs are the object presents on the image [30] They are then further processed using image processing techniques[31]namely convertingthe images to grayscale and thresholdingthem by using the method by Otsu [32] As the objects on the image might not be filled completely (in the form of a blob), the holes present on each object are filled We also applied morphological closing techniques to get a more precise object This process will produce a binary image that contains the background and the foreground
Fig 2.The overall process for detection
Since the focus is on detecting the pedestrians, objects of non-interest which have been detected during
Fig 1 The proposed framework
Characteristics
Image(RGB)
Convertingto
grayscaleand
thresholding
Fillingtheholesof
theobjectsof
interest
Removingobjectsof
nonͲinterest
Trang 4frame differencing are removed The filtering process will be appliedin order to remove small objects and unwanted lines in every frame Thus, we can assume that the remaining objects are humans and there is no false detection because of shadow, reflection or other reasons Figure 2 shows the overall detailed process for detection
3.2 Tracking
The object tracking method that we have applied in the implementation of the proposed framework is based on the blob tracking method The blobs detected are labeled for identification The objects in subsequent frames are labeled in such a way that each object has similar label in all frames.The movement of each blob is tracked by their centroid coordinates [16] The coordinates for each object obtained from each frame is storedas a sequence of coordinates throughout the video frames These coordinates compiled for each object in the video frames give the movement vector for the object
3.3 Extracting and visualizing characteristics
From the movement vector obtained from the video, the graph of pedestrian movement is plotted Hence, the movement pattern of pedestrian can be visualized automatically by plotting the direction of the movement
of the pedestrians The speed of the pedestrians is one of the useful information that we can extract from a raw video Speed is obtained from the distance of the object in pixel per time in seconds The speed that we get is measured in pixels per second which depends on the size of the pixel on the image frame The time is extracted directly from the video duration in seconds since the video is captured in real time and the object is
assumed to be present on video scene throughout the video duration.The distance d is calculated asfollows:
where the initial point is (x1,y1) and the final point is (x2,y2).Speed is averaged out for every two seconds of the video for a more accurate calculation This is to avoid getting the result from straight line distance from initial point to final point since pedestrian might move in random directions from one point to another Thus, the average speed for a distance of two seconds is more practical and reliable for a more accurate speed calculation This is calculated using the following equation:
ܣݒ݁ݎܽ݃݁ݏ݁݁݀ ൌ ௧௧ௗ௦௧
௧௧௧ (2)
The speed in pixels per second is hard to validate Thus, the real measurement for distance in meter persecond is suggested It is calculated by mapping the pixel distance into the centimeter distance In order to
do pixel mapping, a user must choose two points on the image and provide a distance in real measurement between these two points This step requires an effort from the user
This framework also identifies the maximum and minimum of number of pedestrians detected in the video frames
4 Preliminary Experimental Results and Discussion
For the purpose of doing a preliminary test on the framework, we have captured several videos which were taken from the top view using a single fixed camera We have used three different scenarios and they were taken at the same place Table 1 shows the result of the experiment which consists of the trajectories,
Trang 5maximum and minimum numbers of pedestrians, speeds in pixels per second (pps) and speeds in meter per second (mps) for the three different scenarios We have also indicated the actual speed of the pedestrians
The graph shows an accurate movement path of the pedestrian for all the scenarios For the counting part in all the three scenarios, the exact maximum and minimum numbers of pedestrians detected were obtained which is one and zero respectively As shown in the table, the speed of the pedestrian is slightly inaccurate for all the three scenarios This may be due to the geometric distortion phenomenonwhich has not been tackled in the implementation
Table 1 The result of the experiment
Scenario Video
Information
Video
Pedestrians
Speed
1 x 6 sec
x 25 fps
A pedestrian walks in a straight line
x Max 1
x Min 0
x 64.86 pps
x 0.42 mps (actual 0.69 mps)
2 x 8 sec
x 25 fps
A pedestrian walks in a straight line but slower than in Scenario 1
x Max 1
x Min 0
x 39.58 pps
x 0.33 mps(actual 0.46 mps)
3 x 9 sec
x 25 fps
A pedestrian walks in a zigzag manner
x Max 1
x Min 0
x 80.40 pps
x 0.72 mps (actual 0.78 mps)
5 Conclusion and Future Work
The framework presented in this paper can be used to get the movement path and speed of the pedestrians, and the maximum and minimum numbers of pedestrians detected in a video, and also provides some analysis and visualization of the result in an automated manner and with less human effort In all cases accurate or exact results are obtained However, the result for the speed is slightly inaccurate due to geometric distortion
of the video frames We have also successfully developed and implemented this practical framework by using and adapting some existing methods and techniques in order to reduce and minimize the constraints and limitations of automating the entire process of detecting, tracking, and extracting and visualizing characteristics of a pedestrian
For our future work, we plan to improve the result for the speedmeasurement by applying the geometric distortion correction Perhaps, the framework should be enhanced for dense crowds which appear in places like Masjid al-Haram, in Saudi Arabia To ensure the robustness of the system, a wider variety of videos will
be used to test the system More efforts will be focused on occlusion handling, placement of the video camera, and proper testing and validation
Trang 6Acknowledgements
The author would like to acknowledge the support of the Ministry of Higher Education Malaysia for this research under the Fundamental Research Grant Scheme entitled “More Accurate Models for Movements of Pedestrians in Big Crowds”
References
[1] W Hu, T Tan, L Wang, S Maybank, A Survey on Visual Surveillance of Object Motion and Behaviors, IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews) 34 (2004) 334–352 [2] R Ma, L Li, W Huang, Q Tian, On Pixel Count based Crowd Density Estimation for Visual
Surveillance, IEEE Conference on Cybernetics and Intelligent Systems, 2004 1 (2004) 170–173
[3] X Liu, W Song, J Zhang, Extraction and Quantitative Analysis of Microscopic Evacuation
Characteristics Based on Digital Image Processing, Physica A: Statistical Mechanics and Its Applications 388 (2009) 2717–2726
[4] J Zheng, D Yao, Intelligent Pedestrian Flow Monitoring Systems in Shopping Areas, 2010 2nd
International Symposium on Information Engineering and Electronic Commerce (2010) 1–4
[5] B Steffen, a Seyfried, Methods for measuring pedestrian density, flow, speed and direction with minimal scatter, Physica A: Statistical Mechanics and Its Applications 389 (2010) 1902–1910
[6] H Celik, A Hanjalic, E Hendriks, Towards a robust solution to people counting, IEEE International Conference on Image Processing, 2006 (2006) 2401–2404
[7] D Conte, P Foggia, G Percannella, F Tufano, M Vento, A Method for Counting Moving People in Video Surveillance Videos, EURASIP Journal on Advances in Signal Processing 2010 (2010) 231–240 [8] J Albusac, J.J Castro-Schez, L.M Lopez-Lopez, D Vallejo, L Jimenez-Linares, A supervised learning approach to automate the acquisition of knowledge in surveillance systems, Signal Processing 89 (2009) 2400–2414
[9] M Boltes, A Seyfried, B Steffen, A Schadschneider, Automatic Extraction of Pedestrian Trajectories from Video Recordings, in: W.W.F Klingsch, C Rogsch, A Schadschneider, M Schreckenberg (Eds.), Pedestrian and Evacuation Dynamics 2008, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010: pp 43–54 [10] D Makris, T Ellis, Path detection in video surveillance, Image and Vision Computing 20 (2002) 895–
903
[11] Z Zhang, K Huang, T Tan, L Wang, Trajectory Series Analysis based Event Rule Induction for Visual Surveillance, 2007 IEEE Conference on Computer Vision and Pattern Recognition (2007) 1–8
[12] D Ren, J Li, Vision-Based Dynamic Tracking of Motion Trajectories of Human Fingertips, Robotic Welding, Intelligence and Automation (2007) 429–435
[13] N Johnson, D Hogg, Learning the distribution of object trajectories for event recognition, Image and Vision Computing 14 (1996) 609–615
[14] C Piciarelli, G Foresti, L Snidaro, Trajectory clustering and its applications for video surveillance, in: Proceedings IEEE Conference on Advanced Video and Signal Based Surveillance 2005 AVSS-05, IEEE, 2005: pp 40–45
[15] A Fernández-Caballero, jose carlos Castilllo, jose maria Rodriguez-sanchez, A proposal for local and global human activities identification, in: Articulated Motion and Deformable Objects, Springer Berlin Heidelberg, 2010: pp 78–87
[16] H Yue, C Shao, Y Zhao, X Chen, Study on Moving Pedestrian Tracking Based on Video Sequences, Journal of Transportation Systems Engineering and Information Technology 7 (2007) 47–51
[17] C Hao-li, S Zhong-ke, F Qing-hua, The study of the detection and tracking of moving pedestrian using monocular-vision, in: Computational Science–ICCS 2006, 2006: pp 878–885
Trang 7[18] Y Dedeoglu, B ugur Toreyin, U Gudukbay, A.E Cetin, Silhouette-based method for object
classification and human action recognition in video, Computer Vision in Human-Computer Interaction 3979 (2006) 64–77
[19] L Bazzani, D Bloisi, V Murino, A comparison of multi hypothesis kalman filter and particle filter for multi-target tracking, in: 11th IEEE International Workshop on Performance Evaluation of Tracking and
Surveillance (PETS 2009), 2009: pp 47–55
[20] J Berclaz, A Shahrokni, F Fleuret, Evaluation of probabilistic occupancy map people detection for surveillance systems, in: 11th IEEE International Workshop on Performance Evaluation of Tracking and
Surveillance (PETS 2009), 2009: pp 55–62
[21] S Hoogendoorn, W Daamen, P.H.L Bovy, Extracting microscopic pedestrian characteristics from
video data, in: 82nd Annual Meeting at the Transportation Research Board, 2003: pp 1–15
[22] J Ma, W Song, Z Fang, S Lo, G Liao, Experimental study on microscopic moving characteristics of pedestrians in built corridor based on digital image processing, Building and Environment 45 (2010) 2160–
2169
[23] I Ali, M.N Dailey, Multiple human tracking in high-density crowds, Image and Vision Computing (2012) 540–549
[24] A Yilmaz, O Javed, M Shah, Object tracking, ACM Computing Surveys 38 (2006) 1–45
[25] S Lee, S Lee, J Choi, Correction of radial distortion using a planar checkerboard pattern and its image, IEEE Transactions on Consumer Electronics 55 (2009) 27–33
[26] S Bandini, M Federici, S Manzoni, A qualitative evaluation of technologies and techniques for data collection on pedestrians and crowded situations, in: Proceedings of the 2007 Summer Computer Simulation Conference, Society for Computer Simulation International, 2007: pp 1057–1064
[27] S.S Cheung, C Kamath, Robust techniques for background subtraction in urban traffic video, in:
Proceedings of SPIE, SPIE, 2004: pp 881–892
[28] M Karaman, L Goldmann, D Yu, T Sikora, Comparison of static background segmentation methods, in: S Li, F Pereira, H.-Y Shum, A.G Tescher (Eds.), Visual Communications and Image Processing 2005, 2006: pp 1–12
[29] C Zhan, X Duan, S Xu, Z Song, M Luo, An Improved Moving Object Detection Algorithm Based on Frame Difference and Edge Detection, Fourth International Conference on Image and Graphics (ICIG 2007) (2007) 519–523
[30] D Kong, D Gray, A Viewpoint Invariant Approach for Crowd Counting, 18th International Conference
on Pattern Recognition (ICPR’06) 1 (2006) 1187–1190
[31] N Hussain, H.S.M Yatim, N.L Hussain, J.L.S Yan, F Haron, CDES: A pixel-based crowd density estimation system for Masjid al-Haram, Safety Science 49 (2011) 824–833
[32] N Otsu, A Threshold Selection Method from Gray-Level Histograms, IEEE Transactions on Systems, Man, and Cybernetics 9 (1979) 62–66