a practical and automated image based framework for trackingpedestrian movements from a video

The centr and visualize th he results of the presented in th pedestrians in t Institute ence for on1,2 behaviors as video tracking equire a lot of an movements and extracting ing obje

Trang 1

IERI Procedia 4 ( 2013 ) 181 – 187

Selection and peer review under responsibility of Information Engineering Research Institute

doi: 10.1016/j.ieri.2013.11.026

ScienceDirect

Abst

Vide

nowa

have

huma

from

chara

intere

objec

of al

path

expe

pede

recor

Sele

Keyw

2013 Inte

A Prac

T

Halimatu

tract

eo tracking of p

adays pedestria

some limitatio

an intervention

m a video is pre

acteristics of a

est which are a

cts are labelled

l objects are co

of a pedestrian

riment for the

strian moveme

rded correctly H

013.Published

ection and pee

words: Object trac

ernational C

ctical an Tracking

ul Saadiah

pedestrian mov

an safety is a v ons For exampl

n In this paper, esented The pr pedestrian from ssumed to be pe for identificatio omposed to obt

n Besides, it i proposed fram ent is plotted a However the sp

d by Elsevier B

r review unde cking; video; pede

Conference o

nd Autom gPedestri

Md Yatim

1 S 11

2 Dep Al-Madinah Al

vements can be very important

le some works

a practical and roposed framew

m a video Con edestrians The

on These steps tain the movem

is also possible mework using accurately and peed of the pede B.V

er responsibilit estrian; monitorin

on Electron

mated Im ian Mov

m1, Abdulla

School of Comput Universiti Sains M

1800 USM Penan epartment of Comp Taibah Unive l-Munawwarah, K

e used to gain consideration i focus on a spe

d automated im work consists o nsecutive frame

e extracted data

s are automated ment vectors wh

e to estimate th three videos w the maximum estrian is slight

ty of Informat ng; movement pa

nic Engineer

mage-bas vements

ah Zawawi

ter Sciences Malaysia,

ng, Malaysia mputer Science ersity, Kingdom of Saudi

better understa

in many situatio ecific event and mage-based fram

of several steps

es in the video obtained are fil

d and less huma hich are used to

he speed of a with different s and minimum tly inaccurate.

tion Engineer ath; speed

ring and Co

sed Fram from a V

i Talib1, Fa

i Arabia

anding of crow ons The existi

d a specific plac mework for trac such as detect are processed t ltered to remov

n efforts are ne

o plot a graph pedestrian Th scenarios are p

m numbers of p

ing Research

omputer Scie

mework f Video

azilah Haro

wd features and ing works on v

ce, and some re cking pedestria tion, tracking a

to extract movi

ve unwanted obj eeded The centr and visualize th

he results of the presented in th pedestrians in t

Institute

ence

for

on1,2

behaviors as video tracking equire a lot of

an movements and extracting ing objects of jects and then roid positions

he movement

e preliminary

he paper The the video are

Selection and peer review under responsibility of Information Engineering Research Institute

Trang 2

1 Introduction

For the past few decades, videos have been used mainly for watching recorded events Later, they have been used in monitoring and surveillance through closed-circuit television video (cctv) Manual human monitoring through video recording is not always practical In order to help and automate video monitoring, research on surveillance and monitoring [1]has become a growing research area These applications can help

to minimize the human effort as it can be run automatically or semi-automatically to meet specific objectives The main idea in automated monitoring is to extract and analyze the macroscopic and even microscopic data from the images of the video camera automatically without manual inspection on the video and therefore, with less or no human effort Some research applications are aimed at getting the pedestrian characteristics from a video such as the speed and trajectory of moving objects and density of pedestrians in a specific area The data obtained could be used to validate and calibrate simulation model for safety purpose, enhances the design architecture of a building or alert the security personnel on anomaly of events In this paper, a practical andautomated framework for the entire process of pedestrian tracking from video footageis proposed The framework can assist in understanding crowd characteristics and behaviors with little or no human effort

2 Related Work

There exist a number of works on extracting data from a video footage with varying focus and targets[1] The works can be classified into extracting the density of crowd in image [2–5], counting the number of pedestrians[2,6,7] and also extracting trajectories of moving object from video footage [4,5,8,9] In term of detecting trajectories of an object, the object can be a pedestrian [10], vehicle [11], human fingertip[12] and many more Different researchers have used different methods to obtain the object’s trajectory such as supervised learning [8], unsupervised learning [13] and in some casesthrough clustering of trajectories[14] From the trajectory, it can be observed whether the movement pattern of the object of interest is abnormal or normal[14,15].In order to extract data of a pedestrian from video footage, the first and most important step is

to detect individual pedestrian in the video This step is quite challenging and existing methods focus on specific and controlled situations A popular method for pedestrian detection is background subtraction [16– 18] However, it cannot guarantee that a detected object is the object of interest Therefore, there are methods that combine background subtraction with other method [19,20].Object classification is another method for detection.For examplethere is a method that classifies the pedestrians based on their color [21,22] Once the pedestrian is properly detected, tracking ofthe detected pedestrian throughout the video frameswill take place

A number of tracking algorithms are available such as Kalman filter [6,21], particle filter [23], feature-based tracking [9,22] and active contour-based tracking [24] The existing automatic detection and tracking methods focus on some specific events with some limitations Therefore, it is not guaranteed that it can be applied in different situation

After detection and tracking, the next step is to extract the position of a pedestrian accurately in order to obtain some information from video footage especially speed measurement Because of the camera placement and also lens characteristics, accurate pedestrian position can hardly be obtained due to the distortion of the image This phenomenon called geometric distortion[25], must be correctedthrough a technique called image calibration [7] The data will be further analyzed to get the speed, trajectory and the number of pedestrians on the images The conservative way to get the speed of pedestrian is by calculating the speed manually and there is another way to obtain speed but it requires a device called GPS [26] However, this device requires a pedestrian to bring it along in order to get speed measurements Extracting pedestrian data from sequence of images is an alternative way of getting speed measurement without the need for the pedestrian to carry anything Nonetheless this is a new area of study

Trang 3

3 The Proposed Framework

The purpose of the framework is to automate the entire process of detecting, tracking and extracting pedestrian characteristics from video footages Therefore, the overall framework as shown in Figure 1 consists

of object detection, object tracking and lastly extracting and visualizing the object’s characteristics Object here refers to pedestrian Object detection is implemented by getting the background image from the video [27]and then applying the frame differencing method [28] in order to extract moving object in the video We assume that a moving object on the video is a pedestrian and no other objects are present on the background image Then, each frame will undergo image processing techniques to extract individual objects The objects are filtered, and unwanted objects (e.g very small objects) are removed Then they are labeled in such a way that each specific object can be identified using the same label for every frames After obtaining the desired objects, the centroidcoordinate [16] of each object is identified and the movement vector for each of them is extracted The next step is extracting characteristics from the video footage such as speed, trajectory and also the number of pedestrians detected The steps used in the implementation of the framework are adapted from some existing works of other researchers.The techniques used in our approach include: image processing, extracting objects and filtering unwanted objects, labeling and identifying objects, extracting the movement vectors, and finally plotting the movement vector to visualize the trajectory, measuring the speed and counting the number of pedestrians

3.1 Detection

The first step in the framework involves extracting the background image[27]from a video frame In order

to detect objects in a video, consecutive frames are processed using frame differencing techniques [29] between video frames and the background image in order to get the objects of interest Moving objects in the scene are identified as the objects of interest Pixels which change during the video will be grouped as foreground blobs The foreground blobs are the object presents on the image [30] They are then further processed using image processing techniques[31]namely convertingthe images to grayscale and thresholdingthem by using the method by Otsu [32] As the objects on the image might not be filled completely (in the form of a blob), the holes present on each object are filled We also applied morphological closing techniques to get a more precise object This process will produce a binary image that contains the background and the foreground

Fig 2.The overall process for detection

Since the focus is on detecting the pedestrians, objects of non-interest which have been detected during

Fig 1 The proposed framework

Characteristics

Image(RGB)

Convertingto

grayscaleand

thresholding

Fillingtheholesof

theobjectsof

interest

Removingobjectsof

nonͲinterest

Trang 4

frame differencing are removed The filtering process will be appliedin order to remove small objects and unwanted lines in every frame Thus, we can assume that the remaining objects are humans and there is no false detection because of shadow, reflection or other reasons Figure 2 shows the overall detailed process for detection

3.2 Tracking

The object tracking method that we have applied in the implementation of the proposed framework is based on the blob tracking method The blobs detected are labeled for identification The objects in subsequent frames are labeled in such a way that each object has similar label in all frames.The movement of each blob is tracked by their centroid coordinates [16] The coordinates for each object obtained from each frame is storedas a sequence of coordinates throughout the video frames These coordinates compiled for each object in the video frames give the movement vector for the object

3.3 Extracting and visualizing characteristics

From the movement vector obtained from the video, the graph of pedestrian movement is plotted Hence, the movement pattern of pedestrian can be visualized automatically by plotting the direction of the movement

of the pedestrians The speed of the pedestrians is one of the useful information that we can extract from a raw video Speed is obtained from the distance of the object in pixel per time in seconds The speed that we get is measured in pixels per second which depends on the size of the pixel on the image frame The time is extracted directly from the video duration in seconds since the video is captured in real time and the object is

assumed to be present on video scene throughout the video duration.The distance d is calculated asfollows:

where the initial point is (x1,y1) and the final point is (x2,y2).Speed is averaged out for every two seconds of the video for a more accurate calculation This is to avoid getting the result from straight line distance from initial point to final point since pedestrian might move in random directions from one point to another Thus, the average speed for a distance of two seconds is more practical and reliable for a more accurate speed calculation This is calculated using the following equation:

ܣݒ݁ݎܽ݃݁ݏ݌݁݁݀ ൌ ௧௢௧௔௟ௗ௜௦௧௔௡௖௘

௧௢௧௔௟௧௜௠௘ (2)

The speed in pixels per second is hard to validate Thus, the real measurement for distance in meter persecond is suggested It is calculated by mapping the pixel distance into the centimeter distance In order to

do pixel mapping, a user must choose two points on the image and provide a distance in real measurement between these two points This step requires an effort from the user

This framework also identifies the maximum and minimum of number of pedestrians detected in the video frames

4 Preliminary Experimental Results and Discussion

For the purpose of doing a preliminary test on the framework, we have captured several videos which were taken from the top view using a single fixed camera We have used three different scenarios and they were taken at the same place Table 1 shows the result of the experiment which consists of the trajectories,

Trang 5

maximum and minimum numbers of pedestrians, speeds in pixels per second (pps) and speeds in meter per second (mps) for the three different scenarios We have also indicated the actual speed of the pedestrians

The graph shows an accurate movement path of the pedestrian for all the scenarios For the counting part in all the three scenarios, the exact maximum and minimum numbers of pedestrians detected were obtained which is one and zero respectively As shown in the table, the speed of the pedestrian is slightly inaccurate for all the three scenarios This may be due to the geometric distortion phenomenonwhich has not been tackled in the implementation

Table 1 The result of the experiment

Scenario Video

Information

Video

Pedestrians

Speed

1 x 6 sec

x 25 fps

A pedestrian walks in a straight line

x Max 1

x Min 0

x 64.86 pps

x 0.42 mps (actual 0.69 mps)

2 x 8 sec

x 25 fps

A pedestrian walks in a straight line but slower than in Scenario 1

x Max 1

x Min 0

x 39.58 pps

x 0.33 mps(actual 0.46 mps)

3 x 9 sec

x 25 fps

A pedestrian walks in a zigzag manner

x Max 1

x Min 0

x 80.40 pps

x 0.72 mps (actual 0.78 mps)

5 Conclusion and Future Work

The framework presented in this paper can be used to get the movement path and speed of the pedestrians, and the maximum and minimum numbers of pedestrians detected in a video, and also provides some analysis and visualization of the result in an automated manner and with less human effort In all cases accurate or exact results are obtained However, the result for the speed is slightly inaccurate due to geometric distortion

of the video frames We have also successfully developed and implemented this practical framework by using and adapting some existing methods and techniques in order to reduce and minimize the constraints and limitations of automating the entire process of detecting, tracking, and extracting and visualizing characteristics of a pedestrian

For our future work, we plan to improve the result for the speedmeasurement by applying the geometric distortion correction Perhaps, the framework should be enhanced for dense crowds which appear in places like Masjid al-Haram, in Saudi Arabia To ensure the robustness of the system, a wider variety of videos will

be used to test the system More efforts will be focused on occlusion handling, placement of the video camera, and proper testing and validation

Trang 6

Acknowledgements

The author would like to acknowledge the support of the Ministry of Higher Education Malaysia for this research under the Fundamental Research Grant Scheme entitled “More Accurate Models for Movements of Pedestrians in Big Crowds”

References

[1] W Hu, T Tan, L Wang, S Maybank, A Survey on Visual Surveillance of Object Motion and Behaviors, IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews) 34 (2004) 334–352 [2] R Ma, L Li, W Huang, Q Tian, On Pixel Count based Crowd Density Estimation for Visual

Surveillance, IEEE Conference on Cybernetics and Intelligent Systems, 2004 1 (2004) 170–173

[3] X Liu, W Song, J Zhang, Extraction and Quantitative Analysis of Microscopic Evacuation

Characteristics Based on Digital Image Processing, Physica A: Statistical Mechanics and Its Applications 388 (2009) 2717–2726

[4] J Zheng, D Yao, Intelligent Pedestrian Flow Monitoring Systems in Shopping Areas, 2010 2nd

International Symposium on Information Engineering and Electronic Commerce (2010) 1–4

[5] B Steffen, a Seyfried, Methods for measuring pedestrian density, flow, speed and direction with minimal scatter, Physica A: Statistical Mechanics and Its Applications 389 (2010) 1902–1910

[6] H Celik, A Hanjalic, E Hendriks, Towards a robust solution to people counting, IEEE International Conference on Image Processing, 2006 (2006) 2401–2404

[7] D Conte, P Foggia, G Percannella, F Tufano, M Vento, A Method for Counting Moving People in Video Surveillance Videos, EURASIP Journal on Advances in Signal Processing 2010 (2010) 231–240 [8] J Albusac, J.J Castro-Schez, L.M Lopez-Lopez, D Vallejo, L Jimenez-Linares, A supervised learning approach to automate the acquisition of knowledge in surveillance systems, Signal Processing 89 (2009) 2400–2414

[9] M Boltes, A Seyfried, B Steffen, A Schadschneider, Automatic Extraction of Pedestrian Trajectories from Video Recordings, in: W.W.F Klingsch, C Rogsch, A Schadschneider, M Schreckenberg (Eds.), Pedestrian and Evacuation Dynamics 2008, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010: pp 43–54 [10] D Makris, T Ellis, Path detection in video surveillance, Image and Vision Computing 20 (2002) 895–

903

[11] Z Zhang, K Huang, T Tan, L Wang, Trajectory Series Analysis based Event Rule Induction for Visual Surveillance, 2007 IEEE Conference on Computer Vision and Pattern Recognition (2007) 1–8

[12] D Ren, J Li, Vision-Based Dynamic Tracking of Motion Trajectories of Human Fingertips, Robotic Welding, Intelligence and Automation (2007) 429–435

[13] N Johnson, D Hogg, Learning the distribution of object trajectories for event recognition, Image and Vision Computing 14 (1996) 609–615

[14] C Piciarelli, G Foresti, L Snidaro, Trajectory clustering and its applications for video surveillance, in: Proceedings IEEE Conference on Advanced Video and Signal Based Surveillance 2005 AVSS-05, IEEE, 2005: pp 40–45

[15] A Fernández-Caballero, jose carlos Castilllo, jose maria Rodriguez-sanchez, A proposal for local and global human activities identification, in: Articulated Motion and Deformable Objects, Springer Berlin Heidelberg, 2010: pp 78–87

[16] H Yue, C Shao, Y Zhao, X Chen, Study on Moving Pedestrian Tracking Based on Video Sequences, Journal of Transportation Systems Engineering and Information Technology 7 (2007) 47–51

[17] C Hao-li, S Zhong-ke, F Qing-hua, The study of the detection and tracking of moving pedestrian using monocular-vision, in: Computational Science–ICCS 2006, 2006: pp 878–885

Trang 7

[18] Y Dedeoglu, B ugur Toreyin, U Gudukbay, A.E Cetin, Silhouette-based method for object

classification and human action recognition in video, Computer Vision in Human-Computer Interaction 3979 (2006) 64–77

[19] L Bazzani, D Bloisi, V Murino, A comparison of multi hypothesis kalman filter and particle filter for multi-target tracking, in: 11th IEEE International Workshop on Performance Evaluation of Tracking and

Surveillance (PETS 2009), 2009: pp 47–55

[20] J Berclaz, A Shahrokni, F Fleuret, Evaluation of probabilistic occupancy map people detection for surveillance systems, in: 11th IEEE International Workshop on Performance Evaluation of Tracking and

Surveillance (PETS 2009), 2009: pp 55–62

[21] S Hoogendoorn, W Daamen, P.H.L Bovy, Extracting microscopic pedestrian characteristics from

video data, in: 82nd Annual Meeting at the Transportation Research Board, 2003: pp 1–15

[22] J Ma, W Song, Z Fang, S Lo, G Liao, Experimental study on microscopic moving characteristics of pedestrians in built corridor based on digital image processing, Building and Environment 45 (2010) 2160–

2169

[23] I Ali, M.N Dailey, Multiple human tracking in high-density crowds, Image and Vision Computing (2012) 540–549

[24] A Yilmaz, O Javed, M Shah, Object tracking, ACM Computing Surveys 38 (2006) 1–45

[25] S Lee, S Lee, J Choi, Correction of radial distortion using a planar checkerboard pattern and its image, IEEE Transactions on Consumer Electronics 55 (2009) 27–33

[26] S Bandini, M Federici, S Manzoni, A qualitative evaluation of technologies and techniques for data collection on pedestrians and crowded situations, in: Proceedings of the 2007 Summer Computer Simulation Conference, Society for Computer Simulation International, 2007: pp 1057–1064

[27] S.S Cheung, C Kamath, Robust techniques for background subtraction in urban traffic video, in:

Proceedings of SPIE, SPIE, 2004: pp 881–892

[28] M Karaman, L Goldmann, D Yu, T Sikora, Comparison of static background segmentation methods, in: S Li, F Pereira, H.-Y Shum, A.G Tescher (Eds.), Visual Communications and Image Processing 2005, 2006: pp 1–12

[29] C Zhan, X Duan, S Xu, Z Song, M Luo, An Improved Moving Object Detection Algorithm Based on Frame Difference and Edge Detection, Fourth International Conference on Image and Graphics (ICIG 2007) (2007) 519–523

[30] D Kong, D Gray, A Viewpoint Invariant Approach for Crowd Counting, 18th International Conference

on Pattern Recognition (ICPR’06) 1 (2006) 1187–1190

[31] N Hussain, H.S.M Yatim, N.L Hussain, J.L.S Yan, F Haron, CDES: A pixel-based crowd density estimation system for Masjid al-Haram, Safety Science 49 (2011) 824–833

[32] N Otsu, A Threshold Selection Method from Gray-Level Histograms, IEEE Transactions on Systems, Man, and Cybernetics 9 (1979) 62–66

Định dạng
Số trang	7
Dung lượng	277,54 KB