1. Trang chủ
  2. » Giáo Dục - Đào Tạo

motion analysis from encoded video bitstream

53 18 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 53
Dung lượng 1,37 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

the videos in the thesis is the H264 compression standard MPEG-4 part10, apopular video compression standard today.Aims The goal of the thesis is to propose a method for determining movi

Trang 1

UNIVERSITY OF ENGINEERING AND TECHNOLOGY

NGUYEN MINH HOA

MOTION ANALYSIS FROM ENCODED VIDEO

BITSTREAM

MASTER’S THESIS

HA NOI – 2018

Trang 2

UNIVERSITY OF ENGINEERING AND TECHNOLOGY

NGUYEN MINH HOA

MOTION ANALYSIS FROM ENCODED VIDEO

Trang 3

“I hereby declare that the work contained in this thesis is of my own and I have not submitted this thesis at any other institution in order to obtain a degree To the best of my knowledge and belief, the thesis contains no materials previously published or written by another person other than those listed in the bibliography and identified as references.”

Signature: ………

Trang 4

SUPERVISOR’S APPROVAL

“I hereby approve that the thesis in its current form is ready for committee examination as a requirement for the Master of Computer Science degree at the University of Engineering and Technology.”

Signature: ………

Signature: ………

Trang 5

First of all, I would like to express special gratitude to my supervisors, Dr DoVan Nguyen and Dr Tran Quoc Long, for their enthusiasm for instructions, thetechnical explanation as well as advices during this project

I also want to give sincere thanks to Assoc Prof Dr Ha Le Thanh, Assoc Prof

Dr Nguyen Thi Thuy for the instructions as well as the background knowledgefor this thesis And I would like to also thank my teachers, my friends in HumanMachine Interaction Lab for their support

Thank my friends, my colleagues in the project "Nghiên Cứu Công Nghệ Tóm Tắt Video", and project “Multimedia application tools for intangible cultural heritage conservation and promotion”, project number ĐTDL.CN-34/16 for their working and support.

Last but not least, I want to thank my family and all of my friends for theirmotivation and support as well They stand by and inspire me whenever I facethe tough time

Trang 6

TABLE OF CONTENTS

AUTHORSHIP i

SUPERVISOR’S APPROVAL ii

ACKNOWLEDGMENTS iii

TABLE OF CONTENTS 1

ABBREVIATIONS 3

List of Figures 4

List of Tables 5

INTRODUCTION 6

CHAPTER 1 LITERATURE REVIEW 9

Moving object detection in the pixel domain9 Moving object detection in the compressed domain10 1.2.1 Motion vector approaches 11

1.2.2 Size of Macroblock approaches 13

Chapter Summarization 14 CHAPTER 2 METHODOLOGY 15

Video compression standard h264 15 2.1.1 H264 file structure 15

2.1.2 Macroblock 18

2.1.3 Motion vector 19

Proposed method 21 2.2.1 Process video bitstream 21

2.2.2 Macroblock-based Segmentation 22

2.2.3 Object-based Segmentation 24

2.2.4 Object Refinement 28

Trang 7

Chapter Summarization 28

CHAPTER 3 RESULTS 30

The moving object detection application 30

3.1.1 The process of application 31

3.1.2 The motion information 34

3.1.3 Synthesizing movement information 35

3.1.4 Storing Movement Information 36

Experiments 36

3.2.1 Dataset 36

3.2.2 Evaluation methods 40

3.2.3 Implementations 41

3.2.4 Experimental results 41

Chapter Summarization 44

CONCLUSIONS 45

List of of author’s publications related to thesis 46

REFERENCES 47

Trang 8

NALU Network Abstraction Layer Unit

Trang 9

List of Figures

Figure 1.1 The process of moving object detection with data in the pixel domain

10

Figure 1.2 The process of moving object detection with data in the compressed domain 11

Figure 2.1 The structure of a H264 file 15

Figure 2.2 RBSP structure 16

Figure 2.3 Slide structure 18

Figure 2.4 Macroblock structure 18

Figure 2.5 The motion vector of a Macroblock 20

Figure 2.6 The process of moving object detection method 22

Figure 2.7 Skipped Macroblock 23

Figure 2.8 (a) An outdoor and in-door frames (b) The "size-map" of frames, (c) The "motion-map" of frames 24

Figure 2.9 Example about the “consistent” of motion vector 26

Figure 3.1 The implementation process of the approach 33

Figure 3.2 Data struct to storage motion information 35

Figure 3.3 Example frames of test videos 37

Figure 3.4 Example frames and their ground truth 39

Figure 3.5 An example frame of Pedestrians (a) and ground truth image (b) 40

Trang 10

List of Tables

Table 2.1 NALU types 16

Table 2.2 Slide types 17

Table 3.1 The information of test videos 38

Table 3.2 The information of test sequences in group 1 39

Table 3.3 The performance of two approachs with Pedestrians, PETS2006, Highway, and Office 42

Table 3.4 The experimental result of Poppe’s approach on 2nd group 42

Table 3.5 The experimental result of proposed method on 2nd group 43

Trang 11

Today, video content is extensively used in the areas of life such as indoormonitoring, traffic monitoring, etc The number of videos sharing over theInternet at any given time is also extremely large According to statistics,hundreds of hours of video are uploaded to Youtube every minute [1] Not onlythat, the general trend today is the surveillance cameras installed in homes forsurveillance and sercurity purposes These cameras will normally operate andstore the surveillance videos automatically Only when there are some specialsituations, or some special events occur, humans will use the video data torevisit The problem is that in a short amount of time, how can such a largevideo volume be evaluated? For example, when there is a burglary, an intrusionoccurs, we can not spend hours to check each video previously stored Then, atool that lets you determine the moment when an object is moving in a longvideo is essential to reducing the time and effort of searching

Normally, in order to reduce the size of videos for transmission or storing, avideo compression procedure is performed at surveillance cameras After that,the compressed information in form of bit stream is stored, or transmitted to aserver for analysis The video analysis process needs a lot of features to describedifferent aspects of vision Typically, these features are extracted from the pixelvalues of each video frame by fully decompressing bitstream Thedecompression procedure requires high computation capacity device to perform.However, with the trend of "Internet of Things", there are many low processingcapacity devices which are not capable for performing this full videodecompression at high speed So, it is difficult to perform an approach thatrequires a lot of computing power in real time

Another way to extract the feature from the video is using the data on thecompressed video These data can be: transform coefficients, motion vectors,quantization steps, quantization parameters, etc From the above data, throughthe process and analysis, we can handle some important tasks in the computervision include moving objects detection, human actions detection, facerecognition, motion objects tracking

This thesis proposes a new method to determine moving object by exploring andapplying some motion estimation techniques in the video compression domain.After that, the method will be used to build an application that supports movementsearching in the surveillance videos in the families The compression format of

Trang 12

the videos in the thesis is the H264 compression standard (MPEG-4 part10), apopular video compression standard today.

Aims

The goal of the thesis is to propose a method for determining moving objects inthe compressed domain of a video Then, I try to build an application using themethod for support searching the moments which have moving objects in thevideo

Object and Scope of the study

Within the framework of the thesis, I study the algorithms related to determiningmoving objects in video, especially the algorithms that determine movingobjects in the compressed domain The video compression standard is used inthe thesis is H264/AVC

The theory of video compression and computer vision are taken from scientificarticles related to the video analysis problem on the compression domain,determine the motion form on the compression domain of the video

The videos for test and experiment are obtained from the surveillance camerasboth indoor and outdoor

Method and procedures

- Research on motion analysis and evaluation systems on existing compressedvideo, scientific articles related to the analysis and evaluation of motion oncompressed video

- Experimental research: Conduct experiential settings for each theoretical partsuch as extracting video data, compiling data, and evaluating motion based onthe obtained data

- Experimental evaluation: Each experiment will be conducted independently oneach module and then integrated and deployed

Contributions

The thesis proposes a new moving object detection method in surveillance videoencoded with H264 compression standard using the motion vector and size ofmacroblock

Trang 13

Chapter 2 mentiones the basic knowledge about video compression standardH264 such as H264 file structure, macroblocks, motion vectors and describes thedetail of moving object detection method including processing video bitstreams,macroblock-based segmentation phase, object-based segmentation phase, andobject refinement phase.

Chapter 3 shows the results of method including an application using proposedmethod and experimental results

Trang 14

CHAPTER 1.

LITERATURE REVIEW

Today, surveillance cameras are used extensively in the world The volume ofvideo surveillance has also grown tremendously Some problems that are oftenencountered with video surveillance include event searching, motion tracking,abnormal behavior detection, etc In order to handle these tasks, it is necessary

to have a method that can determine which the moments in each videos existmovements

Usually, the video is compressed for storage and transmission The previousmoving object detection method usually use the data from the pixel images such

as color value, edges, etc To get the images that can be displayed, or processed,the system must decode video fully This consumes a large number of computingresources, time and memory of the device I suggest a method that can quicklydetermine the moving objects in high resolution videos The data used in themethod will be taken from the compressed video domain including informationabout the motion vector and the size of the macroblock (in bit) after encoding.The method reduces the processing time of the method considerably compared

to methods implemented with data on the pixel domain

The problem of motion detection in a video has long been studied This is thefirst step in a series of computer vision problems such as object tracking, objectdetection, abnormal movement detection, etc There are usually two approaches

to address this problem: using fully decoded video data (pixel domain data) orusing live data from an undecoded video (compressed domain data) Thefollowing section will outline the studies based on these two approaches

Moving object detection in the pixel domain

Typically, to reduce the size of the video for transmission, a video encodingprocess is performed inside the surveillance camera and the compressedinformation is transmitted as a bit stream to a server for video analysis Commonvideo compression standards used today including mp4, H264, H265 To beviewable, these compressed videos need to be decoded to image frames We callthese image frames are the pixel domain and the data obtained from these imageframes are the data in the pixel domain Fig 1.1 describes the process of movingobject detection methods in the pixel domain The data in the pixel domaininclude the color values of the pixels, the number of color channels of eachpixel, the edges, etc

Trang 15

Figure 1.1 The process of moving object detection with data in the pixel domain

To determine moving objects in the pixel domain, background subtractionalgorithms are commonly used There are many research results that have beenintroduced long ago These methods usually use data as the relationship betweenframes in a time series

Background subtraction in [2] is defined as: “Background subtraction is a widelyused approach for detecting moving objects in videos from static cameras Therationale in the approach is that of detecting the moving objects from thedifference between the current frame and a reference frame, often called The

“background image”, or “background model” As a basic, the background imagemust be a representation of the scene with no moving objects and must be keptregularly updated so as to adapt to the varying luminarice conditions andgeometry settings.”

Results of the researchs may include the methods use Gaussian average such as themethod of Wren et al [3], the method of Koller et al [4]; the methods use Temporalmedian filter such as the method of Lo and Velasti [5], the method of Cucchiara et

al [6]; the methods using a mixture of Gaussians such as the method of Staufferand Grimson [7], methods of Wayne Power and Schoonees [8]; etc

The above methods have a common characteristic that is the process data aretaken by fully decompress the compressed bitstream and this decompressionprocedure requires a highly computational device to perform However, with thetrend of "Internet of Things," where most low-end devices are not capable ofperforming high-speed decompression Therefore, there should be a videoanalysis mechanism that includes only uncompressed video

Moving object detection in the compressed domain

Normally, the videos will be encoded using some compression standard Eachcompression standard specifies how to shrink the video size by a certain structure.The compressed videos will contain fewer data For example, with the H264compression standard, the data contained in the compressed video includes

Trang 16

information about macroblock, motion vector, frame information, etc We callthese data that the data in the compressed domain or video compression region.Fig 1.2 shows the process of moving object detection methods by using the data

in the compressed domain

Figure 1.2 The process of moving object detection with data in the

compressed domain

In general, the amount of data in the video compression domain is much lessthan the data in the pixel domain The idea of using data in the compresseddomain with the H264 compression standard for video analysis has also beeninvestigated by some scientists around the world In order to be able to detectmotion in the compressed video domain, we usually use two types of data Theyare the motion vector and the size (in bit) of the macroblock

1.2.1 Motion vector approaches

A number of algorithms have been proposed to analyze video content in the H264compressed domain, whose good performances have been obtained [9] [10] Zeng

et al Study in [11] proposed a method to detect moving objects in H264compressed videos based on motion vectors Motion vectors are extracted from themotion field and classified into several types Then, they are grouped into blocksthrough the Markov Random Field (MRF) classification process Liu et al

[12] recognized the shape of an object by using a map for each object Thisapproach is based on a binary partition tree created by macroblocks Cipres et al.[13] presented a moving object detection approach in the H264 compresseddomain based on fuzzy logic The motion vectors are used to remove the noises thatappear during the encoding process and represent the concepts that describe thedetected regions Then, the valid motion vectors are grouped into blocks Each ofthem could be identified as a moving object in the video scene The moving objects

of each frame are described with common terms like shape, size, position, andvelocity Mak et al [14] used the length, angle, and direction of motion vectors totrack the objects by applying the MRF Bruyne et al [15] estimated the

Trang 17

reliability of motion vectors by comparing them with projected motion vectorsfrom surrounding frames Then, they combined this information with the magnitude

of motion vectors to distinguish foreground objects from the background Thismethod can localize the noisy motion vectors and their effect during theclassification can be diminished Wang et al [16] proposed a background modelingmethod using the motion vector and local binary pattern (LBP) to detect themoving object When a background block was similar to a foreground block, anoisy motion vector would appear To obtain a more reliable and dense motionvector field, the initial motion vector fields were preprocessed by a temporalaccumulation within three inter frames and a 3×3 median filtering After that, theLBP feature was introduced to describe the spatial correlation among neighboringblocks This approach can reduce the time of extracting moving objects while alsoperforming an effective synopsis analysis Marcus Laumer [17] proposed anapproach to segment video frames into the foreground and background and,according to this segmentation, to identify regions containing moving objects Theapproach uses a map to indicate the "weight" of each (sub-)macroblock for thepresence of a moving object This map is the input of a new spatiotemporaldetection algorithm that is used to refine the weight that indicated the level ofmotion for each block Then, quantization parameters of macroblocks are used toapply individual thresholds to the block weights to segment the video frames Theaccuracy of the approach was approximately 50%

To identify the human action, Tom et al [18] proposed a quick actionidentification algorithm The algorithm uses quantization parameters gradientimage (QGI) and motion vectors with support vector machines (SVM) toclassify the types of the actions The algorithm can also handle light, scale andsome other environmental variables with an accuracy rate of 85% on the videoswith resolution 176x144 It can identifies walking, running, etc Similarly, Tom,Rangarajan and his colleagues also used QGI and motion vector to propose anew method to classify human actions as the Projection Based Learning of theMeta-cognitive Radial Basis Functional Network (PBL-McRBFN)

With the motion tracking problem, Biswas et al [19] propose a method fordetecting abnormal actions by analyzing motion vector This method mainly relies

on observing the motion vector to find the difference between abnormal actions andnormal situations The classifier used here is the Gaussian Mixture Model (GMM).This approach base on their another approach [20] but improved it by using thedirection of the motion vector The speed of approach when perform experimental

is about 70fps Thilak et al [21] propose a Probabilistic Data

Trang 18

Association Filter that detects multiple target clusters This method can handlecases in which targets split into multiple clusters or clusters should be detected(classified) as a target Similarly, You et al [22] use the probabilistic spatio-temporal MB filtering to mark the macroblock as objects and then remove themfrom the noise The algorithm can track many objects with real-time accuracybut can only be applied in case of fixed camera and objects must be at least twomacroblocks Kas et al [23] overcame the fixed camera problem using GlobalMotion Estimation and Object History Images to handle background movement.However, the number of motion objects need to be small and the moving objectsare not occupied most of the frame area.

1.2.2 Size of Macroblock approaches

The methods mentioned above share the trait of using motion vectors to detectmoving objects However, since motion vectors are usually created at the videoencoder to optimize video compression ratio, they do not always represent thereal motion in the video sequence As such, due to its coding-oriented nature, todetect moving objects, the motion vector fields must be preprocessed andrefined to remove the noises

So, Poppe et al [24] proposed an approach to detect moving objects in the H264video by using the size of the macroblocks after encoding (in bit) To achieveSub-macroblock-level (4×4) precision, the information from transformcoefficients was also utilized The system achieved high execution speeds, up to

20 times faster than the motion vector-based related works An analysis wasrestricted to Predicted (P) frames, and a simple interpolation technique wasemployed to handle Intra (I) frames The whole algorithm was based on theassumption that the macroblocks that contains an edge of a moving object ismore difficult to compress since it is hard to find a good match for thosemacroblocks in the reference frame(s)

Base on Poppe’s idea, Vacavant et al [25] used the macroblock size to detectmoving objects by applying the Gaussian mixture model (GMM) The approachcan represent the distribution of macroblock sizes well

Although the method of Poppe and Vacavant is good for removing the backgroundmotion noise, they cannot produce high motion detection results for videos in highspatial resolution (such as 1920 × 1080 or 1280 × 720) In case where the movingobjects are large and they contain a uniform color region (such as a black car), thenthe size of macroblocks corresponding to the inside region of

Trang 19

the moving object will be very small (normally around zero), and using afiltering threshold or parameter (though very small) will not be effective Inthose cases, the algorithm will determine these regions to be background.

Chapter Summarization

In this chapter showed the researchs about moving object detection in both pixeldomain and compressed domain The approachs using data from pixel domainusually have high accuracy but taking a large number of computing resourcesand time The approachs using data in compressed domain have lower accuracybecause the data in compressed domain usually contain less information In thenext chapters, I will propose a method that can efficiently detect moving objects,especially in high spatial resolution video streams The method uses the datataken from the video compressed domain, including the size of the macroblocks

to detect the skeleton of the moving object and the motion vectors to detect thedetail of the moving object

Trang 20

CHAPTER 2.

METHODOLOGY

Video compression standard h264

Before proposing the moving object detection method, this chapter will showsome informations about H264, a popular video compression standard, which isused to encode and decode the surveillance video in the thesis

This day, the installation of surveillance cameras in house became quitecommon Normally, video data from a surveillance camera over a long period oftime usually has very huge size Consequently, videos need to be preprocessedand encoded before being used and transmitted over the network There aremany recognized compression standards and widely used One of these is theH264 or MPEG-4 part 10 [26], a compression standard recognized by the ITU-TVideo Coding Experts Group and the ISO/IEC Moving Picture Experts Group

2.1.1 H264 file structure

Normally, the video after being captured from the camera will be compressedusing a common video compression standard such as H261, H263, MP4,H264/AVC, H265/HEVC, etc In the thesis, I encode and decode the video byusing H264/AVC The H264 video codec or MPEG-4 part 10 is recognized bythe ITU-T Video Coding Experts Group and the ISO/IEC Moving PictureExperts Group

Typically, an H264 file is splitted into packets called the Network AbstractionLayer Unit (NALU) [27], as shown in Fig 2.1

Figure 2.1 The structure of a H264 file

The first NALU byte indicates the type of NALU The NALU type shows what the NALU's structure is It can be a slice or set parameters for decompression The meaning of the NALU in Table 2.1.

Trang 21

Table 2.1 NALU types

1 Slice layer without partitioning non IDR

2 Slice data partition A layer

3 Slice data partition B layer

4 Slice data partition C layer

5 Slice layer without partitioning IDR

6 Additional information (SEI)

7 Sequence parameter set

8 Picture parameter set

9 Access unit delimiter

10 End of sequence

11 End of stream

12 Filler data13 23 Reserved24 31 UndefinedOther than NALU, the rest of the NALU is called RBSP (Raw Byte SequencePayload) RBSP contains data of SODB (String Of Data Bits) According to thespecification document H264 (ISO/IEC 14496-10) if the SODB is empty (nobits are present), the RBSP is also empty The first byte of RBSP (left side)contains 8 bits of SODB; The next byte of the RBSP will contain up to 8 bits ofSODB and continue until less than 8 bits of SODB

Figure 2.2 RBSP structure

Trang 22

A video will normally be divided into frames and the encoder will encode themone by one Each frame is encoded into slices Each slice is divided intoMacroblock (MB) Typically, each frame corresponds to a slice, but sometimes aframe can be split into multiple slices The slices are divided into categories asshown in Fig 2.2 A slice consists of a header and a data section (Fig 2.3) Theheader of the slice contains information about the type of slice, the type of MB

in the slice, the number of slice frames The header also contains informationabout the reference frame and quantitative parameters The data portion of theslice is the information about the macroblock

Table 2.2 Slide types

0 P-slice Consists of P-macroblocks (each macroblock is predicted usingone reference frame) and/or I-macroblocks

1 B-slice Consists of B-macroblocks (each macroblock is predicted

using one or two reference frames) and/or I-macroblocks

2 I-slice Contains only I-macroblocks Each macroblock is predictedfrom previously coded blocks of the same slice

3 SP-slice Consists of P and/or I-macroblocks and lets you switch

between encoded streams

4 SI-slice It consists of a special type of SI-macroblocks and lets youswitch between encoded streams

Trang 23

Figure 2.3 Slide structure

2.1.2 Macroblock

The basic principle of a compression standard is to split the video into framegroups Each frame is divided into the basic processing units (For example, inthe H264/AVC standard, it is Macroblock (MB) which is a region 16x16 pixels).Also, with some data regions carrying more detail, the MBs will be subdividedinto smaller sub-macroblocks (4x4 or 8x8 pixels) Each MB after compressionwill contain the information used to recover the video later, including Motionvector, Residual value, Quantization parameter, etc as in Fig 2.4, where:

• ADDR is the position of Macroblock in a frame;

• TYPE is the Macroblock type;

• QUANT is the quantization parameter;

• VECTOR is Motion vector;

• CBP (Coded Block Pattern) show how to split MB into smaller blocks;

• bN is encoded data of residual of color channels (4 Y, 1 Cr, 1 Cb)

Figure 2.4 Macroblock structure

During decompression, the video decoder receives the compressed video data as

a stream of binary data, decodes the elements and extracts the encodedinformation, including coefficients of variation, size of MB (in bit), motion

Trang 24

prediction information, and so on and perform the reverse transformation torestore the original image data.

2.1.3 Motion vector

With H264 compression, frame-based megabytes are predicted based on theinformation that has been transferred from the encoder to the decoder Usually,there are two ways of predicting frame prediction and inter-frame prediction.Frame forecasting uses compressed image data in the same frame as thecompressed macroblock and predicts inter-frame image data using previouslycompressed frames Interframe forecasting is accomplished through a predictiveand compensatory motion process in which the motion predator retrieves themacroblock in the reference frame closest to the new macroblock and calculatesthe motion vector, this vector characterizes the shift of the new macroblock toencoding compared to the reference frame

Referenced macroblocks are sent to the subtractor with the new macroblock thatneeds coding to find error prediction or residual signal, which will characterizethe difference between the predicted macroblock and the actual macroblock Theresidual signal or prediction error will be converted to Discrete CosineTransform and quantized to reduce the number of bits to be stored ortransmitted These coefficients together with the motion vectors will be applied

to the entropy compressor and the bit stream Video streams of binary datainclude conversion factors, motion prediction information, compressed datastructure information, and more To perform video compression, one comparesthe values of the two frames A frame is used as a reference When we want to

compress a MB at position i of a frame, the video compression algorithm tries to

find the reference frame of a MB with the smallest value of MB compared to

MB at position i Then, if MB is found in the reference frame at position j, the change between i and j is called the Motion vector (MV) of MB at position i (Fig 2.5) Normally an MV will consist of two values: x (the column position of MB) and y (row position of MB).

Trang 25

Figure 2.5 The motion vector of a Macroblock

Note that the MV of a MB does not really describe the motion of the objects inthat MB, but merely represents the movement of pixels closest to the pixels inMB

Trang 26

Proposed method

This section describes the processing of proposored moving object detectionmethod The processing includes three phases: Macroblock-based segmentation,Object-based segmentation, and Object refinement

2.2.1 Process video bitstream

The video data is taken directly from the surveillance camera, in the form of aH264 bitstream Then it is transported to process device To get the MVs andMBs information, I use the library LIVE555 [28] and JM 19.0 [29] LIVE555 is

a free, open-source C ++ library that allows to send and receive streams ofinformation through RTP / RTCP, RTSP, and SIP protocols The LIVE555Streaming Media module is responsible for connecting, authenticating andreceiving data from the RTSP stream taken directly from the surveillancecamera In addition to receiving packets, LIVE555 Streaming Media alsodisassembles the header of packets The results from this module are thereforeNALUs (refer to ISO/IEC 14496-10 [26]) Then the NALU will be transferred to

JM 19.0, a free H264 decode tool commonly used in study and research, forprocessing The original JM 19.0 input decoder module is a compressed videofile with the H264 compression format (with the format described in Annex B ofISO/IEC 14496-10) The original output is the decompressed video file in YUVformat However, in order to reduce the time and volume of computation asoriginally planned, I made a modification to this library that stopped justextracting the required information without fully decoded the video

Then, the MVs and MBs will be used to detect the moving object I propose amethod that uses a combination of both MVs and MBs to determine the motion

in the video This method can be applied to both in-house video and off-roadenvironment Because using the data from compressed domain, it is easy toreduce the processing time of the method compare with the methods use the data

in the pixel domain The moving object detection method consists of 3 phases:Macroblock-based segmentation, Object-based segmentation, and Objectrefinement, as shown in Fig 2.6

Ngày đăng: 30/07/2020, 10:17

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w