unsupervised learning and reverse optical flow in mobile robotics doc

A sum of squaned differences measure is iteratively minim between the two, moving from coarser to Finer levels, ta ealeulate the ‘optical lowe fur a given future, 23 a Optical flow bas

Trang 1

UNSUPERVISED LEARNING AND REVERSE OPTICAL FLOW,

IN MOBILE ROBOTICS

A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL

ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

Andrew Loolinghil May 3011

Trang 2

‘This dissertation is online at pslipatlstanfor.edwhn2066k25780

Trang 3

1 ceriy that Ihave res this dissertation and that, in my opinion tis fully adequate

in scope and quality asa dissertation forthe degree of Doctor of Philosophy

Scbastian Thrun, Primary Adviser

1 cenily that {have read this dissertation and that, in my opinion, i fully adequate

Ít scope and quality as dissertation forthe degece of Doctor of Philosophy

Bernd

1 comity that have reed this dissertation and that in my’ opinion itis fully adequae

In scope and quality as dissertation forthe degree of Doctor of Philosophy

Andrew Ng

Approved forthe Stanford University Committee on Graduate Studies

Patricia J Gumport, Vice Provost Gra

‘This signature page wa generated electronically upon subsson of thi dissertation ia

ectoni format An orginal signed bard cops of he sgeannrs page ison ite in

University Archives

Trang 4

us the environment without having explicitly labeled data is exeitin stuf

So, while this thesis may not have any plot tists of surprise ending, Thope you Bind

it intonssting resting ul who ks

vw? En the coving rovulutiony, Lay be pasty 0 blame for any stray toasters

Trang 5

fortune to srk with sou estranrlinarlly talented people during say Game tere 1

‘would like to thank che members of the Stanford Autonomous Helicopter Project for their help in acquiring the viseo weed for testing the mul

jot tracking algorithin Aisensind i Chapter 2 T would also like 40 thank may’ collaborators David Lic, David Stavens, John Rogers, Jim Corey, and Itai Kata for thei insights as ell as th

long hours, Inte nights, suuburns, ami mosquito bites we enrdused as we wrote and

thon tested ne algorthuas i the Bold, Tana indebted to the members ray rong committer, Professors Gitod andl Nay my thesis defense committee halt, Professor

Widront, ane my advisor, Professor Thrun, Finally, 1 want to thank my mother, for

Trang 6

$.1 Learning Activity Maps from a Moving Platfor 18

3.1.2 Ideutifying Moving Objects on the Ground 19

La Tracking Moving Objects with Particle Filters 2

Trang 8

Bibliography 92

Trang 9

List of Tables

3.1 Single and Multi-Object Tracking Performance

Trang 10

List of Figures

2.1 Reatnees identified using an algorithm by Shi an f

22 Inmge pyouuids, filtered aul subsampled, for te consecutive video

Frames A sum of squaned differences measure is iteratively minim between the two, moving from coarser to Finer levels, ta ealeulate the

‘optical lowe fur a given future,

23 (a) Optical flow based on a short image sequence for an image contain ing a moving object (dark car)

2.11 Outdoor reverse optical fw example, Frame on right shows point

‘when object interacts with robots local sensors, fame on lel shows image region where reverse optical flow indicates the point origineted „

Trang 11

Outdoor roverss optical flow exatnple, Frome on right shows point

‘when object interacts with robots Jacal sensors, frame on left shows image region where reverse optical flo indicates the point originated

tk sr rove optical tow example, Frame on sight shots point whew object iuteracts with robots local sensors, atte on Jeft shows image pegiet where reverse optial flow indicates the point exiginated

(0) Points om desert roaivay selected in initial videw frame (hi) Corte

ponding points 200 frames in the past

The Stanford Helicopter is based on a Bergen Industrial Twin plot

form and is outfitted with instrumentation for autonomous ight (IMU

GPS, magnetometer, PCL), Ia the experiments teparted here the onboard laser was replaced witht a color camera,

(6) Optical flow based on o short image sequence, for an ima taining @ moving object (dark cat), (b) The “onrmeted™ flow after compensating for the estimated platform motion, whieh itself i ch

round is not moving) (b) the ceuter of each particle

8 diffezent frame in the soquonce clatly identifies a oxi sets,

Pwo moving ahjects being tracker in vid

part of s DARPA, taken from a helicopter as

»

Trang 12

ample of « loarwed activity imap oF ay area on casnpns, using

acquired from s camera platform nudergoing unknown motion The arrows inticate the most likely motion direction motes in each grid wl; their kngths eurrespond to the most bkely velocity of that a

and the thickness yepresents the probability of motion, Thị diagram Jearly shows the min trac flows ston the circular object; it lsu shuge the foe of pedestabans that mowed th

The sing

con the aetivity-based models acquited from eavh, This registration

psfranv alignment of two independent video sequences based

is performed without image pixel information, uses only activity

information from the learning,

Exemple of twa tracks (a) without and (b) with the Ienrned activigy tap, The track in (a) is incomplete and musses the moving object for

‘ tumber of time steps The setivity tap eúable the tracler to tri the top object more eeliabls

Selected features in feat frown jogsing video

Adptive rowed following algoeithia

(0) Dark tine shows the definition ragion used in the proposed algo

rithm (b}(a) White lines show the locations in p

sod the debnition rion,

hình optieal Bow has

stor i the dark line

"1

(6) Input video fran (b) Visualization of SSD matching response for

9 horizoutal teuplates for Chis Irame,

Trang 13

15 Dark vircles represent Joentions of ruaxinnom SSD) respons along such horizontal search fine, Light citees are the ontpat of the «

programming routine ‘Phe gray region is the final 0 tpt of the alg Fit» anil is ealonlated sing: the dymataie prograstaning output The Wwideh of this region is fineary interpolated from the horizomtal ven plate widths, Uneven vertical spacing between circles is the result

LG Single fraine algorithm output for three Mojave Desert data sets, Bac

lum contain results from one of the three video sequences, lộ

18 Pixel coverage coms on the three seat Wideo seqences| 50 1.9 {a} Input frame fom the video characterized by straight ditt road with sorub brush Tining the sides (b) Opal flow techuigue output (e) MRE

1.40 (a) put frame from the vido with long shadows and spars vo

(6) Optival ow technique outpat (e} MRF classifier outbat Ea

1.42 (a) Input Game ftom vidoo chataeterized by changing clewntions nd _gravel colors (1) Optical How technique ontput (e) MRF classifier output 54 1.18 Line coverage results are shown at different distances ftom the front of

te vehiele towards the horizon forthe three video sequences, 55 1.11 Sampls aptical low field in frame from winding desert video 16

1115 Fret figure shows the definition region in front of the webiele, subse

sucut gute shows the location the definition rogion is tracked back to

in suceessully earlier video franies ‘The lst figure is 200 frames in the

116 Sample frame with roadway postdon caloalatel from pesitions of hor

Trang 14

1.19 Output of naive texturetuasod rua classification algorithm nở 4.20 Comparison of reverse optical flow, colon, and texture-hased algorithms

21 Coaupatison of rovers optical Sow, eolu, anổ toxtnme-basal nlzoritiune

5A Statisties for Gaussian mixtuze components after a run where the robot

interacted with an orange fence, and avoided subsequent orange ob-

jects Fach row hus the mean and standard deviation for that Gaussian

component, followed by the amber of good, had, and lethal votes for

that component based on traning data

(0) STPT during a peril of wormal sobut opoeation (8) STET dusia

ws period of detected whet slippage

5.) Thos ane classified us clintucles with « tot

Lotion algorithm after imtoruoting with the robot's physical bumper 7 5.10 Learned optical flow field ik used by the robot #0 determine how #0

tusneur to pus obstacles ont of the fell of view

5.11 (6) Video tame (1) Handetabeled obstacle ina

witliont optical How () Segmentation with eptieal ow »

(©) Segmentation

Trang 15

5.12 (a) Paths talon using data collected without optical floor (b) Paths taken using data collected with optical flow sở 5.13 Autonamous navigation results with 95% confidence elipses, ‘The av

ran duration (iu minutes) is indicat ow the yas, and the average number of obstacles enconutered on the x-axis 7

5.15 (a) Input frome (0) Initial classifier outpnt (0) Traworsable pisels (2) Palynoratal contour (e) Refined classifier output (C) Estimated path

Trang 16

sete, are an important tool for desling with, and benefiting from, this ah

dan

‘This thesis will examine the application of snsupervised learning techniques (0 thnee subfiekts of mobile rohoties "The fist, tracking multiple moving objects fromm show, ian aren nf eneront interest for unauanned serial veiele (UAV) researchers Tho sooond, road following iu loosely-structurcd environments, was made fimous by

‘This thesis describes throe novel contributions, ane ia enc of the subfields ietel

have, Fitst, the ability to build dynamic, activity-based ground models from moving platform paves the way for improved multiobject trecing (important for coping with aeal-wordl data with nultiple objets af interes, and other applications

Trang 17

CHAPTER ¡ INTRODUCTION 2

sols video stream registration im vide taken fan UAVS, Second, the combination

‘of optical flow techaigues and dynamic programming produces « real-time algorithin for accurately estimating the position of traversahle areas in a lonscly-struetared ccuvironmient which alluws improve road classification in unpaved diving conditions,

in turn allowing higher robot travel speeds, Finally, an extension of these optical

flow techniques allows ax autonomously navigating robot to improve the quality of its obstacle elasifieation ia monocular vile, whỉnh in twon improves its ubstucle avoidance performances

All tue work coset in this thesis uses monocot video camera as the primary sensor, This is ekollenging in that it lacks the dense 3D range information availa from laver scanners orsteteo vision It is useful, however "_ senor which is desirable in some applica cand because it provides informati

all the way to the hotizon

As the experimental results discussed in this thesis will show, these contributions iuprove the state ofthe atti each ofthese three subs,

1.1 Thesis structure

The falowing chapters of this thesis are based on published papers Chapeer 35

‘Thar (I) Chapter 4 is based on « paper published at RSS with coauthors Lieb and Thưun [7]

bused oo paper published at ICRA, with cowuthons Lieb, Staveus, a

Chapter 5 is hase ow a paper published in ICV wich cowuthors Rogers, Lieb Cory

sn Thre, and « hook chapter in the Springer Star socies wit coanthors Eiet and Thun BỊ, DỤ, All of the approaches discussed inthis ttesis make use of optical How techniques These techniques, which area way’ of associating the Iacation ofan objet

sibel

inva pivun video frame with its location in wn easier frame in the video, uno do

in detail in Chapter 2

Trang 18

Chapter 2

Optical Flow and Reverse Optical Flow

‘Thore are two tools fom computer vision that play’ large past i the work desert

in chapeots 3, 4, and 5 of this thesis

Optical Flow: ‘Trvee and Verti define optieal flow as “the apparent motion

srightnwss patteru"[T] Thu term sparse uptical fw is often use

The work in Chapter 9 of

deseribe this ms joa for subset of pixels in at im

this thesis combines the techniqis in 2.1 and 2.2 to produce a sparse optical fw

‘estimation nsed for ubjeet teacking ia a sonocular video stream taken fom a moving platform A note on terminology: what 1 refer to here a6 sparse optical tow is sometimes refereed to im the literature mors generally ax feature tracking

Reverse Optical Flow: [ will doline revutse optieal flow ws th nse of stated butfer of interftame sparse optical flow vectors for sone nunber of pais of eonsecutive video frames in the past to associate any piso ia the current itmage with the loeation

‘of the ohjowt it corresponds to in an earlioe video frame, This is interesting booams rather than identifying an object abisd of time and tracking its motion, wwe ean pik

sn olyject in the current fran which is interesting (perlips leans of an interaction with short range sensors) aud exinine its appearance at some point in the past, Th work in Chapters 4 and 5 of this thesis combines the techniques in 2.1, 22, and 2.3, into.» tolmstimplotoatation of the calelation oF reverse optical flor

Trang 19

CHAPTER 3 OPTICAL FLOW AND REVERSE OPTICAL FLOW 4

rithon thet dees wot use color, texture, stipe, oe siz inforuation to tral

“objects while keeping the details ofthe implementation simple In addition, no prior assumptions about the aatute of the moving objects heing tracked are made

Ja the approach propose lene, features are frst identified using an algorithaa

by Shi snd Tomasi [5], whieh selects wnambignous

ature points by finding te

in the image containing large spatial image gradients in two orthogonal directions

“Tho fraturos foul snd treed Loy tas lgorighns ate corners Using Seale Tussiant Feature Transform (SIFT) Features wonld have heen another option (6 tn order

to Bind comers, the image is frst smoothed tảng o Gaussian filter (with an 11 pixel keene! and standard deviation of 2.15 in bot dimensions), and the minlmal igenvalue of the matrix

LE Ly & _

ete E, — is the spatial image gradient in the x direction, is then fone at eae

mn [Fl These eigenvalues ate dropped if smaller than a theesbold (0.05

) The 0

cv GoodFenturesToTrack was used to perfor the operation discussed here [9]

times the saxinum eigenvalne ia the image, an this ea enV function

A semple of features found by this algorithm, in an image acquired by the Stanford Anfonummons Helio»pter, is hownt in Fig 2.1

2.2 Feature Tracking

The tracking of the selected features is achieved using 4 pyramidal implementation

Cf tho Lucas-Kanado tracker (8) This approach forms image pyraauids consisting

filcered and snbsampled versions of the original images (see Pig 22) ‘The pyramids Iisa five levels, each level was half the siae in each dimension af the level above

Trang 20

CHAPTER 2, OPTICAL FLOW AND REVERSE OPTICAL FLOW

‘Tho filtering cowsistod of smoothing with a 5x5 pixel Gaussian kernel with standard

deviation equal to 1.25 in both dimensions The displacement vectors hetwoen thơ featuse locations in the two images are found by iteratively minimizing the sum of

Trang 21

CHAPTER 2, OPTICAL FLOW AND REVERSE OPTICAL FLOW 6

FFigute 2.3: (a) Optical đow bass on a short image sequedce for an image contain

8 moving object (dark car)

‘squared errors over a small window, from the coarsost level up to the original level

‘Tho window used in this work was 3x3 pixels, The result of tracking features is shown

iu Fig 2.8 The optical flow (the movement of the pixels corresponding to an object

in image space) of a number of features, tracked through consocutive images and

indicated by small arows in the dtection ofthe flow, is shown This approach bas two important benefits € is tobust to fairly lange pixel displacements dae to the pyramid structure, and i allows the tracking of a sparse set of festures to caleulate

‘object motion, yielding « faster implementation than « dense optical flow caleulaton, Bi-tinear interpolation of image values at integer pixel locations is used to allow sub-

vers and

pisel tracking accuracy The tracker will exit after either a maximum correlation is

ed, oF a maxinnmn nuuber of iterations as occurred for each trucked feature,

mited by low quickly one of these exit criteria is met ‘The tracking

‘The precision is

seas performed using the OpenCY function evCaleOpticalFlowPyrLK 9) The output

‘of this stage isa sparse optical flow fell, While this field does not have information shout the movement of every piso, it docs give a good overview about the motion of

Trang 22

Figure 2.4: Changes in texture and calor appearance with distance

the diferent objects between frames

2.3 Flow Caching and Traceback

Recall that rovurse optical flow is the us of stored optical fow vector between

‘each video frame and its precoding frame to establish correspondences between pixels corresponding to objects in the current frame and their locations in frames from the pest This approach uses optical flaw information to track features om objects from the time they appear on sercen until they interact with the local sensors of the robat Classification and segmentation algorithins ean then be trained using the appearance

‘of these futures at large distanoos from the robot The approuch is motivated by the example shown in Fig 24, Where traditional monocular image segmentation ppraaches use the visual charaeteristies of the tree at short range show i the inset

‘on the right side of the figure after @ short-range seusor iateraetion, the proposed

Trang 23

Figure 25: Changes in specular illumination with distance, Specular illumination depends on the angle of incidence at point P, which differs botwoen robot postions a and h

Approach nses the elaracteristis of the tree at a mnch gteater distance shown in the insot on the let side of the figure ‘Thore aro soveral explanations for why the visual characteristics of obstacles dilfer so greatly at different distances from the robot ‘Those include possible antomatic gain control ofthe on-board eamere, periodic

tanges

i the speovlat component of abject iluaination whieh depends on the viewing angle

toxtures that are not visible at groat distancos due to eamera resolution, and

fof the observer with respect to the surface normal of the object, whieh in turn is dependent on the distance between the observer and the object (see Fig 2.5),

‘The approach discussed in this chapter uses the standard optical flow procetures

‘outlined in the preceding sections to assemble a history of interframe optical lr in roal time as the robot navigates to cowbat the distance-dependent changes ia visual appearance and stil extract useful terain classification information from monocular images ‘This information is then used to trace futures on any object in the current Frame back to their positions in a provious frame, This optical Mow field is popslatod

in the following manner Fitst, the optical flow between adjacent video frames is

‘aleulated as discussed in the procoding two sections ofthis chaptor A typical optical flow fold captured iu this manuer is shown overlaid on the original frame fom a dataset taken in the Mojave Desert in Fig 246

‘The optical flow field for each vonsceutive pur of video frames is then subdivided and coarssuedl ty dividing the 720480 image into a 1298 gril and averaging the

Trang 24

CHAPTER 2, OPTICAL FLOW AND REVERSE OPTICAL FLOW °

‘vith a moan vertor for eack ellis tho stored in a ing baer (forthe work discuss

iu this thesis 200-rame buffer wos usd) « siuplfiod version of which is pictured in Fig 27 A point inthe earrent fram ean bo traced back ta its approximate provions location ins any finmse in the history buler or to the frame where it lel entered the rubut’s fei of view by nding the offiet vector evrresponding to the grid cell the point falls nto the point's eoordnates for each frame of the traceback, ‘The diagram shown in Fig 2.8 illustrates how this is done for a 200-Frame trnceback: Zoro flow is assumed when an optical flow grid culls eeupty Fig 2.6 gives at iden of the relative density of the optical flow ft

Trang 25

CHAPTER 2, OPTICAL FLOW AND REVERSE OPTICAL FLOW w

to video taken

ie platform are shown iu Fig, 2.11, Fig 2.12 and

Trang 26

CHAPTER 3 OPTICAL FLOW AND REVERSE OPTICAL FLOW a

point pin ramen (current ame)

Determine cached optical low vector between frames n and nt

‘Apply flow to find location of pin frame n-t

Return’ point»

in framen Figure 2.8: Operations for tracing the lncation off feature backwards ie ttn

iu texture of diforout areus oŸ guadway haye slightly degraded the quality of th Traceback, AI the points in the enrrent frame lie slong a single line, the calrulatel origin points ia trame 200 frames in the past are scattered slightly vertially Bven this vel aỸ neenruey ăn traceback căn I helpful in learning, algortiaus, ws will be discussed iw Chapters and 5

Trang 27

IFT &

nd Malik focuses on incoxpornting Ioeal descriptors set as stares into the

Trang 28

Figure 2.11: Outdoor reverse optical flow example Pram on right shows point when

‘object interacts with robots Ioeal sensors, frame on left shows image region where

se optical lr indicates the point originated,

Figure 2.12: Outdoor reverse optical flow example Frame on right shows point when object interacts with robots local sensors, frame on left shows image region where reverse optical flow indicates the point originated

variational optical flow model by treating it as an optimization problem [10) This Vields a method capable of dealing sobustly with Ingge displacements, Meaavhile Zang, ot al have done work focusing on robust optical low calculations in the faco

of brightne vatintions (12) The work focuses on replai the brightness constaney

Trang 29

Figure 2.18: Indoor reverse optical flow example, Frame on right shows point when

‘object interacts with robots local sensors, frame om loft shows image region where reverse optical flow indicates the point originated

Figure 2.14: (a) Points on desert roadway selected in initial video frame (b) Corn sponding points 200 fraanes in the past

Other recent work has improved the robustness of dent

RANSAC (RANdom $Atmple Consensus)like techniques to remove outliers [1]

‘The question of what constitutes a good feature for the purposes of tracking is

optical flow by applying

an open one, Though the approach proposed here uses the simple corner features suggested by Shi and Tomasi, there are many alternatives Those include methods for adaptively solocting features whilo tracking Collins otal sleet features base om hhow well they separate sample distbntions drasin ftom the presumed hackgronnd nel foreground [15], while Chen ota similarly pair adaptive selection of color features

Trang 30

vith a particle fitor for trarlitg tà tixiudze ler bistro differences between ta background aud foreground (16

Other approaches suet as that proposed by Neumann and You, seek to combine

sing Exnectation-Masximization, these two complementary approaches are combined

to produce a robust teacket [IS] Dotini and Goldenstein’s work om misented fe ture tracking represents the uncertainty about the location of features using Gaussian [Random Variables, and then uses this to improve th petfornanee of the Kama

Lucas-Tomosi feature trarker [19], Takada aud Sngaya improve: the robustness 0

feature tracking by detecting incorteet future tracks by Imposing an aline constant

on feature trajectories (20) Finally, the work of Ta ot al, increases the eficieney some tracking by searching for matelies within neighbothood in a 3D imag

to one procesding it in time, ws part of the prucess of using provionsly eonspnted disparfty maps to improve the quality and spood of dispatity taap caleulaion for

AAynamiie stereo [25] In Bonoit's issercation, the term is used to refer to using the

nhat Hue hetsoeen «vido Fran and fot he eatue prowling it and th frame

scene information in a way’ that i robust to vemporary

‘sceltsions 26) "The definition propose is this chapter differs frm these in that flow

Trang 31

is cwlenlated and stored for « auch larger amalier of cousseutive Grams, allowing

‘operations between franues significantly separated in me

Trang 32

‘moving platform, he thrust ofthe research discussed in this captor ix the acqisition

‘of activity-basod models, whieh ate munols that ehataetarize places based oo the tyne

‘of motion aetivites that nocut For example, the tivities found on roads dilfer fom those Fond on sidewalks Even among roads, motion chatactorsties vary significantly

facilitate the tracking of individual moving object, as will he showa ia this chapter

sition of activity-hased ground! models

"

Trang 33

CHAPTER 8, MULTEOBJECT TRACKING AND ACTIVITY MODELS 18

[Figure S.t: The Stanforl Heitoptsr is basel on a Bergen Industrial Twin platform

‘and is outfitted with instrumentation for autonomens fight (IMU, GPS, magnetore ter, PCL) In the experiments reportal here the ouboard lasor was replaced with a

killed are hen applied to identify multiple moving objects on the ground reliably, The resulting tracks from the partici filters are feito a histogram that characterizes the probability distribution over spoods and orientations of motious on the ground This probability histogram constitutes the learned activity map To illustrate the utility of the activity map, two applications ate examined in this chapter: an improved particle filter tracker, and an application to the peoblow of global image registration,

3.1 Learning Activity Maps from a Moving Plat-

form

3.1.1 Feature Selection and Feature Tracking

‘To determine which pixels in a given video frame potentially correspond to a moving

‘object, the feature selection and feature trucking methods deseribed in Chapter 2.1

nd 2.2 are applied

Trang 34

CHAPTER 8, MULTLOBJECT TRACKING AND ACTIVITY MODELS 19

Figute 82: (9) Optical ow based on a short image sequence, for an image contain

‘© moving object (dark ear) (b) The “corrected” flow after compensating for the estimated platform motion, whieh itself is obtained ftom the image flow The reader vill notice that this low is significantly higher for the moving car, These mages were

‘sequined with the Stanford belicpter

3.1.2 Identifying Moving Objects on the Ground

‘The principal difficulty in interpreting the optical How to Wdeutity moving objects arises from the fact that- most of the flow is cased by the platform's ego-motion,

‘The flow shown in Fig 3.24 is largely due to the elcopter's own motion; the oly exception is the flow associated with the datk vehicle inthe seen:

‘The proposes approach uses an Expectation-Maximination (EM) algorithm to identify the nature of the ow Let (x92, 1} be the set of features

Lucas Kanade, where (xian) eotesponds to the image cooedinates of feature in rare by

‘one frame, and (1/2) corresponds to the ge coordinates of that feature ia the next frame The displaoment betwoon these two sets of coordinates is proportional

to the velocity of «feature neative ta the eamera plane (but họ he gtonhd), The probability that L, mí, ý) cortesponls to 8 moving object om the ground ix now

Trang 35

CHAPTER 3, MUITEOBJECT TRACKING AND ACTIVITY MODELS 20

determines its position (uf) in the subsequent feame

provides the optima

‘The bey te the id affine parameters @ and ntification of moving features is naw the E-step: base on the stints image plane wansforwation, the expectation of the binaay variable ¢ is

118 a normalization fietot aided ta ensute that the probabilities sum to 1, and

is coustant, chosen empirically (ezp(2) in this enso) ‘The matrix Eis a di

os nal mutris of size 2by-2, containing variances for the x and y cou cnots, Tn this particular ease it was the 2x2 Meutity miottix, The subsequent Mestep iterates

Trang 36

CHAPTER 3, MULTEOBJECT TRACKING AND ACTIVITY MODELS 21

‘Tracking Moving Objects with Particle Filters

nfortnuately, the cata retnmed by the EM auslyss are sil too nolay for construet ing actviey-based maps ‘The affine model assunies an orthographic projection, and i therefore, i genoral,iusffciat to model all pusible platforms motion, In wddition, some Featites appear to have a high

association error in the LucasKenade algorittan (if the interframe tracking just

high activity in areas whore the affine assumption breaks down or Lucas-Kanade ers

‘To improve the quality of the tacking, the proposed apptoaeh employs wmultipl particle filters, This approach is capable of tracking a variable number of moving ahjects, spawning an individual particle filter far sạch such aiject Particle Blters in particular provide a way tơ model a mitianodal điettibutian, Let (9f"" ef" Ye th m-th particle jn the &-th particle ter (cortesp

hou this elapters} will efer to u fenture's coordiantes and gto its velocity

“The prediction step for this particle assuntes Browniaa motion:

nodeling the randow ch

whee S78 a random weetor ges in vehicle velocity Bn this

Trang 37

CHAPTER 3, MULTLOBJECT TRACKING AND ACTIVITY MODELS

vwork it was « totimensional anifora radon yortor with 241 anew al 4 rag

30 pixels in each dimension though # Gaussian raucdom variable might arguably be a better choos The importance weights ate set according to the miotion extracted in

the provious stup Specially,

wi = Setaexp fa) đại

Now pattick ites ate stagtod if, a tke borer of The canoes field, large numer

‘of featutos with high probability p(s) exist that are nọt meneiatel xith any’ of the

‘existing naticle filters, This opuration uses tel meaneshife operators sshich begin

on (In this

J.) Tespawnis new puatticle ters when

ro exinting filters are withiu a specified distance (4) pixels in this impletentation) of

‘ouch peak in the image plane, Particle filters are discontinued when particle tracks

law the image of who the total sum of all importance weights draps below a uset defined tbreshold (40, in this case), Te help diptinguish slssty moving objects from

i {ull ealeulation is performed once every six frames, Particle filter position information

the backyroun and increase the disparity between ego-t " for interleaved frames i interpolated

Trang 38

lo filter in a different frame in the sequence clearly identifies all moving objects,

part

Trang 39

Figure 8.4: Two moving objects boing tracked in video takou from a helicopter as part of a DARPA demo,

Fig 33 shows the result of the particle itor tracking, Fig 83a shows a situation

in which three different particle filters have heen spavned, each corresponding to a diferent objeet, Fig, 8.h shows the contor ofeach partile filter In this example all,

tutree moving objects arv correctly identifi (the large truck in the foreground did hot move in the image sequence) Fig 3.4 shoss a shot of tracking video taken from the Stanford hilicopter during a demo for the Defense Advan

Ageney (DARPA) The two moving objects in the wideo have been coereetly identified and tracked from averhead,

1d Research Projects

Trang 40

CHAPTER 3, MUITEOBJECT TRACKING AND ACTIVITY MODELS 25

3.1.4 Learning the Activity-Based Ground Model

‘Tho fival stop of this approves involves the weauisition of the behavior aoe For

"hat, the map i anchored using features in the intsge plane that, wit high Tikelibood,

fa patch

fre not moving In this way, the activity map rofers to a projection

‘of ground into the eawora plane, oven when that patch of grotnd is wot preseatty

‘observable by che cotnera, This ground plane projection remains static with respect

the ground and does not ft ta relative loestions in the camera image

‘The activity map is thon ealoulated by histogeamening the vations types of motion cohierved at different locations More specially the approach earns a -limensional histogram rg, 0.8), indexed over ry locations inthe projection of the ground in the camera plane and the velocity of the objects observed at chese locations, represented

bự a velocity magnitude » and an orientation of objet motion @ (30 bins are nsec

for spon, 36 for orientation), Speeiically, oncl tim the k-th particle filter's sta

Fig 3.5 shows the result of the learning step Shown here is an activity map

‘overlayed with one of the images acquieed during tracking, Blue arrows eoreespond

to the most likely motion mavdes tor each grid cal jn the projeetion of the geownd in the camera plane: ff ao motion has been observed in cell, marrow is display Further, the length of each artow indicates the most likely orientation and velocity

21 each location, and its thickness corresponds to the frequency of motion As thie

activity mols acquired by this approach are iuforaative 0

the motions that pccuf an the ground

Tiêu đề	Unsupervised Learning and Reverse Optical Flow in Mobile Robotics
Trường học	University of Robotics and Autonomous Systems
Chuyên ngành	Mobile Robotics
Thể loại	Research Paper
Năm xuất bản	2023
Thành phố	New York

Định dạng
Số trang	120
Dung lượng	6,11 MB