a simulated autonomous car

This dissertation describes a simulated autonomous car capable of driving on urbanstyle roads. The system is built around TORCS, an open source racing car simulator. Two realtime solutions are implemented; a reactive prototype using a neural network and a more complex deliberative approach using a sense, plan, act architecture. The deliberative system uses vision data fused with simulated laser range data to reliably detect road markings. The detected road markings are then used to plan a parabolic path and compute a safe speed for the vehicle.

Trang 1

A Simulated Autonomous Car

Iain David Graham Macdonald

Master of Science School of Informatics University of Edinburgh

2011

Trang 2

Abstract

This dissertation describes a simulated autonomous car capable of driving on style roads The system is built around TORCS, an open source racing car simulator Two real-time solutions are implemented; a reactive prototype using a neural network and a more complex deliberative approach using a sense, plan, act architecture The deliberative system uses vision data fused with simulated laser range data to reliably detect road markings The detected road markings are then used to plan a parabolic path and compute a safe speed for the vehicle The vehicle uses a simulated global positioning/inertial measurement sensor to guide it along the desired path with the throttle, brakes, and steering being controlled using proportional controllers The vehicle is able to reliably navigate the test track maintaining a safe road position at speeds of up to 40km/h

Trang 3

urban-Acknowledgements

I would like to thank all of the lectures who have taught me over the past year, each

of whom contributed to this thesis in some way Particular thanks must go to my supervisor, Prof Barbara Webb, for agreeing to supervise this project and for her advice and encouragement throughout, and to Prof Bob Fisher for his many useful suggestions

Trang 4

Declaration

I declare that this thesis was composed by myself, that the work contained herein is

my own except where explicitly stated in the text, and that this work has not been submitted for any other degree or professional qualification except as specified

(Iain David Graham Macdonald)

Trang 5

Table of Contents

CHAPTER 1 INTRODUCTION 1

1.1 P URPOSE 1

1.2 M OTIVATION 1

1.3 O BJECTIVES 2

1.4 D ISSERTATION O UTLINE 2

CHAPTER 2 BACKGROUND 3

2.1 I NTRODUCTION 3

2.1.1 Motivation 3

2.1.2 A Brief History of Autonomous Vehicles 4

2.2 T HE U RBAN C HALLENGE 6

2.2.1 The Challenge 6

2.3 B OSS 7

2.3.1 Route Planning 7

2.3.2 Intersection Handling 8

2.4 J UNIOR 9

2.4.1 Localisation 9

2.4.2 Obstacle Detection 10

2.5 O DIN 12

2.5.1 Path Planning 12

2.5.2 Architecture 12

2.6 D ISCUSSION 14

2.7 C ONCLUSION 15

CHAPTER 3 SIMULATION SYSTEM 17

3.1 A RCHITECTURE 17

3.2 T RACK S ELECTION 18

3.3 C AR S ELECTION 19

CHAPTER 4 REACTIVE PROTOTYPE 20

4.1.1 Image Processing 21

4.1.2 Training 22

4.1.3 Results 24

4.1.4 Evaluation 25

CHAPTER 5 DELIBERATIVE APPROACH 26

Trang 6

5.1 S UMMARY 26

5.2 G ROUND T RUTH D ATA 27

5.3 S ENSING 28

5.3.1 Some Initial Experiments 28

5.3.2 The MIT Approach 32

5.3.3 Road Geometry Modelling 43

5.3.4 Lane Marking Verification 48

5.3.5 Lane Marking Classification 49

5.4 P LANNING 52

5.4.1 Trajectory Calculation 52

5.4.2 Speed Selection 54

5.5 A CTING 54

5.5.1 Speed Control 55

5.5.2 Steering Control 56

CHAPTER 6 EVALUATION 57

6.1 L ANE M ARKING D ETECTION AND C LASSIFICATION 57

6.2 T RAJECTORY P LANNING 58

6.2.1 Generation of Trajectory Points 58

6.2.2 Flat Ground Assumption 59

6.2.3 Non-continuous Path 60

6.2.4 Look-ahead Distance 61

6.3 P HYSICAL P ERFORMANCE 62

6.3.1 Path Following 62

6.3.2 Speed in Bends 64

6.3.3 G Force Analysis 64

6.3.4 Maximum Speed 66

6.4 R EAL - TIME P ERFORMANCE 67

CHAPTER 7 CONCLUSION 69

7.1 S UMMARY 69

7.2 F UTURE W ORK AND C ONCLUSION 70

BIBLIOGRAPHY 73

Trang 7

Chapter 1 Introduction

"The weak point of the modern car is the squidgy organic bit behind the

wheel." Jeremy Clarkson

1.1 Purpose

This dissertation describes a simulated autonomous car capable of navigating on urban-style roads at a variable speeds whilst staying in-lane The simulation uses TORCS, an open source racing car simulator which is known for its accurate vehicle dynamics [23] Two solutions to the problem are provided; firstly, a reactive approach using a neural network and secondly, a deliberative approach inspired by the recent DARPA Urban Challenge with separate sensing, planning, and control stages In particular, the latter system fuses vision and simulated laser range data to reliably detect road markings to guide the vehicle

1.2 Motivation

The recent DARPA Grand Challenges and the work of teams such as Tartan Racing [8] and Stanford Racing [9] have demonstrated that the goal of fully autonomous vehicles may be within reach The development of such vehicles has the potential to save many of the thousands of lives that are lost in collisions every year [1]

Due to the need for autonomous vehicles to interact in an environment populated by human drivers and pedestrians, there is clear safety risk in the development process This risk can act as a barrier to development as any autonomous vehicle must prove a

sufficient level of safety before it is able to enter such an environment One approach

to this problem, as seen in the DARPA challenges, is to create controlled environments that are representative of the intended operating environment Whilst the benefits of this approach are clear, the logistics and the expense of organising these events mean that they are likely to remain rare

Another approach is to simulate the environment thereby eliminating the safety risk and reducing costs Simulations have the additional benefit of being able to test performance under unusual circumstances and allow algorithms to be optimised and

Trang 8

1.3 Objectives

The goal of this project was to develop an simulated autonomous vehicle capable of driving on urban style roads The vehicle must be capable of navigating around a test

track in a safe and controlled manner Specifically, the vehicle must remain in the

correct lane and drive at an appropriate speed Although the environment is simulated, the intention is to approach the project as though the vehicle is real With that in mind, the vehicle should only make use of information that would be available

in the real-world, and the system must run in real-time The project looks to the recent DARPA Urban Challenge for inspiration

1.4 Dissertation Outline

The remainder of this document is structured as follows: Chapter 2 provides a brief history of autonomous vehicle research and discusses some of the techniques used by state of the art vehicles Chapter 3 describes the simulator and system architecture Chapter 4 describes a sub-project which was undertaken to establish the feasibility of the main project Here, a neural network is used to control a simulated autonomous vehicle Chapter 5 forms the main body of the dissertation and describes a deliberative approach using image processing, data fusion, planning, and control techniques to solve the problem Chapter 6 provides the results of experiments performed on the completed system as an evaluation Finally, Chapter 7 offers a summary of the work undertaken, conclusions, and suggestions for future work

Trang 9

Chapter 2 Background

2.1 Introduction

This chapter examines the current state of the art in driverless cars It focuses on the

2007 DARPA Urban Challenge, a competition held to promote research in the field and the main inspiration behind this project The motivation behind autonomous vehicles is discussed, followed by a retrospective that places the Urban Challenge in context The challenges posed by the Urban Challenge are described and the vehicles

which finished in the top three positions, Boss, Junior, and Odin examined As this

project aims to build a complete autonomous system, different aspects of each vehicle are examined giving a broad overview of the field Although some of the techniques described do not relate directly to this project, they have helped shape it and represent the aspirations of the project had more time been available

The car has been a significant force for social change, improving the mobility of the population Access to this mobility will increase the quality of life for certain groups, such as the elderly and disabled, who cannot drive themselves For others, being

Trang 10

Chapter 2 Background

released from time spent behind the wheel will simply allow that time to be put to better use [1]

2.1.2 A Brief History of Autonomous Vehicles

This section provides a history of the development of autonomous vehicles from the 1980s to the present

In the early 1980s, pioneer Ernst Dickmanns began developing what can be considered the first real robot cars He developed a vision system which used saccadic camera movements to focus attention on the most relevant visual input Probabilistic techniques such as extended Kalman filters were used to improve robustness in the presence of noise and uncertainty By 1987 his vehicle was capable

of driving at high speeds, albeit on empty streets [5]

In the late 80s Dickmanns participated in the European Prometheus project (PROgraMme for a European Traffic of Highest Efficiency and Unprecedented Safety) With an estimated investment of 1 billion dollars in today’s money, the Prometheus project laid the foundation for most subsequent work in the field By the mid-90s, the project produced vehicles capable of driving on highways at speeds of 80km/h in busy traffic [6] Techniques such as tracking other vehicles, convoy driving, and autonomous passing were developed

Another pioneer in the field was Dean Pomerleau who developed ALVINN (Autonomous Land Vehicle in a Neural Network) in the early 90s [7] ALVINN was notable for its ability to learn to drive on new road types with only a few minutes training from a human driver

After the successes of the 80s and early 90s, progress seems to have plateaued in the late 90s It was not until DARPA (Defence Advanced Research Projects Agency) launched the first of its Grand Challenges that interest in the field was renewed In

2004, DARPA offered a $1 million prize to the first autonomous vehicle capable of negotiating a 150 mile course through the Mojave Desert For the first time, the vehicles were required to be fully autonomous with no humans allowed in the vehicle during the competition By this time, GPS systems were widely available, significantly improving navigational abilities Despite several high profile teams, and

Trang 11

general advances in computing technology, the competition proved to be a disappointment with the most successful team reaching only 7 miles before stopping The following year, DARPA re-held the competition This time the outcome was very different with five vehicles completing the 132 mile course and all but one of the 23 finalists surpassing the seven miles achieved the previous year The

competition was won by Stanley, the entry from the Stanford Racing Team headed

by Sebastian Thrun [12]

Buoyed by this success, DARPA announced a new challenge, to be held in 2007 named the Urban Challenge This would see the competition move from the desert to

an urban environment with the vehicles having to negotiate junctions and manoeuvre

in the presence of both autonomous and human-driven vehicles

Trang 12

2.2 The Urban Challenge

This section describes the Urban Challenge and the main sub-challenges that were set

The competition took place in 2007 on a closed air force base in California All the autonomous vehicles and multiple human-driven vehicles were present on the course

at the same time The environment can, therefore, be considered a good approximation of a genuine urban environment even though the roads were not open

to the public

2.2.1 The Challenge

Each vehicle was required to complete a mission specified by an ordered series of checkpoints in a complex route network Each vehicle was expected to be able to negotiate all hazards including both static and dynamic obstacles, re-plan for alternative routes, and obey California traffic laws at all times [4]

More specifically, each vehicle had to demonstrate the following abilities:

 Safe and correct check-and-go behaviour at junctions, when avoiding obstacles and when performing manoeuvres

 Safe vehicle following at normal speeds and in slow moving queues

 Safe road following, only changing lane when safe and legal to do so

 GPS-free navigation (GPS may be used but is not reliable in urban environments)

 Manoeuvres such as parking and u-turns

Each vehicle is supplied with two files, the Route Network Definition File (RNDF), and the Mission Definition File (MDF) The RNDF specifies the layout of the road

network and is common to all teams It specifies accessible road segments, lane widths, and stop sign locations The MDF contains the specific mission that the vehicle must accomplish, with each vehicle having a unique but equivalent mission

Trang 13

2.3 Boss

Boss was developed by Carnegie Mellon University and finished in 1st place [8]

This section describes the route planning and intersection handling techniques used

by Boss The intersection handling described below would have been of relevance to

the project had time been available to add overtaking functionality

2.3.1 Route Planning

The RNDF is converted to a connected graph with directional edges representing

drivable lanes Each edge is assigned a weight that represents the cost of driving the

corresponding road segment The cost is calculated using the length and speed limit

of the segment as well as a term that represents the complexity or difficulty of the

terrain for Boss to negotiate Graph search techniques are then used to plan a path

from the current location to a goal location

As Boss navigates the chosen path, new information may become available that

requires the costs of road segments to be modified For example, Boss maintains a

map of obstacles it believes to be static If a static obstacle is determined to entirely

block the road, it is necessary to find an alternative route To do this, Boss

significantly increases the cost associated with the road segment and re-calculates a

new route to the goal The increased cost is sufficient to cause an alternative route to

be selected However, it is not desirable to permanently avoid the blocked road and

so the cost is exponentially reduced over time to its original value

Figure 2-1 Boss at the Urban Challenge

Trang 14

2.3.2 Intersection Handling

A crucial requirement is that the vehicle is capable of negotiating intersections safely and observing correct precedence Precedence becomes important when the intersection contains more than one stop line (4-way stops are common in the US) The order of precedence is determined by the order in which the vehicles arrive at

their respective stop lines Boss estimates precedence by defining a precedence

polygon that starts around three metres prior to the stop line A vehicle is considered

to be in the polygon if its front bumper (or part of it) is within the polygon The time

at which the vehicle is detected as being in the polygon is used to estimate precedence As vehicles with higher precedence leave their polygons, Boss moves up the precedence order until it determines it has precedence

Whilst this approach seems straight forward, care must be taken when determining the size of the precedence polygon Increasing the size of the polygon improves the robustness of the algorithm but risks that two vehicles may be detected as one The idea of the precedence polygon is extended to apply to situations where Boss

must merge with moving traffic In this case, yield polygons are calculated based on

the time it would take Boss to execute the manoeuvre and the safe inter-vehicle gap For example, if Boss wishes to cross a lane of traffic coming from the left to join traffic coming from the right the following times would be considered:

 The time to cross the lane, Taction

 The time to accelerate to appropriate speed for the desired lane, Taccelerate

 The minimum safe time gap between vehicles, Tspacing

These times are used to determine the size and location of the yield polygon for both the crossed lane and the destination lane Any vehicle within a polygon has its velocity tracked and the time at which it will cross Boss’s desired path is estimated Using the estimates for each lane, Boss is able to determine if there is sufficient time

to perform the manoeuvre

Trang 15

2.4 Junior

Junior was developed by Stanford University and finished in 2nd place This section

describes Junior’s use of LIDAR and GPS/IMU See sections 5.3 and 5.5 for how

these sensors types are used by this project

Figure 2-2 Junior at the Urban Challenge and the Velodyne HDL64 LIDAR used by several teams

2.4.1 Localisation

As GPS signals are carried by microwaves, they are absorbed by water leading to reception problems in bad weather and under foliage Tall buildings reduce the visible area of the sky and, therefore, the choice of satellites, limiting accuracy Furthermore, GPS does not provide a means of directly determining the vehicle’s orientation For these reasons, the GPS unit is combined with an inertial measurement unit (IMU) which uses gyroscopes and accelerometers to estimate the velocity and acceleration of the vehicle [2]

Whilst combined GPS/IMU systems can provide sub-meter accuracy they are still

insufficient for safe road following They provide a pose estimate that is the most

probable at the current time and are prone to position jumps In addition, the data in

the RNDF is also GPS based and cannot be guaranteed to be accurate It is, therefore, necessary to use additional localisation techniques

Junior uses kerb locations and road markings and to accurately localise relative to the RNDF The kerb locations are described below Front and side mounted lasers that are angled down are used to measure the infra-red reflectivity of the road Lane markings can be extracted from this data and compared with lane data in the RNDF

Trang 16

This fine-grained localisation is used to maintain an internal co-ordinate system that

is robust to position jumps

2.4.2 Obstacle Detection

Five of the six teams that completed the Urban Challenge used a high-definition LIDAR system as their primary sensor As the name suggests, LIDAR is similar to RADAR but pulses of laser light are used rather than radio waves Both Boss and Junior used a system manufactured by Velodyne Inc that was developed for the original Grand Challenge This roof-mounted system comprises a rotating unit containing 64 separate lasers Each of the lasers is fixed at a different pitch and therefore scans a different portion of the environment The result is a highly detailed 3-dimensional map of the environment that can be used to detect kerb-sized objects

at 100m [3]

The LIDAR produces a detailed map of the environment in the form of a point-cloud For obstacle detection, this data must be processed and features of interest extracted One method of doing this would be to identify points that are the same distance and direction from the vehicle but have different heights However, the Stanford team found that whilst such a method was suitable for detecting objects with large heights such as cars and pedestrians, it was not suitable for smaller objects such as kerbs The problem was setting a threshold that would allow kerbs to be detected without producing a large number of false-positives

To combat this, Junior uses a novel approach If the vehicle is on flat ground, each of the LIDAR lasers will scan a circle of known-radius around the vehicle The scans, therefore, generate a series of concentric circles with each circle a fixed distance apart On ground that is not flat, the distances between the circles are distorted much like the contours on a map By comparing the distances between the contours with the expected value, small objects can be detected with greater sensitivity than using vertical measurements

One complication with this approach is that of vehicle roll As the vehicle turns, it has a tendency to tilt outwards thus reducing the distance between the contours on one side of the vehicle and increasing them on the other If not compensated for, this

Trang 17

as particles As more information becomes available, the particles are filtered allowing the object to be tracked over time

Trang 18

2.5 Odin

Odin was developed by Virginia Tech as part of team VictorTango and finished in

3rd place [10] This section describes the path planning and architecture of Odin See

section 5.4 for details of how this project performs path planning

2.5.1 Path Planning

The RNDF contains a series of waypoints which define the road network The

distance between the waypoints may vary and it is, therefore, necessary to calculate a

smooth path from one point to the next To do this, Odin uses cubic splines The

same technique is used to generate paths through intersections and in unstructured

parking zones

Using splines guarantees a smooth path between points but does not guarantee that

the path accurately matches the road To combat this problem the curvature of the

splines is manually adjusted using the aerial photographs supplied by DARPA

2.5.2 Architecture

Odin implements a hybrid deliberative-reactive architecture [13] Such architectures

combine the benefits of high-level deliberative planning with low-level reactive

simplicity However, increases in computing power have allowed Odin to add a

further deliberative layer to handle low-level motion planning The reactive driving

behaviours are, therefore, sandwiched in a deliberative-reactive-deliberative

progression [11]

Figure 2-3 Odin at the Urban Challenge

Trang 19

2.5.2.1 Route Planning

The top-level deliberative component is responsible for route planning It is invoked

on demand when a mission is first loaded or when an existing route is found to be blocked As with Boss, the road network is searched using A* graph search and aims

to find the route with the shortest time The time for a route is based on the speed limits and distances with additional fixed penalties for manoeuvres such as u-turns

2.5.2.2 Driving Behaviours

The reactive layer comprises a set of independent driving behaviours Each is dedicated to a specific driving task such as passing another vehicle or merging with moving traffic However, not all driving behaviours are applicable all of the time and, therefore, a sub-set is selected based on the current driving context For

example, on a normal section of road the route driver, passing driver, and the

blockage driver are applicable whereas at a junction the precedence, merge, and turn drivers are used The driving context therefore acts as an arbiter that activates

left-multiple behaviours

Route Driver Assumes no other traffic Passing Driver Pass other vehicles Blockage Driver React to blocked roads Precedence Driver Stop sign precedence Merge Driver Enters or crosses moving traffic Left Turn Driver Yields when turning left across traffic Zone Driver Re-route when stuck

The arbiter and each of the driving behaviours are implemented as finite state machines These are arranged in a hierarchy with the arbiter as the root The structure

of the hierarchy represents a top-down task decomposition rather than any idea of behavioural priority

As the arbiter is able to select multiple, potentially competing behaviours, an additional mechanism is required to select which commands the vehicle For this, a

form of command fusion is used which allows each behaviour to specify an urgency

Trang 20

parameter This parameter indicates how strongly the behaviour feels that it should

be selected

2.5.2.3 Low-level Planning and Vehicle Control

The bottom, low-level deliberative layer is concerned with motion planning Its purpose is to determine a speed and trajectory that will keep Odin in the desired lane whilst avoiding obstacles or to perform manoeuvres such as parking

Once the desired path and speed are established, the vehicle needs to be commanded

appropriately To do this, the vehicle’s dynamics are modelled using a bicycle

model This simplifies modelling by compressing four wheels into two and has

proved to be sufficient for the low speeds experienced in the Urban Challenge [25] The base vehicle chosen for Odin was a hybrid-electric Ford Escape which has the

advantage of an existing built-in drive-by-wire system Sending the appropriate

commands to this system allow the steering, throttle, and gear change to be easily controlled An additional advantage of hybrid vehicles is that they have sophisticated power generation systems making it easy to power the computers and sensors

2.6 Discussion

The Urban Challenge has stimulated huge interest in the field of autonomous vehicles but how realistic a challenge did it represent? Having multiple autonomous vehicles interacting with each other and human driven vehicles on the scale seen in the Urban Challenge certainly presents a degree of realism not seen before However, there is a clear gap between the competition format and reality The challenge did not require vehicles to perceive road sighs or traffic lights Nor were vulnerable road users such as motorcycles or pedestrians encountered These are active areas of research but they were notable by their absence

The RNDF together with aerial photography presented the teams with a rich description of the environment Despite this, manual modifications were required to ensure correct operation It is unrealistic to expect this level of detail to be available

or maintained on a global basis

Trang 21

The vehicles must perform well at many different tasks in order to succeed Some of

the problems can be considered solved whilst others show a trend towards a

particular solution For example, high-level route planning, an important aspect of

the challenge posed little problem with standard techniques such as A* being used

successfully Likewise, the low speeds encountered in urban driving pose little

problem in terms of vehicle control An autonomous vehicle developed by Stanford,

named Shelley, recently competed in an off-road hill climb race demonstrating the

state-of-the-art in vehicle dynamics

Perception is perhaps the area of most interest in the Urban Challenge There is a

clear trend towards direct sensing technology such as LIDAR and away from vision

This trend is likely to continue but many vision techniques will remain applicable to

images generated by laser sensors

There is no doubt that the availability of combined GPS and IMU technology has

been crucial to the field but despite these advances, localisation still proves to be a

serious problem In qualifying, Odin experienced a signal jump that caused it to

misjudge its position by 10m A similar, though less severe problem occurred in the

final event Another competitor, Knight Rider failed to complete the challenge due to

a localisation failure [14] Accurate localisation is crucial; even an error of 1m could

be catastrophic

2.7 Conclusion

The Urban Challenge and indeed, the preceding Grand Challenges have been a

powerful driving force in the development of autonomous vehicles Together they

mark a significant milestone towards the goal Progress has, in large part, been due to

Figure 2-4 Shelley, an autonomous Audi TT developed by Stanford

Trang 22

advances in GPS and LIDAR technologies but limitations still remain Further improvements in these technologies are required Solving the technical challenges seems inevitable but other challenges such as questions of liability and how to adequately prove safety lie ahead

Trang 23

Chapter 3 Simulation System

3.1 Architecture

The Open-source Racing Car Simulator (TORCS) is a racing simulator that has a reputation for having an accurate physics engine [23] It was developed with the artificial intelligence community in mind, being used as a platform for the development of computer-controlled opponents in racing games It has also been used as the base platform for the annual WCCI racing challenge [15]

TORCS is implemented in C++ and has an API [29] that provides physical vehicle parameters such a speed, acceleration, wheel rotations rates, and so on, that can be used as sensors (Table 3-1) This information is provided in real-time and is updated

at 50Hz The API also provides a means of commanding the vehicle via the variables shown in Table 3-2 Commanding the vehicle via this interface is analogous to the

use of vehicles with built-in drive-by-wire interfaces such as Odin in the Urban

Challenge

Figure 3-1 Top-level system architecture

The objective of this project is to implement the artificial intelligence vehicle controller (AIC) depicted in Figure 3-1 This controller must be capable of running in real-time Data is transferred between TORCS and the AIC using sockets allowing the AIC to run on a separate PC should performance be an issue

OUTPUT WINDOW

VEHICLE SENSOR DATA

CAPTURED IMAGE VIDEO OUTPUT

CONTROLLER VEHICLE COMMANDS

Trang 24

Chapter 3 Simulation System

Rather than using a camera, input images must be captured directly from the simulator’s output window The Windows API provides a means of enumerating all windows currently in use From this it is possible to query the title of each window and, therefore, locate the TORCS window Once identified, a further Windows API call can be used to perform a fast copy of image data from that window into local process memory

Table 3-1 TORCS Data and Potential Application

Wheel rotation rates Odometry

Vehicle’s position in the world (x,y,z) Global positioning system (GPS)

Vehicle acceleration (x,y,z) Inertial measurement unit (IMU)

Track geometry Light detection and ranging (LIDAR)

Table 3-2 Vehicle Command Parameters

Steering angle Float / -1.0 … 1.0 -1.0 indicates full left-lock, 0.0 straight ahead, and

1.0 full right-lock Throttle Float / 0.0 … 1.0 0.0 indicates no throttle, 1.0 indicates full throttle Brake Float / 0.0 … 1.0 0.0 indicates no brake force, 1.0 indicates full

brake force Gear Integer / 0 … 5 0 indicates neutral, 1…5 indicates desired

Figure 3-2 shows the track selected It is 2.59km long and there are long gentle bends, sharp hair-pin style bends, and tight bends in opposite directions close together The figure also shows the lane markings which consist (for the most part)

Trang 25

Chapter 3 Simulation System

of a continuous white border line on the left and right marking the edge of the road and a dashed white line separating the two driving lanes In lane detection systems a common problem is that of shadows and poor quality road markings The simulator image shown exhibits both these features to some extent

Figure 3-2 Selected track left, and example screenshot from camera

3.3 Car Selection

As TORCS is a racing simulator, it provides a selection of cars to choose from, the majority of which are dedicated track racing cars Most of the competitors in the DARPA Urban Challenge used SUV-style vehicles and the rules stated that the vehicle must be road-legal and of proven safety record [4] Of the cars provided by TORCS, the one that best matched these requirements was a Peugeot 406 This car was selected on the grounds that it is a typical saloon style car common on the roads and has good low speed handling characteristics due to being front-wheel drive The vehicle dimensions are Figure 3-3 shows an image of the selected vehicle

Figure 3-3 Peugeot 406

Trang 26

Chapter 4 Reactive Prototype

It was necessary to perform a feasibility study to ensure that it was possible to capture the simulator output, process the data, and send control commands back to the simulator and to assess whether a typical laptop had sufficient processing power For the purposes of the prototype, it was important to select a technique that was direct, allowing the main parts of the system to be put together relatively quickly I chose to base the prototype on Dean Pomerleau’s ALVINN [7] This uses a neural network to directly convert an input image of the road into a steering angle for the vehicle Thus, the control of the vehicle is directly reactive to the current road scene The feed-forward network is organised as three layers comprising 800 input nodes (conceptually a image grid), 4 hidden nodes, and 31 output nodes The output , of each node is a function of its weighted inputs :

∑ The network weights are trained using the back-propagation algorithm [24] The general training process is described in more detail below Figure 4-1 illustrates the network’s structure

Figure 4-1 Neural network structure

At pixels, the input image is at a lower resolution than typically used for modern lane-tracking systems and certainly lower than the captured image

SHARP LEFT

SHARP RIGHT

1

… 800

INPUT IMAGE INPUT

LAYER

HIDDEN LAYER

OUTPUT LAYER

Trang 27

Chapter 4 Reactive Approach

Therefore, the image must be down-sampled in a process described in section 4.1.1 prior to being passed to the neural network

Each of the 31 output nodes represents a specific steering angle with sharp-left corresponding to node 0, straight-ahead corresponding to node 15, and sharp-right node 30 Each output node returns a value between 0.0 and 1.0 indicating the degree

to which the network believes that to be the correct steering angle The output, therefore, represents a distribution of probable steering angles This distribution is then converted to a single floating point value by computing its centre of mass and rescaling to the range -1.0…1.0 for compatibility with the TORCS API

4.1.1 Image Processing

The captured image goes through the following steps to convert it into a format suitable for input to the neural network The effect of each step is illustrated in figure Figure 4-2

format The horizon lies approximately half-way down the image and, for the purposes of this project is assumed to be fixed The image is, therefore, cropped to , discarding the upper half

Intensity conversion: The cropped image is converted from RGB format to

grey-scale using the standard conversion formula:

Binarisation: A manually selected threshold is used to convert the image from grey

to binary The result is an image that has the road markings highlighted against the black background of the road Edge features to the sides of the road are also highlighted but this does not pose a problem

Trang 28

Down-sampling: At this point, the image resolution is slightly less than due to shrinkage during filtering This is still too high to use as input to the neural network The next step is, therefore, to reduce the resolution to by simply averaging the intensities over blocks of pixels

Figure 4-2 Image Processing Steps Top left, cropped RGB image Top right, grey scale image Middle left, smoothed image Middle right, edge enhancement Bottom left, binarisation Bottom right, resolution reduction

4.1.2 Training

In order for the neural network to operate, it must first be trained The training data required consists of a set of tuples, each containing an input image and the corresponding desired output steering angle ALVINN relied upon a human driver to train the network over a period of a few minutes driving on any new road type I chose to use a computer controlled ‘expert’ driver to train the network

This expert driver is used to determine the correct steering angle to be associated with a given input image as follows: The TORCS API makes it easy to determine the exact position of the vehicle relative to the centre-line of the road This information can be used to make the vehicle follow a given lane using a proportional steering control method Thus, the steering angle can be captured at the same time as the image, forming a training pair

Trang 29

The speed of the expert driver was fixed 15km/h, being slow enough that the most severe bends can be safely negotiated

It would be possible to generate a batch of training data by periodically capturing pairs as the expert driver navigates the track Once captured, this data could be partitioned into separate training and test sets However, I chose to train the network

in an online manner

As the vehicle travels around the track, an image is captured and the network generates what it believes to be the correct output steering angle This output is compared with the correct steering angle as determined by the expert driver If the two steering angles differ by more than a specified threshold, the expert driver wins the right to control the vehicle When this happens, the captured image and the correct steering angle are combined into a training pair and added to the current set

of training pairs Conversely, if the steering angles are in reasonable agreement then the AIC retains control Thus, training data is generated only in situations where it is required and constitutes a supervised learning approach

The threshold used to determine which driver is in control is initially set to a very small value so that any deviation between the two steering angles results in the expert driver gaining control The threshold is relaxed gradually over time This allows tight control at the outset but also allows more ‘wiggle room’ as the network becomes more competent allowing for slight deviations from the desired path to go unchecked

During each image processing cycle, 25ms is allocated to training the network incrementally using the all the training data accumulated so far The network, therefore, continually improves over time with training taking place whether the expert or the network is in control

When using the expert driver, we must convert from its exact steering angle to an

output distribution that is compatible with the desired network output To do this, a Gaussian distribution is created with its mean at the steering angle and a variance of 0.07 as illustrated in Figure 4-3

Trang 30

Figure 4-3 Illustration of the relationship between the input image and output steering

distribution (not to scale)

4.1.3 Results

This online supervised learning approach is highly effective Within only a few tens

of metres, the network is able to take control on the initial straight section As the first lap progresses, the network is able to maintain control through the more gentle bends The more severe bends and bends in opposite directions in quick succession are the last to be mastered by the network Often, by the end of the 2nd lap the network is fully trained and the 3rd lap is completed under full autonomous control Table 4-1 gives the percentage of autonomous control over three test runs of 4 laps each and Figure 4-4 shows how the capability of the network improves over the 3 laps

Figure 4-4: Expert versus autonomous control over 3 laps Red dots indicate areas where the expert driver was in control The final lap (right) is completely autonomous

The vehicle starts at the blue circle and travels in a clockwise direction The red markers indicate where the expert driver has control The left image shows the 1stlap, with the vehicle starting out under expert control The neural network quickly takes control and only needs occasional assistance during the first straight Heavy

CAPTURED IMAGE OUTPUT NODE VALUES

Trang 31

assistance is required during the bends on the first lap The middle image shows the

2nd lap – the expert driver is only required in four locations An interesting point is that during the 2nd lap, the network requires more assistance to straighten up when exiting a bend than it does on entry to the bend The right image shows the 3rd lap which is completed under full autonomous control

Table 4-1 Percentage of Autonomous Control

a standard Windows laptop with an AMD Turion64 processor – a five year old system at the time of writing The combined processor load of running TORCS and the AI controller was 100%

An important point about this system is the direct coupling between the frame rate and the steering command rate Each steering command only applies for the instant that it was generated If the image processing were interrupted for any reason, the vehicle will immediately lose control

4.1.4 Evaluation

The development of this system was, in itself, a substantial amount of work (approximately 6 weeks) but was necessary to demonstrate that TORCS could be integrated successfully with an independent vision and control system However, the basic architecture with regards to image capture, image processing, and inter-process communication would be re-usable Indeed, the rest of this project would not have been achievable in the available time had this prototype not been developed Despite the prototype being a success, it highlighted the limitations of the laptop used and, as

a result, a new high-performance laptop was used for the remainder of the project

Trang 32

Chapter 5 Deliberative Approach

Whilst the neural network prototype successfully controls the vehicle, it operates in a reactive way; the steering angle is a direct function of the input image Furthermore, this function is essentially hidden and does not lend itself to analysis What features, for example, is the network responding too? In order to deal with more complex driving situations the entrants to the Urban Challenge required a higher level of scene understanding is required

The main body of this project is, therefore, concerned with controlling the vehicle in

a deliberative manner Starting with a captured image, the road markings are explicitly detected and modelled, prior knowledge of the road width is used to classify road markings, a trajectory for the vehicle is computed, and the vehicle controls both its speed and position to follow the desired path This project,

therefore, takes a sense, plan, act approach to vehicle control

5.1 Summary

This chapter forms the main body of the dissertation It starts with a description of how some ground truth data was generated for test purposes in Section 5.2 Section 5.3 covers the sensing aspects of the system It provides a short description of some initial investigations that, whilst useful, were not taken further, before describing the main image processing steps and LIDAR simulation Section 5.4 describes the techniques used to convert the perceived environment into a path for the vehicle to follow Finally, section 5.5 describes how the vehicle is controlled Figure 5-1 gives

an overview of the main steps in the system

Trang 33

Chapter 5 Deliberative Approach

Figure 5-1: Main processing steps of the system Sensing steps are shown in blue, planning steps in red, and control steps in purple

5.2 Ground Truth Data

In order to perform experiments and evaluate different approaches, it was necessary

to generate a test set of image pairs comprising an original captured image and the corresponding ‘ground truth’ To do this, a set of 17 images was captured from various points along the track These images were chosen to be representative of the track and, therefore, included straight sections and bends of various degrees

Once the images were captured, they were converted to binary images using a manually selected threshold such that the lane markings were fully present – it is not desirable to lose any of the lane-marking information The resulting images contained a substantial amount of noise which was removed manually

CAPTURED IMAGE

GREY SCALE IMAGE

BINARY IMAGE

MATCHED FILTERS

MARKING VERIFICATION

SIMULATED

LIDAR DATA

DATA FUSION

MARKING DETECTION

FEATURE

DETECTION

TORCS

TRAJECTORY POINTS

PARABOLA FITTING

MARKING CLASSIFICATION

SPEED SELECTION

THROTTLE

CONTROL

STEERING CONTROL BRAKE

CONTROL

Trang 34

The result is a set of image pairs; the original captured image along with a binary image containing only the lane markings Figure 5-2 shows an example of such a ground truth pair

Figure 5-2 Example of a captured image and the corresponding ground truth

5.3 Sensing

The section describes the development of the sensing system and culminates with the fusing of vision and LIDAR data into a virtual lane marking sensor

5.3.1 Some Initial Experiments

At the project outset, I had no specific technique in mind for detecting the markings Therefore, I performed some experiments to explore different options

road-5.3.1.1 Simple Thresholding

It is important to try the simple approaches before looking for more sophisticated techniques Although I did not expect simple thresholding to be a reliable means of distinguishing the lane markings from the background, I decided to start with this approach A side-effect of this is that it provides a baseline for evaluating other methods

Using the ground-truth test set, the original image is converted to a grey-scale image This is then thresholded and the resulting binary image compared with the ground-truth By doing this, it is possible to obtain a measure of the signal to noise ratio of the binary image The signal to noise ratio is defined as:

∑

Trang 35

Thus, for each pixel set in the binary image we determine if it corresponds to a genuine road marking in the ground-truth or whether it is a false positive (noise) By repeating the process, the threshold with the highest SNR can be determined

Figure 5-3 Effect of different thresholds on SNR Top left, captured image Right, SNR against

threshold Bottom left, binary image with highest SNR

Figure 5-3 shows the SNR obtained using different thresholds for a single image There is a clear peak at a threshold of 150 (for this particular image) Comparing against the ground truth image in Figure 5-2, we can see that although the signal to noise ratio has been maximised, there are significant sections of the markings missing This is, in part, caused by the shadows cast over the road This problem is typical in road marking detection systems and vision systems in general

It is clear that using a simple threshold is not appropriate given that substantial portions of the lane markings are absent even when we are in a position to choose the best threshold

The experiment was repeated using each pair in the test set The peak SNR value occurred at an average threshold of 136 This threshold was used to generate the SNR values shown in Figure 5-17

5.3.1.2 Inverse Perspective Mapping

A common approach to lane marking detection is to perform an inverse perspective mapping (IPM) to remove the foreshortening effect due to perspective [16][17] The result of IPM is an image of the road as though viewed directly from above The technique works by projecting the perspective image onto the ground-plane which is

Trang 36

assumed to be both flat and horizontal A description of the technique can be found

in [16] In order to apply IPM, characteristics of the camera such as height, and field

of view must be known Typically, this information is obtained using a automated calibration process which involves placing a chessboard pattern of known dimensions in front of the camera Many vision software libraries include routines to facilitate this process However, as this project uses a simulated camera, the calibration approach is not applicable Instead I obtained an approximation of the camera characteristics by working through the simulator source code It would have been possible to obtain precise information as this is necessarily encoded within the simulator but I was reluctant to spend too much time on this in the initial stages of the project My initial evaluation of this approach involved applying IPM to the ground-truth images and simply evaluating the results by eye

semi-Figure 5-4: Effect of applying IPM to ground truth images

Examples of applying IPM to ground-truth images are shown in Figure 5-4 The benefits of IPM are clear, with the edges of the road now appearing parallel Using this approach would facilitate applying constraints when searching for the lane markings – particularly searching for sets of parallel lines rather than individual markings However, despite the clear visual benefits of IPM, it is not without its problems As IPM relies on the flat ground-plane assumption, the image can become distorted when the assumption does not hold Furthermore, pixels in the perspective image that are distant from the camera are mapped to multiple pixels in the IPM image This produces a block-like effect that becomes more apparent the further the

Trang 37

pixel is from the camera [16] Figure 5-5 shows the effect of applying IPM to more severe bends where the markings in the perspective image are thin

Figure 5-5: Left image shows non-parallel lines Right image illustrates block effect for distant pixels

In this figure, both of the IPM images show distortion of the road geometry as the distance from the camera increases In particular, the lane markings cease to be parallel and the block effect of mapping a single pixel in the perspective image to multiple pixels in the IPM image can be seen (despite anti-aliasing being used)

5.3.1.3 RANSAC Curve Fitting

TORCS represents bends as circular arc segments although more complex curves can

be made by joining segments of different radii together As this project is concerned with modelling road geometry, I experimented with fitting circular arcs to the IPM images

Figure 5-6: Result of using RANSAC to fit circles to the IPM image

Trang 38

Figure 5-6 shows the result at fitting circular arcs to an IPM image using the RANSAC algorithm [21] The thickness of the lane markings causes many circles to pass the acceptance test This suggests that the approach is unlikely to be reliable at determining the radius of a given bend There were two further problems with this approach Firstly, no circles were matched to the centre-line and secondly, small modifications to the RANSAC parameters seemed to make the difference between many circles being detected and none being detected Given these problems, I decided that this approach was unlikely to succeed and did not investigate it further However, with hindsight there are several things that could have improved the situation For example, thinning the markings prior to the application of RANSAC and using a parabolic model rather than the somewhat restrictive circular approach Nonetheless, the time spent understanding these techniques would prove useful elsewhere in the project; both inverse perspective mapping and curve fitting are used

in section 5.3.3 on road geometry modelling

5.3.2 The MIT Approach

Whilst many lane marking detection systems employ the inverse perspective mapping approach as the first processing step, not all do so In particular, the approach taken by the MIT team and described in Albert Huang’s PhD thesis [19] works around the foreshortening effect by applying filters of different sizes directly

to the perspective image

This approach was of particular interest to me for two reasons Firstly, their technique proved to be successful in the competitive environment of the Urban Challenge and secondly, they describe a way in which data from LIDAR sensors can

be fused with camera data

5.3.2.1 Image Capture & Pre-processing

In contrast to the neural network approach, where the input image is used to directly determine the steering angle, the approach taken here involves separating the tasks of scene understanding and vehicle control More specifically, the lane markings are extracted and used to form a model of the road geometry and subsequently plan a path for the vehicle to follow As we are concerned with detecting specific features and their location in the distance, it makes good sense to increase the resolution of

Trang 39

the input image However, as resolution increases so does the cost of processing the data I chose to set the simulator output to a resolution of The image is captured and cropped to again assuming that the horizon is fixed halfway down

The image is then converted from RGB to grey-scale in the normal manner In addition to this, a separate binary image is created which is used for a verification step described in section 5.3.4 As this binary image is not used directly for feature detection the choice of threshold need not be too fine-tuned and is selected to provide

a reasonable separation of the lane-markings from the road surface

5.3.2.2 Matched Filters

Huang observes that as lane markings are typically of a standard width and the rate

of foreshortening in a perspective image can be determined [19], it is possible to locate the lane markings by searching for features of a size dependent on the distance from the camera As with inverse perspective mapping, this relies on the flat ground-plane assumption The technique, therefore, searches for features of a size that is a function of the marking width and the scanline being searched

However, this makes the further assumption that a single horizontal scanline represents a line in the world that is a constant distance from the camera This is not

the case and it is possible that extending the function to include the position within

the scanline may improve the algorithm However, this was not investigated

In order to detect a feature of a specified size the filter shown in Figure 5-7 is scaled such that the portion above zero is the same length (in pixels) as the feature to be detected

Figure 5-7 Feature detection filter template The filter is scaled to match the desired feature size

Trang 40

Figure 5-8 illustrates the principle behind matched filters A single scanline containing two different sized features is convolved with a filter whose size matches one feature but not the other When the filter exactly matches the feature size, the result is a clear local maximum When the match is not exact, the result is a truncated peak We can, therefore, locate features by searching for definite local maxima

Figure 5-8 Principle behind matched filters Left, single scanline with two different sized features Right, the result of filtering The filter does not match the first feature but matches the second exactly

Tiêu đề	A Simulated Autonomous Car
Tác giả	Iain David Graham Macdonald
Người hướng dẫn	Prof. Barbara Webb, Prof. Bob Fisher
Trường học	School of Informatics, University of Edinburgh
Chuyên ngành	Informatics
Thể loại	Master of Science thesis
Năm xuất bản	2011
Thành phố	Edinburgh

Định dạng
Số trang	81
Dung lượng	1,81 MB
File đính kèm	4. A Simulated Autonomous Car.rar (2 MB)