4.3 Random Forests for Driving Maneuver Detection A characteristic of the driving domain and the chosen 29 driving maneuver classes is that the classes are not mutually exclusive.. 5 Sen
Trang 1Each tree in the Random Forest is grown according to the following parameters:
1 A number m is specified much smaller than the total number of total input variables M (typically m is
proportional to√
M ).
2 Each tree of maximum depth (until pure nodes are reached) is grown using a bootstrap sample of the training set
3 At each node, m out of the M variables are selected at random.
4 The split used is the best possible split on these m variables only.
Note that for each tree to be constructed, bootstrap sampling is applied A different sample set of training
data is drawn with replacement The size of the sample set is the same as the size of the original dataset This means that some individual samples will be duplicated, but typically 30% of the data is left out of this sample (out-of-bag) This data has a role in providing an unbiased estimate of the performance of the tree
Also note that the sampled variable set does not remain constant while a tree is grown Each new node
in a tree is constructed based on a different random sample of m variables The best split among these m
variables is chosen for the current node, in contrast to typical decision tree construction, which selects the best split among all possible variables This ensures that the errors made by each tree of the forest are not correlated Once the forest is grown, a new sensor reading vector will be classified by every tree of the forest Majority voting among the trees produces then the final classification decision
We will be using RF throughout our experimentation because of it simplicity and excellent performance.1
In general, RF is resistant to irrelevant variables, it can handle massive numbers of variables and observations, and it can handle mixed type data and missing data Our data definitely is of mixed type, i.e., some variables are continuous, some variables are discrete, although we do not have missing data since the source is the simulator
4.3 Random Forests for Driving Maneuver Detection
A characteristic of the driving domain and the chosen 29 driving maneuver classes is that the classes are not mutually exclusive For example, an instance in time could be classified simultaneously as “SlowMoving” and
“TurningRight.” The problem cannot thus be solved by a typical multi-class classifier that assigns a single class label to a given sensor reading vector and excludes the rest This dictates that the problem should be
treated rather as a detection problem than a classification problem.
Furthermore, each maneuver is inherently a sequential operation For example, “ComingToLeftTurnStop” consists of possibly using the turn signal, changing the lane, slowing down, braking, and coming to a full stop Ideally, a model of a maneuver would thus describe this sequence of operations with variations that naturally occur in the data (as evidenced by collected naturalistic data) Earlier, we have experimented with Hidden Markov Models (HMM) for maneuver classification [31] A HMM is able to construct a model of
a sequence as a chain of hidden states, each of which has a probabilistic distribution (typically Gaussian)
to match that particular portion of the sequence [22] The sequence of sensor vectors corresponding to a maneuver would thus be detected as a whole
The alternative to sequential modeling is instantaneous classification In this approach, the whole duration
of a maneuver is given just a single class label, and the classifier is trained to produce this same label for every time instant of the maneuver Order, in which the sensor vectors are observed, is thus not made use
of, and the classifier carries the burden of being able to capture all variations happening inside a maneuver under a single label Despite these two facts, in our initial experiments the results obtained using Random Forests for instantaneous classification were superior to Hidden Markov Models
Because the maneuver labels may be overlapping, we trained a separate Random Forest for each maneuver treating it as a binary classification problem – the data of a particular class against all the other data This results in 29 trained “detection” forests
1We use Leo Breiman’s Fortran version 5.1, dated June 15, 2004 An interface to Matlab was written to facilitate easy experimentation The code is available at http://www.stat.berkeley.edu/users/breiman/
Trang 2Fig 5.A segment of driving with corresponding driving maneuver probabilities produced by one Random Forest trained for each maneuver class to be detected Horizontal axis is the time in tenths of a second Vertical axis is the probability of a particular class These “probabilities” can be obtained by normalizing the random forest output voting results to sum to one
New sensor data is then fed to all 29 forests for classification Each forest produces something of a
“probability” of the class it was trained for An example plot of those probability “signals” is depicted
in Fig 5 The horizontal axis represents the time in tenths of a second About 45 s of driving is shown None of the actual sensor signals are depicted here, instead, the “detector” signals from each of the forests are graphed These show a sequence of driving maneuvers from “Cruising” through “LaneDepartureLeft,”
“CurvingRight,” “TurningRight,” and “SlowMoving” to “Parking.”
The final task is to convert the detector signals into discrete and possibly overlapping labels, and to assign
a confidence value to each label In order to do this, we apply both median filtering and low-pass filtering to the signals The signal at each time instant is replaced by the maximum of the two filtered signals This has the effect of patching small discontinuities and smoothing the signal while still retaining fast transitions Any signal exceeding a global threshold value for a minimum duration is then taken as a segment Confidence of the segment is determined as the average of the detection signal (the probability) over the segment duration
An example can be seen at the bottom window depicted in Fig 2 The top panel displays some of the original sensor signals, the bottom panel graphs the raw maneuver detection signals, and the middle panel shows the resulting labels
We compared the results of the Random Forest maneuver detector to the annotations done by a human expert On the average, the annotations agreed 85% of the time This means that only 15% needed to be adjusted by the expert Using this semi-automatic annotation tool, we can drastically reduce the time that
is required for data processing
5 Sensor Selection Using Random Forests
In this section we study which sensors are necessary for driving state classification Sensor data is collected
in our driving simulator; it is annotated with driving state classes, after which the problem reduces to that
of feature selection [11]: “Which sensors contribute most to the correct classification of the driving state
Trang 3expensive to arrange in a real vehicle Furthermore, sensor behavioral models can be created in software Goodness or noisiness of a sensor can be modified at will The simulator-based approach makes it possible
to study the problem without implementing the actual hardware in a real car
Variable selection methods can be divided in three major categories [11, 12, 16] These are:
1 Filter methods, that evaluate some measure of relevance for all the variables and rank them based on the measure (but the measure may not necessarily be relevant to the task, and any interactions that variables may have will be ignored)
2 Wrapper methods, that using some learner, actually learn the solution to the problem evaluating all possible variable combinations (this is usually computationally too prohibitive for large variable sets)
3 Embedded methods that use a learner with all variables, but infer the set of important variables from the structure of the trained learner
Random Forests (Sect 4.2) can act as an embedded variable selection system As a by-product of the construction, a measure of variable importance can be derived from each tree, basically from how often different variables were used in the splits of the tree and from the quality of those splits [5] For an ensemble
of N trees the importance measure (1) is simply averaged over the ensemble.
M (x i) = 1
N
N
n=1
The regularization effect of averaging makes this measure much more reliable than a measure extracted from just a single tree
One must note that in contrast to simple filter methods of feature selection, this measure considers multiple simultaneous variable interactions – not just two at a time In addition, the tree is constructed for the exact task of interest We apply now this importance measure to driving data classification
5.1 Sensor Selection Results
As the variable importance measure, we use the tree node impurity reduction (1) summed over the forest (3) This measure does not require any extra computation in addition to the basic forest construction process Since a forest was trained for each class separately, we can now list the variables in the order of importance for each class These results are combined and visualized in Fig 6
This figure has the driving activity classes listed at the bottom, and variables on the left column Head-and eye-tracking variables were excluded from the figure Each column of the figure thus displays the impor-tances of all listed variables, for the class named at the bottom of the column White, through yellow, orange, red, and black, denote decreasing importance
In an attempt to group together those driving activity classes that require a similar set of variables to be accurately detected, and to group together those variables that are necessary or helpful for a similar set of driving activity classes, we clustered first the variables in six clusters and then the driving activity classes in four clusters Any clustering method can be used here, we used spectral clustering [19] Rows and columns are then re-ordered according to the cluster identities, which are indicated in the names by alternating blocks
of red and black font in Fig 6
Looking at the variable column on the left, the variables within each of the six clusters exhibit a similar behavior in that they are deemed important by approximately the same driving activity classes The topmost cluster of variables (except “Gear”) appears to be useless in distinguishing most of the classes The next five variable clusters appear to be important but for different class clusters Ordering of the clusters as well as variable ordering within the clusters is arbitrary It can also be seen that there are variables (sensors) that are important for a large number of classes
In the same fashion, clustering of the driving activity classes groups those classes together that need similar sets of variables in order to be successfully detected The rightmost and the leftmost clusters are rather distinct, whereas the two middle clusters do not seem to be that clear There are only a few classes
Trang 4Fig 6.Variable importances for each class See text for explanation
Trang 5only needs “Gear,” and “CurvingRight” that needs “lateralAcceleration” with the aid of “steeringWheel.” This clustering shows that some classes need quite a wide array of sensors in order to be reliably detected The rightmost cluster is an example of those classes
5.2 Sensor Selection Discussion
We present first results of a large scale automotive sensor selection study aimed towards intelligent driver assistance systems In order to include both traditional and advanced sensors, the experiment was done on
a driving simulator This study shows clusters of both sensors and driving state classes: what classes need similar sensor sets, and what sensors provide information for which sets of classes It also provides a basis
to study detecting an isolated driving state or a set of states of interest and the sensors required for it
6 Driver Inattention Detection Through Intelligent Analysis of Readily
Available Sensors
Driver inattention is estimated to be a significant factor for 78% of all crashes [8] A system that could accurately detect driver inattention could aid in reducing this number In contrast to using specialized sensors
or video cameras to monitor the driver we detect driver inattention by using only readily available sensors A classifier was trained using Collision Avoidance Systems (CAS) sensors which was able to accurately identify 80% of driver inattention and could be added to a vehicle without incurring the cost of additional sensors Detection of driver inattention could be utilized in intelligent systems to control electronic devices [21]
or redirect the driver’s attention to critical driving tasks [23]
Modern automobiles contain many infotainment devices designed for driver interaction Navigation mod-ules, entertainment devices, real-time information systems (such as stock prices or sports scores), and communication equipment are increasingly available for use by drivers In addition to interacting with on-board systems, drivers are also choosing to carry in mobile devices such as cell phones to increase productivity while driving Because technology is increasingly available for allowing people to stay connected, informed, and entertained while in a vehicle many drivers feel compelled to use these devices and services in order to multitask while driving
This increased use of electronic devices along with typical personal tasks such as eating, shaving, putting
on makeup, reaching for objects on the floor or in the back seat can cause the driver to become inattentive
to the driving task The resulting driver inattention can increase risk of injury to the driver, passengers, surrounding traffic and nearby objects
The prevailing method for detecting driver inattention involves using a camera to track the driver’s head or eyes [9, 26] Research has also been conducted on modeling driver behaviors through such methods
as building control models [14, 15] measuring behavioral entropy [2] or discovering factors affecting driver intention [10, 20]
Our approach to detecting inattention is to use only sensors currently available on modern vehicles (possibly including Collision Avoidance Systems (CAS) sensors) without using head and eye tracking system This avoids the additional cost and complication of video systems or dedicated driver monitoring systems
We derive several parameters from commonly available sensors and train an inattention classifier This results
in a sophisticated yet inexpensive system for detecting driver inattention
6.1 Driver Inattention
What is Driver Inattention?
Secondary activities of drivers during inattention are many, but mundane The 2001 NETS survey in Table 2 found many activities that drivers perform in addition to driving A study, by the American Automobile Association placed miniature cameras in 70 cars for a week and evaluated three random driving hours from
Trang 6Table 2.2001 NETS survey 96% Talking to passengers 89% Adjusting vehicle climate/radio controls 74% Eating a meal/snack
51% Using a cell phone 41% Tending to children 34% Reading a map/publication 19% Grooming
11% Prepared for work Activities drivers engage in while driving
each Overall, drivers were inattentive 16.1% of the time they drove About 97% of the drivers reached or leaned over for something and about 91% adjusted the radio Thirty percent of the subjects used their cell phones while driving
Causes of Driver Inattention
There are at least three factors affecting attention:
1 Workload Balancing the optimal cognitive and physical workload between too much and boring is an everyday driving task This dynamic varies from instant to instant and depends on many factors If we chose the wrong fulcrum, we can be overwhelmed or unprepared
2 Distraction Distractions might be physical (e.g., passengers, calls, signage) or cognitive (e.g., worry, anxiety, aggression) These can interact and create multiple levels of inattention to the main task of driving
3 Perceived Experience Given the overwhelming conceit that almost all drivers rate their driving ability
as superior than others, it follows that they believe they have sufficient driving control to take part of their attention away from the driving task and give it to multi-tasking This “skilled operator” over-confidence tends to underestimate the risk involved and reaction time required This is especially true in the inexperienced younger driver and the physically challenged older driver
Effects of Driver Inattention
Drivers involved in crashes often say that circumstances occurred suddenly and could not be avoided How-ever, due to laws of physics and visual perception, very few things occur suddenly on the road Perhaps more realistically an inattentive driver will suddenly notice that something is going wrong This inattention or lack of concentration can have catastrophic effects For example, a car moving at a slow speed with a driver inserting a CD will have the same effect as an attentive driver going much faster Simply obeying the speed limits may not be enough
Measuring Driver Inattention
Many approaches to measuring driver inattention have been suggested or researched Hankey et al suggested three parameters: average glance length, number of glances, and frequency of use [13] The glance parameters require visual monitoring of the drivers face and eyes Another approach is using the time and/or accuracy
of a surrogate secondary task such as Peripheral Detection Task (PDT) [32] These measures are yet not practical real time measures to use during everyday driving
Boer [1] used a driver performance measure, steering error entropy, to measure workload, which unlike eye gaze and surrogate secondary-tasks, is unobtrusive, practical for everyday monitoring, and can be calculated
in near real time
We calculate it by first training a linear predictor from “normal” driving [1] The predictor uses four
Trang 7previ-is computed for the data A ten-bin dprevi-iscretizer previ-is constructed from the residual, selecting the dprevi-iscretization levels such that all bins become equiprobable The predictor and discretizer are then fixed, and applied
to a new steering angle signal, producing a discretized steering error signal We then compute the run-ning entropy of the steering error signal over a window of 15 samples using the standard entropy definition
E = −10
i=1 p ilog10p i , where p iare the proportions of each discretization level observed in the window Our work indicates that steering error entropy is able to detect driver inattention while engaged in secondary tasks Our current study expands and extends this approach and looks at other driver performance variables, as well as the steering error entropy, that may indicate driver inattention during a common driving task, such as looking in the “blind spot.”
Experimental Setup
We designed the following procedure to elicit defined moments of normal driving inattention
The simulator authoring tool, HyperDrive, was used to create the driving scenario for the experiment The drive simulated a square with curved corners, six kilometers on a side, 3-lanes each way (separated by
a grass median) beltway with on- and off-ramps, overpasses, and heavy traffic in each direction All drives used daytime dry pavement driving conditions with good visibility
For a realistic driving environment, high-density random “ambient” traffic was programmed All “ambi-ent” vehicles simulated alert, “good” driver behavior, staying at or near the posted speed limit, and reacted reasonably to any particular maneuver from the driver
This arrangement allowed a variety of traffic conditions within a confined, but continuous driving space Opportunities for passing and being passed, traffic congestion, and different levels of driving difficulty were thereby encountered during the drive
After two orientation and practice drives, we collected data while drivers drove about 15 min in the simulated world Drivers were instructed to follow all normal traffic laws, maintain the vehicle close to the speed limit (55 mph, 88.5 kph), and to drive in the middle lane without lane changes At 21 “trigger” locations scattered randomly along the road, the driver received a short burst from a vibrator located in the seatback
on either the left or right side of their backs This was their alert to look in their corresponding “blind spot” and observe a randomly selected image of a vehicle projected there The image was projected for 5 s and the driver could look for any length of time he felt comfortable They were instructed that they would receive
“bonus” points for extra money for each correctly answered question about the images Immediately after the image disappeared, the experimenter asked the driver questions designed to elicit specific characteristics
of the image – i.e., What kind of vehicle was it?, Were there humans in the image?, What color was the vehicle?, etc
Selecting Data for Inattention Detection
Though the simulator has a variety of vehicle, environment, cockpit, and driver parameters available for our use, our goal was to experiment with only readily extractable parameters that are available on modern vehicles We experimented with two subsets of these parameter streams: one which used only traditional driver controls (steering wheel position and accelerator pedal position), and a second subset which included the first subset but also added variables available from CAS systems (lane boundaries, and upcoming road curvature) A list of variables used and a brief description of each is displayed in Table 3
Eye/Head Tracker
In order to avoid having to manually label when the driver was looking away from the simulated road, an eye/head tracker was used (Fig 7)
When the driver looked over their shoulder at an image in their blind spot this action caused the eye tracker to lose eye tracking ability (Fig 8) This loss sent the eye tracking confidence to a low level These periods of low confidence were used as the periods of inattention This method avoided the need for hand
Trang 8Table 3.Variables used to detect inattention
steeringWheel Steering wheel angle
accelerator Position of accelerator pedal
distToLeftLaneEdge Perpendicular distance of left front wheel from left lane edge
crossLaneVelocity Rate of change of distToLeftLaneEdge
crossLaneAcceleration Rate of change of crossLaneVelocity
steeringError Difference between steering wheel position and ideal position
for vehicle to travel exactly parallel to lane edges aheadLaneBearing Angle of road 60 m in front of current vehicle position
Fig 7.Eye/head tracking during attentive driving
Fig 8.Loss of eye/head tracking during inattentive driving
6.2 Inattention Data Processing
Data was collected from six different drivers as described above This data was later synchronized and re-sampled at a constant sampling rate of 10 Hz resulting in 40,700 sample vectors In order to provide more relevant information to the task at hand, further parameters were derived from the original sensors These parameters are as follows:
1 ra9: Running average of the signal over nine previous samples (smoothed version of the signal)
2 rd5: Running difference five samples apart (trend)
3 rv9: Running variance of nine previous samples according to the standard definition of sample variance
4 ent15: Entropy of the error that a linear predictor makes in trying to predict the signal as described in [1] This can be thought of as a measure of randomness or unpredictability of the signal
5 stat3: Multivariate stationarity of a number of variables simultaneously three samples apart as described
in [27] Stationarity gives an overall rate of change for a group of signals Stationarity is one if there are
no changes over the time window and approaches zero for drastic transitions in all signals of the group The operations can be combined For example, “ rd5 ra9” denotes first computing a running difference five samples apart and then computing the running average over nine samples
Two different experiments were conducted
1 The first experiment used only two parameters: steeringWheel and accelerator, and derived seven other parameters: steeringWheel rd5 ra9, accelerator rd5 ra9, stat3 of steeringWheel accel,
steeringWheel ent15 ra9, accelerator ent15 ra9, steeringWheel rv9, and accelerator rv9
Trang 92 The second experiment used all seven parameters in Table 1 and derived 13 others as follows:
steeringWheel rd5 ra9, steeringError rd5 ra9, distToLeftLaneEdge rd5 ra9, accelerator rd5 ra9, aheadLaneBearing rd5 ra9, stat3 of steeringWheel accel,
stat3 of steeringError crossLaneVelocity distToLeftLaneEdge aheadLaneBearing,
steeringWheel ent15 ra9, accelerator ent15 ra9, steeringWheel rv9, accelerator rv9,
distToLeftLaneEdge rv9, and crossLaneVelocity rv9
Variable Selection for Inattention
In variable selection experiments we are attempting to determine the relative importance of each variable to the task of inattention detection First a Random Forest classifier is trained for inattention detection (see also Sect 6.2) Variable importances can then be extracted from the trained forest using the approach that was outlined in Sect 5
We present the results in Tables 4 and 5 These tables provide answers to the question “Which sensors are most important in detecting driver’s inattention?” When just the two basic “driver control” sensors were used, some new derived variables may provide as much new information as the original signals, namely the running variance and entropy of steering When CAS sensors are combined, the situation changes: lane
Table 4.Important sensor signals for inattention detection derived from steering wheel and accelerator pedal
steeringWheel ent15 ra9 58.44
stat3 of steeringWheel accelerator 41.38
Table 5.Important sensor signals for inattention detection derived from steering wheel, accelerator pedal and CAS sensors
Stat3 of steeringError crossLaneVelocity distToLeftLaneEdge aheadLaneBearing 38.07
Trang 10position (distToLeftLaneEdge) becomes the most important variable together with the accelerator pedal Steering wheel variance becomes the most important variable related to steering
Inattention Detectors
Detection tasks always have a tradeoff between desired recall and precision Recall denotes the percentage
of total events of interest detected Precision denotes the percentage of detected events that are true events
of interest and not false detections A trivial classifier that classifies every instant as a true event would have 100% recall (since none were missed), but its precision would be poor On the other hand, if the classifier
is so tuned that only events having high certainty are classified as true events, the recall would be low, missing most of the events, but its precision would be high, since among those that were classified as true events, only a few would be false detections Usually any classifier has some means of tuning the threshold of detection Where that threshold will be set depends on the demands of the application It is also noteworthy
to mention that in tasks involving detection of rare events, overall classification accuracy is not a meaningful measure In our case only 7.3% of the database was inattention so a trivial classifier classifying everything
as attention would thus have an accuracy of 92.7% Therefore we will report our results using the recall and precision statistics for each class
First, we constructed a Random Forests (RF) classifier of 75 trees using either the driver controls or the driver controls combined with the CAS sensors Figure 9 depicts the resulting recall/precision graphs One simple figure of merit that allows comparison of two detectors is equal error rate, or equal accuracy, which denotes the intersection of recall and precision curves By that figure, basing the inattention detector only on driver control sensors results in an equal accuracy of 67% for inattention and 97% for attention, whereas adding CAS sensors raises the accuracies up to 80 and 98%, respectively For comparison, we used the same data to train a quadratic classifier [29] Compared to the RF classifier, the quadratic classifier performs poorly in this task We present the results in Table 6 for two different operating points of the quadratic classifier The first one (middle rows) is tuned not to make false alarms, but its recall rate remains low The second one is compensated for the less frequent occurrences of inattention, but it makes false alarms about 28% of the time Random Forest clearly outperforms the quadratic classifier with almost no false alarms and good recall
Fig 9. Precision/recall figures for detection of inattention using only driver controls (left) and driver controls combined with CAS sensors (right) Random Forest is used as the classifier Horizontal axis, prior for inattention,
denotes a classifier parameter which can be tuned to produce different precision/recall operating points This can
be thought of as a weight or cost given to missing inattention events Typically, if it is desirable not to miss events (high recall – in this case a high parameter value), the precision may be low – many false detections will be made Conversely, if it is desirable not to make false detections (high precision), the recall will be low – not all events will