This is especially valuable for mixed teams of humans and autonomous vehicles as well as for generating intelligent behavior of these ve-hicles in environments abounding with activities
Trang 13.4 Behavioral Capabilities for Locomotion 105
control in an active vision system By extending these types of explicit tions to all processes for perception, decision-making, and mission planning as well as mission performance and monitoring, a very flexible overall system will re-sult These aspects have been discussed here to motivate the need for both smooth parts of mission performance with nice continuity conditions alleviating percep-tion, and sudden changes in behavior where sticking to the previous mode would lead to failure (or probably disaster)
representa-Efficient dynamic vision systems have to take advantage of continuity tions as long as they prevail; however, they always have to watch out for disconti-nuities in motion, both of the subject’s body and of other object observed, to be able to adjust readily For example, a vehicle following the rightmost lane on a road can be tracked efficiently using a simple motion model However, when an obstacle occurs suddenly in this lane, for example, a ball or an animal running onto the road, there may be a harsh reaction to one side At this moment, a new motion phase begins, and it cannot be expected that the filter tuning for optimal tracking remains the same So the vision process for tracking (similar to the bouncing ball example in Section 2.3.2) has two distinctive phases which should be handled in parallel
condi-3.4.6.1 Smooth Evolution of a Trajectory
Continuity models and low-pass filtering components can help to easily track phases of a dynamic process in an environment without special events Measure-ment values with high-frequency oscillations are considered due to noise, which has to be eliminated in the interpretation process The natural sciences and engi-neering have compiled a wealth of models for different domains The methods de-scribed in this book have proven to be well suited for handling these cases on net-works of roads
However, in road traffic environments, continuity is interrupted every now and then due to initiation of new behavioral components by subjects and maybe by weather
3.4.6.2 Sudden Changes and Discontinuities
The optimal settings of parameters for smooth pursuit lead to unsatisfactory ing performance in cases of sudden changes The onset of a harsh braking maneu-ver of a car or a sudden turn may lead to loss of tracking or at least to a strong tran-sient motion estimated, especially so, if delay times in the visual perception process are large If the onsets of these discontinuities could be well predicted, a switch in model or tracking parameters at the right time would yield much better results The example of a bouncing ball has already been mentioned
track-In road traffic, the compulsory introduction of the braking (stop) lights serves the same purpose of indicating that there is a sudden change in the underlying be-havioral mode (deceleration) Braking lights have to be detected by vision for de-fensive driving; this event has to trigger a new motion model for the car at which it
is observed The level of braking is not yet indicated by the intensity of the braking lights There are some studies under way for the new LED-braking lights to couple
Trang 2the number of LEDs lighting up to the level of braking applied; this could help finding the right deceleration magnitude for the hypothesis of the observed braking vehicle and thus reduce transients
Sudden onsets of lateral maneuvers are supposed to be preceded by warning lights blinking at the proper side However, the reliability of behaving according to this convention is rather low in many parts of the world
As a general scheme in vision, it can be concluded that partially smooth sections and local discontinuities have to be recognized and treated with proper methods both in the 2-D image plane (object boundaries) and on the time line (events)
3.4.6.3 A Capability Network for Locomotion
The capability network shows how more complex behaviors depend on more basic ones and finally on the actuators available The timing (temporal sequencing) of their activation has to be learned by testing and corresponding feedback of errors occurring in the real world Figure 3.28 shows the capability network for locomo-tion of a wheeled ground vehicle Note that some of the parameters determining the trigger point for activation depend on visual perception and on other measurement values The challenges of system integration will be discussed in later chapters af-ter the aspects of knowledge representation have been discussed
Figure 3.28 Network of behavioral capabilities of a road vehicle: Longitudinal
and lateral control is fully separated only on the hardware level with three tors; many basic skills are realized by diverse parameterized feed-forward and feedback control schemes On the upper level, abstract schematic capabilities as triggered from “central decision” are shown [Maurer 2000, Siedersberger 2004]
actua-Brakes
Stand
still
Avoid obstacle
Keep speed
Constant steering angle
Actuators
Skills
Schematic capabilities
Halt Approach
Road running Waypoint
navigation Stop in
front of
along guide line
Turn to heading
Turn Ȝ
to Ȝ com
Accelerate Decelerate
Keep
distance
Throttle Longitudinal control Lateral control
Trang 33.6 Growth Potential of the Concept, Outlook 107
3.5 Situation Assessment and Decision-Making
Subjects differ from objects (proper) in that they have perceptual impressions from the environment and the capability of decision-making with respect to their control options For subjects, a control term appears in the differential equation constraints
on their motion activities, which allows them to influence their motion; this makes subjects basically different from objects
If decisions on control selection are not implicitly given in the code ing subject behavior, but may be made according to some explicit goal criteria,
implement-something like free will occurs in the behavior decision process of the subject
Be-cause of the fundamentally new properties of subjects, these require separate ods for knowledge representation and for combining this knowledge with actual perception to achieve their goals in an optimal fashion (however defined) The col-
meth-lection of all facts of relevance for decision-making is called the situation It is pecially difficult if other subjects, who also may behave at will to achieve their
es-goals, form part of this process; these behaviors are unknown, usually, but may be guessed sometimes from reasoning as for own decision-making
Some expectations for future behavior of other subjects can be derived from ing to understand the situation as it might look for oneself in the situation supposed
try-to be given for the other subject At the moment, this is beyond the actual state of the art of autonomous systems But the methods under development for the sub-ject’s decision-making will open up this avenue In the long run, capabilities of situation assessment of other subjects may be a decisive factor in the development
of really intelligent systems Subjects may group together, striving for common goals; this interesting field of group behavior taking real-world constraints into ac-count is even further out in the future than individual behavior But there is no doubt that the methods will become available in the long run
3.6 Growth Potential of the Concept, Outlook
The concept of subjects characterized by their capabilities in sensory perception, in data processing (taking large knowledge bases for object/subject recognition and situation assessment into account), in decision-making and planning as well as in behavior generation is very general Through an explicit representation of these ca-pabilities, avenues for developing autonomous agents with new mental capabilities
of learning and cooperation in teams may open up In preparation for this term goal, representing humans with all their diverse capabilities in this framework should be a good exercise This is especially valuable for mixed teams of humans and autonomous vehicles as well as for generating intelligent behavior of these ve-hicles in environments abounding with activities of humans, which will be the standard case in traffic situations
long-In road traffic, other subjects frequently encountered (at least in rural ments) beside humans are four-legged animals of different sizes: horses, cattle,
Trang 4environ-sheep, goats, deer, dogs, cats, etc.; birds and poultry are two-legged animals, many
of which are able to fly
Because of the eminent importance of humans and four-legged animals in any kind of road traffic, autonomous vehicles should be able to understand the motion capabilities of these living beings in the long run This is out into the future right now; the final section of this chapter shows an approach and first results developed
in the early 1990s for recognition of humans This field has seen many activities since the early work of Hogg (1984) in the meantime and has grown to a special area in technical vision; two recent papers with application to road traffic are [Ber-
tozzi et al 2004; Franke et al 2005]
3.6.1 Simple Model of Human Body as Traffic Participant
Elaborate models for the motion pabilities of human bodies are avail-able in different disciplines of physi-ology, sports, and computer animation [Alexander 1984; Bruderlin, Calvert 1989; Kroemer 1988] Humans
ca-as traffic participants with the ioral modes of walking, running, rid-ing bicycles or motor bikes as well as modes for transmitting information
behav-by waving their arms, possibly with additional instruments, show a much reduced set of stereotypical move-ments Kinzel (1994a, b), therefore, se-lected the articulated body model shown in Figure 3.29 to represent humans in traffic activities in connec-tion with the 4-D approach to dy-namic vision Visual recognition of moving humans becomes especially difficult due to the vast variety of clothing encountered and of objects carried For normal Western style clothing the cyclic activities of extremities are characteristic of humans moving Motion of limbs should be separated from body motion since they behave in dif-ferent modes and at different eigenfrequencies, usually
9 10
2, 3
lower
arms
upper torso
lower
0, 1 shoulders
2, 3 elbows
12 waist
6, 7 hips
8, 9 knees
10, 11 feet joints
Figure 3.29 Simple generic model for
hu-man shape with 22 degrees of freedom,
af-ter [Kinzel 1994]
Limbs tend to be used in typical cyclic motion, while the body moves more steadily The rotational movements of limbs may be in the same or in opposite di-rection depending on the style and the phase of grasping or running
Figure 3.30 shows early results achieved with the lower part of the body model from Figure 3.29; cyclic motion of the upper leg (hip angle, amplitude § 60°, upper graph) and the lower leg (knee angle, amplitude § 100°, bottom graph) has been recognized roughly in a computer simulation with real-time image sequence
Trang 53.6 Growth Potential of the Concept, Outlook 109
Fig 3.31 Quantitative recognition of motion parameters of a human leg while
running: simulation with real image sequence processing (after [Kinzel 1994]).
Figure 3.30 Quantitative recognition of motion parameters of a human
leg while running: simulation with real image sequence processing, after [Kinzel 1994]
evaluation and tracking At that time, microprocessor resources were not sufficient
to do this onboard a car in real time (at least a factor of 5 was missing) In the meantime, computing power has increased by more than two orders of magnitude per processor, and human gesture recognition has attracted quite a bit of attention Also the wide-spread activities in computer animation with humanoid robots, and especially the demanding challenge of the humanoid robo-cup league have ad-vanced this field considerably, lately
From the field last-mentioned and from analysis of sports as well as dancing tivities there will be a pressure towards automatically recognizing human (-oid) motion This field can be considered developing on its own; application within semi-autonomous road or autonomous ground vehicles will be more or less a side product The knowledge base for these application areas of ground vehicles has to
ac-be developed as a specific effort, however In case of construction sites or accident areas with human traffic regulation, future (semi-) autonomous vehicles should
Trang 6also have the capability of proper understanding of regulatory arm gestures and of proper behavior in these unusual situations Recognizing grown-up people and children wearing various clothing and riding bicycle or carrying bulky loads will remain a challenging task
3.6.2 Ground Animals and Birds
Beside humans, two superclasses of other animals play a role in rural traffic: legged animals of various sizes and with various styles of running, and birds (from crows, hen, geese, turkeys, to ostrich), most of which can fly and run or hop on the ground This wide field of subjects has hardly been touched for technical vision systems In principle, there is no basic challenge for successful application of the 4-D approach In practice, however, a huge volume of work lies ahead until techni-cal vision systems will perceive animals reliably
Trang 7Four-4 Application Domains, Missions, and Situations
In the previous chapters, the basic tools have been treated for representing objects and subjects with homogeneous coordinates in a framework of the real 3-D world and with spatiotemporal models for their motion Their application in combination with procedural computing methods will be the subject of Chapters 5 and 6 The result will be an estimated state of single objects/subjects for the point “here and now” during the visual observation process These methods can be applied multiple
times in parallel to n objects in different image regions representing different
spa-tial angles of the world around the set of cameras
Vision is not supposed to be a separate exercise of its own but to serve some purpose in a task or mission context of an acting individual (subject) For deeper understanding of what is being seen and perceived, the goals of egomotion and of other moving subjects as well as the future trajectories of objects tracked should be known, at least vaguely Since there is no information exchange between oneself and other subjects, usually, their future behavior can only be hypothesized based
on the situation given and the behavioral capabilities of the subjects observed However, out of the set of all objects and subjects perceived in parallel, generally only a few are of direct relevance to their own plans of locomotion
To be efficient in perceiving the environment, special attention and thus tual resources and computing power for understanding should be concentrated on the most important objects/subjects The knowledge needed for this decision is quite different from that one needed for visual object and state recognition The de-cision has to take into account the mission plan and the likely behavior of other subjects nearby as well as the general environmental conditions (like quality of visual perception, weather conditions and likely friction coefficient for maneuver-ing, as well as surface structure) In addition, the sets of rules for traffic regulation valid in the part of the world, where the vehicle is in operation, have to be taken into account
percep-4.1 Structuring of Application Domains
To survey where the small regime, onto which the rest of the book will be trating, fits in the overall picture, first (contributions to) a loosely defined ontology for ground vehicles will be given Appendix A shows a structured proposal which,
concen-of course, is only one concen-of many possible approaches Here, only some aspects concen-of certain missions and application domains are discussed to motivate the items se-
Trang 8lected for presentation in this book An all-encompassing and complete ontology for ground vehicles would be desirable but has not yet been assembled in the past
From the general environmental conditions grouped under A.1, up to now only
a few have been perceived explicitly by sensing, relying on the human operator to take care for the rest More autonomous systems have to have perceptual capabili-ties and knowledge bases available to be able to recognize more of them by them-selves Contrary to humans, intelligent vehicles will have much more extended ac-
cess to satellite navigation (such as GPS now or Galileo in the future) In
combination with digital maps and geodetic information systems, this will allow them improved mission planning and global orientation
Obstacle detection both on roads and in cross-country driving has to be formed by local perception since temporal changes are too fast, in general, to be re-liably represented in databases; this will presumably also be the fact in the future
per-In cross-country driving, beside the vertical surface profiles in the planned tracks for the wheels, the support qualities of the ground for wheels and tracks also have
to be estimated from visual appearance This is a very difficult task, and decisions should always be on the safe side (avoid entering uncertain regions)
Representing national traffic rules and regulations (Appendix A.1.1) is a
straightforward task; their ranges of validity (national boundaries) have to be stored in the corresponding databases One of the most important facts is the gen-
eral rule of right- or left-hand traffic Only a few traffic signs like stop and one-way
are globally valid With speed signs (usually a number on a white field in a red cle) the corresponding dimension has to be inferred from the country one is in (km/h in continental Europe or mph in the United Kingdom or the United States,
cir-etc.).
Lighting conditions (Appendix A.1.2) affect visual perception directly The
dy-namic range of light intensity in bright sunshine with snow and harsh shadows on dark ground can be extremely large (more than six orders of magnitude may be en-countered) Special high-dynamic-range cameras (HDRC) have been developed to cope with the situation The development is still going on, and one has to find the right compromise in the price-performance trade-off To perceive the actual situa-tion correctly, representing the recent time history of lighting conditions and of po-
tential disturbances from the environment may help Weather conditions (e.g., blue
skies) and time of day in connection with the set of buildings in the vicinity of the
trajectory planned (tunnel, underpass, tall houses, etc.) may allow us to estimate
expected changes which can be counteracted by adjusting camera parameters or viewing directions The most pleasant weather condition for vision is an overcast sky without precipitation
In normal visibility, contrasts in the scene are usually good Under foggy tions, contrasts tend to disappear with increasing distance The same is true at dusk
condi-or dawn when the light intensity level is low Features linked to intensity gradients tend to become unreliable under these conditions To better understand results in state estimation of other objects from image sequences (Chapters 5 and 6), it is therefore advantageous to monitor average image intensities as well as maximal and minimal intensity gradients This may be done over entire images, but comput-ing these characteristic values for certain image regions in parallel (such as sky or larger shaded regions) gives more precise results
Trang 94.1 Structuring of Application Domains 113
It is recommended to have a steady representation available of intensity tics and their trends in the image sequence: Averages and variances of maximum and minimum image intensities and of maximum and minimum intensity gradients
statis-in representative regions When surfaces are wet and the sun comes out, light flections may lead to highlights Water surfaces (like puddles) rippled by wind may exhibit relatively large glaring regions which have to be excluded from image in-terpretation for meaningful results Driving toward a low standing sun under these conditions can make vision impossible When there are multiple light sources like
re-at night in an urban area, regions with stable visual fere-atures have to be found lowing tracking and orientation by avoiding highlighted regions
al-Headlights of other vehicles may also become hard to deal with in rainy tions Backlights and stoplights when braking are relatively easy to handle but re-quire color cameras for proper recognition In RGB-color representation, stop lights are most efficiently found in the R-image, while flashing blue lights on vehi-cles for ambulance or police cars are most easily detected in the B-channel Yellow
condi-or condi-orange lights fcondi-or signaling intentions (turn direction indicatcondi-ors) require tion of several RGB channels or just the intensity signal Stationary flashing lights
evalua-at construction sites (light sequencing, looking like a hopping light) for indicevalua-ation
of an unusual traffic direction require good temporal resolution and correlation with subject vehicle perturbations to be perceived correctly
Recognition of weather conditions (Appendix A.1.3) is especially important
when they affect the interaction of the vehicle with the ground (acceleration, eration through friction between tires and surface material) Recognizing and ad-justing behavior to rain, hail, and snow conditions may prevent accidents by cau-tious driving Slush and loose or wet dirt or gravel on the road may have similar effects and should thus be recognized Heavy winds and gusts can have a direct ef-fect on driving stability; however, they are not directly visible but only by secon-dary effects like dust or leaves whirling up or by moving grass surfaces and plants
decel-or branches of trees Advanced vision systems should be able to perceive these weather conditions (maybe supported by inertial sensors directly feeling the accel-erations on the body) Recognizing fine shades of texture may be a capability for achieving this; at present, this is beyond the performance level of microprocessors available at low cost, but the next decade may open up this avenue
Roadway recognition (Appendix A.2) has been developed to a reasonable state
since recursive estimation techniques and differential geometry descriptions have been introduced two decades ago For freeways and other well-kept, high-speed roads (Appendices A.2.1 and A.2.2), lane and road recognition can be considered state of the art Additional developments are still required for surface state recogni-tion, for understanding the semantics of lane markings, arrows, and other lines painted on the road as well as detailed perception of the infrastructure along the road This concerns repeating poles with different reflecting lights on both sides of the roadway, the meaning of which may differ from one country to the next, and guiderails on road shoulders and many different kinds of traffic and navigation signs which have to be distinguished from advertisements On these types of roads there is only unidirectional traffic (one-way), usually, and navigation has to be done by proper lane selection
Trang 10On ordinary state roads with two-way traffic (Appendix A.2.3) the perceptual
capabilities required are much more demanding Checking free lanes for passing has to take oncoming traffic with high speed differences between vehicles and the type of central lane markings into account With speeds allowed of up to 100 km/h
in each direction, relative speed can be close to 60 m/s (or 2.4 m per video cycle of
40 ms) A 4-second passing maneuver thus requires about 250 m look-ahead range, way beyond what is found in most of today’s vision systems With the resolution required for object recognition and the perturbation level in pitch due to nonflat ground, inertial stabilization of gaze direction seems mandatory
These types of roads may be much less well kept Lane markings may be duced to a central line indicating by its type whether passing is allowed (dashed line) or not (solid line) To the sides of the road, there may be potholes to be avoided; sometimes these may be found even on the road itself
re-On all of these types of road, for short periods after (re-) construction there may
be no lane markings at all In these cases, vehicles and drivers have to orient selves according to road width and to the distance from “their” side of the sealed surface “Migrating construction sites” like for lane marking may be present and have to be dealt with properly The same is true for maintenance work or for grass cutting in the summer
them-Unmarked country roads (Appendix A.2.4) are usually narrow, and oncoming
traffic may require slowing down and touching the road shoulders with their outer wheels The road surface may not be well kept, with patches of dirt and high-spatial frequency surface perturbations The most demanding item, however, may
be the many different kinds of subjects on the road: People and children walking,
running and bicycling, carrying different types of loads or guarding animals Wild animals range from hares to deer (even moose in northern countries) and birds feeding on cadavers
On unsealed roads (Appendix A.2.5) where speed driven is much slower,
usu-ally, in addition to the items mentioned above, the vertical surface structure comes of increasing interest due to its unstable nature Tracks impressed into the surface by heavily loaded vehicles can easily develop, and the likelihood of pot-holes (even large ones into which wheels of usual size will fit) requires stereovi-sion for recognition, probably with sequential view fixation on especially interest-ing areas
be-Driving cross-country, tracks (Appendix A.2.6) can alleviate the task in that
they show where the ground is sufficiently solid to support a vehicle However, due to non-homogeneous ground properties, vertical curvature profiles of high spa-tial frequency may have developed and have to be recognized to adjust speed so that the vehicle is not bounced around losing ground contact After a period of rain when the surface tends to be softer than usual, it has to be checked whether the tracks are not so deep that the vehicle touches the ground with its body when the wheels sink into the track Especially, tracks filled with water pose a difficult chal-lenge for decision-making
In Appendix A.2.7, all infrastructure items for all types of roads are collected to
show the gamut of figures and objects which a powerful vision system for traffic application should be able to recognize Some of these are, of course, specific to certain regions of the world (or countries) There have to be corresponding data
Trang 114.1 Structuring of Application Domains 115
bases and algorithms for recognizing these items; they have to be swapped when entering a zone with new regulations
In section Appendix A.3 the different types of vehicles are listed They have to
be recognized and treated according to their form (shape), appearance and function
of the vehicle (Appendix A.4) This type of structuring may not seem systematic at first glance There is, of course, one column like A.4 for each type of vehicle under A.3 Since this book concentrates on the most common wheeled vehicles (cars and trucks), only these types are discussed in more detail here Geometric size and 3-D shape (Appendix A.4.1) have been treated to some extent in Section 2.2.3 and will
be revisited for recognition in Chapters 7 to 10
Subpart hierarchies (Appendix A.4.2) are only partially needed for vehicles
driving, but when standing, open doors and hoods may yield quite different pearances of the same vehicle The property of glass with respect to mirroring of light rays has a fundamental effect on features detected in these regions Driving through an environment with tall buildings and trees at the side or with branches partially over the road may lead to strongly varying features on the glass surfaces
ap-of the vehicle, which have nothing to do with the vehicle itself These regions should, therefore, be discarded for vehicle recognition, in general On the other hand, with low light levels in the environment, the glass surfaces of the lighting elements on the front and rear of the vehicle (or even highlights on windscreens) may be the only parts discernible well and moving in conjunction; under these en-vironmental conditions, these groups are sufficient indication for assuming a vehi-cle at the location observed
Variability of image shape over time depending on the 3-D aspect conditions of
the 3-D object “vehicle” (Appendix A.3) is important knowledge for recognizing and tracking vehicles When machine vision was started in the second half of the last century, some researchers called the appearance or disappearance of features due to self-occlusion a “catastrophic event” because the structure of their (insuffi-cient) algorithm with fixed feature arrangements changed In the 4-D approach where objects and aspect conditions are represented as in reality and where tempo-ral changes also are systematically represented by motion models, there is nothing exciting with the appearance of new or disappearance of previously stable features
It has been found rather early that whenever the aspect conditions bring two tures close to each other so that they may be confused (wrong feature correspon-dence), it is better to discard these features altogether and to try to find unambigu-ous ones [Wünsche 1987] The recursive estimation process to be discussed in Chapter 6 will be perturbed by wrong feature correspondence to a larger extent than by using slightly less well-suited, but unambiguous features Grouping re-gimes of aspect conditions with the same highly recognizable set of features into classes is important knowledge for hypothesis generation and tracking of objects When detecting new feature sets in a task domain, it may be necessary to start more than one object hypothesis for fast recognition of the object observed Such 4-D object hypotheses allow predicting other features which should be easily visi-ble; in case they cannot be found in the next few images, the hypothesis can be dis-carded immediately An early jump to several 4-D hypotheses thus has advantages over too many feature combinations before daring an object hypothesis (known as
fea-a combinfea-atorifea-al explosion in the vision literfea-ature)
Trang 12Photometric appearance (Appendix A.4.4) can help in connection with the
as-pect conditions to find out the proper hypothesis Intensity and color shading as well as high resolution in texture discrimination contribute positively to eliminat-ing false object hypotheses Computing power and algorithms are becoming avail-able now for using these region-based features efficiently The last four sections discussed are concerned with single object (vehicle) recognition based on image sequence analysis In our approach, this is done by specialist processes for certain
object classes (roads and lanes, other vehicles, landmarks, etc.).
When it comes to understanding the semantics of processes observed, the tionality aspects (Appendix A.4.5) prevail For proper recognition, observations
func-have to be based on spatially and temporally more extended representation Trying
to do this with data-intensive images is not yet possible today, and maybe even not desirable in the long run for data efficiency and corresponding delay times in-volved For this reason, the results of perceiving single objects (subjects) “here and now” directly from image sequence analysis with spatiotemporal models are col-lected in a “dynamic object database” (DOB) in symbolic form Objects and sub-jects are represented as members of special classes with an identification number, their time of appearance, and their relative state defined by homogeneous coordi-nates, as discussed in Section 2.1.1 Together with the algorithms for homogeneous coordinate transformations and shape computation, this represents a very compact but precise state and shape description Data volumes required are decreased by two to three orders of magnitude (KB instead of MB) Time histories of state vari-ables are thus manageable for several (the most important) objects/subjects ob-served
For subjects, this allows recognizing and understanding maneuvers and iors of which one knows members of this type of subject class are capable (Appen-
behav-dix A.4.6) Explicit representations of perceptual and behavioral capabilities of subjects are a precondition for this performance level Tables 3.1 and 3.3 list the most essential capabilities and behavioral modes needed for road traffic partici-pants Based on data in the ring-buffer of the DOB for each subject observed, this background knowledge now allows guessing the intentions of the other subject This qualitatively new information may additionally be stored in special slots of the subject’s representation Extended observations and comparisons to standards for decisions–making and behavior realization now allows attributing additional characteristic properties to the subject observed Together with the methods avail-able for predicting movements into the future (fast-in-advance simulation), this al-lows predicting the likely movements of the other subject; both results can be compared and assessed for dangerous situations encountered Thus, real-time vi-sion as propagated here is an animation process with several individuals based on previous (actual) observations and inferences from a knowledge base of their inten-tions (expected behavior)
This demanding process cannot be performed for all subjects in sight but is fined to the most relevant ones nearby Selecting and perceiving these most rele-vant subjects correctly and focusing attention on them is one of the decisive tasks
con-to be performed steadily The judgment, which subject is most relevant, also pends on the task to be performed When just cruising with ample time available, the situation is different from the same cruising state in the leftmost of three lanes,
Trang 13de-4.2 Goals and Their Relations to Capabilities 117
but an exit at the right is to be taken in the near future On a state road, cruising in the rightmost lane but having to take a turnoff to the left from the leftmost lane yields a similar situation So the situation is not just given by the geometric ar-rangement of objects and subjects but also depends on the task domain and on the intentions to be realized
Making predictions for the behavior of other subjects is a difficult task,
espe-cially when their perceptual capabilities (Appendix A.4.7) and those for planning and decision-making (Appendix A.4.8) are not known This may be the case with
respect to animals in unknown environments These topics (Appendix A.6) and the well-known but very complex appearance and behavior of humans (Appendix A.5) are not treated here
Appendix A.7 is intended to clarify some notions in vehicle and traffic control for which different professional communities have developed different terminol-ogies (Unfortunately, it cannot be assumed that, for example, the terms “dynamic system” or “state” will be understood with the same meaning by one person from the computer science and a second one from the control engineering communities.)
4.2 Goals and Their Relations to Capabilities
To perform a mission efficiently under perturbations, both the goal of the mission together with some quality criteria for judging mission performance and the capa-bilities needed to achieve them have to be known
The main goal of road vehicle traffic is to transport humans or goods from point
A to point B safely and reliably, observing some side constraints and maybe some optimization criteria A smooth ride with low values of the time integrals of (longi-tudinal and lateral) acceleration magnitudes (absolute values) is the normal way of
driving (avoiding hectic control inputs) For special missions, e.g., on ambulance
or touring sightseers, these integrals should be minimized
An extreme type of mission is racing, exploiting vehicle capabilities to the most and probably reducing safety by taking more risks Minimal fuel consumption
ut-is the other extreme where travel time ut-is of almost no concern
Safety and collision avoidance even under adverse conditions and in totally expected situations is the most predominant aspect of vehicle guidance Driving at lower speed very often increases safety; however, on high-speed roads during heavy traffic, it can sometimes worsen safety Going downhill, the additional thrust from gravity has to be taken into account which may increase braking distance considerably When entering a crossroad or when starting a passing maneuver on a road with two-way traffic, estimation of the speed of other vehicles has to be done with special care, and an additional safety margin for estimation errors should be allowed Here, it is important that the acceleration capabilities of the subject vehi-cle under the given conditions (actual mass, friction coefficient, power reserves) are well known and sufficient
un-When passing on high-speed roads with multiple lanes, other vehicles in the convoy being passed sometimes start changing into your lane at short distances,
Trang 14without using indication signs (blinker); even these critical situations not ing to standard behavior have to be coped with successfully
conform-4.3 Situations as Precise Decision Scenarios
The definition for “situation” used here is the following: A situation encompasses all aspects of relevance for decision-making in a given scenario and mission con- text This includes environmental conditions affecting perception and limit values
for control application (such as wheel to ground friction coefficients) as well as the set of traffic regulations actually valid that have been announced by traffic signs
(maximum speed allowed, passing prohibited, etc.) With respect to other jects/subjects, a situation is not characterized by a single relation to one other unit but to the total number of objects of relevance Which of those detected and
ob-tracked are relevant is a difficult decision Even the selected regions of special tention are of importance The objects/subjects of relevance are not necessarily the nearest ones; for example, driving at higher speed, some event happening at a far-ther look-ahead distance than the two preceding vehicles may be of importance: A patch of dense fog or a front of heavy rain or snow can be detected reliably at rela-tively long distance One should start reacting to these signs at a safe distance ac-cording to independent judgment and not only when the preceding vehicles start their reactions
at-Some situational aspects can be taken into account during mission planning For
example, driving on roads heading into the low-standing sun at morning or evening
should be avoided by proper selection of travel time Traffic congestion during rush hour also may be avoided by proper timing Otherwise, the driver/autonomous vehicle has to perceive the indicators for situational aspects, and from a knowledge base, the proper behavior has to be selected The three components required to per-form this reliably are discussed in the sections below: Environmental background, objects/subjects of relevance, and the rule systems for decision-making Beside the
rules for handling planned missions, another set of perceptual events has to be
monitored which may require another set of rules to be handled for selecting proper reactions to these events
4.3.1 Environmental Background
This has not received sufficient attention in the recent past since, at first, the basic capabilities of perceiving roads and lanes as well as other vehicles had to be dem-onstrated Computing power for including at least some basic aspects of environ-mental conditions at reasonable costs is now coming along In Section 4.1 and Ap-pendix A.1.2 (lighting conditions)/A.1.3 (weather conditions), some aspects have already been mentioned Since these environmental conditions change rather slowly, they may be perceived at a low rate (in the range of seconds to minutes)
An economical way to achieve this may be to allot remaining processing time per video cycle of otherwise dedicated image processing computers to this “environ-
Trang 154.3 Situations as Precise Decision Scenarios 119
mental processing” algorithm These low-frequency results should be made able to all other processes by providing special slots in the DOB and depositing the values with proper time stamps The situation assessment algorithm has to check these values for decision-making regularly
avail-The specialist processes for visual perception should also have a look at them to adjust parameters in their algorithms for improving results In the long run, a direct feedback component for learning may be derived Perceiving weather conditions through textures may be very computer-intensive; once the other basic perception tasks for road and other vehicles run sufficiently reliable, additional computing power becoming available may be devoted to this task, which again can run at a very low rate Building up a knowledge base for the inference from distributed tex-tures in the images toward environmental conditions will require a large effort This includes transitions in behavior required for safe mission performance
4.3.2 Objects/Subjects of Relevance
A first essential step is to direct attention (by gaze control and corresponding age evaluation) to the proper environmental regions, depending on the mission element being performed This is, of course, different for simple roadrunning, for preparing lane changes, or for performing a turnoff maneuver Turning off to the left on roads with oncoming (right-hand) traffic is especially demanding since their lane has to be crossed
im-Driving in urban environments with right-of-way for vehicles on crossroads coming from the right also requires special attention (looking into the road) Enter-ing traffic circles requires checking traffic in the circle, because these vehicles
have the right-of-way Especially difficult are 4-way-stops in use in some
coun-tries; here the right-of-way depends on the time of reaching the stop–lines on all four incoming roads
Humans may be walking on roads through populated areas and in stop-and-go
traffic On state, urban and minor roads, humans may ride bicycles, may be roller skating, jogging, walking, or leisurely strolling Children may be playing on the road Recognizing these situations with their semantic context is actually out of range for machine vision However, detecting and recognizing moving volumes (partially) filled with massive bodies is in the making and will become available soon for real-time application Avoiding these areas with a relatively large safety margin may be sufficient for driver assistance and even for autonomous driving Some nice results for assistance in recognizing humans crossing in front of the ve-hicle (walking or biking) have been achieved in the framework of the project “In-vent”[Franke et al 2005]
With respect to animals on the road, there are no additional principal difficulties for perception except the perhaps erratic motion behavior some of these animals may show Birds can both move on the ground and lift off for flying; in the transi-tion period there are considerable changes in their appearance Both their shapes and the motion characteristics of their limbs and wings will change to a large ex-tent