1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Handbook of Industrial Automation - Richard L. Shell and Ernest L. Hall Part 9 pot

37 368 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Machine Vision Fundamentals
Tác giả Guda et al.
Chuyên ngành Industrial Automation
Thể loại tài liệu hướng dẫn
Năm xuất bản 2000
Định dạng
Số trang 37
Dung lượng 908,12 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

2.4.5 Image Recognition and Decisions 2.4.5.1 Neural Networks Arti®cial neural networks ANNs can be used in image processing applications.. 2.4.6 Image Processing Applications Arti®cial

Trang 1

basic operations, like linear ®ltering and modulations,

are easily described in the Fourier domain A common

example of Fourier transforms can be seen in the

appearance of stars A star lools like a small point of

twinkling light However, the small point of light we

observe is actually the far-®eld Fraunhoffer diffraction

pattern or Fourier transform of the image of the star

The twinkling is due to the motion of our eyes The

moon image looks quite different, since we are close

enough to view the near-®eld or Fresnel diffraction

pattern

While the most common transform is the Fourier

transform, there are also several closely related

trans-forms The Hadamard, Walsh, and discrete cosinetransforms are used in the area of image compression.The Hough transform is used to ®nd straight lines in abinary image The Hotelling transform is commonlyused to ®nd the orientation of the maximum dimension

of an object [5]

2.4.2.1 Fourier TransformThe one-dimensional Fourier transform may be writ-ten as

F…u† ˆ

…1

1

Figure 6 Images at various gray-scale quantization ranges

Figure 7 Digitized image

Figure 8 Color cube shows the three-dimensional nature ofcolor

Figure 9 Image surface and viewing geometry effects

Trang 2

In the two-dimensional case, the Fourier transform

and its corresponding inverse representation are:

The discrete two-dimensional Fourier transform and

corresponding inverse relationship may be written as

a linear, position invariant system are related by aconvolution, is an important principle The basic idea

of convolution is that if we have two images, for ple, pictures A and B, then the convolution of A and Bmeans repeating the whole of A at every point in B, orvice versa An example of the convolution theorem isshown inFig 12 The convolution theorem enables us

exam-to do many important things During the Apollo 13space ¯ight, the astronauts took a photograph of theirdamaged spacecraft, but it was out of focus Imageprocessing methods allowed such an out-of-focus pic-ture to be put back into focus and clari®ed

2.4.3 Image EnhancementImage enhancement techniques are designed to improvethe quality of an image as perceived by a human [1].Some typical image enhancement techniques includegray-scale conversion, histogram, color composition,etc The aim of image enhancement is to improve theinterpretability or perception of information in imagesfor human viewers, or to provide ``better'' input forother automated image processing techniques

2.4.3.1 HistogramsThe simplest types of image operations are pointoperations, which are performed identically on eachpoint in an image One of the most useful point opera-tions is based on the histogram of an image

Figure 10 Diffuse surface re¯ection

Figure 11 Specular re¯ection

Trang 3

the image enables us to generate another image with a

gray-level distribution having a uniform density

This transformation can be implemented by a

three-step process:

1 Compute the histogram of the image

2 Compute the cumulative distribution of the

gray levels

3 Replace the original gray-level intensities using

the mapping determined in 2

After these processes, the original image, shown in Fig

13, can be transformed, and scaled and viewed as

shown in Fig 16 The new gray-level value set Sk,

which represents the cumulative sum, is

Skˆ …1=7; 2=7; 5=7; 5=7; 5=7; 6=7; 6=7; 7=7†

Histogram Speci®cation Even after the equalization

process, certain levels may still dominate the image so

that the eye cannot interpret the contribution of the

other levels One way to solve this problem is to specify

a histogram distribution that enhances selected gray

levels relative to others and then reconstitutes the

ori-ginal image in terms of the new distribution For

exam-ple, we may decide to reduce the levels between 0 and

2, the background levels, and increase the levels

between 5 and 7 correspondingly After the similar

step in histogram equalization, we can get the newgray levels set Sk0:

Sk0 ˆ …1=7; 5=7; 6=7; 6=7; 6=7; 6=7; 7=7; 7=7†

By placing these values into the image, we can get thenew histogram-speci®ed image shown inFig 17.Image Thresholding This is the process of separating

an image into different regions This may be basedupon its gray-level distribution.Figure 18 shows how

an image looks after thresholding The percentage

Figure 15 An example of histogram equalization (a) Original image, (b) histogram, (c) equalized histogram, (d) enhanced image

Figure 16 Original image before histogram equalization

Trang 4

Next, we shift the window one pixel to the right and

repeat the calculation After calculating all the pixels in

the line, we then reposition the matrix one pixel down

and repeat this procedure At the end of the entire

process, we have a set of T values, which enable us

to determine the existence of the edge Depending on

the values used in the mask template, various effects

such as smoothing or edge detection will result

Since edges correspond to areas in the image where

the image varies greatly in brightness, one idea would

be to differentiate the image, looking for places where

the magnitude of the derivative is large The only

drawback to this approach is that differentiation

enhances noise Thus, it needs to be combined with

smoothing

Smoothing Using Gaussians One form of smoothing

the image is to convolve the image intensity with a

gaussian function Let us suppose that the image is

of in®nite extent and that the image intensity is

I…x; y† The Gaussian is a function of the form

G…x; y† ˆ 1

22e …x 2 ‡y 2 †=2 2

…12†

The result of convolving the image with this function is

equivalent to lowpass ®ltering the image The higher

the sigma, the greater the lowpass ®lter's effect The

®ltered image is

I…x; y† ˆ I…x; y†  G…x; y† …13†

One effect of smoothing with a Gaussian function is a

reduction in the amount of noise, because of the low

pass characteristic of the Gaussian function Figure 20

shows the image with noise added to the original, Fig

19

Figure 21 shows the image ®ltered by a lowpass

Gaussian function with  ˆ 3

Vertical Edges To detect vertical edges we ®rst volve with a Gaussian function and then differentiate

con-I…x; y† ˆ I…x; y†  G…x; y† …14†the resultant image in the x-direction This is the same

as convolving the image with the derivative of thegaussian function in the x-direction that is

x22e …x 2 ‡y 2 †=2 2

…15†Then, one marks the peaks in the resultant images thatare above a prescribed threshold as edges (the thresh-old is chosen so that the effects of noise are mini-mized) The effect of doing this on the image of Fig

21 is shown inFig 22

Horizontal Edges To detect horizontal edges we ®rstconvolve with a Gaussian and then differentiate theresultant image in the y-direction But this is thesame as convolving the image with the derivative ofthe gaussian function in the y-direction, that isy

Figure 19 A digital image from a camera

Figure 20 The original image corrupted with noise

Figure 21 The noisy image ®ltered by a Gaussian of variance3

Trang 5

Stereometry This is the technique of deriving a range

image from a stereo pair of brightness images It has

long been used as a manual technique for creating

elevation maps of the earth's surface

Stereoscopic Display If it is possible to compute a

range image from a stereo pair, then it should be

pos-sible to generate a stereo pair given a single brightness

image and a range image In fact, this technique makes

it possible to generate stereoscopic displays that give

the viewer a sensation of depth

Shaded Surface Display By modeling the imaging

system, one can compute the digital image that

would result if the object existed and if it were digitized

by conventional means Shaded surface display grew

out of the domain of computer graphics and has

devel-oped rapidly in the past few years

2.4.5 Image Recognition and Decisions

2.4.5.1 Neural Networks

Arti®cial neural networks (ANNs) can be used in

image processing applications Initially inspired by

biological nervous systems, the development of

arti®-cial neural networks has more recently been motivated

by their applicability to certain types of problem and

their potential for parallel processing implementations

Biological Neurons There are about a hundred

bil-lion neurons in the brain, and they come in many

dif-ferent varieties, with a highly complicated internal

structure Since we are more interested in large

net-works of such units, we will avoid a great level of

detail, focusing instead on their salient computational

features A schematic diagram of a single biological

neuron is shown in Fig 27

The cells at the neuron connections, or synapses,

receive information in the form of electrical pulses

from the other neurons The synapses connect to thecell inputs, or dendrites, and form an electrical signaloutput of the neuron is carried by the axon An elec-trical pulse is sent down the axon, or the neuron

``®res,'' when the total input stimuli from all of thedendrites exceeds a certain threshold Interestingly,this local processing of interconnected neurons results

in self-organized emergent behavior

Arti®cial Neuron Model The most commonly usedneuron model, depicted in Fig 28, is based on the

Figure 26 Edges of the original image

Figurer 27 A schematic diagram of a single biologicalneuron

Figure 28 ANN model proposed by McCulloch and Pitts in1943

Trang 6

model proposed by McCulloch and Pitts in 1943 [11].

In this model, each neuron's input, a1 an, is weighted

by the values wi1 win A bias, or offset, in the node is

characterized by an additional constant input w0 The

output, ai, is obtained in terms of the equation

Feedforward and Feedback Networks Figure 29

shows a feedforward network in which the neurons

are organized into an input layer, hidden layer or

layers, and an output layer The values for the input

layer are set by the environment, while the output layer

values, analogous to a control signal, are returned to

the environment The hidden layers have no external

connections, they only have connections with other

layers in the network In a feedforward network, a

weight wij is only nonzero if neuron i is in one layer

and neuron j is in the previous layer This ensures that

information ¯ows forward through the network, from

the input layer to the hidden layer(s) to the output

layer More complicated forms for neural networks

exist and can be found in standard textbooks

Training a neural network involves determining the

weights wij such that an input layer presented with

information results in the output layer having a correct

response This training is the fundamental concern

when attempting to construct a useful network

Feedback networks are more general than

feedfor-ward networks and may exhibit different kinds of

behavior A feedforward network will normally settle

into a state that is dependent on its input state, but a

feedback network may proceed through a sequence of

states, even though there is no change in the externalinputs to the network

2.4.5.2 Supervised Learning and Unsupervised

LearningImage recognition and decision making is a process ofdiscovering, identifying, and understanding patternsthat are relevant to the performance of an image-based task One of the principal goals of image recog-nition by computer is to endow a machine with thecapability to approximate, in some sense, a similarcapability in human beings For example, in a systemthat automatically reads images of typed documents,the patterns of interest are alphanumeric characters,and the goal is to achieve character recognition accu-racy that is as close as possible to the superb capabilityexhibited by human beings for performing such tasks.Image recognition systems can be designed andimplemented for limited operational environments.Research in biological and computational systems iscontinually discovering new and promising theories

to explain human visual cognition However, we donot yet know how to endow these theories and appli-cations with a level of performance that even comesclose to emulating human capabilities in performinggeneral image decision functionality For example,some machines are capable of reading printed, prop-erly formatted documents at speeds that are orders ofmagnitude faster than the speed that the most skilledhuman reader could achieve However, systems of thistype are highly specialized and thus have little extend-ibility That means that current theoretical and imple-mentation limitations in the ®eld of image analysis anddecision making imply solutions that are highly pro-blem dependent

Different formulations of learning from an ment provide different amounts and forms of informa-tion about the individual and the goal of learning Wewill discuss two different classes of such formulations

environ-of learning

Supervised Learning For supervised learning, a

``training set'' of inputs and outputs is provided Theweights must then be determined to provide the correctoutput for each input During the training process, theweights are adjusted to minimize the differencebetween the desired and actual outputs for eachinput pattern

If the association is completely prede®ned, it is easy

to de®ne an error metric, for example mean-squarederror, of the associated response This is turn gives usthe possibility of comparing the performance with the

Figure 29 A feedforward neural network

Trang 7

prede®ned responses (the ``supervision''), changing the

learning system in the direction in which the error

diminishes

Unsupervised Learning The network is able to

dis-cover statistical regularities in its input space and can

automatically develop different modes of behavior to

represent different classes of inputs In practical

appli-cations, some ``labeling'' is required after training,

since it is not known at the outset which mode of

behavior will be associated with a given input class

Since the system is given no information about the

goal of learning, all that is learned is a consequence

of the learning rule selected, together with the

indivi-dual training data This type of learning is frequently

referred to as self-organization

A particular class of unsupervised learning rule

which has been extremely in¯uential is Hebbian

learn-ing [12] The Hebb rule acts to strengthen often-used

pathways in a network, and was used by Hebb to

account for some of the phenomena of classical

con-ditioning

Primarily some type of regularity of data can be

learned by this learning system The associations

found by unsupervised learning de®ne representations

optimized for their information content Since one of

the problems of intelligent information processing

deals with selecting and compressing information, the

role of unsupervised learning principles is crucial for

the ef®ciency of such intelligent systems

2.4.6 Image Processing Applications

Arti®cial neural networks can be used in image

proces-sing applications Many of the techniques used are

variants of other commonly used methods of pattern

recognition However, other approaches of image

pro-cessing may require modeling of the objects to be

found within an image, while neural network models

often work by a training process Such models also

need attention devices, or invariant properties, as it is

usually infeasible to train a network to recognize

instances of a particular object class in all orientations,

sizes, and locations within an image

One method commonly used is to train a network

using a relatively small window for the recognition of

objects to be classi®ed, then to pass the window over

the image data in order to locate the sought object,

which can then be classi®ed once located In some

engineering applications this process can be performed

by image preprocessing operations, since it is possible

to capture the image of objects in a restricted range of

orientations with predetermined locations and priate magni®cation

appro-Before the recognition stage, the system has to bedetermined such as which image transform is to beused These transformations include Fourier trans-forms, or using polar coordinates or other specializedcoding schemes, such as the chain code One interest-ing neural network model is the neocognition model ofFukushima and Miyake [13], which is capable of recog-nizing characters in arbitrary locations, sizes andorientations, by the use of a multilayered network.For machine vision, the particular operationsinclude setting the quantization levels for the image,normalizing the image size, rotating the image into astandard orientation, ®ltering out background detail,contrast enhancement, and edge direction Standardtechniques are available for these and it may be moreeffective to use these before presenting the transformeddata to a neural network

2.4.6.1 Steps in Setting Up an ApplicationThe main steps are shown below

Physical setup: light source, camera placement,focus, ®eld of view

Software setup: window placement, threshold,image map

Feature extraction: region shape features, gray-scalevalues, edge detection

Decision processing: decision function, training,testing

2.4.7 Future Development of Machine VisionAlthough image processing has been successfullyapplied to many industrial applications, there are stillmany de®nitive differences and gaps between machinevision and human vision Past successful applicationshave not always been attained easily Many dif®cultproblems have been solved one by one, sometimes bysimplifying the background and redesigning theobjects Machine vision requirements are sure toincrease in the future, as the ultimate goal of machinevision research is obviously to approach the capability

of the human eye Although it seems extremely dif®cult

to attain, it remains a challenge to achieve highly tional vision systems

func-The narrow dynamic range of detectable brightnesscauses a number of dif®culties in image processing Anovel sensor with a wide detection range will drasti-cally change the impact of image processing As micro-electronics technology progreses, three-dimensional

Trang 8

compound sensor, large scale integrated circuits (LSI)

are also anticipated, to which at least preprocessing

capability should be provided

As to image processors themselves, the local

par-allel pipelined processor may be further improved to

proved higher processing speeds At the same time,

the multiprocessor image processor may be applied in

industry when the key-processing element becomes

more widely available The image processor will

become smaller and faster, and will have new

func-tions, in response to the advancement of

semiconduc-tor technology, such as progress in system-on-chip

con®gurations and wafer-scale integration It may

also be possible to realize one-chip intelligent

proces-sors for high-level processing, and to combine these

with one-chip rather low-level image processors to

achieve intelligent processing, such as

knowledge-based or model-knowledge-based processing Based on these

new developments, image processing and the resulting

machine vision improvements are expected to

gener-ate new values not merely for industry but for all

aspects of human life

2.5 MACHINE VISION APPLICATIONS

Machine vision applications are numerous as shown in

the following list

Surface contour accuracy

Part identi®cation and sorting:

Sorting

Shape recognition

Inventory monitoring

Conveyor pickingÐnonoverlapping parts

Conveyor pickingÐoverlapping parts

Bin picking

Industrial robot control:

Tracking

Seam welding guidance

Part positioning and location determination

2.5.1 OverviewHigh-speed production lines, such as stamping lines,use machine vision to meet online, real time inspectionneeds Quality inspection involves deciding whetherparts are acceptable or defective, then directing motioncontrol equipment to reject or accept them Machineguidance applications improve the accuracy and speed

of robots and automated material handling equipment.Advanced systems enable a robot to locate a part or anassembly regardless of rotation or size In gaging appli-cations, a vision system works quickly to measure avariety of critical dimensions The reliability and accu-racy achieved with these methods surpasses anythingpossible with manual methods

In the machine tool industry, applications formachine vision include sensing tool offset and break-age, verifying part placement and ®xturing, and mon-itoring surface ®nish A high-speed processor that oncecost $80,000 now uses digital signal processing chiptechnology and costs less than $10,000 The rapidgrowth of machine vision usage in electronics, assem-bly systems, and continuous process monitoring cre-ated an experience base and tools not available even

a few years ago

2.5.2 InspectionThe ability of an automated vision system to recognizewell-de®ned patterns and determine if these patternsmatch those stored in the system's CPU memorymakes it ideal for the inspection of parts, assemblies,containers, and labels Two types of inspection can beperformed by vision systems: quantitative and qualita-tive Quantitative inspection is the veri®cation thatmeasurable quantities fall within desired ranges of tol-erance, such as dimensional measurements and thenumber of holes Qualitative inspection is the veri®ca-tion that certain components or properties are presentand in a certain position, such as defects, missing parts,extraneous components, or misaligned parts

Many inspection tasks involve comparing the givenobject with a reference standard and verifying thatthere are no discrepancies One method of inspection

is called template matching An image of the object iscompared with a reference image, pixel by pixel A dis-crepancy will generate a region of high differences Onthe other hand, if the observed image and the reference

Trang 9

are slightly out of registration, differences will be found

along the borders between light and dark regions in the

image This is because a slight misalignment can lead to

dark pixels being compared with light pixels

A more ¯exible approach involves measuring a set

of the image's properties and comparing the measured

values with the corresponding expected values An

example of this approach is the use of width

measure-ments to detect ¯aws in printed circuits Here the

expected width values were relatively high; narrow

ones indicated possible defects

2.5.2.1 Edge-Based Systems

Machine vision systems, which operate on edge

descriptions of objects, have been developed for a

number of defense applications Commercial

edge-based systems with pattern recognition capabilities

have reached markets now The goal of edge detection

is to ®nd the boundaries of objects by marking points

of rapid change in intensity Sometimes, the systems

operate on edge descriptions of images as

``gray-level'' image systems These systems are not sensitive

to the individual intensities of patterns, only to changes

in pixel intensity

2.5.2.2 Component or Attribute Measurements

An attribute measurement system calculates speci®c

qualities associated with known object images

Attributes can be geometrical patterns, area, length

of perimeter, or length of straight lines Such systems

analyze a given scene for known images with

prede-®ned attributes Attributes are constructed from

pre-viously scanned objects and can be rotated to match an

object at any given orientation This technique can be

applied with minimal preparation However, orienting

and matching are used most ef®ciently in aplications

permitting standardized orientations, since they

con-sume signi®cant processing time Attribute

measure-ment is effective in the segregating or sorting of

parts, counting parts, ¯aw detection, and recognition

decisions

2.5.2.3 Hole Location

Machine vision is ideally suited for determining if a

well-de®ned object is in the correct location relative

to some other well-de®ned object Machined objects

typically consist of a variety of holes that are drilled,

punched, or cut at speci®ed locations on the part

Holes may be in the shape of circular openings, slits,

squares, or shapes that are more complex Machine

vision systems can verify that the correct holes are inthe correct locations, and they can perform this opera-tion at high speeds A window is formed around thehole to be inspected If the hole is not too close toanother hole or to the edge of the workpiece, onlythe image of the hole will appear in the window andthe measurement process will simply consist of count-ing pixels Hole inspection is a straightforward appli-cation for machine vision It requires a two-dimensional binary image and the ability to locateedges, create image segments, and analyze basic fea-tures For groups of closely located holes, it may alsorequire the ability to analyze the general organization

of the image and the position of the holes relative toeach other

2.5.2.4 Dimensional Measurements

A wide range of industries and potential applicationsrequire that speci®c dimensional accuracy for the ®n-ished products be maintained within the tolerance lim-its Machine vision systems are ideal for performing100% accurate inspections of items which are moving

at high speeds or which have features which are cult to measure by humans Dimensions are typicallyinspected using image windowing to reduce the dataprocessing requirements A simple linear length mea-surement might be performed by positioning a longwidth window along the edge The length of the edgecould then be determined by counting the number ofpixels in the window and translating into inches ormillimeters The output of this dimensional measure-ment process is a ``pass±fail'' signal received by ahuman operator or by a robot In the case of a con-tinuous process, a signal that the critical dimensionbeing monitored is outside the tolerance limits maycause the operation to stop, or it may cause the form-ing machine to automatically alter the process.2.5.2.5 Defect Location

dif®-In spite of the component being present and in thecorrect position, it may still be unacceptable because

of some defect in its construction The two types ofpossible defects are functional and cosmetic

A functional defect is a physical error, such as abroken part, which can prevent the ®nished productfrom performing as intended A costmetic defect is a

¯aw in the appearance of an object, which will notinterfere with the product's performance, but maydecrease the product's value when perceived by theuser Gray-scale systems are ideal for detecting subtledifferences in contrast between various regions on the

Trang 10

surface of the parts, which may indicate the presence of

defects Some examples of defect inspection include the

inspection of:

Label position on bottles

Deformations on metal cans

Deterioration of dies

Glass tubing for bubbles

Cap seals for bottles

Keyboard character deformations

2.5.2.6 Surface Contour Accuracy

The determination of whether a three-dimensional

curved surface has the correct shape or not is an

important area of surface inspection Complex

manu-factured parts such as engine block castings or aircraft

frames have very irregular three-dimensional shapes

However, these complex shapes must meet a large

number of dimensional tolerance speci®cations

Manual inspection of these shapes may require several

hours for each item A vision system may be used for

mapping the surface of these three-dimensional

objects

2.5.3 Part Identi®cation and Sorting

The recognition of an object from its image is the most

fundamental use of a machine vision system

Inspection deals with the examination of objects

with-out necessarily requiring that the objects be identi®ed

In part recognition however, it is necessary to make a

positive identi®cation of an object and then make the

decision from that knowledge This is used for

categor-ization of the objects into one of several groups The

process of part identi®cation generally requires strong

geometrical feature interpretation capabilities The

applications considered often require an interface

cap-ability with some sort of part-handling equipment An

industrial robot provides this capability

There are manufacturing situations that require that

a group of varying parts be categorized into common

groups and sorted In general, parts can be sorted

based on several characteristics, such as shape, size,

labeling, surface markings, color, and other criteria,

depending on the nature of the application and the

capabilities of the vision system

2.5.3.1 Character Recognition

Usually in manufacturing situations, an item can be

identi®ed solely based on the identi®cation of an

alpha-numeric character or a set of characters Serial bers on labels identify separate batches in whichproducts are manufactured Alphanumeric charactersmay be printed, etched, embossed, or inscribed on con-sumer and industrial products Recent developmentshave provided certain vision systems with the capabil-ity of reading these characters

num-2.5.3.2 Inventory MonitoringCategories of inventories, which can be monitored forcontrol purposes, need to be created The sorting pro-cess of parts or ®nished products is then based on thesecategories Vision system part identi®cation capabil-ities make them compatible with inventory control sys-tems for keeping track of raw material, work inprocess, and ®nished goods inventories Vision systeminterfacing capability allows them to command indus-trial robots to place sorted parts in inventory storageareas Inventory level data can then be transmitted to ahost computer for use in making inventory-leveldecisions

2.5.3.3 Conveyor PickingÐOverlapOne problem encountered during conveyor picking isoverlapping parts This problem is complicated by thefact that certain image features, such as area, losemeaning when the images are joined together Incases of a machined part with an irregular shape, ana-lysis of the overlap may require more sophisticateddiscrimination capabilities, such as the ability toevaluate surface characteristics or to read surfacemarkings

2.5.3.4 No Overlap

In manufacturing environments with high-volumemass production, workpieces are typically positionedand oriented in a highly precise manner Flexible auto-mation, such as robotics, is designed for use in therelatively unstructured environments of most factories.However, ¯exible automation is limited without theaddition of the feedback capability that allows it tolocate parts Machine vision systems have begun toprovide the capability The presentation of parts in arandom manner, as on a conveyor belt, is common in

¯exible automation in batch production A batch ofthe same type of parts will be presented to the robot

in a random distribution along the conveyor belt Therobot must ®rst determine the location of the part andthen the orientation so that the gripper can be properlyaligned to grip the part

Trang 11

2.5.3.5 Bin Picking

The most common form of part representation is a bin

of parts that have no order While a conveyor belt

insures a rough form of organization in a

two-dimen-sional plane, a bin is a three-dimentwo-dimen-sional assortment of

parts oriented randomly through space This is one of

the most dif®cult tasks for a robot to perform

Machine vision is the most likely tool that will enable

robots to perform this important task Machine vision

can be used to locate a part, identify orientation, and

direct a robot to grasp the part

2.5.4 Industrial Robot Control

2.5.4.1 Tracking

In some applications like machining, welding,

assem-bly, or other process-oriented applications, there is a

need for the parts to be continuously monitored and

positioned relative to other parts with a high degree of

precision A vision system can be a powerful tool for

controlling production operations The ability to

mea-sure the geometrical shape and the orientation of the

object coupled with the ability to measure distance is

important A high degree of image resolution is also

needed

2.5.4.2 Seam Welding Guidance

Vision systems used for this application need more

features than the systems used to perform continuous

welding operations They must have the capability to

maintain the weld torch, electrode, and arc in the

proper positions relative to the weld joint They must

also be capable of detecting weld joints details, such as

widths, angles, depths, mismatches, root openings,

tack welds, and locations of previous weld passes

The capacity to perform under conditions of smoke,

heat, dirt, and operator mistreatment is also necessary

2.5.4.3 Part Positioning and Location

Determination

Machine vision systems have the ability to direct a part

to a precise position so that a particular machining

operation may be performed on it As in guidance

and control applications, the physical positioning is

performed by a ¯exible automation device, such as a

robot The vision system insures that the object is

cor-rectly aligned This facilitates the elimination of

expen-sive ®xturing The main concern here would be how to

achieve a high level of image resolution so that the

position can be measured accurately In cases in

which one part would have to touch another part, atouch sensor might also be needed

2.5.4.4 Collision AvoidanceOccasionally, there is a case in industry, where robotsare being used with ¯exible manufacturing equipment,when the manipulator arm can come in contact withanother piece of equipment, a worker, or other obst-acles, and cause an accident Vision systems may beeffectively used to prevent this This applicationwould need the capability of sensing and measuringrelative motion as well as spatial relationships amongobjects A real-time processing capability would berequired in order to make rapid decisions and preventcontact before any damage would be done

2.5.4.5 Machining MonitoringThe popular machining operations like drilling, cut-ting, deburring, gluing, and others, which can be pro-grammed of¯ine, have employed robots successfully.Machine vision can greatly expand these capabilities

in applications requiring visual feedback The tage of using a vision system with a robot is that thevision system can guide the robot to a more accurateposition by compensating for errors in the robot's posi-tioning accuracy Human errors, such as incorrectpositioning and undetected defects, can be overcme

advan-by using a vision system

2.5.5 Mobile Robot ApplicationsThis is an active research topic in the following areas.Navigation

GuidanceTrackingHazard determinationObstacle avoidance

2.6 CONCLUSIONS ANDRECOMMENDATIONSMachine vision, even in its short history, has beenapplied to practically every type of imagery with var-ious degrees of success Machine vision is a multidisci-plinary ®eld It covers diverse aspects of optics,mechanics, electronics, mathematics, photography,and computer technology This chapter attempts tocollect the fundamental concepts of machine visionfor a relatively easy introduction to this ®eld

Trang 12

The declining cost of both processing devices and

required computer equipment make it likely to have a

continued growth for the ®eld Several new

technolo-gical trends promise to stimulate further growth of

computer vision systems Among these are:

Parallel processing, made practical by low-cost

Inexpensive, high-resolution color display systems

Machine vision systems can be applied to many

manufacturing operations where human vision is

tra-ditional These systems are best for applications in

which their speed and accuracy over long time periods

enable them to outperform humans Some

manufac-turing operations depend on human vision as part of

the manufacturing process Machine vision can

accom-plish tasks that humans cannot perform due to

hazar-dous conditions and carry out these tasks at a higher

con®dence level than humans Beyond inspecting

pro-ducts, the human eye is also valued for its ability to

make measurement judgments or to perform

calibra-tion This will be one of the most fruitful areas for

using machine vision to replace labor The bene®ts

3 JD Murray, W Van Ryper Encyclopedia of GraphicFile Formats Sebastopol, CA: O'Reilly andAssociates, 1994

4 G Wagner Now that they're cheap, we have to makethem smart Proceedings of the SME Applied MachineVision' 96 Conference, Cincinnati, OH, June 3±6, 1996,

7 MD Levine Vision in Man and Machine Hill, New York, 1985, pp 151±170

McGraw-8 RM Haralick, LG Shapiro Computer and RobotVision Addison-Wesley, Reading, MA, 1992, pp 509±553

9 EL Hall Fundamental principles of robot vision In:Handbook of Pattern Recognition and ImageProcessing Computer Vision, Academic Press,Orlando, FL, 1994, pp 542±575

10 R Schalkoff, Pattern Recognition, John Wiley, NY,

1992, pp 204±263

11 WS McCulloch and WH Pitts, ``A Logical Calculus ofthe Ideas Imminent in Nervous Behavior,'' Bulletin ofMathematical Biophysics, Vol 5, 1943, pp 115±133

12 D Hebb Organization of Behavior, John Wiley & Sons,

NY, 1949

13 K Fukushima and S Miyake, ``Neocognition: A NewAlgorithm for Pattern Recognition Tolerant ofDeformations and Shifts in Position,'' PatternRecognition, Vol 15, No 6, 1982, pp 455±469

14 M Sonka, V Klavec, and R Boyle, Image Processing,Analysis and Machine Vision, PWS, Paci®c Grove, CA,

1999, pp 722±754

Trang 13

Three-dimensional vision concerns itself with a system

that captures three-dimensional displacement

informa-tion from the surface of an object Let us start by

reviewing dimensions and displacements A

displace-ment between two points is a one-dimensional

mea-surement One point serves as the origin and the

second point is located by a displacement value

Displacements are described by a multiplicity of

stan-dard length units For example, a displacement can be

3 in Standard length units can also be used to create a

co-ordinate axis For example, if the ®rst point is the

origin, the second point may fall on the co-ordinate 3

which represents 3 in

Determining the displacements among three points

requries a minimum of two co-ordinate axes, assuming

the points do not fall on a straight line With one point

as the origin, measurements are taken in perpendicular

(orthogonal) directions, once again using a standard

displacement unit

Three-dimensional vision determines displacements

along three co-ordinate axes Three dimensions are

required when the relationship among four points is

desired that do not fall on the same plane

Three-dimensional sensing systems are usually used to

acquire more than just four points Hundreds or

thou-sands of points are obtained from which critical spatial

relationships can be derived Of course, simple

one-dimensional measurements can still be made point to

point fronm the captured data

The three-dimensional vision systems discussed inthis chapter can also be referred to as triangulationsystems These systems typically consist of two cam-eras, or a camera and projector The systems use geo-metrical relationships to calculate the location of alarge number of points, simultaneously Three-dimen-sional vision systems are computationally intensive.Advances in computer processing and storage technol-ogies have made these systems economical

3.1.1 Competing TechnologiesBefore proceeding, let us review other three-dimen-sional capture technologies that are available.Acquisition of three-dimensional data can be broadlycategorized into contact and noncontact methods.Contact methods require the sensing system to makephysical contact with the object Noncontact methodsprobe the surface unobtrusively

Scales and calipers are traditional contact ment devices that require a human operator When theoperator is a computer, the measuring device would be

measure-a co-ordinmeasure-ate memeasure-asuring mmeasure-achine (CMM) A CMM is

a rectangular robot that uses a probe to acquire dimensional positional data The probe senses contactwith a surface using a force transducer The CMMrecords the three-dimensional position of the sensor

three-as it touches the surface point

Several noncontact methods exist for capturingthree-dimensional data Each has its advantages anddisadvantages One method, known as time of ¯ight,

415

Trang 14

bounces a laser, sound wave, or radio wave off the

surface of interest By measuring the time it takes for

the signal to return, one can calculate a position

Acoustical time-of-¯ight systems are better known as

sonar, and can span enormous distances underwater

Laser time-of-¯ight systems, on the other hand, are

used in industrial settings but also have inherently

large work volumes Long standoffs from the system

to the measured surface are required

Another noncontact technique for acquiring

three-dimensional data is image depth of focus A camera

can be ®tted with a lens that has a very narrow, but

adjustable depth of ®eld A computer controls the

depth of ®eld and identi®es locations in an image

that are in focus A group of points are acquired at a

speci®c distance, then the lens is refocused to acquire

data at a new depth

Other three-dimensional techniques are tailored to

speci®c applications Interferometry techniques can be

used to determine surface smoothness It is frequently

used in ultrahigh precision applications that require

accuracies up to the wavelength of light Specialized

medical imaging systems such as magnetic resonance

imaging (MRI) or ultrasound also acquire

three-dimensional data by penetrating the subject of interest

The word ``vision'' usually refers to an outer shell

mea-surement, putting these medical systems outside the

scope of this chapter

The competing technology to three-dimensional

tri-angulation vision, as described in this chapter, are

CMM machines, time-of-¯ight devices, and depth of

®eld Table 1 shows a brief comparison among

differ-ent systems represdiffer-enting each of these technologies

The working volume of a CMM can be scaled up out loss of accuracy Triangulation systems and depth-of-®eld systems lose accuracy with large work volumes.Hence, both systems are sometimes moved as a unit toincrease work volume.Figure 1shows a triangulationsystem, known as a laser scanner Laser scanners canhave accuracies of a thousandth of an inch but thesmall work volume requires a mechanical actuator.Triangulation systems acquire an exceptionally largenumber of points simultaneously A CMM mustrepeatedly make physical contact with the object toacquire points and therefore is much slower

with-3.1.2 Note on Two-Dimensional Vision SystemsVision systems that operate with a single camera aretwo-dimensional vision systems Three-dimensionalinformation may sometimes be inferred from such avision system As an example, a camera acquirestwo-dimensional information about a circuit board

An operator may wish to inspect the solder joints onthe circuit board, a three-dimensional problem Forsuch a task, lighting can be positioned such that sha-dows of solder joints will be seen by the vision system.This method of inspecting does not require the directmeasurement of three-dimensional co-ordinate loca-tions on the surface of the board Instead the three-dimensional information is inferred by a clever setup.Discussion of two-dimensional image processing forinspection of three dimensions by inference can befound in Chap 5.2 This chapter will concern itselfwith vision systems that capture three-dimensionalposition locations

Table 1 Comparison of Three-Dimensional TechnologiesSystem Work volume (in.) resolution (in.)Depth (points/sec)SpeedTriangulation

(DCS Corp.)

CMM(Brown & Sharp Mfg Co.)

of 4 in/sec

Trang 15

where the slope of the line is the pixel position divided

by the focal length:

35

xyz1

264

37

Equation (8) can be used to ®nd a pixel location,

…xpixel; ypixel†, for any point …x; y; z† in space dimensional information is reduced to two-dimen-sional information by dividing wxpixel by w Equation(8) cannot be inverted It is not possible to use a pixellocation alone to determine a unique …x; y; z† point

Three-In order to represent the camera in different tions, it is helpful to de®ne a zpixelco-ordinate that willalways have a constant value The equation below is aperspective projection matrix that contains such a con-stant:

loca-wxpixel

wypixel

wzpixelw

264

37

375

xyz1

264

37

Figure 3 The pinhole camera is a widely used approximate for a camera or projector

Figure 4 The pinhole camera model leads to the perspective projection matrix

Trang 16

example, to simulate moving the focal point to a new

location, d, on the z-axis, one would use the equation

377

377

xyz1

266

377

…10†

This equation subtracts a value of d in the z-direction

from every point being viewed in the co-ordinate space

That would be equivalent to moving the camera

for-ward along the z-direction by a value of d

The co-ordinate space orientation, and hence the

camera's viewing angle, can be changed using standard

rotation matrices [1] A pinhole camera, ®ve units away

from the origin, viewing the world space at a 458 angle

with respect to x z-axis would have the matrix

377

377

377

xyz1

266

377

xyz1

266

377…11b†

Once again, the world co-ordinates are changed to

re¯ect the view of the camera, with respect to the

pin-hole model

Accuracy in modeling a physical camera is

impor-tant for obtaining accurate measurements When

set-ting up a stereo vision system, it may be possible to

precisely locate a physical camera and describe that

location with displacement and rotation tion matrices This will require precision ®xtures andlasers to guide set up Furthermore, special cameralenses should be used, as standard off-the-shelf lensesoften deviate from the pinhole model Rather than try

transforma-to duplicate transformation matrices in the setup, adifferent approach can be taken

Let us consider the general perspective projectionmatrix for a camera located at some arbitrary locationand rotation:

wxpixel

wypixel

wzpixelw

264

37

375

xyz1

264

37

Specialized ®xtures are not required to assure a speci®crelationship to some physically de®ned origin.(Cameras, however, must always be mounted to hard-ware that prevents dislocation and minimizes vibra-tion.) The location of the camera can be determined

by the camera view itself A calibration object, withknown calibration points in space, is viewed by thecamera and is used to determine the aij constants.Equation (12) has 16 unknowns Sixteen calibrationpoints can be located at 16 different pixel locationsgenerating a suf®cient number of equations to solvefor the unknowns [2] More sophisticated methods of

®nding the aij constants exist, and take into accountlens deviations from the pinhole model [3,4]

3.2.3 System Types3.2.3.1 Passive Stereo ImagingPassive stereo refers to two cameras viewing the samescene from different perspectives Points corresponding

to the same location in space must be matched in theimages, resulting in two lines of sight Triangulationwill then determine the …x; y; z† point location.Assume the perspective projection transformationmatrix of one of the cameras can be described by Eq.(12) where …xpixel; ypixel† is replaced by …x0; y0† The twoequations below can be derived by substituting for theterm w and ignoring the constant zpixel

…a11 a41x0†x ‡ …a12 a42x0†y ‡ …a13 a43x0†z

Trang 17

…b11 b41x00†x ‡ …b12 b42x00†y ‡ …b13 b43x00†z

…b21 b41y00†x ‡ …b22 b42y00†y ‡ …b23 b43y00†z

where the aijconstants of Eq (12) have been replaced

with bij Equations (13)±(16) can be arranged in matrix

xyz

26

37

…17†

The constants aij and bij will be set based on the

posi-tion of the cameras in world space The cameras view

the same point in space at locations …x0; y0† and …x00; y00†

on their respective image planes Hence, Eqs (13)±(16)

are four linearly independent equations with only three

unknowns, …x; y; z† A solution for the point of

trian-gulation, …x; y; z†, can be achieved by using

least-squares regression However, more accurate results

may be obtained by using other methods [4,5]

Passive stereo vision is interesting because of its

similarity to human vision, but it is rarely used by

industry Elements of passive stereo can be found in

photogrammetry Photogrammetry is the use of

pas-sive images, taken from aircraft, to determine

geogra-phical topology [6] In the industrial setting,

determining points that correspond in the two images

is dif®cult and imprecise, especially on smooth

manu-factured surfaces The uncertainty of the lines of sight

from the cameras result in poor measurements

Industrial systems usually replace one camera with a

projection system, as described in the section below

3.2.3.2 Active Stereo Imaging (Moire Systems)

In active stereo imaging, one camera is replaced with a

projector Cameras and projectors can both be

simu-lated with a pinhole camera model For a projector, the

focal point of the pinhole camera model is replaced

with a point light source A transmissive image plane

is then placed in front of this light source

A projector helps solves the correspondence

pro-blem of the passive system The projector projects a

shadow from a known pixel location on its image

plane The shadow falls on a surface that may havebeen smooth and featureless The imaging cameralocates the shadow in the ®eld of view using algorithmsespecially designed for the task The system activelymodi®es the scene of inspection to simplify and makemore precise the correspondence task Often the pro-jector projects a simple pattern of parallel stripesknown as a Ronchi pattern, as shown in Fig 5.Let us assume that the aij constants in Eq (17) cor-respond to the camera The bij constants woulddescribe the location of the projector Equation (17)was overdetermined The fourth equation, Eq (16),which was generated by the y00 pixel position, is notneeded to determine the three unknowns A location inspace can be found by

a11 a41x0 a12 a42x0 a13 a43x0

a21 a41y0 a22 a42y0 a23 a43y0

b11 b41x00 b12 b42x00 b13 b43x00

26

3

7 xyz

26

37

37

…18†All pixels in the y00-direction can be used to project asingle shadow, since the speci®c y00pixel location is notneeded Hence, a pattern of striped shadows is logical.Active stereo systems use a single camera to locateprojected striped shadows in the ®eld of view Thestripes can be found using two-dimensional edge detec-tion techniques described inChap 5.2 The image pro-cessing technique must assign an x00 location to theshadow This can be accomplished by encoding thestripes [7,8] Assume a simpli®ed Ronchi grid as

Figure 5 Example of an active stereo vision system

Trang 18

shown in Fig 6 Each of the seven stripe positions is

uniquely identi®ed by a binary number The camera

images the ®eld of view three times Stripes are turned

on±off with each new image, based on the 3-bit

numer-ical representation The camera tracks the appearance

of shadows in the images and determines the x00

posi-tion based on the code

Prior to the advent of active stereo imaging, moire

fringe patterns were used to determine

three-dimen-sional surface displacements When a sinusoidal grid

is projected on a surface, a viewer using the same grid

will see fringes that appear as a relief map of the

sur-face [9±11] Figure 7 shows a conceptual example using

a sphere The stripe pattern is projected and an

obser-ver views the shadows as a contour map of the sphere

In order to translate the scene into measurements, a

baseline fringe distance must be established The moire

fringe patterns present an intuitive display for viewing

displacements

The traditional moire technique assumes the lines of

sight for light source and camera are parallel As

cussed in Sec 3.2.4, shadow interference occurs at

dis-crete distances from the grid This is the reason for the

relief mapping Numerous variations on the moire

sys-tem have been made including: specialized projection

patterns, dynamically altering projection patterns, and

varying the relationship of the camera and projector

The interested reader should refer to the many optical

journals available

The moire technique is a triangulation technique It

is not necessary for the camera to view the scenethrough a grid A digital camera consists of evenlyspaced pixel rows, that can be modeled as a grid.Active stereo imaging could be described as a moiresystem using a Ronchi grid projection and a digitalcamera

3.2.3.3 Laser ScannerThe simplest and most popular industrial triangulationsystem is the laser scanner Previously, active stereovision systems were described as projecting severalstraight-line shadows simultaneously A laser scannerprojects a single line of light onto a surface, for ima-ging by a camera Laser scanners acquire a single slice

of the surface that intersects the laser's projected plane

of light The scanner, or object, is then translated andadditional slices are captured in order to obtain three-dimensional information

For the laser scanner shown in Fig 8, the laserplane is assumed to be parallel to the x y-axis Eachpixel on the image plane is represented on the laserplane The camera views the laser light re¯ected fromthe surface, at various pixel locations Since the z-co-ordinate is constant, Eq (18) reduces to

a11 a41x0 a12 a42x0

a21 a41y0 a22 a41y0

xy

Figure 6 Ronchi grid stripes are turned on (value 1) and off

(value 0) to distinguish the x00position of the projector plane

Figure 7 Classic system for moire topology measurements

Ngày đăng: 10/08/2014, 04:21

TỪ KHÓA LIÊN QUAN