1. Trang chủ
  2. » Công Nghệ Thông Tin

e.r. davies - computer and machine vision. theory algorithms practicalities 4th

912 1K 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Computer and Machine Vision: Theory, Algorithms, Practicalities
Tác giả E. R. Davies
Trường học Royal Holloway, University of London
Chuyên ngành Computer and Machine Vision
Thể loại book
Năm xuất bản 2012
Thành phố Egham
Định dạng
Số trang 912
Dung lượng 22,17 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

His long experience in the field of puter and machine vision surpasses even the “big bang” in computer visionaround 25 years ago in the mid-80s when the Alvey Vision Conference UK andCVP

Trang 2

Machine Vision: Theory, Algorithms,

Practicalities

Trang 3

To my late father, Arthur Granville Davies, who passed on to me his appreciation of the beauties of mathematics and science.

To my wife, Joan, for love, patience, support, and inspiration.

To my children, Elizabeth, Sarah, and Marion, the music in my life.

To my grandson, Jasper, for reminding me of the carefree

joys of youth.

Trang 4

Machine Vision: Theory, Algorithms,

Practicalities

Fourth Edition

E R DAVIES Department of Physics

Royal Holloway, University of London, Egham, Surrey, UK

AMSTERDAM • BOSTON • HEIDELBERG • LONDON

NEW YORK • OXFORD • PARIS • SAN DIEGO

SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO

Academic Press is an imprint of Elsevier

Trang 5

Second edition 1997

Third edition 2005

Fourth edition 2012

Copyright r 2012 Elsevier Inc All rights reserved

No part of this publication may be reproduced or transmitted in any form or by any means, electronic

or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher Details on how to seek permission, further information about the Publisher’s permissions policies and arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website:

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-386908-1

For information on all Elsevier publications

visit our website at elsevierdirect.com

Typeset by MPS Limited, a Macmillan Company, Chennai, India

www.macmillansolutions.com

Printed and bound in United States of America

12 11 10 9 8 7 6 5 4 3 2 1

Trang 6

Foreword xxi

Preface xxiii

About the Author xxvii

Acknowledgements xxix

Glossary of Acronyms and Abbreviations xxxiii

CHAPTER 1 Vision, the Challenge 1

1.1 Introduction—Man and His Senses 1

1.2 The Nature of Vision 2

1.2.1 The Process of Recognition 2

1.2.2 Tackling the Recognition Problem 4

1.2.3 Object Location 6

1.2.4 Scene Analysis 8

1.2.5 Vision as Inverse Graphics 9

1.3 From Automated Visual Inspection to Surveillance 10

1.4 What This Book is About 12

1.5 The Following Chapters 13

1.6 Bibliographical Notes 14

PART 1 LOW-LEVEL VISION 15 CHAPTER 2 Images and Imaging Operations 17

2.1 Introduction 18

2.1.1 Gray Scale Versus Color 19

2.2 Image Processing Operations 23

2.2.1 Some Basic Operations on Grayscale Images 24

2.2.2 Basic Operations on Binary Images 28

2.3 Convolutions and Point Spread Functions 32

2.4 Sequential Versus Parallel Operations 34

2.5 Concluding Remarks 36

2.6 Bibliographical and Historical Notes 36

2.7 Problems 36

CHAPTER 3 Basic Image Filtering Operations 38

3.1 Introduction 38

3.2 Noise Suppression by Gaussian Smoothing 40

3.3 Median Filters 43

3.4 Mode Filters 45

3.5 Rank Order Filters 52

v

Trang 7

3.6 Reducing Computational Load 54

3.7 Sharp Unsharp Masking 55

3.8 Shifts Introduced by Median Filters 56

3.8.1 Continuum Model of Median Shifts 57

3.8.2 Generalization to Grayscale Images 59

3.8.3 Problems with Statistics 60

3.9 Discrete Model of Median Shifts 62

3.10 Shifts Introduced by Mode Filters 65

3.11 Shifts Introduced by Mean and Gaussian Filters 67

3.12 Shifts Introduced by Rank Order Filters 68

3.12.1 Shifts in Rectangular Neighborhoods 69

3.13 The Role of Filters in Industrial Applications of Vision 74

3.14 Color in Image Filtering 74

3.15 Concluding Remarks 76

3.16 Bibliographical and Historical Notes 77

3.16.1 More Recent Developments 78

3.17 Problems 79

CHAPTER 4 Thresholding Techniques 82

4.1 Introduction 83

4.2 Region-Growing Methods 83

4.3 Thresholding 84

4.3.1 Finding a Suitable Threshold 85

4.3.2 Tackling the Problem of Bias in Threshold Selection 86

4.3.3 Summary 88

4.4 Adaptive Thresholding 88

4.4.1 The Chow and Kaneko Approach 91

4.4.2 Local Thresholding Methods 92

4.5 More Thoroughgoing Approaches to Threshold Selection 93

4.5.1 Variance-Based Thresholding 95

4.5.2 Entropy-Based Thresholding 96

4.5.3 Maximum Likelihood Thresholding 97

4.6 The Global Valley Approach to Thresholding 98

4.7 Practical Results Obtained Using the Global Valley Method 101

4.8 Histogram Concavity Analysis 106

4.9 Concluding Remarks 107

4.10 Bibliographical and Historical Notes 108

4.10.1 More Recent Developments 109

4.11 Problems 110

CHAPTER 5 Edge Detection 111

5.1 Introduction 112

5.2 Basic Theory of Edge Detection 113

Trang 8

5.3 The Template Matching Approach 115

5.4 Theory of 33 3 Template Operators 116

5.5 The Design of Differential Gradient Operators 117

5.6 The Concept of a Circular Operator 118

5.7 Detailed Implementation of Circular Operators 120

5.8 The Systematic Design of Differential Edge Operators 122

5.9 Problems with the Above Approach—Some Alternative Schemes 123

5.10 Hysteresis Thresholding 126

5.11 The Canny Operator 128

5.12 The Laplacian Operator 131

5.13 Active Contours 134

5.14 Practical Results Obtained Using Active Contours 137

5.15 The Level Set Approach to Object Segmentation 140

5.16 The Graph Cut Approach to Object Segmentation 141

5.17 Concluding Remarks 145

5.18 Bibliographical and Historical Notes 146

5.18.1 More Recent Developments 147

5.19 Problems 148

CHAPTER 6 Corner and Interest Point Detection 149

6.1 Introduction 150

6.2 Template Matching 150

6.3 Second-Order Derivative Schemes 151

6.4 A Median Filter-Based Corner Detector 153

6.4.1 Analyzing the Operation of the Median Detector 154

6.4.2 Practical Results 156

6.5 The Harris Interest Point Operator 158

6.5.1 Corner Signals and Shifts for Various Geometric Configurations 161

6.5.2 Performance with Crossing Points and Junctions 162

6.5.3 Different Forms of the Harris Operator 165

6.6 Corner Orientation 166

6.7 Local Invariant Feature Detectors and Descriptors 168

6.7.1 Harris Scale and Affine-Invariant Detectors and Descriptors 171

6.7.2 Hessian Scale and Affine-Invariant Detectors and Descriptors 173

6.7.3 The SIFT Operator 173

6.7.4 The SURF Operator 174

6.7.5 Maximally Stable Extremal Regions 176

6.7.6 Comparison of the Various Invariant Feature Detectors 177

Trang 9

6.8 Concluding Remarks 180

6.9 Bibliographical and Historical Notes 181

6.9.1 More Recent Developments 184

6.10 Problems 184

CHAPTER 7 Mathematical Morphology 185

7.1 Introduction 185

7.2 Dilation and Erosion in Binary Images 186

7.2.1 Dilation and Erosion 186

7.2.2 Cancellation Effects 186

7.2.3 Modified Dilation and Erosion Operators 187

7.3 Mathematical Morphology 187

7.3.1 Generalized Morphological Dilation 187

7.3.2 Generalized Morphological Erosion 188

7.3.3 Duality Between Dilation and Erosion 189

7.3.4 Properties of Dilation and Erosion Operators 190

7.3.5 Closing and Opening 193

7.3.6 Summary of Basic Morphological Operations 195

7.4 Grayscale Processing 197

7.4.1 Morphological Edge Enhancement 198

7.4.2 Further Remarks on the Generalization to Grayscale Processing 199

7.5 Effect of Noise on Morphological Grouping Operations 201

7.5.1 Detailed Analysis 203

7.5.2 Discussion 205

7.6 Concluding Remarks 205

7.7 Bibliographical and Historical Notes 206

7.7.1 More Recent Developments 207

7.8 Problem 208

CHAPTER 8 Texture 209

8.1 Introduction 209

8.2 Some Basic Approaches to Texture Analysis 213

8.3 Graylevel Co-occurrence Matrices 213

8.4 Laws’ Texture Energy Approach 217

8.5 Ade’s Eigenfilter Approach 220

8.6 Appraisal of the Laws and Ade Approaches 221

8.7 Concluding Remarks 223

8.8 Bibliographical and Historical Notes 223

8.8.1 More Recent Developments 224

Trang 10

PART 2 INTERMEDIATE-LEVEL VISION 227

CHAPTER 9 Binary Shape Analysis 229

9.1 Introduction 230

9.2 Connectedness in Binary Images 230

9.3 Object Labeling and Counting 231

9.3.1 Solving the Labeling Problem in a More Complex Case 235

9.4 Size Filtering 238

9.5 Distance Functions and Their Uses 240

9.5.1 Local Maxima and Data Compression 243

9.6 Skeletons and Thinning 244

9.6.1 Crossing Number 247

9.6.2 Parallel and Sequential Implementations of Thinning 248 9.6.3 Guided Thinning 251

9.6.4 A Comment on the Nature of the Skeleton 251

9.6.5 Skeleton Node Analysis 251

9.6.6 Application of Skeletons for Shape Recognition 253

9.7 Other Measures for Shape Recognition 254

9.8 Boundary Tracking Procedures 257

9.9 Concluding Remarks 257

9.10 Bibliographical and Historical Notes 259

9.10.1 More Recent Developments 260

9.11 Problems 261

CHAPTER 10 Boundary Pattern Analysis 266

10.1 Introduction 266

10.2 Boundary Tracking Procedures 269

10.3 Centroidal Profiles 269

10.4 Problems with the Centroidal Profile Approach 270

10.4.1 Some Solutions 271

10.5 The (s, ψ) Plot 274

10.6 Tackling the Problems of Occlusion 276

10.7 Accuracy of Boundary Length Measures 279

10.8 Concluding Remarks 280

10.9 Bibliographical and Historical Notes 281

10.9.1 More Recent Developments 282

10.10 Problems 282

CHAPTER 11 Line Detection 284

11.1 Introduction 284

11.2 Application of the Hough Transform to Line Detection 285

11.3 The Foot-of-Normal Method 288

11.3.1 Application of the Foot-of-Normal Method 290

Trang 11

11.4 Longitudinal Line Localization 290

11.5 Final Line Fitting 292

11.6 Using RANSAC for Straight Line Detection 293

11.7 Location of Laparoscopic Tools 297

11.8 Concluding Remarks 299

11.9 Bibliographical and Historical Notes 300

11.9.1 More Recent Developments 301

11.10 Problems 301

CHAPTER 12 Circle and Ellipse Detection 303

12.1 Introduction 304

12.2 Hough-Based Schemes for Circular Object Detection 305

12.3 The Problem of Unknown Circle Radius 308

12.3.1 Some Practical Results 310

12.4 The Problem of Accurate Center Location 311

12.4.1 A Solution Requiring Minimal Computation 313

12.5 Overcoming the Speed Problem 314

12.5.1 More Detailed Estimates of Speed 314

12.5.2 Robustness 315

12.5.3 Practical Results 316

12.5.4 Summary 317

12.6 Ellipse Detection 320

12.6.1 The Diameter Bisection Method 320

12.6.2 The Chord Tangent Method 322

12.6.3 Finding the Remaining Ellipse Parameters 323

12.7 Human Iris Location 325

12.8 Hole Detection 327

12.9 Concluding Remarks 327

12.10 Bibliographical and Historical Notes 328

12.10.1 More Recent Developments 330

12.11 Problems 331

CHAPTER 13 The Hough Transform and Its Nature 333

13.1 Introduction 333

13.2 The Generalized Hough Transform 334

13.3 Setting Up the Generalized Hough Transform—Some Relevant Questions 336

13.4 Spatial Matched Filtering in Images 336

13.5 From Spatial Matched Filters to Generalized Hough Transforms 337

13.6 Gradient Weighting Versus Uniform Weighting 339

13.6.1 Calculation of Sensitivity and Computational Load 339

13.7 Summary 342

13.8 Use of the GHT for Ellipse Detection 343

13.8.1 Practical Details 347

Trang 12

13.9 Comparing the Various Methods 349

13.10 Fast Implementations of the Hough Transform 350

13.11 The Approach of Gerig and Klein 352

13.12 Concluding Remarks 353

13.13 Bibliographical and Historical Notes 354

13.13.1 More Recent Developments 356

13.14 Problems 357

CHAPTER 14 Pattern Matching Techniques 358

14.1 Introduction 359

14.2 A Graph-Theoretic Approach to Object Location 359

14.2.1 A Practical Example—Locating Cream Biscuits 363

14.3 Possibilities for Saving Computation 366

14.4 Using the Generalized Hough Transform for Feature Collation 369

14.4.1 Computational Load 370

14.5 Generalizing the Maximal Clique and Other Approaches 371

14.6 Relational Descriptors 373

14.7 Search 376

14.8 Concluding Remarks 377

14.9 Bibliographical and Historical Notes 378

14.9.1 More Recent Developments 380

14.10 Problems 381

PART 3 3-D VISION AND MOTION 387 CHAPTER 15 The Three-Dimensional World 389

15.1 Introduction 389

15.2 3-D Vision—the Variety of Methods 390

15.3 Projection Schemes for Three-Dimensional Vision 392

15.3.1 Binocular Images 393

15.3.2 The Correspondence Problem 396

15.4 Shape from Shading 398

15.5 Photometric Stereo 402

15.6 The Assumption of Surface Smoothness 405

15.7 Shape from Texture 407

15.8 Use of Structured Lighting 408

15.9 Three-Dimensional Object Recognition Schemes 410

15.10 Horaud’s Junction Orientation Technique 411

15.11 An Important Paradigm—Location of Industrial Parts 415

15.12 Concluding Remarks 417

15.13 Bibliographical and Historical Notes 419

15.13.1 More Recent Developments 420

15.14 Problems 421

Trang 13

CHAPTER 16 Tackling the Perspective n-point Problem 424

16.1 Introduction 424

16.2 The Phenomenon of Perspective Inversion 425

16.3 Ambiguity of Pose under Weak Perspective Projection 427

16.4 Obtaining Unique Solutions to the Pose Problem 430

16.4.1 Solution of the Three-Point Problem 433

16.4.2 Using Symmetric Trapezia for Estimating Pose 434

16.5 Concluding Remarks 434

16.6 Bibliographical and Historical Notes 436

16.6.1 More Recent Developments 437

16.7 Problems 438

CHAPTER 17 Invariants and Perspective 439

17.1 Introduction 440

17.2 Cross-ratios: the “Ratio of Ratios” Concept 441

17.3 Invariants for Noncollinear Points 445

17.3.1 Further Remarks About the Five-Point Configuration 447

17.4 Invariants for Points on Conics 449

17.5 Differential and Semi-differential Invariants 452

17.6 Symmetric Cross-ratio Functions 454

17.7 Vanishing Point Detection 456

17.8 More on Vanishing Points 458

17.9 Apparent Centers of Circles and Ellipses 460

17.10 The Route to Face Recognition 462

17.10.1 The Face as Part of a 3-D Object 464

17.11 Perspective Effects in Art and Photography 466

17.12 Concluding Remarks 472

17.13 Bibliographical and Historical Notes 474

17.13.1 More Recent Developments 475

17.14 Problems 475

CHAPTER 18 Image Transformations and Camera Calibration 478

18.1 Introduction 479

18.2 Image Transformations 479

18.3 Camera Calibration 483

18.4 Intrinsic and Extrinsic Parameters 486

18.5 Correcting for Radial Distortions 488

18.6 Multiple View Vision 490

18.7 Generalized Epipolar Geometry 491

18.8 The Essential Matrix 492

18.9 The Fundamental Matrix 495

18.10 Properties of the Essential and Fundamental Matrices 496

18.11 Estimating the Fundamental Matrix 497

Trang 14

18.12 An Update on the Eight-Point Algorithm 497

18.13 Image Rectification 498

18.14 3-D Reconstruction 499

18.15 Concluding Remarks 501

18.16 Bibliographical and Historical Notes 502

18.16.1 More Recent Developments 503

18.17 Problems 504

CHAPTER 19 Motion 505

19.1 Introduction 505

19.2 Optical Flow 506

19.3 Interpretation of Optical Flow Fields 509

19.4 Using Focus of Expansion to Avoid Collision 511

19.5 Time-to-Adjacency Analysis 513

19.6 Basic Difficulties with the Optical Flow Model 514

19.7 Stereo from Motion 515

19.8 The Kalman Filter 517

19.9 Wide Baseline Matching 519

19.10 Concluding Remarks 521

19.11 Bibliographical and Historical Notes 522

19.12 Problem 522

PART 4 TOWARD REAL-TIME PATTERN RECOGNITION SYSTEMS 523 CHAPTER 20 Automated Visual Inspection 525

20.1 Introduction 525

20.2 The Process of Inspection 527

20.3 The Types of Object to be Inspected 527

20.3.1 Food Products 528

20.3.2 Precision Components 528

20.3.3 Differing Requirements for Size Measurement 529

20.3.4 Three-Dimensional Objects 530

20.3.5 Other Products and Materials for Inspection 530

20.4 Summary: The Main Categories of Inspection 530

20.5 Shape Deviations Relative to a Standard Template 532

20.6 Inspection of Circular Products 533

20.7 Inspection of Printed Circuits 537

20.8 Steel Strip and Wood Inspection 538

20.9 Inspection of Products with High Levels of Variability 539

20.10 X-Ray Inspection 542

20.10.1 The Dual-Energy Approach to X-Ray Inspection 546

20.11 The Importance of Color in Inspection 546

Trang 15

20.12 Bringing Inspection to the Factory 548

20.13 Concluding Remarks 549

20.14 Bibliographical and Historical Notes 550

20.14.1 More Recent Developments 552

CHAPTER 21 Inspection of Cereal Grains 553

21.1 Introduction 553

21.2 Case Study: Location of Dark Contaminants in Cereals 554

21.2.1 Application of Morphological and Nonlinear Filters to Locate Rodent Droppings 555

21.2.2 Problems with Closing 558

21.2.3 Ergot Detection Using the Global Valley Method 558

21.3 Case Study: Location of Insects 560

21.3.1 The Vectorial Strategy for Linear Feature Detection 560

21.3.2 Designing Linear Feature Detection Masks for Larger Windows 563

21.3.3 Application to Cereal Inspection 564

21.3.4 Experimental Results 564

21.4 Case Study: High-Speed Grain Location 566

21.4.1 Extending an Earlier Sampling Approach 566

21.4.2 Application to Grain Inspection 567

21.4.3 Summary 571

21.5 Optimizing the Output for Sets of Directional Template Masks 572

21.5.1 Application of the Formulae 573

21.5.2 Discussion 574

21.6 Concluding Remarks 575

21.7 Bibliographical and Historical Notes 575

21.7.1 More Recent Developments 576

CHAPTER 22 Surveillance 578

22.1 Introduction 579

22.2 Surveillance—The Basic Geometry 580

22.3 Foreground—Background Separation 584

22.3.1 Background Modeling 585

22.3.2 Practical Examples of Background Modeling 591

22.3.3 Direct Detection of the Foreground 593

22.4 Particle Filters 594

22.5 Use of Color Histograms for Tracking 600

22.6 Implementation of Particle Filters 604

22.7 Chamfer Matching, Tracking, and Occlusion 607

22.8 Combining Views from Multiple Cameras 609

Trang 16

22.8.1 The Case of Nonoverlapping Fields of View 613

22.9 Applications to the Monitoring of Traffic Flow 614

22.9.1 The System of Bascle et al 614

22.9.2 The System of Koller et al .616

22.10 License Plate Location 619

22.11 Occlusion Classification for Tracking 621

22.12 Distinguishing Pedestrians by Their Gait 623

22.13 Human Gait Analysis 627

22.14 Model-Based Tracking of Animals 629

22.15 Concluding Remarks 631

22.16 Bibliographical and Historical Notes 632

22.16.1 More Recent Developments 634

22.17 Problem 635

CHAPTER 23 In-Vehicle Vision Systems 636

23.1 Introduction 637

23.2 Locating the Roadway 638

23.3 Location of Road Markings 640

23.4 Location of Road Signs 641

23.5 Location of Vehicles 645

23.6 Information Obtained by Viewing License Plates and Other Structural Features 647

23.7 Locating Pedestrians 651

23.8 Guidance and Egomotion 653

23.8.1 A Simple Path Planning Algorithm 656

23.9 Vehicle Guidance in Agriculture 656

23.9.1 3-D Aspects of the Task 660

23.9.2 Real-Time Implementation 661

23.10 Concluding Remarks 662

23.11 More Detailed Developments and Bibliographies Relating to Advanced Driver Assistance Systems 663

23.11.1 Developments in Vehicle Detection 664

23.11.2 Developments in Pedestrian Detection 666

23.11.3 Developments in Road and Lane Detection 668

23.11.4 Developments in Road Sign Detection 669

23.11.5 Developments in Path Planning, Navigation, and Egomotion 671

23.12 Problem 671

CHAPTER 24 Statistical Pattern Recognition 672

24.1 Introduction 673

24.2 The Nearest Neighbor Algorithm 674

24.3 Bayes’ Decision Theory 676

24.3.1 The Naive Bayes’ Classifier 678

Trang 17

24.4 Relation of the Nearest Neighbor and Bayes’

Approaches 679

24.4.1 Mathematical Statement of the Problem 679

24.4.2 The Importance of the Nearest Neighbor Classifier 681

24.5 The Optimum Number of Features 681

24.6 Cost Functions and Error Reject Tradeoff 682

24.7 The Receiver Operating Characteristic 684

24.7.1 On the Variety of Performance Measures Relating to Error Rates 686

24.8 Multiple Classifiers 688

24.9 Cluster Analysis 691

24.9.1 Supervised and Unsupervised Learning 691

24.9.2 Clustering Procedures 692

24.10 Principal Components Analysis 695

24.11 The Relevance of Probability in Image Analysis 699

24.12 Another Look at Statistical Pattern Recognition: The Support Vector Machine 700

24.13 Artificial Neural Networks 701

24.14 The Back-Propagation Algorithm 705

24.15 MLP Architectures 708

24.16 Overfitting to the Training Data 709

24.17 Concluding Remarks 712

24.18 Bibliographical and Historical Notes 713

24.18.1 More Recent Developments 715

24.19 Problems 717

CHAPTER 25 Image Acquisition 718

25.1 Introduction 718

25.2 Illumination Schemes 719

25.2.1 Eliminating Shadows 721

25.2.2 Principles for Producing Regions of Uniform Illumination 724

25.2.3 Case of Two Infinite Parallel Strip Lights 726

25.2.4 Overview of the Uniform Illumination Scenario 729

25.2.5 Use of Line-Scan Cameras 730

25.2.6 Light Emitting Diode (LED) Sources 731

25.3 Cameras and Digitization 732

25.3.1 Digitization 734

25.4 The Sampling Theorem 735

25.5 Hyperspectral Imaging 738

25.6 Concluding Remarks 739

25.7 Bibliographical and Historical Notes 740

25.7.1 More Recent Developments 741

Trang 18

CHAPTER 26 Real-Time Hardware and Systems Design

Considerations 742

26.1 Introduction 743

26.2 Parallel Processing 744

26.3 SIMD Systems 745

26.4 The Gain in Speed Attainable withN Processors 747

26.5 Flynn’s Classification 748

26.6 Optimal Implementation of Image Analysis Algorithms 750

26.6.1 Hardware Specification and Design 751

26.6.2 Basic Ideas on Optimal Hardware Implementation 752

26.7 Some Useful Real-Time Hardware Options 754

26.8 Systems Design Considerations 755

26.9 Design of Inspection Systems—the Status Quo 757

26.10 System Optimization 760

26.11 Concluding Remarks 761

26.12 Bibliographical and Historical Notes 763

26.12.1 General Background 763

26.12.2 Developments Since 2000 764

26.12.3 More Recent Developments 765

CHAPTER 27 Epilogue— Perspectives in Vision 767

27.1 Introduction 767

27.2 Parameters of Importance in Machine Vision 768

27.3 Tradeoffs 770

27.3.1 Some Important Tradeoffs 770

27.3.2 Tradeoffs for Two-Stage Template Matching 771

27.4 Moore’s Law in Action 772

27.5 Hardware, Algorithms, and Processes 773

27.6 The Importance of Choice of Representation 774

27.7 Past, Present, and Future 775

27.8 Bibliographical and Historical Notes 777

Appendix A Robust Statistics 778

References 796

Author Index 845

Subject Index 861

Trang 19

Topics Covered in Application Case Studies

Trang 20

Influences Impinging upon Integrated Vision

System Design

Trang 22

Although computer vision is such a relatively young field of study, it has matured

immensely over the last 25 years or so—from well-constrained, targeted

applica-tions to systems that learn automatically from examples

Such progress over these 25 years has been spurred not least by mind-boggling

advances in vision and computational hardware, making possible simple tasks

that could take minutes on small images, now integrated as part of real-time

sys-tems that do far more in a fraction of a second on much larger images in a video

stream

This all means that the focus of research has been in a perpetual state of

change, marked by near-exponential advances and achievements, and witnessed

by the quality, and often quantity, of outstanding contributions to the field

pub-lished in key conferences and journals such as ICCV and PAMI These advances

are most clearly reflected by the growing importance of the application areas in

which the novel and real-time developments in computer vision have been applied

to or developed for Twenty-five years ago, industrial quality inspection and

sim-ple military applications ruled the waves, but the emphasis has since spread its

wings, some slowly and some like wildfire, to many more areas, for example,

from medical imaging and analysis to surveillance and, inevitably, complex

mili-tary and space applications

So how does Roy’s book reflect this shift? Naturally, there are many

funda-mental techniques that remain the same, and this book is a wonderful treasure

chest of tools that provides the fundamentals for any researcher and teacher

More modern and state-of-the-art methodologies are also covered in the book,

most of them pertinent to the topical application areas currently driving not only

the research agenda, but also the market forces In short, the book is a direct

reflection of the progress and key methodologies developed in computer vision

over the last 25 years and more

Indeed, while the third edition of this book was already an excellent,

success-ful, and internationally popular work, this fourth edition is greatly enhanced and

updated All its chapters have been substantially revised and brought up to date

by the inclusion of many new references covering advances in the subject made

even in the past year There are now also two entirely new chapters (to reflect the

great strides that have been made in the area of video analytics) on surveillance

and in-vehicle vision systems The latter is highly relevant to the coming era of

advanced driver assistance systems, and the former’s importance and role requires

no emphasis in this day and age where so many resources are dedicated to

crimi-nal and terrorist activity monitoring and prevention

The material in the book is written in a way that is both approachable and

didactic It is littered with examples and algorithms I am sure that this volume

will be welcomed by a great many students and workers in computer and machine

vision, including practitioners in academia and industry—from beginners who are

xxi

Trang 23

starting out in the subject to advanced researchers and workers who need to gaininsight into video analytics I will also welcome it personally, for use by my ownundergraduate and postgraduate students, and will value its presence on my book-shelf as an up-to-date reference on this important subject.

Finally, I am very happy to go on record as saying that Roy is the right person

to have produced this substantial work His long experience in the field of puter and machine vision surpasses even the “big bang” in computer visionaround 25 years ago in the mid-80s when the Alvey Vision Conference (UK) andCVPR (USA) were only inchoates of what they have become today and reachesback to when ICPR and IAPR began to be dominated by image processing in thelate 70s

com-September 2011Majid MirmehdiUniversity of Bristol, UK

Trang 24

PREFACE TO THE FOURTH EDITION

The first edition came out in 1990, and was welcomed by many researchers and

practitioners However, in the subsequent two decades, the subject moved on at a

rapidly accelerating rate, and many topics that hardly deserved a mention in the

first edition had to be solidly incorporated in subsequent editions It seemed

par-ticularly important to bring in significant amounts of new material on

mathemati-cal morphology, 3-D vision, invariance, motion analysis, object tracking, artificial

neural networks, texture analysis, X-ray inspection, foreign object detection, and

robust statistics There are thus new chapters or appendices on these topics, and

they have been carefully integrated with the existing material The greater

propor-tion of the new material has been included in Parts 3 and 4 So great has been the

growth in work on 3-D vision and its applications that the original single chapter

on 3-D vision had to be expanded into the set offive chapters on 3-D vision and

motion forming Part 3, together with a further two chapters on surveillance and

in-vehicle vision systems in Part 4 Indeed, these changes have been so radical

that the title of the book has had to be modified to reflect them At this stage,

Part 4 encompasses such a range of chapters—covering applications and the

com-ponents needed for constructing real-time visual pattern recognition systems—

that it is difficult to produce a logical ordering for them: notably, the topics

interact with each other at a variety of different levels—theory, algorithms,

meth-odologies, practicalities, design constraints, and so on However, this should not

matter in practice, as the reader will be exposed to the essential richness of the

subject, and his/her studies should be amply rewarded by increased understanding

and capability

It is worth remarking that, at this point in time, computer vision has attained a

level of maturity that has made it substantially more rigorous, reliable, generic,

and—in the light of the improved hardware facilities now available for its

imple-mentation (not least, FPGA and GPU types of solution)—capable of real-time

performance This means that workers are more than ever before using it in

seri-ous applications, and with fewer practical difficulties It is intended that this

edi-tion of the book will reflect this radically new and exciting state of affairs at a

fundamental level

A typical final-year undergraduate course on vision for electronic engineering

or computer science students might include much of the work of Chapters 1 10

and 14, 15, plus a selection of sections from other chapters, according to

require-ments For MSc or PhD research students, a suitable lecture course might go on

xxiii

Trang 25

to cover Part 3 in depth, including several of the chapters in Part 4,1with manypractical exercises being undertaken on an image analysis system Here, muchwill depend on the research program being undertaken by each individual student.

At this stage, the text will have to be used more as a handbook for research, andindeed, one of the prime aims of the volume is to act as a handbook for theresearcher and practitioner in this important area

As mentioned in the original Preface, this book leans heavily on experience Ihave gained from working with postgraduate students: in particular, I would like toexpress my gratitude to Mark Edmonds, Simon Barker, Daniel Celano, DarrelGreenhill, Derek Charles, Mark Sugrue, and Georgios Mastorakis, all of whom have

in their own ways helped to shape my view of the subject In addition, it is a specialpleasure to recall very many rewarding discussions with my colleagues Barry Cook,Zahid Hussain, Ian Hannah, Dev Patel, David Mason, Mark Bateman, Tieying Lu,Adrian Johnstone, and Piers Plummer, the last two named having been particularlyprolific in generating hardware systems for implementing my research group’svision algorithms Next, I am immensely grateful to Majid Mirmehdi for readingmuch of the manuscript and making insightful comments and valuable suggestions.Finally, I am indebted to Tim Pitts of Elsevier Science for his help and encourage-ment, without which this fourth edition might never have been completed

PREFACE TO THE FIRST EDITION (1990)

Over the past 30 years or so, machine vision has evolved into a mature subjectembracing many topics and applications: these range from automatic (robot)assembly to automatic vehicle guidance, from automatic interpretation of docu-ments to verification of signatures, and from analysis of remotely sensed images

to checking of fingerprints and human blood cells; currently, automated visualinspection is undergoing very substantial growth, necessary improvements in

1 The importance of the appendix on robust statistics should not be underestimated once one gets onto serious work, although this will probably be outside the restrictive environment of an under- graduate syllabus.

Trang 26

quality, safety and cost-effectiveness being the stimulating factors With so much

ongoing activity, it has become a difficult business for the professional to keep up

with the subject and with relevant methodologies: in particular, it is difficult to

distinguish accidental developments from genuine advances It is the purpose of

this book to provide background in this area

The book was shaped over a period of 10 12 years, through material I have

given on undergraduate and postgraduate courses at London University, and

con-tributions to various industrial courses and seminars At the same time, my own

investigations coupled with experience gained while supervising PhD and

post-doctoral researchers helped to form the state of mind and knowledge that is now

set out here Certainly it is true to say that if I had had this book 8, 6, 4, or even

2 years ago, it would have been of inestimable value to myself for solving

practi-cal problems in machine vision It is therefore my hope that it will now be of use

to others in the same way Of course, it has tended to follow an emphasis that is

my own—and in particular one view of one path toward solving automated visual

inspection and other problems associated with the application of vision in

indus-try At the same time, although there is a specialism here, great care has been

taken to bring out general principles—including many applying throughout the

field of image analysis The reader will note the universality of topics such as

noise suppression, edge detection, principles of illumination, feature recognition,

Bayes’ theory, and (nowadays) Hough transforms However, the generalities lie

deeper than this The book has aimed to make some general observations and

messages about the limitations, constraints, and tradeoffs to which vision

algo-rithms are subject Thus, there are themes about the effects of noise, occlusion,

distortion and the need for built-in forms of robustness (as distinct from less

suc-cessful ad hoc varieties and those added on as an afterthought); there are also

themes about accuracy, systematic design, and the matching of algorithms and

architectures Finally, there are the problems of setting up lighting schemes

which must be addressed in complete systems, yet which receive scant attention

in most books on image processing and analysis These remarks will indicate that

the text is intended to be read at various levels—a factor that should make it of

more lasting value than might initially be supposed from a quick perusal of the

Contents

Of course, writing a text such as this presents a great difficulty in that it is

necessary to be highly selective: space simply does not allow everything in a

sub-ject of this nature and maturity to be dealt with adequately between two covers

One solution might be to dash rapidly through the whole area mentioning

every-thing that comes to mind, but leaving the reader unable to understand anyevery-thing in

detail or toachieve anything having read the book However, in a practical

sub-ject of this nature, this seemed to me a rather worthless extreme It is just possible

that the emphasis has now veered too much in the opposite direction, by coming

down to practicalities (detailed algorithms, details of lighting schemes, and so

on): individual readers will have to judge this for themselves On the other hand,

an author has to be true to himself and my view is that it is better for a reader or

Trang 27

student to have mastered a coherent series of topics than to have a mish-mash ofinformation that he is later unable to recall with any accuracy This, then, is myjustification for presenting this particular material in this particular way and forreluctantly omitting from detailed discussion such important topics as textureanalysis, relaxation methods, motion, and optical flow.

As for the organization of the material, I have tried to make the early part ofthe book lead into the subject gently, giving enough detailed algorithms (espe-cially in Chapters 2 and 6) to provide a sound feel for the subject—includingespecially vital, and in their own way quite intricate, topics such as connectedness

in binary images Hence, Part 1 provides the lead-in, although it is not alwaystrivial material and indeed some of the latest research ideas have been brought in(e.g., on thresholding techniques and edge detection) Part 2 gives much of themeat of the book Indeed, the (book) literature of the subject currently has a sig-nificant gap in the area of intermediate-level vision; while high-level vision (AI)topics have long caught the researcher’s imagination, intermediate-level visionhas its own difficulties which are currently being solved with great success (notethat the Hough transform, originally developed in 1962, and by many thought to

be a very specialist topic of rather esoteric interest, is arguably only now cominginto its own) Part 2 and the early chapters of Part 3 aim to make this clear, whilePart 4 gives reasons why this particular transform has become so useful As awhole, Part 3 aims to demonstrate some of the practical applications of the basicwork covered earlier in the book, and to discuss some of the principles underlyingimplementation: it is here that chapters on lighting and hardware systems will befound As there is a limit to what can be covered in the space available, there is acorresponding emphasis on the theory underpinning practicalities Probably, this

is a vital feature, since there are many applications of vision both in industry andelsewhere, yet listing them and their intricacies risks dwelling on interminabledetail, which some might find insipid; furthermore, detail has a tendency to daterather rapidly Although the book could not cover 3-D vision in full (this topicwould easily consume a whole volume in its own right), a careful overview ofthis complex mathematical and highly important subject seemed vital It is there-fore no accident that Chapter 16 is the longest in the book Finally, Part 4 asksquestions about the limitations and constraints of vision algorithms and answersthem by drawing on information and experience from earlier chapters It is tempt-ing to call the last chapter the Conclusion However, in such a dynamic subjectarea, any such temptation has to be resisted, although it has still been possible todraw a good number of lessons on the nature and current state of the subject.Clearly, this chapter presents a personal view but I hope it is one that readers willfind interesting and useful

Trang 28

Roy Davies is Emeritus Professor of Machine Vision at Royal Holloway,

University of London He has worked on many aspects of vision, from feature

detection and noise suppression to robust pattern matching and real-time

imple-mentations of practical vision tasks His interests include automated visual

inspec-tion, surveillance, vehicle guidance, and crime detection He has published more

than 200 papers and three books—Machine Vision: Theory, Algorithms,

Practicalities (1990), Electronics, Noise and Signal Recovery (1993), and Image

Processing for the Food Industry (2000); the first of these has been widely used

internationally for more than 20 years, and is now out in this much enhanced

fourth edition Roy is a Fellow of the IoP and the IET, and a Senior Member of

the IEEE He is on the Editorial Boards of Real-Time Image Processing, Pattern

Recognition Letters, Imaging Science, and IET Image Processing He holds a DSc

at the University of London: he was awarded BMVA Distinguished Fellow in

2005 and Fellow of the International Association of Pattern Recognition in 2008

xxvii

Trang 30

The author would like to credit the following sources for permission to reproduce

tables, figures and extracts of text from earlier publications:

Elsevier

For permission to reprint portions of the following papers from Image and Vision

Computing as text in Chapters 5 and 14; as Tables 5.1 5.5; and as Figures 3.29,

5.2, 14.1, 14.2, 14.6:

Davies (1984b, 1987c)

Davies, E.R (1991) Image and Vision Computing9, 252 261

For permission to reprint portions of the following paper from Pattern

Recognition as text in Chapter 9; and as Figure 9.11:

Davies and Plummer (1981)

For permission to reprint portions of the following papers from Pattern

Recognition Letters as text in Chapters 3, 5, 11 14, 21, 24; as Tables 3.2; 12.3;

13.1; and as Figures 3.6, 3.8, 3.10; 5.1, 5.3; 11.1, 11.2a, 11.3b; 12.4, 12.5, 12.6,

12.7 12.10; 13.1, 13.3 13.11; 21.3, 21.6:

Davies (1986a,b; 1987a,e,f; 1988b,c,e,f; 1989a)

Davies et al (2003a)

For permission to reprint portions of the following paper from Signal Processing

as text in Chapter 3; and as Figures 3.15 3.20:

Davies (1989b)

For permission to reprint portions of the following paper from Advances in

Imaging and Electron Physics as text in Chapter 3:

Davies (2003c)

For permission to reprint portions of the following article from Encyclopedia of

Physical Science and Technology as Figures 9.9, 9.12, 10.1, 10.4:

Davies, E.R (1987) Visual inspection, automatic (robotics) In: Meyers, R.A

(ed.) Encyclopedia of Physical Science and Technology, Vol 14 Academic

Press, San Diego, pp 360 377

The Committee of the Alvey Vision Club

For permission to reprint portions of the following paper as text in Chapter 14;

and as Figures 14.1, 14.2, 14.6:

Davies, E.R (1988) An alternative to graph matching for locating objects

from their salient features Proc 4th Alvey Vision Conf., Manchester (31

August 2 September), pp 281 286

xxix

Trang 31

CEP Consultants Ltd (Edinburgh)

For permission to reprint portions of the following paper as text in Chapter 20:Davies, E.R (1987) Methods for the rapid inspection of food products andsmall parts In: McGeough, J.A (ed.) Proc 2nd Int Conf on Computer-AidedProduction Engineering, Edinburgh (13 15 April), pp 105 110

For permission to reprint portions of the following paper as text in Chapter 3; and

Davies (1985; 1988a; 1997b; 1999f; 2000b,c; 2005; 2008b)

Davies, E.R (1997) Algorithms for inspection: constraints, tradeoffs and thedesign process IEE Digest no 1997/041, Colloquium on IndustrialInspection, IEE (10 Feb.), pp 6/1 5

Sugrue and Davies (2007)

Mastorakis and Davies (2011)

Davies et al (1998a)

Davies and Johnstone (1989)

IFS Publications Ltd

For permission to reprint portions of the following paper as text in Chapters 12,20; and as Figures 12.1, 12.2, 20.5:

Davies (1984c)

Trang 32

The Council of the Institution of Mechanical Engineers

For permission to reprint portions of the following paper as text in Chapter 26;

and as Tables 26.1, 26.2:

Davies and Johnstone (1986)

MCB University Press (Emerald Group)

For permission to reprint portions of the following paper as Figure 20.6:

Patel et al (1995)

The Royal Photographic Society

For permission to reprint portions of the following papers1 as text in Chapter 3;

as Table 3.4; and as Figures 3.12, 3.13, 3.25 3.28:

Davies, E.R (2003) Design of object location algorithms and their use for

food and cereals inspection Chapter 15 in Graves, M and Batchelor, B.G

(eds.) Machine Vision Techniques for Inspecting Natural Products

Springer-Verlag, pp 393 420

Peter Stevens Photography

For permission to reprint a photograph as Figure 3.12(a)

F.H Sumner

For permission to reprint portions of the following article from State of the Art

Report: Supercomputer Systems Technology as text in Chapter 9; and as

Figure 9.4:

Davies, E.R (1982) Image processing In: Sumner, F.H (ed.) State of the Art

Report: Supercomputer Systems Technology Pergamon Infotech, Maidenhead,

pp 223 244

1 See also the Maney website: www.maney.co.uk/journals/ims

Trang 33

World Scientific

For permission to reprint portions of the following book as text in Chapters 7, 21,

22, 23, 26; and as Figures 7.1 7.4, 21.4, 22.20, 23.15, 23.16, 26.3:

Davies (2000a)

Royal Holloway, University of London

For permission to reprint extracts from the following examination questions, inally written by E.R Davies:

orig-EL385/97/2; EL333/98/2; EL333/99/2, 3, 5, 6; EL333/01/2, 4 6;

Trang 34

1-D one dimension/one-dimensional

2-D two dimensions/two-dimensional

3-D three dimensions/three-dimensional

3DPO 3-D part orientation system

ACM Association for Computing Machinery (USA)

ADAS advanced driver assistance system

ADC analog to digital converter

AI artificial intelligence

ANN artificial neural network

APF auxiliary particle filter

ASCII American Standard Code for Information Interchange

ASIC application-specific integrated circuit

ATM automated teller machine

AUC area under curve

AVI audio video interleave

BCVM between-class variance method

BetaSAC beta [distribution] sampling consensus

BMVA British Machine Vision Association

BRAM block of RAM

BRDF bidirectional reflectance distribution function

CAD computed-aided design

CAM computer-aided manufacture

CCD charge-coupled device

CCTV closed-circuit television

CDF cumulative distribution function

CIM computer integrated manufacture

CLIP cellular logic image processor

CPU central processing unit

DCSM distinct class based splitting measure

DET Beaudet determinant operator

DEXA dual-emission X-ray absorptiometry

DG differential gradient

DN Dreschler Nagel corner detector

DoF degree of freedom

DoG difference of Gaussians

DSP digital signal processor

EM expectation maximization

EURASIP European Association for Signal Processing

FAST features from accelerated segment test

FFT fast Fourier transform

xxxiii

Trang 35

FN false negative

fnr false negative rate

FoE focus of expansion

FoV field of view

FP false positive

FPGA field programmable gate array

FPP full perspective projection

fpr false positive rate

GHT generalized Hough transform

GLOH gradient location and orientation histogram

GMM Gaussian mixture model

GPS global positioning system

GPU graphics processing unit

GroupSAC group sampling consensus

GVM global valley method

HOG histogram of orientated gradients

HSI hue, saturation, intensity

HT Hough transform

IBR intensity extrema-based region detector

IDD integrated directional derivative

IEE Institution of Electrical Engineers (UK)

IEEE Institute of Electrical and Electronics Engineers (USA)IET Institution of Engineering and Technology (UK)ILW iterated likelihood weighting

IMechE Institution of Mechanical Engineers (UK)

IMPSAC importance sampling consensus

ISODATA iterative self-organizing data analysis

JPEG/JPG Joint Photographic Experts Group

k-NN k-nearest neighbor

KR Kitchen Rosenfeld corner detector

LED light emitting diode

LFF local-feature-focus method

LIDAR light detection and ranging

LMedS least median of squares

LoG Laplacian of Gaussian

LS least squares

LUT lookup table

MAP maximum a posteriori

MDL minimum description length

MIMD multiple instruction stream, multiple data streamMIPS millions of instructions per second

MISD multiple instruction stream, single data stream

MLP multi-layer perceptron

Trang 36

MoG mixture of Gaussians

MP microprocessor

MSER maximally stable extremal region

NAPSAC n adjacent points sample consensus

NIR near infra-red

NN nearest neighbor

OCR optical character recognition

PC personal computer

PCA principal components analysis

PCB printed circuit board

PE processing element

PnP perspectiven-point

PR pattern recognition

PROSAC progressive sample consensus

PSF point spread function

RAM random access memory

RANSAC random sample consensus

RGB red, green, blue

RHT randomized Hough transform

RKHS reproducible kernel Hilbert space

RMS root mean square

ROC receiver operating characteristic

RoI region of interest

RPS Royal Photographic Society (UK)

SFOP scale-invariant feature operator

SIAM Society of Industrial and Applicative Mathematics

SIFT scale-invariant feature transform

SIMD single instruction stream, multiple data stream

SIR sampling importance resampling

SIS sequential importance sampling

SISD single instruction stream, single data stream

SOC sorting optimization curve

SOM self-organizing map

SPIE Society of Photo-optical Instrumentation Engineers

SPR statistical pattern recognition

STA spatiotemporal attention [neural network]

SURF speeded-up robust features

SUSAN smallest univalue segment assimilating nucleus

SVM support vector machine

Trang 37

TP true positive

tpr true positive rate

TV television

ULUT universal lookup table

USEF unit step edge function

VLSI very large scale integrationVMF vector median filter

VP vanishing point

WPP weak perspective projection

ZH Zuniga Haralick corner detector

Trang 38

1 Vision, the Challenge

Of the five senses—vision, hearing, smell, taste, and touch—vision is undoubtedly

the one that man has come to depend upon above all others, and indeed the

one that provides most of the data he receives Not only do the input pathways

from the eyes provide megabits of information at each glance but the data rates for

continuous viewing probably exceed 10 megabits per second (mbit/s) However,

much of this information is redundant and is compressed by the various layers of

the visual cortex, so that the higher centers of the brain have to interpret abstractly

only a small fraction of the data Nonetheless, the amount of information the higher

centers receive from the eyes must be at least two orders of magnitude greater than

all the information they obtain from the other senses

Another feature of the human visual system is the ease with which interpretation

is carried out We see a scene as it is—trees in a landscape, books on a desk,

widgets in a factory No obvious deductions are needed and no overt effort is

required to interpret each scene: in addition, answers are effectively immediate

and are normally available within a tenth of a second Just now and again some

doubt arises—e.g a wire cube might be “seen” correctly or inside out This and

a host of other optical illusions are well known, although for the most part we

can regard them as curiosities—irrelevant freaks of nature Somewhat surprisingly,

illusions are quite important, since they reflect hidden assumptions that the brain is

making in its struggle with the huge amounts of complex visual data it is receiving

We have to pass by this story here (although it resurfaces now and again in various

parts of this book) However, the important point is that we are for the most part

unaware of the complexities of vision Seeing is not a simple process: it is just that

vision has evolved over millions of years, and there was no particular advantage

in evolution giving us any indication of the difficulties of the task (if anything,

Trang 39

to have done so would have cluttered our minds with irrelevant information andslowed our reaction times).

In the present day and age, man is trying to get machines to do much of hiswork for him For simple mechanistic tasks this is not particularly difficult, butfor more complex tasks the machine must be given the sense of vision Effortshave been made to achieve this, sometimes in modest ways, for well over 30 years

At first, schemes were devised for reading, for interpreting chromosome images,and so on, but when such schemes were confronted with rigorous practical tests,the problems often turned out to be more difficult Generally, researchers react

to finding that apparent “trivia” are getting in the way by intensifying their effortsand applying great ingenuity, and this was certainly so with early efforts at visionalgorithm design Hence, it soon became evident that the task really is a complexone, in which numerous fundamental problems confront the researcher, and theease with which the eye can interpret scenes turned out to be highly deceptive

Of course, one of the ways in which the human visual system gains over themachine is that the brain possesses more than 1010 cells (or neurons), some ofwhich have well over 10,000 contacts (or synapses) with other neurons If eachneuron acts as a type of microprocessor, then we have an immense computer inwhich all the processing elements can operate concurrently Taking the largestsingle man-made computer to contain several hundred million rather modestprocessing elements, the majority of the visual and mental processing tasks thatthe eyebrain system can perform in a flash have no chance of being performed

by present-day man-made systems Added to these problems of scale, there isthe problem of how to organize such a large processing system, and also how toprogram it Clearly, the eyebrain system is partly hard-wired by evolution butthere is also an interesting capability to program it dynamically by training duringactive use This need for a large parallel processing system with the attendantcomplex control problems shows that machine vision must indeed be one of themost difficult intellectual problems to tackle

So what are the problems involved in vision that make it apparently so easyfor the eye, yet so difficult for the machine? In the next few sections an attempt

is made to answer this question

1.2.1 The Process of Recognition

This section illustrates the intrinsic difficulties of implementing machine vision,starting with an extremely simple example—that of character recognition Considerthe set of patterns shown inFig 1.1(a) Each pattern can be considered as a set of

25 bits of information, together with an associated class indicating its interpretation

In each case imagine a computer learning the patterns and their classes by rote.Then any new pattern may be classified (or “recognized”) by comparing it with

Trang 40

this previously learnt “training set,” and assigning it to the class of the nearest

pattern in the training set Clearly, test pattern (1) (Fig 1.1(b)) will be allotted to

class U on this basis Chapter 24 shows that this method is a simple form of the

nearest-neighbor approach to pattern recognition

The scheme outlined above seems straightforward and is indeed highly effective,

even being able to cope with situations where distortions of the test patterns occur or

where noise is present: this is illustrated by test patterns (2) and (3) However, this

approach is not always foolproof First, there are situations where distortions or noise

are excessive, so errors of interpretation arise Second, there are situations where

Some simple 25-bit patterns and their recognition classes used to illustrate some of

the basic problems of recognition: (a) training set patterns (for which the known classes

are indicated); (b) test patterns

Ngày đăng: 05/06/2014, 11:45