Xử lý ảnh trong cơ điện tử machine vision chapter 6 image segmentation

➢ The three types of image characteristics: isolated points, lines, and edges➢ Edge pixels: the intensity of an image changes abruptly ➢ Edges or edge segments: sets of connected edge pi

Trang 1

XỬ LÝ ẢNH TRONG CƠ ĐIỆN TỬ

Machine Vision

Giảng viên: TS Nguyễn Thành Hùng Đơn vị: Bộ môn Cơ điện tử, Viện Cơ khí

Hà Nội, 2021

Trang 3

image segmentation as a process that partitions R into n subregions, R1, R2, …,

R n, such that:

where Q(R k ) is a logical predicate defined over the points in set R k and  is the null

set

Trang 4

➢ The regions are said to disjoint If the set formed by the union of two regions is

not connected

➢ The fundamental problem in segmentation is to partition an image into regions

that satisfy the preceding conditions

➢ Segmentation algorithms for monochrome images generally are based on one

of two basic categories dealing with properties of intensity values: discontinuity

and similarity.

▪ Edge-based segmentation

▪ Region-base segmentation

Trang 5

(a) Image of a constant intensity region (b) Boundary based on intensity discontinuities (c) Result of

segmentation (d) Image of a texture region (e) Result of intensity discontinuity computations (note

the large number of small edges) (f) Result of segmentation based on region properties.

Trang 8

➢ The three types of image characteristics: isolated points, lines, and edges

➢ Edge pixels: the intensity of an image changes abruptly

➢ Edges (or edge segments): sets of connected edge pixels

➢ Edge detectors: local image processing tools designed to detect edge pixels

Trang 9

▪ a (typically) thin edge segment

▪ the intensity of the background on either side of the line is either much higher

or much lower than the intensity of the line pixels

➢ Isolated point: a foreground (background) pixel surrounded by background

(foreground) pixels

Trang 10

➢ an approximation to the first-order derivative at an arbitrary point x of a

one-dimensional function f(x)

x = 1 for the sample preceding x and x = -1 for the sample following x.

Trang 11

➢ an approximation to the first-order derivative at an arbitrary point x of a

one-dimensional function f(x)

Trang 12

➢ The forward difference

➢ The backward difference

Trang 13

➢ The central difference

➢ The second order derivative based on a central difference

Trang 15

➢ For two variables

Trang 17

➢ Spatial filter kernel

A general 3x3 spatial filter kernel The

w’s are the kernel coefficients (weights).

Where z k is the intensity of the pixel whose spatial location

corresponds to the location of the kth kernel coefficient.

Trang 18

➢ Laplacian

➢ Detected points

Trang 19

➢ Example

(a) Laplacian kernel used for point detection (b) X-ray image of a turbine blade with a porosity manifested

by a single black pixel (c) Result of convolving the kernel with the image (d) Result of using Eq (10-15)

was a single point (shown enlarged at the tip of the arrow).

Trang 20

➢ EXAMPLE: Using the Laplacian for line detection.

(a) Original image (b) Laplacian image; the magnified section shows the positive/negative double-line effect characteristic of the Laplacian (c) Absolute value of the Laplacian (d) Positive values of the Laplacian.

Trang 21

➢ EXAMPLE: Detecting lines in specified directions.

Line detection kernels Detection angles are with respect to the axis system in above figure,

with positive angles measured counterclockwise with respect to the (vertical) x-axis.

Trang 22

➢ EXAMPLE: Detecting lines in specified directions.

Trang 23

➢ A step edge ➢ A ramp edge ➢ A roof edge

Trang 24

A 1508x1970 image showing (zoomed) actual ramp (bottom, left), step (top, right), and roof edge profiles The profiles are from dark to light, in the areas enclosed by the small circles The ramp and step profiles span 9 pixels and

2 pixels, respectively The base of the roof edge is 3 pixels.

Trang 25

(a) Two regions of constant intensity separated by an ideal ramp edge (b) Detail near the edge, showing

a horizontal intensity profile, and its first and second derivatives.

Trang 26

edge.

Trang 27

edge.

Trang 28

➢ The three steps performed typically for edge detection are:

▪ Image smoothing for noise reduction

▪ Detection of edge points

▪ Edge localization

Trang 29

➢ The Image Gradient and Its Properties

Magnitude:

The direction of the gradient vector:

Trang 30

➢ EXAMPLE: Computing the gradient.

Using the gradient to determine edge strength and direction at a point Note that the edge direction is perpendicular to the direction of the gradient vector at the point where the

gradient is computed Each square represents one pixel.

Trang 31

➢ Gradient Operators

Trang 32

➢ Gradient Operators

Trang 33

Kirsch compass kernels The edge direction of strongest response of each kernel is labeled below it.

Trang 34

The Sobel absolute value response of the two components of the gradient

Trang 35

Gradient angle image

Trang 37

Diagonal edge detection (a) Result of using the Kirsch kernel in Fig 10.15(c) (b) Result of using the kernel in Fig 10.15(d) The input image in both cases was Fig

10.18(a)

Trang 38

➢ Combining the Gradient with Thresholding

(a) Result of thresholding Fig 10.16(d) , the gradient of the original image

(b) Result of thresholding Fig 10.18(d) , the gradient of the smoothed image.

Trang 39

➢ The Canny Edge Detector: three basic objectives

▪ Low error rate

▪ Edge points should be well localized

▪ Single edge point response

Trang 40

▪ The Gaussian function:

▪ convolving f and G:

Trang 41

(a) Two possible orientations

of a horizontal edge (shaded)

in a 3x3 neighborhood.

(b) Range of values (shaded) of

, the direction angle of the edge normal for a horizontal edge.

(c) The angle ranges of the edge normals for the four types of edge directions in a 3x3 neighborhood.

Trang 42

▪ Non maximum suppression: finding the pixel with the maximum value in an

edge

r = αb + (1−α)a

Non Maximum Suppression with Interpolation No Interpolation

Trang 43

▪ Double thresholding

Trang 44

▪ Connectivity analysis

Trang 45

▪ Smooth the input image with a Gaussian filter.

▪ Compute the gradient magnitude and angle images

▪ Apply nonmaxima suppression to the gradient magnitude image

▪ Use double thresholding and connectivity analysis to detect and link edges

Trang 46

(a) Original image of size 834x1114 pixels, with intensity values scaled to the range [0, 1] (b) Thresholded gradient

of the smoothed image (c) Image obtained using the Marr-Hildreth algorithm (d) Image obtained using the Canny algorithm Note the significant improvement of the Canny image compared to the other two.

Trang 47

➢ Line detection with Hough transform

(a) xy-plane (b) Parameter space.

Trang 49

(a) Image of size 101x101 pixels,

containing five white points (four in

the corners and one in the center).

(b) Corresponding parameter space.

Trang 50

(a) A 502x564 aerial image of an airport (b) Edge map obtained using Canny’s algorithm (c) Hough parameter space (the boxes

highlight the points associated with long vertical lines) (d) Lines in the image plane corresponding to the points highlighted by the boxes (e) Lines superimposed on the original image.

Trang 52

Single thresholding

Multiple thresholding

Trang 53

Intensity histograms that can be partitioned (a) by a single threshold, and (b) by dual thresholds.

Trang 54

(a) Noiseless 8-bit image (b) Image with additive Gaussian noise of mean 0 and standard deviation of

10 intensity levels (c) Image with additive Gaussian noise of mean 0 and standard deviation of

50 intensity levels (d) through (f) Corresponding histograms.

Trang 55

(a) Noisy image (b) Intensity ramp in the range [0.2, 0.6] (c) Product of (a) and (b) (d) through (f)

Corresponding histograms.

Trang 56

➢ Iterative algorithm

▪ Select an initial estimate for the global threshold, T.

▪ Segment the image using T in Eq (1) This will produce two groups of pixels: G1, consisting of

pixels with intensity values > T and G2, consisting of pixels with values  T.

▪ Compute the average (mean) intensity values m1 and m2 for the pixels in G1 and G2

respectively.

▪ Compute a new threshold value midway between m1 and m2:

▪ Repeat Steps 2 through 4 until the difference between values of T in successive iterations is

smaller than a predefined value, T

Trang 57

(a) Noisy fingerprint (b) Histogram (c) Segmented result using a global threshold (thin image border added for clarity).

Trang 59

(a) Original image (b) Histogram (high peaks were clipped to highlight details in the lower values) (c) Segmentation result using the basic global algorithm (d)

Result using Otsu’s method.

Trang 60

(a) Noisy image and (b) its histogram (c) Result obtained using Otsu’s method (d) Noisy image smoothed using a 5x5 averaging kernel and (e) its histogram (f) Result of

thresholding using Otsu’s method.

Trang 61

(a) Noisy image and (b) its histogram (c) Result obtained using Otsu’s method (d) Noisy image smoothed using a 5x5 averaging kernel and (e) its histogram (f) Result of

thresholding using Otsu’s method Thresholding failed in both cases to extract the

object of interest.

Trang 62

(a) Noisy image and (b) its histogram (c) Mask image formed as the gradient magnitude image

thresholded at the 99.7 percentile (d) Image formed as the product of (a) and (c) (e) Histogram of the nonzero pixels in the image in (d) (f) Result of segmenting image (a) with the Otsu threshold based

on the histogram in (e) The threshold was 134, which is approximately midway

between the peaks in this histogram.

Trang 63

(a) Image of yeast cells (b) Histogram of (a) (c) Segmentation of (a) with Otsu’s method using the histogram in (b) (d) Mask image formed by thresholding the absolute Laplacian image (e) Histogram of the nonzero pixels in the product of (a) and (d) (f) Original image thresholded using Otsu’s method based on the histogram in (e).

Trang 64

(a) Image of an iceberg (b) Histogram (c) Image segmented into three regions using dual Otsu thresholds.

Trang 66

➢ Semantic segmentation: classification of pixels in an image into semantic

classes

rather than classes

➢ Panoptic segmentation: the combination of semantic segmentation and instance

segmentation where each instance of an object in the image is segregated and

the object’s identity is predicted

Trang 68

➢ Semantic segmentation models provide segment maps as outputs corresponding

to the inputs they are fed

Trang 69

➢ Like most of the other applications, using a CNN for semantic segmentation is

the obvious choice

➢ When using a CNN for semantic segmentation, the output is also an image

rather than a fixed length vector

Trang 70

➢ The architecture of the model contains several convolutional layers, non-linear

➢ The initial layers learn the low-level concepts such as edges and colors

➢ The later level layers learn the higher level concepts such as different objects

Trang 71

Spatial tensor is downsampled and converted to a vector

Trang 72

Encoder-Decoder architecture

Trang 73

Encoder-Decoder with skip connections

Trang 74

Transfer learning for Segmentation

Trang 75

➢ Each pixel of the output of the network is compared with the corresponding pixel

in the ground truth segmentation image We apply standard cross-entropy loss

on each pixel

In binary classification, where the number of classes M equals

2, cross-entropy can be calculated as:

If M>2 (i.e multiclass classification), we calculate a separate loss for each class label per observation and sum the result.

Trang 76

➢ Dataset: The first step in training our segmentation model is to prepare the

dataset

➢ Data augmentation:

Example of image augmentation for segmentation

Trang 77

➢ Building the model: define our segmentation model with skip connections

➢ Choosing the model:

▪ Choosing the base model: select an appropriate base network → ResNet,

VGG-16, MobileNet, Custom CNN, …

▪ Select the segmentation architecture: FCN, SegNet, UNet, PSPNet, …

Trang 78

Architecture of FCN32

Trang 79

Architecture of SegNet

Trang 80

Architecture of UNet

Trang 81

Architecture of PSPNet

Trang 82

➢ Choosing the input size: If there are a large number of objects in the image, the input size shall be larger The standard input size is somewhere from 200x200

to 600x600

➢ Training: output are the model weights

➢ Testing: get predictions from a saved model

Trang 83

Image Segmentation results of

DeepLabV3 on sample images

Photo Credit:

https://arxiv.org/pdf/2001.05566.

pdf

Định dạng
Số trang	83
Dung lượng	6,23 MB