➢ The three types of image characteristics: isolated points, lines, and edges➢ Edge pixels: the intensity of an image changes abruptly ➢ Edges or edge segments: sets of connected edge pi
Trang 1XỬ LÝ ẢNH TRONG CƠ ĐIỆN TỬ
Machine Vision
Giảng viên: TS Nguyễn Thành Hùng Đơn vị: Bộ môn Cơ điện tử, Viện Cơ khí
Hà Nội, 2021
Trang 3image segmentation as a process that partitions R into n subregions, R1, R2, …,
R n, such that:
where Q(R k ) is a logical predicate defined over the points in set R k and is the null
set
Trang 4➢ The regions are said to disjoint If the set formed by the union of two regions is
not connected
➢ The fundamental problem in segmentation is to partition an image into regions
that satisfy the preceding conditions
➢ Segmentation algorithms for monochrome images generally are based on one
of two basic categories dealing with properties of intensity values: discontinuity
and similarity.
▪ Edge-based segmentation
▪ Region-base segmentation
Trang 5(a) Image of a constant intensity region (b) Boundary based on intensity discontinuities (c) Result of
segmentation (d) Image of a texture region (e) Result of intensity discontinuity computations (note
the large number of small edges) (f) Result of segmentation based on region properties.
Trang 8➢ The three types of image characteristics: isolated points, lines, and edges
➢ Edge pixels: the intensity of an image changes abruptly
➢ Edges (or edge segments): sets of connected edge pixels
➢ Edge detectors: local image processing tools designed to detect edge pixels
Trang 9▪ a (typically) thin edge segment
▪ the intensity of the background on either side of the line is either much higher
or much lower than the intensity of the line pixels
➢ Isolated point: a foreground (background) pixel surrounded by background
(foreground) pixels
Trang 10➢ an approximation to the first-order derivative at an arbitrary point x of a
one-dimensional function f(x)
x = 1 for the sample preceding x and x = -1 for the sample following x.
Trang 11➢ an approximation to the first-order derivative at an arbitrary point x of a
one-dimensional function f(x)
Trang 12➢ The forward difference
➢ The backward difference
Trang 13➢ The central difference
➢ The second order derivative based on a central difference
Trang 15➢ For two variables
Trang 17➢ Spatial filter kernel
A general 3x3 spatial filter kernel The
w’s are the kernel coefficients (weights).
Where z k is the intensity of the pixel whose spatial location
corresponds to the location of the kth kernel coefficient.
Trang 18➢ Laplacian
➢ Detected points
Trang 19➢ Example
(a) Laplacian kernel used for point detection (b) X-ray image of a turbine blade with a porosity manifested
by a single black pixel (c) Result of convolving the kernel with the image (d) Result of using Eq (10-15)
was a single point (shown enlarged at the tip of the arrow).
Trang 20➢ EXAMPLE: Using the Laplacian for line detection.
(a) Original image (b) Laplacian image; the magnified section shows the positive/negative double-line effect characteristic of the Laplacian (c) Absolute value of the Laplacian (d) Positive values of the Laplacian.
Trang 21➢ EXAMPLE: Detecting lines in specified directions.
Line detection kernels Detection angles are with respect to the axis system in above figure,
with positive angles measured counterclockwise with respect to the (vertical) x-axis.
Trang 22➢ EXAMPLE: Detecting lines in specified directions.
Trang 23➢ A step edge ➢ A ramp edge ➢ A roof edge
Trang 24A 1508x1970 image showing (zoomed) actual ramp (bottom, left), step (top, right), and roof edge profiles The profiles are from dark to light, in the areas enclosed by the small circles The ramp and step profiles span 9 pixels and
2 pixels, respectively The base of the roof edge is 3 pixels.
Trang 25(a) Two regions of constant intensity separated by an ideal ramp edge (b) Detail near the edge, showing
a horizontal intensity profile, and its first and second derivatives.
Trang 26edge.
Trang 27edge.
Trang 28➢ The three steps performed typically for edge detection are:
▪ Image smoothing for noise reduction
▪ Detection of edge points
▪ Edge localization
Trang 29➢ The Image Gradient and Its Properties
Magnitude:
The direction of the gradient vector:
Trang 30➢ EXAMPLE: Computing the gradient.
Using the gradient to determine edge strength and direction at a point Note that the edge direction is perpendicular to the direction of the gradient vector at the point where the
gradient is computed Each square represents one pixel.
Trang 31➢ Gradient Operators
Trang 32➢ Gradient Operators
Trang 33Kirsch compass kernels The edge direction of strongest response of each kernel is labeled below it.
Trang 34The Sobel absolute value response of the two components of the gradient
Trang 35Gradient angle image
Trang 37Diagonal edge detection (a) Result of using the Kirsch kernel in Fig 10.15(c) (b) Result of using the kernel in Fig 10.15(d) The input image in both cases was Fig
10.18(a)
Trang 38➢ Combining the Gradient with Thresholding
(a) Result of thresholding Fig 10.16(d) , the gradient of the original image
(b) Result of thresholding Fig 10.18(d) , the gradient of the smoothed image.
Trang 39➢ The Canny Edge Detector: three basic objectives
▪ Low error rate
▪ Edge points should be well localized
▪ Single edge point response
Trang 40▪ The Gaussian function:
▪ convolving f and G:
Trang 41(a) Two possible orientations
of a horizontal edge (shaded)
in a 3x3 neighborhood.
(b) Range of values (shaded) of
, the direction angle of the edge normal for a horizontal edge.
(c) The angle ranges of the edge normals for the four types of edge directions in a 3x3 neighborhood.
Trang 42▪ Non maximum suppression: finding the pixel with the maximum value in an
edge
r = αb + (1−α)a
Non Maximum Suppression with Interpolation No Interpolation
Trang 43▪ Double thresholding
Trang 44▪ Connectivity analysis
Trang 45▪ Smooth the input image with a Gaussian filter.
▪ Compute the gradient magnitude and angle images
▪ Apply nonmaxima suppression to the gradient magnitude image
▪ Use double thresholding and connectivity analysis to detect and link edges
Trang 46(a) Original image of size 834x1114 pixels, with intensity values scaled to the range [0, 1] (b) Thresholded gradient
of the smoothed image (c) Image obtained using the Marr-Hildreth algorithm (d) Image obtained using the Canny algorithm Note the significant improvement of the Canny image compared to the other two.
Trang 47➢ Line detection with Hough transform
(a) xy-plane (b) Parameter space.
Trang 49(a) Image of size 101x101 pixels,
containing five white points (four in
the corners and one in the center).
(b) Corresponding parameter space.
Trang 50(a) A 502x564 aerial image of an airport (b) Edge map obtained using Canny’s algorithm (c) Hough parameter space (the boxes
highlight the points associated with long vertical lines) (d) Lines in the image plane corresponding to the points highlighted by the boxes (e) Lines superimposed on the original image.
Trang 52Single thresholding
Multiple thresholding
Trang 53Intensity histograms that can be partitioned (a) by a single threshold, and (b) by dual thresholds.
Trang 54(a) Noiseless 8-bit image (b) Image with additive Gaussian noise of mean 0 and standard deviation of
10 intensity levels (c) Image with additive Gaussian noise of mean 0 and standard deviation of
50 intensity levels (d) through (f) Corresponding histograms.
Trang 55(a) Noisy image (b) Intensity ramp in the range [0.2, 0.6] (c) Product of (a) and (b) (d) through (f)
Corresponding histograms.
Trang 56➢ Iterative algorithm
▪ Select an initial estimate for the global threshold, T.
▪ Segment the image using T in Eq (1) This will produce two groups of pixels: G1, consisting of
pixels with intensity values > T and G2, consisting of pixels with values T.
▪ Compute the average (mean) intensity values m1 and m2 for the pixels in G1 and G2
respectively.
▪ Compute a new threshold value midway between m1 and m2:
▪ Repeat Steps 2 through 4 until the difference between values of T in successive iterations is
smaller than a predefined value, T
Trang 57(a) Noisy fingerprint (b) Histogram (c) Segmented result using a global threshold (thin image border added for clarity).
Trang 59(a) Original image (b) Histogram (high peaks were clipped to highlight details in the lower values) (c) Segmentation result using the basic global algorithm (d)
Result using Otsu’s method.
Trang 60(a) Noisy image and (b) its histogram (c) Result obtained using Otsu’s method (d) Noisy image smoothed using a 5x5 averaging kernel and (e) its histogram (f) Result of
thresholding using Otsu’s method.
Trang 61(a) Noisy image and (b) its histogram (c) Result obtained using Otsu’s method (d) Noisy image smoothed using a 5x5 averaging kernel and (e) its histogram (f) Result of
thresholding using Otsu’s method Thresholding failed in both cases to extract the
object of interest.
Trang 62(a) Noisy image and (b) its histogram (c) Mask image formed as the gradient magnitude image
thresholded at the 99.7 percentile (d) Image formed as the product of (a) and (c) (e) Histogram of the nonzero pixels in the image in (d) (f) Result of segmenting image (a) with the Otsu threshold based
on the histogram in (e) The threshold was 134, which is approximately midway
between the peaks in this histogram.
Trang 63(a) Image of yeast cells (b) Histogram of (a) (c) Segmentation of (a) with Otsu’s method using the histogram in (b) (d) Mask image formed by thresholding the absolute Laplacian image (e) Histogram of the nonzero pixels in the product of (a) and (d) (f) Original image thresholded using Otsu’s method based on the histogram in (e).
Trang 64(a) Image of an iceberg (b) Histogram (c) Image segmented into three regions using dual Otsu thresholds.
Trang 66➢ Semantic segmentation: classification of pixels in an image into semantic
classes
rather than classes
➢ Panoptic segmentation: the combination of semantic segmentation and instance
segmentation where each instance of an object in the image is segregated and
the object’s identity is predicted
Trang 68➢ Semantic segmentation models provide segment maps as outputs corresponding
to the inputs they are fed
Trang 69➢ Like most of the other applications, using a CNN for semantic segmentation is
the obvious choice
➢ When using a CNN for semantic segmentation, the output is also an image
rather than a fixed length vector
Trang 70➢ The architecture of the model contains several convolutional layers, non-linear
➢ The initial layers learn the low-level concepts such as edges and colors
➢ The later level layers learn the higher level concepts such as different objects
Trang 71Spatial tensor is downsampled and converted to a vector
Trang 72Encoder-Decoder architecture
Trang 73Encoder-Decoder with skip connections
Trang 74Transfer learning for Segmentation
Trang 75➢ Each pixel of the output of the network is compared with the corresponding pixel
in the ground truth segmentation image We apply standard cross-entropy loss
on each pixel
In binary classification, where the number of classes M equals
2, cross-entropy can be calculated as:
If M>2 (i.e multiclass classification), we calculate a separate loss for each class label per observation and sum the result.
Trang 76➢ Dataset: The first step in training our segmentation model is to prepare the
dataset
➢ Data augmentation:
Example of image augmentation for segmentation
Trang 77➢ Building the model: define our segmentation model with skip connections
➢ Choosing the model:
▪ Choosing the base model: select an appropriate base network → ResNet,
VGG-16, MobileNet, Custom CNN, …
▪ Select the segmentation architecture: FCN, SegNet, UNet, PSPNet, …
Trang 78Architecture of FCN32
Trang 79Architecture of SegNet
Trang 80Architecture of UNet
Trang 81Architecture of PSPNet
Trang 82➢ Choosing the input size: If there are a large number of objects in the image, the input size shall be larger The standard input size is somewhere from 200x200
to 600x600
➢ Training: output are the model weights
➢ Testing: get predictions from a saved model
Trang 83Image Segmentation results of
DeepLabV3 on sample images
Photo Credit:
https://arxiv.org/pdf/2001.05566.