MIDTERM REPORT DIGITAL IMAGE PROCESSING canny algorithm and code

To make it easier to view but nottry to rebuild the rest of the algorithm since “please do not re-invent the wheel when it is already there for you”, we will go through all these headers

CANNY ALGORITHM THEORY

Chapter 1: What is Edge detection?

Chapter 2: Canny edge and Canny edge detection algorithm PART 2: CANNY ALGORITHM SCRIPTING

Instructor: PHD PHAM VAN HUY 1

VIETNAM GENERAL CONFEDERATION OF LABOUR 2

VERIFICATION AND EVALUATION FROM LECTURER iii

CHAPTER 1: WHAT IS EDGE DETECTION? 4

CHAPTER 2: Canny edge and Canny edge detection algorithm 9

2.1 What is gradient and gradient based operator? 9

Picture 1.1: Flutter’s log (source) Error! Bookmark not defined

Picture 1.2: Flutter codebase example Error! Bookmark not defined

Code snippet 1.1: Text sample Error! Bookmark not defined

Picture 1.3: Text sample’ result Error! Bookmark not defined

Picture 2.1: ToDoNhut’s starting screen Error! Bookmark not defined

Picture 2.2: Step 1 to create task Error! Bookmark not defined

Picture 2.4: Finish creating task Error! Bookmark not defined

Picture 2.5: Task infor screen Error! Bookmark not defined

Picture 2.6: Edit Task screen Error! Bookmark not defined

Picture 2.7: Mark as complete Error! Bookmark not defined

Picture 2.8: Delete a task Error! Bookmark not defined

Picture 3.1: ToDoNhut Github repo Error! Bookmark not defined

WHAT IS EDGE DETECTION?

What is an edge?

In low-level image processing, grayscale images are often used as the basis for edge detection, since edges are typically defined as large or abrupt changes in pixel intensity along a line or curve This focus on intensity transitions, rather than color, makes edge detection more robust and simplifies downstream tasks such as segmentation, feature extraction, and object recognition.

In image processing, edges appear as large values in the first derivative of the signal, indicating a sudden change in intensity Because images are two-dimensional, the derivative is defined more intricately than in one dimension, capturing abrupt transitions across neighboring pixels A hard edge occurs when the intensity changes very abruptly over only about two to three pixels, while a soft edge results from a transition from bright to dark that spans several pixels Put simply, edges are discontinuities in intensity.

Viewed as a diagram, the sign shows a color transition from red to white, and the edge is the boundary where this transition changes abruptly The red area contributes its influence up to a breakpoint, after which the white takes over; many such breakpoints arranged along a line or curve form a complete edge, capturing the discontinuities in intensity that separate adjoining regions In image processing, detecting these edges helps identify object boundaries and the underlying structure of the scene, making edge detection a key technique for interpreting visual content.

Edges are defined as abrupt changes in image intensity, marking boundaries between regions This concept is straightforward to grasp, especially when you see a diagram illustrating how sharp intensity transitions create edge structures The graph below demonstrates edge formats resulting from dramatic intensity changes, illustrating how edge detection adapts to various patterns and even to highly uncertain regions.

This analysis reveals both rising and falling trends in the data, with distinct edges marking sharp transitions The key question is how to estimate the rate of change for the graph or the function it represents By examining local slopes, derivatives, or difference quotients, we can quantify how quickly the value changes at each point and identify regions of rapid variation, while also considering how filtering steps might influence those estimates.

Consider an image a[x], where x is a vector that spans the space in which the image exists During edge enhancement, a specific model E_m is applied to a[x], producing a new image b[x] = E_m(a[x]) This transformation highlights edges and sharpens details, delivering an enhanced image suitable for analysis and visualization The approach relies on the model E_m to map the original image vector into a refined representation, so that b[x] preserves content while improving boundary clarity.

For a more detailed view of edge detection, see reference [1], Slide Lecture 05 An edge can be defined as the border between two different regions, and edges arise from various discontinuities such as surface normal discontinuities, depth discontinuities, surface color discontinuities, and illumination discontinuities.

Edges and contours are often confused, but they have distinct roles in image analysis Edge detection aims to identify abrupt intensity changes that reveal the structural outlines of an object, yet these edges may not form a closed boundary Contour detection, by contrast, seeks to produce closed, continuous boundaries that encircle the object In short, edges highlight where an object begins and ends, while contours define its complete shape See the figures below for a clear visual comparison.

What is edge detection?

Edge detection is a fundamental technique in image processing, used for feature detection and feature extraction It aims to identify points in a digital image where brightness changes sharply, signaling edges and discontinuities that define the shapes and boundaries within the scene By highlighting these abrupt intensity transitions, edge detection supports more accurate image segmentation, object recognition, and analysis in applications such as computer vision and medical imaging.

The purpose of edge detection is significantly reducing the amount of data in an image and preserves the structural properties for further image processing.

An edge in a grayscale image is a local feature that marks the boundary between neighboring regions, each of which has a relatively uniform gray level but different intensities across the boundary In a neighborhood around the edge, the contrast between the two sides defines the edge, but this process can be affected by the surrounding context and noise When the image is noisy, distinguishing true edges from random high-frequency fluctuations becomes difficult because both edges and noise involve rapid intensity changes This overlap often leads to blurred, distorted edge representations, posing challenges for reliable edge detection in grayscale images.

So, the input of our detector is an image and the output is basically an image but it is binary and it is called edge map.

Edge detection has a wide range of applications across several fields In biometric security, it enhances fingerprint recognition by clarifying ridges and minutiae, with research detailing this problem In satellite imaging, edge maps support location recognition and shape detection, simplifying interpretation of results In robotic vision for autonomous driving, edge detection enables lane detection to guide steering decisions and converts lane edges into wheel-angle commands In medical imaging, edge detection can transform complex X-ray images into edge maps to facilitate the detection of pathological objects such as tumors or cancer cells.

According to the survey titled "Algorithm and Technique on Various Edge Detection: A Survey," the study offers a clear overview of the different edge detection techniques, highlighting the main categories and how they differ in theory and application, while the accompanying diagram below visually maps these methods to provide a concise guide for researchers and engineers working in image processing and computer vision.

As Canny edge detection is put in first order – gradient based operator which functions based on thing called gradient, we are about to figure it out in Chapter 2.

Canny edge and Canny edge detection algorithm

What is gradient and gradient based operator?

Gradient, also called the slope, indicates how steep a line is It is calculated by dividing the change in height by the change in horizontal distance, which in Cartesian coordinates is expressed as dy/dx (or Δy/Δx), where dy is the change in y and dx is the change in x This ratio shows how much the y-value changes per unit of x, revealing whether the line rises or falls and how steeply In short, a larger gradient means a steeper line, while a smaller gradient corresponds to a gentler slope.

In vector calculus, the gradient of a scalar-valued differentiable function f: R^n → R is the vector field ∇f that assigns to each point p the n-tuple of partial derivatives (∂f/∂x1(p), ∂f/∂x2(p), , ∂f/∂xn(p)) It points in the direction of the steepest ascent of f and its magnitude gives the maximal rate of increase of f at p Geometrically, ∇f is perpendicular to the level set surface {x : f(x) = c} that passes through p, and the gradient is denoted as ∇f or grad f.

The gradient vector encodes the direction of the fastest increase of a function and its rate of change in that direction When the gradient at a point p is non-zero, the gradient points in the direction in which the function increases most quickly from p, and its magnitude gives the rate of that increase.

- For the simplest way to define what image gradient is, we can say gradient of an image can be measure of change in image function in x direction and y direction, so as the slope, the “how steep it is” or the rate of fastest increase and it points in the direction of most rapid change in intensity

Edge detection at the pixel level involves analyzing how image intensity changes in both horizontal and vertical directions at each location By focusing on the pixel (x, y), we estimate the gradient to capture edge strength and direction precisely at that point This is done by computing the partial derivatives ∂f/∂x and ∂f/∂y and forming the gradient vector ∇f(x, y) The gradient magnitude |∇f(x, y)| represents the edge strength, while the gradient direction θ = arctan2(∂f/∂y, ∂f/∂x) reveals the edge orientation, enabling accurate edge detection at the pixel level.

That equation denotes the gradient of the image intensity function f at the point (x, y) Because the gradient points in the direction of the greatest rate of change in intensity, it can be treated as a vector; this gradient vector at (x, y) describes the direction and magnitude of the maximum change in f at that location.

To improve clarity, break the equation into two directional components along the x and y axes The gradient in the x-direction is f_x = ∂f/∂x, and in the y-direction is f_y = ∂f/∂y; together these x and y terms compose the full gradient The gradient magnitude shows how steep the change is, while the gradient angle indicates the edge orientation—i.e., the direction the edge is heading This decomposition is central to edge detection in image processing, and the accompanying image below helps visualize the gradient components and their role in defining edges.

From here, we can focus on two key concepts: edge strength and edge direction Edge strength is the magnitude of the gradient vector, indicating how strong a boundary is at a point, while edge direction is the angle of the gradient, typically computed from the gradient components using arctan2 for robust orientation.

- The magnitude(length) of the vector f , this term has the same scenario with length of a normal vector which is | v |x y 2 , denoted as ||f ||then we have:

Edge strength is the magnitude of the rate of change in the direction of the gradient vector, and in edge detection this quantity serves as the primary measure of how strong a boundary appears The gradient magnitude captures how abruptly image intensities change, guiding the identification of edges In some approaches, edge strength can be interpreted in relation to distance-based metrics, including the Manhattan distance formula, which sums absolute differences to quantify variation across neighboring pixels.

- Then the direction of the gradient is given by the angle :

From the image of properties, the vector points toward the direction in which color changes most prominently, and the edge lies between those regions; to extract the edge, its direction must be orthogonal to the gradient, which means rotating the gradient by 90 degrees to obtain the edge direction.

- To make it look better, we should take a look at the figure below here:

- In the above figure, edge normal here is actually our gradient direction and the edge direction is definitely orthogonal to the normal and reverse.

Who is Canny?

John Francis Canny (born 1958) is an Australian computer scientist associated with MIT and the University of Adelaide He holds the Paul E Jacobs and Stacy Jacobs Distinguished Professor of Engineering in the Computer Science Department at the University of California, Berkeley.

He has made significant contributions in various areas of computer science and mathematics including artificial intelligence, robotics, computer graphics, human-computer interaction, computer security, computational algebra, and computational geometry.

Of course, The Canny edge detector was developed by John F Canny in 1986 Canny also produced a computational theory of edge detection explaining why the technique works.

Edge tracking by Hysteresis

Technology has enriched our lives with a range of valuable inventions, and the camera is among the most essential This compact device is something many people take everywhere because it helps you capture moments you can treasure as memories Images created with a camera can be edited, enhanced, and shared, turning everyday experiences into enjoyable, creative storytelling.

When images become extremely popular, the analysis of images has been a growing field of science and application for several decades Therefore, the ability to analyze images has been increasing.

Edge detection is a foundational technique in image processing and computer vision, with classic methods such as Prewitt, Roberts, Sobel, and Marr–Hildreth In this article we focus on the Canny edge detector, regarded as an efficient and robust tool for edge extraction To keep the discussion practical and avoid reinventing the wheel, we outline the main ideas and present the sections that will guide readers through the Canny algorithm and how it compares to other edge detectors.

Chapter 1: What is Edge detection?

CANNY ALGORITHM SCRIPTING

VIETNAM GENERAL CONFEDERATION OF LABOUR 2

VERIFICATION AND EVALUATION FROM LECTURER iii

CHAPTER 1: WHAT IS EDGE DETECTION? 4

CHAPTER 2: Canny edge and Canny edge detection algorithm 9

2.1 What is gradient and gradient based operator? 9

Picture 1.1: Flutter’s log (source) Error! Bookmark not defined

Picture 1.2: Flutter codebase example Error! Bookmark not defined

Code snippet 1.1: Text sample Error! Bookmark not defined

Picture 1.3: Text sample’ result Error! Bookmark not defined

Picture 2.1: ToDoNhut’s starting screen Error! Bookmark not defined

Picture 2.4: Finish creating task Error! Bookmark not defined

Picture 2.5: Task infor screen Error! Bookmark not defined

Picture 2.6: Edit Task screen Error! Bookmark not defined

Picture 2.7: Mark as complete Error! Bookmark not defined

Picture 3.1: ToDoNhut Github repo Error! Bookmark not defined

CHAPTER 1: WHAT IS EDGE DETECTION?

Shapes in an image are defined by their boundaries, and edge detection or edge enhancement algorithms are used to detect or sharpen these boundaries Many texture measures rely on detecting edges, highlighting the central role that edges play in image analysis Edge detection is a well-developed field within image processing and functions as a core image segmentation technique, partitioning the image’s spatial domain into meaningful regions.

At the core of low-level image processing is grayscale data, which simplifies the scene to intensity values In grayscale images, edges are defined by large or abrupt changes in pixel intensity along a line or curve These intensity transitions form the key signal for edge detection, guiding algorithms to identify boundaries and shapes within the image.

Edge detection in images hinges on large values in the first derivative of the signal, which mark where intensity changes abruptly In two-dimensional images, this derivative is more complex than in one dimension, but the core idea remains the same A hard edge is an abrupt change in intensity, often spanning just 2 or 3 pixels, while a soft edge shows a transition from bright to dark that extends over several pixels In simple terms, edges are discontinuities in intensity.

Observing a sign that shifts from red to white (and back) reveals the edge as the boundary between color regions If we treat the sign as a diagram, we see an area where red’s influence rises to a peak and then fades, while white takes over; the point where this transition occurs is an edge But there isn’t just one breakpoint—multiple transitions along a line or curve combine to form a full edge, which explains the presence of discontinuities in intensity In short, an edge marks where color and intensity change abruptly, creating the visual boundary between red and white.

Edges in image processing are abrupt changes in image intensity that delineate boundaries between different regions This definition is easy to grasp and forms the basis for many computer vision tasks The accompanying diagram shows how dramatic intensity transitions give rise to different edge formats, and the graph below illustrates these edge types as they appear in the image.

On this graph, we can spot rising and falling trends along with noticeable edge effects that influence the data The core question is how to estimate the rate of change of the function being analyzed, i.e., the slope or derivative that describes how values evolve as the input varies To address this, one can apply derivative estimation techniques such as finite differences for discrete data or calculus-based derivatives for smooth models, while using smoothing or filtering to reduce noise Emphasizing local changes near key transitions ensures an accurate capture of the rate of change, especially at the edges where behavior shifts In summary, accurately estimating the rate of change involves selecting appropriate methods to quantify the slope of the graph or function at the points of interest, while accounting for noise and edge effects.

In image processing, an image is represented as a[x], where x is a vector that spans the space of possible image representations When performing edge enhancement, a specific model E_m is applied to a[x], producing a new image b[x], defined by b[x] = E_m(a[x]) This transformation highlights edges in the output image while preserving the overall content of the original.

Edge detection identifies the border between two different regions, where an edge is the boundary separating distinct areas For a more detailed view, see reference [1], Slide Lecture 05 Edges can arise from several types of discontinuities, including surface normal discontinuities, depth discontinuities, surface color discontinuities, and illumination discontinuities.

There's still a common confusion between contour and edge In edge detection and contour detection, both approaches aim to reveal the structural outlines of an object, but they are not the same An edge highlights abrupt changes in image intensity and may not form a closed shape, whereas a contour represents a closed boundary that completely encircles the object For a clearer understanding, refer to the two figures below that illustrate how edges and contours differ in practice.

Edge detection is a fundamental tool in image processing used for feature detection and extraction It seeks to identify points in a digital image where brightness changes sharply, revealing intensity discontinuities that outline object boundaries Detecting these edges enables downstream tasks such as object recognition, image segmentation, and scene understanding, making edge detection a core step in many computer vision workflows.

The purpose of edge detection is significantly reducing the amount of data in an image and preserves the structural properties for further image processing.

In a grayscale image, an edge is a local feature within a neighborhood that separates regions where the gray level is relatively uniform on each side of the boundary but differs across the edge In noisy images, edge detection is challenging because both true edges and noise contain high-frequency content, which can blur and distort the resulting edge estimates.

So, the input of our detector is an image and the output is basically an image but it is binary and it is called edge map.

Edge detection has a broad set of applications across several fields In biometrics, it enhances fingerprint recognition by clarifying ridge and valley boundaries, a topic explored in numerous studies In satellite imaging, edge maps assist location recognition and shape detection, making the interpretation of geographic features easier In robotic vision and autonomous navigation, detecting lane boundaries and other edges supports steering decisions by converting edge information into wheel-angle commands In medical imaging, converting complex X-ray images into edge maps helps highlight pathological objects such as tumors or cancer cells, aiding diagnosis and treatment planning Overall, edge detection provides precise boundary information that improves analysis, segmentation, and decision-making across these domains.

According to the survey paper "Algorithm and Technique on Various Edge Detection: A Survey," there is a clear overview of the different edge detection techniques, outlining the main categories and how each method identifies image boundaries The study highlights the taxonomy of edge detectors—from traditional gradient-based operators to more advanced approaches—and explains the strengths and limitations of each technique An accompanying diagram visualizes this classification, helping readers understand how the various methods relate to one another within the broader edge-detection landscape.

As Canny edge detection is put in first order – gradient based operator which functions based on thing called gradient, we are about to figure it out in Chapter 2.

CHAPTER 2: Canny edge and Canny edge detection algorithm

2.1 What is gradient and gradient based operator?

Gradient, also called the slope, shows how steep a line is It is calculated by dividing the change in height (Δy) by the change in horizontal distance (Δx) in the Cartesian coordinate system, and is often written as dy/dx, where dy is the vertical change and dx is the horizontal change.

In vector calculus, the gradient of a differentiable scalar-valued function f of several variables is the vector field ∇f At a point p, its value is the vector whose components are the partial derivatives of f at p with respect to each variable, so ∇f(p) = (∂f/∂x1(p), ∂f/∂x2(p)(p), , ∂f/∂xn(p)) The gradient points in the direction of the steepest ascent of f, and its magnitude equals the maximum rate of increase of f at p.

Canny algorithm scripting

In OpenCV, there is a Canny function with structure as: void cv::Canny( InputArray image,

OutputArray edges, double threshold1, double threshold2, int apertureSize = 3 , bool L2gradient = false

Using the Canny edge detector, you input an 8-bit grayscale image and receive a single-channel 8-bit binary edge map of the same size The hysteresis procedure relies on two thresholds, threshold1 and threshold2, to classify and link edge pixels, while the apertureSize parameter sets the Sobel operator’s kernel size used to compute image gradients.

) a flag, indicating weather a more accurate

√( ) + ( should be used to calculate the image gradient magnitude 2

(L2gradient = true), or whether the default 1 = | / | + | / | ℎ (L2gradient = false)

In Canny function, we don’t have to attend where is the higher threshold parameters in procedure prepareThresh template inline void prepareThresh(f64 low_thresh, f64 high_thresh, s32 &low, s32

&high) { if (low_thresh > high_thresh) std::swap(low_thresh, high_thresh);

#if defined GNUC low = (s32)low_thresh; high = (s32)high_thresh; low -= (low > low_thresh); high -= (high > high_thresh);

#else low = internal::round(low_thresh); high = internal::round(high_thresh); f32 ldiff = (f32)(low_thresh - low); f32 hdiff = (f32)(high_thresh - high); low -= (ldiff < 0); high -= (hdiff < 0);

The source code shows the different when change the L2gradient = true or false:

When L2gradient = true: for (; j < colscn; j++)

_norm[j] = s32(_dx[j])*_dx[j] + s32(_dy[j])*_dy[j];

When L2gradient = false: for (; j < colscn; j++)

_norm[j] = std::abs(s32(_dx[j])) + std::abs(s32(_dy[j]));

Two loops above is apply in NormCanny procedure which have 3 pointer as parameters (_dx, _dy and _norm)

In source code, they apply the BLOB technic (consider 8-connected neighborhood pixels) when using Hyteresis:

/ now track the edges (hysteresis thresholding) while

{ u8* m; if ((size_t)(stack_top - stack_bottom) + 8u > maxsize) { ptrdiff_t sz = (ptrdiff_t)(stack_top - stack_bottom); maxsize = maxsize * 3/2; stack.resize(maxsize); stack_bottom = &stack[0]; stack_top = stack_bottom + sz;

} CANNY_POP(m); if (!m[-1]) CANNY_PUSH(m - 1); if (!m[1]) CANNY_PUSH(m + 1); if (!m[-mapstep-1]) CANNY_PUSH(m - mapstep - 1); if (!m[-mapstep]) CANNY_PUSH(m - mapstep); if (!m[-mapstep+1]) CANNY_PUSH(m - mapstep + 1); if (!m[mapstep-1]) CANNY_PUSH(m + mapstep

- 1); if (!m[mapstep]) CANNY_PUSH(m + mapstep); if (!m[mapstep+1]) CANNY_PUSH(m + mapstep + 1); }

To perform Canny edge detection in OpenCV, first load the image (for example test2.jpeg) with cv2.imread in color, then apply a Gaussian blur using a 5×5 kernel to reduce noise, and finally run Canny on the blurred image with thresholds 30 and 100, an aperture size of 3, and L2gradient set to False Display both the original image and the edge map using cv2.imshow, and conclude with cv2.waitKey(0) followed by cv2.destroyAllWindows to close the windows This workflow highlights edges while suppressing noise, providing a clear edge map suitable for further image processing tasks.

We can consider different between 2 equations (Manhattan distance and Euclidean distance)

Then, we can see the different between Canny and Sobel: (Sobel will record all the different include not an edges):

Tiêu đề	Midterm Report Digital Image Processing Canny Algorithm and Code
Tác giả	Nguyen Minh Nhut, Tran Nhan Tai
Người hướng dẫn	PHD. Pham Van Huy
Trường học	Ton Duc Thang University
Chuyên ngành	Digital Image Processing
Thể loại	Báo cáo giữa kỳ
Năm xuất bản	2021
Thành phố	Ho Chi Minh City

Định dạng
Số trang	32
Dung lượng	3,94 MB