MIDTERM REPORT DIGITAL IMAGE PROCESSING canny algorithm and code

To make it easier to view but not try to rebuild the rest of the algorithm since “please do not re-invent the wheel when it is already there for you”, we will go through all these header

CANNY ALGORITHM THEORY

Chapter 1: What is Edge detection?

Chapter 2: Canny edge and Canny edge detection algorithm PART 2: CANNY ALGORITHM SCRIPTING

Instructor: PHD PHAM VAN HUY 1

VIETNAM GENERAL CONFEDERATION OF LABOUR 2

VERIFICATION AND EVALUATION FROM LECTURER iii

CHAPTER 1: WHAT IS EDGE DETECTION? 4

This article guides you through Flutter development by presenting a typical Flutter log, a codebase example, and a text sample with its resulting output to show how code translates into visible UI It also introduces ToDoNhut’s starting screen to illustrate the app’s initial interface and user flow, followed by Step 1 to create a task, which outlines the first actions needed to add tasks and begin task management within the app.

Picture 2.3: Step 2 to create task Error! Bookmark not defined

This article highlights the key steps in using a task management interface: finishing the creation of a new task, reviewing its details in the task information screen, editing task details on the edit task screen, marking tasks as complete, and finally deleting tasks The sequence is depicted across multiple screenshots, guiding users through the full lifecycle of a task—from creation to completion to removal—within the application.

Picture 3.1: ToDoNhut Github repo Error! Bookmark not defined

WHAT IS EDGE DETECTION?

What is an edge?

In low-level image processing, grayscale images provide a practical representation of scene intensity Edges in grayscale images are typically defined as large or abrupt changes in pixel intensity along a line or curve, marking boundaries and transitions in the scene and enabling effective edge detection.

Edge detection relies on observing large values in the first derivative of a signal, which highlights where intensity changes abruptly In two-dimensional images, the derivative is interpreted as image gradients along different directions, making it more complex than in one dimension A hard edge is an abrupt change, often only 2–3 pixels wide, while a soft edge features a gradual transition from bright to dark that spans several pixels.

 Then, in simpler way to define edge is that “Edges are discontinuities in intensity”

On this sign, the transition from red to white (and back) creates visible edges Viewed as a diagram, there are regions where red dominates, followed by breakpoints where the color weight shifts toward white Rather than a single boundary, these breakpoints align to form a line or curve that becomes a full edge, explaining the discontinuities in intensity at the transition.

Edges in images are defined as abrupt changes in image intensity that occur at boundaries between regions This concept is straightforward to understand and is widely used in image processing and edge detection We reference a diagram that shows how strong intensity transitions form the basic edge structures, while more complex or uncertain changes can produce a variety of edge formats The graph below illustrates these edge manifestations under different conditions, providing a clear visualization of how edge detection responds to varying intensity transitions.

The data show both uplifting and down-filtering trends with clear edges marking where the signal shifts The central question is how to estimate the rate of change for the graph or function these patterns describe By examining local slopes, transition points, and derivative estimates, you can quantify how quickly the value rises or falls across the signal This rate-of-change estimation helps reveal the system’s dynamics, supports robust edge detection, and provides a reliable measure across different regions of the graph With proper smoothing and noise management, the resulting rate of change reflects the true motion of the underlying function rather than random fluctuations.

Consider an image a_x that lies in a vector space spanned by x In edge enhancement, a specific model E_m is applied to a_x to produce a new image b_x, defined by b_x = E_m(a_x) This representation shows how the edge-enhancement operator transforms the input image within its vector space to emphasize edges and sharpen details for downstream image processing tasks.

Reference [1] – Slide Lecture 05 Edge detection provides a more detailed view of what is called an edge An edge is the border line between two different regions, and it can manifest as surface normal discontinuity, depth discontinuity, surface color discontinuity, or illumination discontinuity.

There's still a common confusion between contour and edge, which are related but not identical In edge detection, the goal is to identify intensity transitions that mark potential boundaries, while contour detection aims to extract closed structural outlines that enclose an object An edge may trace features without forming a closed shape, whereas a contour always constitutes a closed boundary around the object For a clearer view, refer to the two figures below illustrating an edge without a closed contour and a complete contour around an object.

What is edge detection?

Edge detection is a fundamental tool in image processing used for feature detection and extraction It aims to identify points in a digital image where brightness changes sharply, revealing abrupt discontinuities in intensity that delineate boundaries and shapes By detecting these edges, the technique supports downstream tasks like object recognition, segmentation, and scene analysis, making it a core component of many computer vision pipelines.

 The purpose of edge detection is significantly reducing the amount of data in an image and preserves the structural properties for further image processing

In a grayscale image, an edge is a local feature that, within a neighborhood, separates regions that are approximately uniform in intensity but have different values on each side of the boundary In the presence of noise, detecting edges becomes challenging because both edges and noise contribute high-frequency content, often resulting in blurred or distorted results.

 So, the input of our detector is an image and the output is basically an image but it is binary and it is called edge map

Edge detection has a wide range of practical applications across different fields In biometric security, it enhances fingerprint recognition by clarifying the ridge and minutiae boundaries In satellite imaging, edge maps aid location recognition, shape detection, and feature extraction, making analysis faster and more accurate In robotics and autonomous systems, edge detection supports lane detection for autopilot steering decisions, improving navigation reliability In medical imaging, converting complex X-ray images into edge maps helps highlight pathological structures such as tumors or cancer cells for easier diagnosis Together, these applications show how edge detection turns raw images into meaningful, actionable information.

According to the paper "Algorithm and Technique on Various Edge Detection: A Survey," the study provides a clear overview of the different edge detection types and features a diagram that helps readers visualize how these methods are organized and compared within the broader spectrum of edge-detection techniques.

 As Canny edge detection is put in first order gradient based operator which functions – based on thing called gradient, we are about to figure out in Chapter 2 it

CHAPTER 2: Canny edge and Canny edge detection algorithm

2.1 What is gradient and gradient based operator?

Gradient, also called slope, indicates how steep a line is It is calculated by dividing the vertical change (rise) by the horizontal change (run), and the standard formula is gradient = dy/dx (rise over run) This concise measure captures the rate of change as you move along the x-axis and is fundamental in graphing, geometry, and calculus for describing linear relationships and steepness.

 dx where dy change in y and is dx is change in x from Cartesian coordinate system

In vector calculus, the gradient ∇f of a scalar-valued differentiable function f of several variables is the vector field (or vector-valued function) denoted by ∇f At any point p in its domain, ∇f(p) is the vector whose components are the partial derivatives of f at p, i.e., ∇f(p) = (∂f/∂x1(p), ∂f/∂x2(p)(p), , ∂f/∂xn(p)).

The gradient vector encodes the direction of steepest ascent and the speed of that ascent for a scalar function When the gradient at a point p is nonzero, it points in the direction in which the function increases most rapidly from p, and its magnitude measures how fast that increase occurs in that direction.

An image gradient describes how the image intensity changes across pixels, capturing both the rate and direction of change It is computed by measuring variations in the image function along the x and y directions, effectively the slope or steepness of intensity variation The gradient's direction points toward the fastest increase in intensity, while its magnitude indicates the strength of that change This concept underpins image processing tasks like edge detection by highlighting where brightness shifts most rapidly in the image.

Although image changes are often described in terms of horizontal and vertical shifts, this discussion focuses on pixel-level analysis to measure edge strength and direction at a specific location (x, y) in the image By computing the local gradient, we capture the edge magnitude (strength) and orientation (direction), enabling precise edge detection and characterization at that pixel and forming the basis for downstream image analysis features.

The gradient of the image f at the point (x, y), denoted ∇f(x, y), indicates the direction of the steepest ascent in intensity Because it points to the greatest change in f, this gradient can be viewed as a vector whose direction reveals where the image intensity changes most rapidly at (x, y).

To improve readability, break the equation down step by step This decomposition shows the target value the expression points to, which is the point of maximum significance in the given context Generalizing the result links it to related problems and broadens its applicability By presenting the idea in a clear, simplified form, the explanation becomes easier to follow and more SEO-friendly for readers seeking concise mathematical insights.

     and by the y, the gradient can be y [0, y ] 0, f f g y

By combining these elements, we obtain the governing equation, which you can solve by finding theta; this angle defines the steepness of the slope, and for edge analysis it indicates the direction in which the edge is heading See the image below for a clear view of the components.

From here, we can focus on the two key concepts: edge strength and edge direction Edge strength is defined as the magnitude of the gradient vector, reflecting how strong the intensity change is at a point, while edge direction is the orientation of that gradient, given by theta = arctan2(Gy, Gx).

- The magnitude(length) of the vector f , this term has the same scenario with length of a normal vector which is | |v  x y 2 , denoted as ||f ||then we have:

Gradient magnitude quantifies the rate of change in the direction of the gradient vector, and in edge detection this quantity is known as edge strength Edge strength captures where image intensity changes most abruptly, providing the core signal for identifying boundaries in an image In some formulations, edge strength is related to distance measures like the Manhattan distance (L1 norm) between neighboring pixels, tying the concept to the discrete geometry of the image grid This perspective helps translate gradient-based calculations into robust edge maps for (image processing) and computer vision applications.

- Then the direction of the gradient is given by the angle :

From the image properties above, the vector points toward the direction of the greatest color change, and the edge lies between regions with contrasting colors To detect and extract the edge, its direction must be orthogonal to the gradient, which means rotating the gradient by 90 degrees to obtain the edge direction.

- To make it look better, we should take a look at the figure below here:

- In the above figure, edge normal here is actually our gradient direction and the edge direction is definitely orthogonal to the normal and reverse.

Who is Canny?

John F Canny (born 1958) is an Australian-born computer scientist who serves as the Paul E Jacobs and Stacy Jacobs Distinguished Professor of Engineering in the UC Berkeley Computer Science Department Trained at MIT and the University of Adelaide, he has made influential contributions across artificial intelligence, robotics, computer graphics, human–computer interaction, computer security, computational algebra, and computational geometry, establishing himself as a leading figure at the intersection of mathematics and computation.

Of course, The Canny edge detector was developed by John F Canny in 1986 Canny also produced a computational theory of edge detection explaining why the technique works.

Edge tracking by Hysteresis

In today’s tech-driven world, the camera is one of the most indispensable inventions, a gadget you rarely leave home without It lets you capture high-quality images, edit and enhance them, and turn everyday moments into lasting memories you can enjoy and share.

When images become extremely popular, the analysis of images has been a growing field of science and application for several decades Therefore, the ability to analyze images has been increasing

Edge detection is a classic technique in image processing, with established solutions such as Roberts, Prewitt, Sobel, Marr–Hildreth, and the Canny edge detector In this report we focus on the Canny edge detector, which is widely regarded as an efficient detector for edge extraction To keep things straightforward and avoid reinventing the wheel, we outline the main approaches under the key headings below, providing a clear overview of how these methods work and where Canny fits among them for practical edge detection.

Chapter 1: What is Edge detection?

CANNY ALGORITHM SCRIPTING

VIETNAM GENERAL CONFEDERATION OF LABOUR 2

VERIFICATION AND EVALUATION FROM LECTURER iii

CHAPTER 1: WHAT IS EDGE DETECTION? 4

This article walks you through a practical Flutter workflow, starting with an examination of Flutter logs to understand runtime behavior, then presenting a representative codebase example that shows how components are structured, followed by a text sample and its rendered result to illustrate how content appears in the user interface It also showcases ToDoNhut’s starting screen and walks through Step 1 to create a task, offering a concrete path from app launch to task entry Together these sections highlight common Flutter development patterns—from logging and debugging to implementing a simple to-do task flow—providing actionable insights for developers building Flutter apps.

Picture 2.3: Step 2 to create task Error! Bookmark not defined

An end-to-end walkthrough of task management in a project dashboard, detailing how to finish creating a task, open the task information screen, edit task details, mark the task as complete, and delete a task The sequence is shown in Figures 2.4 through 2.10, highlighting the user flows—from completing task creation to viewing and updating task data, applying edits in the edit screen, completing tasks, and removing them when needed This concise guide helps users manage tasks efficiently and keep their task list organized.

Picture 3.1: ToDoNhut Github repo Error! Bookmark not defined

CHAPTER 1: WHAT IS EDGE DETECTION?

Shapes in images are defined by their boundaries, which can be detected or enhanced with edge detection and related enhancement algorithms Many texture measures rely on edge information to capture structural details Edge detection is a mature field within image processing and serves as a core image segmentation technique, partitioning the spatial domain of an image into meaningful parts or regions.

In low-level image processing, we work with grayscale representations to analyze image structure In grayscale images, edges are typically defined as large or abrupt changes in intensity along a line or across a curve, signaling boundaries and transitions in the scene.

Edge detection relies on the first derivative of image intensity In two-dimensional images, this derivative has a more complex definition than in one dimension because changes can occur along both horizontal and vertical directions Large values in the first derivative signal a strong change in brightness, indicating an edge A hard edge is an abrupt transition, often only 2–3 pixels wide, whereas a soft edge shows a gradual transition from bright to dark that spans several pixels.

 Then, in simpler way to define edge is that “Edges are discontinuities in intensity”

Viewed as a diagram, the sign shows how the red region dominates up to a breakpoint, after which the white region takes over; the boundary between red and white is an edge Rather than a single breakpoint, multiple breakpoints align to form a line or curve that constitutes a full edge, explaining the discontinuities in intensity.

Edges in images correspond to abrupt changes in intensity that mark boundaries and texture changes This concept is easier to understand when illustrated in a diagram, where sharp transitions highlight edge locations As shown in the graph below, edge structures arise from dramatic variations in brightness, forming the patterns that edge-detection algorithms seek to identify.

This analysis highlights both rising and falling trends and boundary effects in the data, with edges to consider The core question is how to estimate the rate of change imposed by the graph or the underlying function on the signal By examining the derivative or gradient and accounting for any filtering steps that shape the observed variations, we can pinpoint regions of rapid change versus steadiness and achieve a clearer interpretation for data analysis and modeling.

An image can be represented as a vector a_x, where x spans the space of possible images When performing edge enhancement, a specific model E_m is applied to a_x, producing a new image b_x defined by b_x = E_m(a_x) This perspective treats images as points in a vector space and edge enhancement as a transformation that maps the original image to its enhanced version through the model E_m.

Edge detection is the process of identifying the border between two different regions in an image An edge is the boundary that separates areas with distinct properties, including surface normal discontinuity, depth discontinuity, surface color discontinuity, and illumination discontinuity For a more detailed view, refer to reference [1] Slide Lecture 05: Edge Detection.

One more thing you might confuse is the difference between a contour and an edge They are related but not the same In this discussion of edge detection and contour detection, we’ll keep the comparison concise Both edge detection and contour detection are used to determine the structural outlines of an object, but an edge may not form a closed shape, whereas a contour represents a closed boundary around the object See the two figures below for a clearer view of these concepts.

Edge detection is a fundamental technique in image processing used for feature detection and feature extraction It aims to identify points in a digital image where brightness changes abruptly and to locate discontinuities in the visual data By highlighting sharp intensity transitions, edge detection enables subsequent tasks like object recognition, segmentation, and scene analysis.

 The purpose of edge detection is significantly reducing the amount of data in an image and preserves the structural properties for further image processing

In grayscale images, edges are local features that separate neighboring regions, each having relatively uniform intensity but different values on opposite sides of the edge When noise is present, edge detection becomes difficult because both edges and noise contribute high-frequency content, which can blur and distort the resulting image.

 So, the input of our detector is an image and the output is basically an image but it is binary and it is called edge map

Edge detection has versatile applications across several fields In biometrics, it sharpens fingerprint recognition by emphasizing ridge-and-valley boundaries, with numerous studies supporting this approach In satellite imaging, edge maps aid location recognition and shape detection, making geographic features easier to identify In robotic vision and autonomous navigation, detecting lane boundaries and road edges guides steering decisions and lane-to-wheel-angle conversions In medical imaging, converting complex X-ray images into edge maps helps reveal pathological objects such as tumors or cancerous cells for easier detection and analysis Together, these use cases show how edge detection turns raw images into actionable representations for analysis and decision-making.

A paper titled "Algorithm and Technique on Various Edge Detection: A Survey" provides a clear overview of the different edge detection types, and the accompanying diagram below visually illustrates these methods.

 As Canny edge detection is put in first order gradient based operator which functions – based on thing called gradient, we are about to figure out in Chapter 2 it

CHAPTER 2: Canny edge and Canny edge detection algorithm

2.1 What is gradient and gradient based operator?

Canny algorithm scripting

In OpenCV, there is a Canny function with structure as: void cv::Canny ( InputArray image,

OutputArray edges, double threshold1, double threshold2, int apertureSize = , 3 bool L2gradient = false

 edges output edge map; single channels 8-bit image, which has the same size as image (binary image)

 threshold1 first threshold for the hysteresis procedure

 threshold2 second threshold for the hysteresis procedure

 apertureSize aperture size for Sobel operator

 L2gradient a flag, indicating weather a more accurate 𝐿 2 √( 𝑑 𝑑𝐼

𝑦)should be used to calculate the image gradient magnitude

(L2gradient = true), or whether the default 𝐿 1 = |𝑑𝐼/𝑑 𝑥 | + |𝑑𝐼/𝑑 | 𝑒𝑛𝑜𝑢𝑔ℎ 𝑦 𝑖𝑠 (L2gradient = false)

In Canny function, we don’t have to attend where is the higher threshold parameters in procedure prepareThresh template inline void prepareThresh( f64 low_thresh f64 high_thresh, , s32 &low, s32 &high)

{ if (low_thresh > high_thresh) std::swap(low_thresh, high_thresh);

Under GCC, the function casts the low and high thresholds to 32-bit signed integers, then conditionally decreases low when it lies below its threshold and decreases high when it exceeds its threshold The high value is then assigned the result of rounding the high threshold via internal::round It computes ldiff and hdiff as the floating-point differences between the thresholds and the adjusted low and high, and finally subtracts ldiff from low when ldiff is negative and subtracts 1 from high when hdiff is negative This sequence enforces threshold-based adjustments and produces final low and high values aligned with the specified limits.

The source code shows the different when change the L2gradient = true or false:

 When L2gradient = true: for (; j < colscn j ; ++)

_norm[ ] j = s32(_dx[j ])*_dx[ ] + s32(_dy[j j ])*_dy[ ]; j

 When L2gradient = false: for (; j < colscn j ; ++)

_norm[ ] j = std::abs(s32(_dx[ j])) + std::abs s32(_dy[j ( ]));

Two loops above is apply in NormCanny procedure which have 3 pointer as parameters (_dx, _dy and _norm)

In source code, they apply the BLOB technic (consider 8-connected neighborhood pixels) when using Hyteresis:

// now track the edges (hysteresis thresholding) while (stack_top > stack_bottom)

{ u8* m ; if (( size_t )( stack_top - stack_bottom ) + 8u > maxsize)

{ ptrdiff_t sz = ( ptrdiff_t )(stack_top stack_bottom - ); maxsize = maxsize * 3 2 / ; stack.resize(maxsize); stack_bottom = &stack[0]; stack_top = stack_bottom + sz;

Within a Canny edge detection routine, the current pixel pointed to by m is popped from the processing queue, and the code then examines its eight neighboring pixels in the image grid If a neighbor’s value is zero, indicating it has not yet been processed, that neighbor is enqueued for later processing using CANNY_PUSH The eight neighbors considered are the positions left and right (m-1 and m+1) and the three above (m-mapstep-1, m-mapstep, m-mapstep+1) plus the three below (m+mapstep-1, m-mapstep, m+mapstep+1), where mapstep is the row stride of the image This approach propagates processing outward through the 8-connected neighborhood until no unprocessed neighbors remain.

An example for using canny in OpenCV: import cv2 image = cv2.imread('test2.jpeg', cv2.IMREAD_COLOR) blur = cv2.GaussianBlur( image , (5, 5), 0) canny = cv2 Canny (image = blur, threshold1 = 30, threshold2 = 100, apertureSize = 3,

L2gradient = False) cv2 imshow ('Result', image) cv2 imshow (' Canny ' , canny) cv2 waitKey (0) cv2 destroyAllWindows ()

We can consider different between 2 equations (Manhattan distance and Euclidean distance)

Then, we can see the different between Canny and Sobel: (Sobel will record all the different include not an edges):

Tiêu đề	Midterm Report Digital Image Processing Canny Algorithm and Code
Tác giả	Nguyen Minh Nhut, Tran Nhan Tai
Người hướng dẫn	PhD. Pham Van Huy
Trường học	Ton Duc Thang University
Chuyên ngành	Digital Image Processing
Thể loại	báo cáo giữa kỳ
Năm xuất bản	2021
Thành phố	Ho Chi Minh City

Định dạng
Số trang	32
Dung lượng	2,1 MB