1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Intelligent parking support system

48 5 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Intelligent Parking Support System
Tác giả Pham Duc Hoang, Nguyen Dac Thang
Người hướng dẫn Ph. D Pham Van Khoa
Trường học Ho Chi Minh City University of Technology and Education
Chuyên ngành Electronics and Communication Engineering Technology
Thể loại Graduation Thesis
Năm xuất bản 2022
Thành phố Ho Chi Minh City
Định dạng
Số trang 48
Dung lượng 3,79 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Cấu trúc

  • CHAPTER 1: INTRODUCTION (15)
    • 1.1 Topic Overview (15)
    • 1.2 Aim and scope (15)
    • 1.3 Methodology (16)
    • 1.4 Thesis summary (16)
  • CHAPTER 2: LITERATURE REVIEW (16)
    • 2.1 Optical Character recognition (17)
      • 2.1.1 OCR (17)
      • 2.1.2 Machine learning OCR with Tesseract (17)
    • 2.2 Methods in image processing (18)
      • 2.2.1 Grayscale Image (18)
      • 2.2.2 Noise reduction with a Gaussian filter (19)
      • 2.2.3 Binary with dynamic threshold (20)
      • 2.2.4 Canny Edge Detection (21)
      • 2.2.5 Filter number plate with contours (23)
    • 2.3 YOLO (24)
      • 2.3.1 About Yolo (24)
      • 2.3.2 How yolo works (24)
      • 2.3.3 Yolo structure (25)
      • 2.3.4 Loss Function (26)
      • 2.3.5 The network (26)
  • CHAPTER 3: SYSTEM DESIGN (16)
    • 3.1 Overview (27)
    • 3.2 Block Diagram (27)
      • 3.2.1 License Plate (28)
      • 3.2.2 Vehicle Color (30)
      • 3.2.3 Vehicle Properties (30)
      • 3.2.4 Other Block (33)
    • 3.3 Flowchart of the system when entering (33)
    • 3.4 Flowchart of the system when going out (35)
  • CHAPTER 4: EXPERIMENT RESULTS (16)
    • 4.1 Simulation (38)
      • 4.1.1 Simulate receiving RFID card code (38)
      • 4.1.2 Simulate number plate recognition (39)
      • 4.1.3 Simulate vehicle paint color recognition (41)
      • 4.1.4 Model of vehicle identification (42)
    • 4.2 Model of the entire system (43)
  • CHAPTER 5: CONCLUSION AND FURTHER WORK (16)
    • 5.1 Conclusion (46)
    • 5.2 Further work (46)

Nội dung

MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING ELECTRONICS AND TELECOMMUNICATION ENGINEERING TECHNOLOGY IN

Trang 1

MINISTRY OF EDUCATION AND TRAINING

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION

FACULTY FOR HIGH QUALITY TRAINING

ELECTRONICS AND TELECOMMUNICATION ENGINEERING TECHNOLOGY

INTELLIGENT PARKING SUPPORT SYSTEM

LECTURER: PHAM VAN KHOA STUDENT : LA GIA KIET

NGUYEN DAC THANG

SKL 0 0 9 3 0 9

Ho Chi Minh City, August, 2020

Trang 2

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION

FACULTY FOR HIGH QUALITY TRAINING

GRADUATION THESIS

MAJOR: ELECTRONICS AND COMMUNICATION ENGINEERING

TECHNOLOGY

INTELLIGENT PARKING SUPPORT SYSTEM

STUDENT: PHAM DUC HOANG - 18161012

NGUYEN DAC THANG - 18161037

ADVISOR: Ph D PHAM VAN KHOA

Trang 3

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION

FACULTY FOR HIGH QUALITY TRAINING

GRADUATION THESIS

MAJOR: ELECTRONICS AND COMMUNICATION ENGINEERING

TECHNOLOGY

INTELLIGENT PARKING SUPPORT SYSTEM

STUDENT: PHAM DUC HOANG - 18161012

NGUYEN DAC THANG - 18161037

ADVISOR: Ph D PHAM VAN KHOA

Ho Chi Minh City, July 2022

Trang 4

THE SOCIALIST REPUBLIC OF VIETNAM

Independence – Freedom– Happiness

-

Ho Chi Minh City, July 28th, 2022

GRADUATION PROJECT ASSIGNMENT

Student name: Pham Duc Hoang Student ID: 18161012

Student name: Nguyen Dac Thang Student ID: 18161037

Major: Electronics and Communication Engineering Technology

Class: 18161CLA

Advisor: Ph D Pham Van Khoa Phone number:

Date of assignment: Date of submission:

1 Thesis title: Intelligent Parking Support System

2 Initial materials provided by the advisor

3 Content of the thesis:

- To determine the thesis' development course, consult the document, read it, and make a summary

- Design and explanation of flow charts

- Running the program and confirming successful completion

- Prepare presentations and write a report

CHAIR OF THE PROGRAM

(Sign with full name)

ADVISOR

(Sign with full name)

Ph D Pham Van Khoa

Trang 5

THE SOCIALIST REPUBLIC OF VIETNAM

Independence – Freedom– Happiness

-

Ho Chi Minh City, July 28th, 2022

ADVISOR’S EVALUATION SHEET

Student name: Pham Duc Hoang Student ID: 18161012

Student name: Nguyen Dac Thang Student ID: 18161037

Major: Electronics and Communication Engineering Technology

Thesis title: Intelligent Parking Support System

Advisor: Ph D Pham Van Khoa Phone number:

EVALUATION

1 Content of the project:

2 Strengths:

3 Weaknesses:

4 Approval for oral defense? (Approved or denied)

(Sign with full name)

Ph D Pham Van Khoa

Trang 6

THE SOCIALIST REPUBLIC OF VIETNAM

Independence – Freedom– Happiness

-

Ho Chi Minh City, July 28th, 2022

PRE-DEFENSE EVALUATION SHEET

Student name: Pham Duc Hoang Student ID: 18161012 Student name: Nguyen Dac Thang Student ID: 18161037 Major: Electronics and Communication Engineering Technology Thesis title: Intelligent Parking Support System Name of Reviewer: Ph.D Do Duy Tan Phone number: EVALUATION 1 Content and workload of the project:

2 Strengths:

3 Weaknesses:

4 Approval for oral defense? (Approved or denied)

5 Reviewer questions for project valuation

6 Mask: (in words:……….)

Ho Chi Minh City, July 28th, 2022

REVIEWER

(Sign with full name)

Ph.D Do Duy Tan

Trang 7

Thesis title: Intelligent Parking Support System

Advisor: PHAM VAN KHOA, Ph D

Student 1: PHAM DUC HOANG

Student’s ID: 18161012 Class: 18161CLA1

Email: 18161012@student.hcmute.edu.vn

Student 2: NGUYEN DAC THANG

Student’s ID: 18161037 Class: 18161CLA2

(Sign with full name)

Trang 8



The most crucial phase of every student's life is the process of finishing their graduation thesis Our graduation thesis serves to provide us with knowledge we will need

in the future, as well as research techniques We would first like to express our gratitude to

Ho Chi Minh City University of Technology and Education The Faculty of High-Quality Training teachers, in particular during our time in the lecture hall, enthusiastically lectured and equipped us with essential knowledge, laying the foundation for our capacity to complete the thesis

We want to express our gratitude to Ph.D Pham Van Khoa for his passionate assistance and guidance in terms of scientific thinking and effort Those are some extremely valuable suggestions, both for the development of this thesis and as a foundation for our subsequent coursework and professional growth If something is missing from the project, our team is hopeful that Ph.D Pham Van Khoa and the professors can sympathize and provide suggestions on how to improve

Ho Chi Minh City, July 28th, 2022

Trang 9

Currently, the number of vehicles participating in traffic on the road is very large, leading to consuming a lot of human and material resources for the management of personal vehicles in the parking lot Without a convenient tool, the management of personal vehicles is very time consuming, easy to cause confusion, damage to service users at parking lots

To reduce the load on tasks such as collecting money, insuring cars, finding vehicles

in parking lots, the world has developed automatic monitoring technology for vehicles, thanks to the individuality of the sea The number of vehicles that has become the main object used for research and development in this technology

Therefore, our group want to choose this topic as a basic step in understanding more powerful monitoring tools such as vehicle control on the road or facial recognition which are being paid great attention by the world at the moment

Trang 10

TABLE OF CONTENTS

DISCLAIMER i

ACKNOWLEDGE ii

ABSTRACT iii

TABLE OF CONTENTS iv

LIST OF FIGURES vi

ABBREVIATIONS viii

CHAPTER 1: INTRODUCTION 1

1.1 Topic Overview 1

1.2 Aim and scope 1

1.3 Methodology 2

1.4 Thesis summary 2

CHAPTER 2: LITERATURE REVIEW 3

2.1 Optical Character recognition 3

2.1.1 OCR 3

2.1.2 Machine learning OCR with Tesseract 3

2.2 Methods in image processing 4

2.2.1 Grayscale Image 4

2.2.2 Noise reduction with a Gaussian filter 5

2.2.3 Binary with dynamic threshold 6

2.2.4 Canny Edge Detection 7

2.2.5 Filter number plate with contours 9

2.3 YOLO 10

2.3.1 About Yolo 10

2.3.2 How yolo works 10

2.3.3 Yolo structure 11

2.3.4 Loss Function 12

2.3.5 The network 12

CHAPTER 3: SYSTEM DESIGN 13

Trang 11

3.2 Block Diagram 13

3.2.1 License Plate 14

3.2.2 Vehicle Color 16

3.2.3 Vehicle Properties 16

3.2.4 Other Block 19

3.3 Flowchart of the system when entering 19

3.4 Flowchart of the system when going out 21

CHAPTER 4: EXPERIMENT RESULTS 24

4.1 Simulation 24

4.1.1 Simulate receiving RFID card code 24

4.1.2 Simulate number plate recognition 25

4.1.3 Simulate vehicle paint color recognition 27

4.1.4 Model of vehicle identification 28

4.2 Model of the entire system 29

CHAPTER 5: CONCLUSION AND FURTHER WORK 32

5.1 Conclusion 32

5.2 Further work 32

REFERENCE 33

Trang 12

LIST OF FIGURES

Figure 2 1: Architecture of Tesseract OCR [6] 4

Figure 2 2: Gaussian filter matrix [7] 6

Figure 2 3: Result using Gauss filter 6

Figure 2 4: Photo after detecting the Canny border 9

Figure 2 5: Draw Contour with OpenCV 10

Figure 2 6: The model [10] 11

Figure 2 7: The architecture [10] 12

Figure 3 1: Block diagram of the system when the vehicle come in 13

Figure 3 2: Block diagram of the system when the vehicle come out 14

Figure 3 3: Block diagram of license plate block 15

Figure 3 4: Block diagram of vehicle color 16

Figure 3 5: Block diagram of vehicle type identification block 17

Figure 3 6: Scooter Data 18

Figure 3 7: Motorbike Data 18

Figure 3 8: Flowchart overview when entering 20

Figure 3 9: Detailed flowchart upon entering 21

Figure 3 10: Flowchart overview when going out 22

Figure 3 11: Detailed flowchart upon going out 23

Figure 4 1: Function simulation blocks 24

Trang 13

Figure 4 2: Realistic model of RFID tag reading 25

Figure 4 3: Test Read RFID Tag 25

Figure 4 4: Demo of license plate recognition system 26

Figure 4 5: Wrong number plate recognition system 26

Figure 4 6: Demo of motorcycle color recognition 27

Figure 4 7: Wrong color recognition 27

Figure 4 8: Scooter identification 28

Figure 4 9: Identification of motorcycles 28

Figure 4 10: Wrong vehicle identification 29

Figure 4 11: Process of the system when the vehicle enters 30

Figure 4 12: Information storage system 30

Figure 4 13: The system's processing process when a vehicle comes out 31

Figure 4 14: The system recognizes when the outgoing data is different from the card's stored data 31

Trang 14

ABBREVIATIONS

DTCNN Discrete-time cellular neural network SVM Support Vector Machine RFID Radio Frequency Identification OCR Optical character recognition LSTM Long Short Term Memory PIL Python Image Library CNN Convolutional Neural Network

Trang 15

CHAPTER 1: INTRODUCTION 1.1 Topic Overview

Urbanization is happening quickly, and the population in metropolitan areas is growing along with a significant growth in transportation options As a result, theft becomes more likely as the number of vehicles in the parking lot grows and becomes harder

to manage The major cause is that thieves employ schemes to create bogus tickets, switch license plates, or misplace tickets From there, our group had the concept for creating a smart parking system

Currently, parking lots in our nation are deemed intelligent when it is possible to identify and store license plates However, simply identifying number plates is insufficient

to ensure security and safety for the parking lot So how can we operate the parking lot entirely with machines and do away with the human element? Our group has developed a method that combines the classification of vehicles with the identification of vehicle colors The parking lot's security and safety will be improved, and the need for human assistance throughout the process will be eliminated, after the three aforementioned aspects are thoroughly validated

Previous smart parking topics have been about vehicle and license plate detection using Image Warping based on YOLOv2 [1] and there are some other studies that have used OpenCV and Python to identify vehicles [2] Other approaches exist as well, like discrete-time cellular neural networks (DTCNN) [3] My group will integrate the capabilities of vehicle identification using the YOLO V4 [4] and vehicle color identification using the SVM [5] in this topic By eliminating the need for human supervision and adding all safety and security features, the most comprehensive smart vehicle park model is created

1.2 Aim and scope

In this project, our group will create an automated smart parking lot model The type

of vehicle (motorcycle, scooter), color, and license plate can all be recognized by the parking lot and recorded on an RFID magnetic card

It is only at the level of research and testing because it is solely for research and not for commercialization Thus, it will only function under certain circumstances, such as the following:

- The number plate includes two lines of characters with black lettering on a white backdrop

- The number plate must be clear and unobstructed, without rust or peeling paint

- The license plate cannot be tilted upward more than 45 degrees from the horizontal

- The license plate image is not blurry, and the curved number plate characters are easily distinct and recognizable

Trang 16

- The brightness is appropriate, and there is no glare or noise from brilliant images

1.3 Methodology

In this project, my team will use the most optimal method, which is YOLO V4 used

to identify vehicles quickly and accurately, applying artificial intelligence to process Specifically, machine learning algorithms (SVM) to classify data and recognize characters

in license plates and recognize vehicle colors In addition, some image processing methods are used (Gaussian, Otsu, ) to recognize letters and numbers in license plates

1.4 Thesis summary

Chapter 1: Introduction

Give an explanation of the thesis's objectives, methodologies, and topic selection

Chapter 2: Literature Review

Introduction to vehicle recognition, presentation of image processing algorithms uses to recognize license plates

Chapter 3: System Design

Here is the process diagram for entering and leaving a parking lot, as well as the block diagram of the vehicle identification and categorization system

Chapter 4: Experiment Results

Evaluating the accuracy of the model, giving a picture of the system's simulation results

Chapter 5: Conclusion and Further Work

Review the results of this study and offer remedies for any issues that need to be resolved

Trang 17

CHAPTER 2: LITERATURE REVIEW 2.1 Optical Character recognition

2.1.1 OCR

Optical character recognition (OCR), also known as text recognition technology, converts any type of image containing text into machine-readable text data OCR enables speedy and automatic document digitization without the need for manual data entry The output of OCR is more frequently utilized for editing electronic documents, for small-footprint data storage, and as the foundation for technologies like cognitive computing, machine translation, and text-to-speech

2.1.2 Machine learning OCR with Tesseract

Tesseract was originally developed at Hewlett-Packard Laboratories from 1985 to

1994 HP released it as open source in 2005 One of the most precise open-source OCR systems available in 2006 was Tesseract As such, the tesseract's capabilities are restricted

to organized text data In texts with substantial noise and an unstructured structure, it will perform very poorly Google has been funding tesseract development since 2006

Methods based on deep learning perform better with unstructured data With an LSTM-based OCR engine (a form of reconstructive neural network) that concentrates on line identification but also supports Tesseract 3's legacy Tesseract OCR engine that employs character pattern recognition, Tesseract 4 includes deep learning-based capabilities On July 7, 2019, the most recent stable version 4.1.0, was made available On unstructured text, it is also substantially more accurate

Trang 18

Figure 2 1: Architecture of Tesseract OCR [6]

Tesseract OCR operates in a sequential manner in accordance with the block

diagram in Figure 2.1 The first step, adaptive thresholding, converts the image into binary

images The next step is connected component analysis, which is used to extract character outlines This method is quite useful because it performs OCR on pictures with white text

on black backgrounds Tesseract was perhaps the first to provide this kind of processing After that, the outlines are changed into blobs Blobs are used to construct text lines, which are then checked for a specific area or an equivalent text size

2.2 Methods in image processing

2.2.1 Grayscale Image

A grayscale image is simply an image in which the colors are shades of gray with

256 gray levels varying from black to white, ranging from 0 to 255, i.e 8 bits or 1 required bytes to represent each of these pixels The reason it is important to distinguish between grayscale images and other images is that grayscale images provide less information per pixel With a normal image, each pixel is usually provided with 3 fields of information while with a gray image there is only 1 information field, reducing the amount of information helps to increase processing speed, simplifies the algorithm but still ensures the correct information required task

Trang 19

In this project, we will convert gray images from HSV color system instead of RGB because with HSV color space we have three main values: Color area (Hue), saturation (Saturation), light intensity (Value) For that reason, the HSV color space is better adapted

to changes in ambient light When converting, the gray image we need is a matrix of intensity values extracted from the HSV color system

2.2.2 Noise reduction with a Gaussian filter

❖ Noise

Noise is basically understood as the form of small particles distributed on the image Noise can distort details in an image, resulting in low image quality In fact, there are many types of noise, but it is usually divided into three types: additive noise, multiplicative noise and impulse noise The nature of noise usually corresponds to high frequencies, and the theoretical basis of the filter is that only signals of certain frequencies can pass through, so people often use low-pass or medium-pass filters

❖ Gaussian filter

Gaussian filter is a low pass filter used to reduce noise (high frequency components) and blurred areas of an image The filter is implemented as an Odd sized Symmetric Kernel (DIP version of the Matrix) that is passed over each pixel of the Region of Interest to get the desired effect The kernel is not difficult in the direction of drastic color changes (edges) since pixels towards the center of the kernel have more weight towards the final value then

to the periphery A Gaussian Filter can be thought of as an approximation of a (mathematical) Gaussian Function In this article, we will learn the methods of using Gaussian Filter to reduce noise in images using Python programming language

In the process of using Gaussian Filter on an image, we first determine the size of the Kernel/Matrix that will be used to reduce the image The dimensions are usually odd, i.e the overall result can be calculated on the central pixel In addition, the kernels are symmetric and therefore have the same number of rows and columns The values inside the kernel are calculated using the Gaussian function, as follows:

Using the above function, a gaussian kernel of any size can be computed by feeding

it the appropriate values Gaussian kernel approximation in two dimensions of 3 x 3 with standard deviation equal 1

We will use a PIL (Python Image Library) function called filter() to pass our entire image through a predefined Gaussian kernel First import Image and Image Filter (to use filter() modules) of the PIL library Then we create an image object by opening the image

Trang 20

at the IMAGE_PATH (User Defined) path We then filter the image through the filter function and provide the Image Filter Gaussian Blur (predefined in the Image Filter module) as an argument to it The kernel size of Image Filter Gaussian Blur is 5×5 Finally,

we have rendered the image

Figure 2 2: Gaussian filter matrix [7]

Assume the image is one-dimensional The pixel in the center will have the greatest weight Pixels farther from the center will have a decreasing weight as their distance from the center increases Thus, the closer the point is to the center, the more it will contribute

to the central point value

Original image Image after blurring, reducing noise

Figure 2 3: Result using Gauss filter 2.2.3 Binary with dynamic threshold

❖ Binary Image

An image where the value of pixels is represented by only two values, 0 (Black) and

255 (White)

❖ Binary

Trang 21

It is the process of converting a grayscale image into a binary image

- Call the luminous intensity value at a pixel I(x,y)

- INP(x,y) is the intensity of the pixel on the binary image

- With (0 < x < image Width) and (0 < y < image Height)

To convert grayscale image to binary image we compare the pixel's luminance value with a binary threshold T

- If I(x,y) > T then INP(x, y) = 0

- If I(x,y) > T then INP(x, y) = 255

❖ Binary with dynamic threshold

It is very difficult to binarize an image with a global threshold as it usually is when you have to manually calculate and choose the appropriate threshold level for each different image Dynamic threshold image binarization will help calculate the threshold to suit each image, the second advantage is that it is very suitable when the image has areas that are too bright or too dark, leading to the loss of images in that area if use global thresholds

About the main idea will follow these 3 steps:

- Divide the image into many different areas, windows (Region)

- Use an algorithm to find a matching T value for each window

- Apply the binary method to each area and window with the appropriate threshold T

There are many methods to find T, here I use a type of algorithm that the OpenCV library supports, ADAPTIVE_THRESH_GAUSSIAN_C i.e averaging the values around the dynamic threshold point under consideration T(x,y) with a Gaussian distribution then subtract the constant C

2.2.4 Canny Edge Detection

In images, there are often components such as: smooth areas, corners/edges, and noise Edges in images have important features, often belonging to objects in the image Therefore, for edge detection in the image, there are many different algorithms such as Sobel operator, Prewitt operator, Zero crossing But here I choose Canny algorithm because this method is superior to other methods due to its advantages less affected by noise and for the ability to detect weak edges This method follows 4 main steps:

Trang 22

Blur image, reduce noise using Gauss filter size 5x5 The 5x5 size usually works well for the Canny algorithm

❖ Eliminate points that are not maxima

In this step, use a 3x3 filter that runs through the pixels on the gradient image in turn During the filtering process, consider whether the gradient magnitude of the central pixel is maximum compared to the gradients in the surrounding pixels If it is the maximum, we will note that we will keep that pixel And if the pixel there is not a neighboring maximum, we will set its gradient magnitude to zero We only compare the central pixel with 2 neighboring pixels in the gradient direction For example, if the gradient direction is 0 degrees, we will compare the center pixel with its left and right adjacent pixels In other case, if the gradient direction is 45 degrees, we will compare it with 2 neighboring pixels, the upper right corner and the lower left corner of the center pixel

❖ Filter threshold

Threshold filtering: we will consider the positive pixels on the resulting binary mask

of the previous step If the gradient value exceeds the max_val threshold, the pixel is definitely an edge Pixels with gradient magnitude less than min_val threshold will be discarded Pixels that fall within the upper two thresholds will be considered to be adjacent

to those that are said to be "definitely edge" If adjacent, we keep, if not adjacent to any edge pixels, we remove After this step, we can apply additional post-processing to remove noise (ie, discrete or short edge pixels) if desired

❖ Result

After using canny edge detection, although we have extracted the edge details of the number plate, but there are still too many redundant details in the image, from here we will draw contour, apply the characteristics of the sea number to filter to get the correct number plate

Trang 23

Figure 2 4: Photo after detecting the Canny border 2.2.5 Filter number plate with contours

❖ Contours

Contour can be understood as a set of points forming a closed curve around an object Frequently used to establish the location and properties of things Contours come handy in shape analysis, finding the size of the object of interest, and object detection Suzuki's Contour tracing approach is used with the OpenCV package

❖ Suzuki's Tracing Algorithm

This is the algorithm used by the OpenCV library, in addition to the ability to determine the boundary of the object like the two methods above Suzuki's Tracing method

is also capable of distinguishing the outer boundary (Outer) and the inner boundary (Hole)

of the object

Trang 24

Figure 2 5: Draw Contour with OpenCV

In the image, the pink lines are the contour lines surrounding the object, but because there are too many lines around the objects other than the number plate, we will apply the specific features of the high/wide scale, the area in the image to filter out the correct number plate to look for

2.3 YOLO

2.3.1 About Yolo

You only look once (YOLO) is a CNN model for object detection that stands out for being a lot faster than the more traditional methods even perform well on IOT gadgets like the Raspberry Pi Pre-YOLO object identification techniques used classifiers to conduct detection, however YOLO suggests using an end-to-end neural network that provides a prediction At the same time, guess boundary boxes and class probabilities

By employing a fundamentally distinct method for object detection, YOLO achieves cutting-edge outcomes that far outperform those of conventional real-time object detection systems

2.3.2 How yolo works

The YOLO algorithm divides the image into N grids, each containing an equal-sized SxS rectangle These N grids are each in charge of locating and zoning the feature they contain Accordingly, these grids forecast feature names, object probabilities icons, and bounding box coordinates B relative to their cell coordinates

This technique results in many duplicate predictions because many cells forecast the same object with the predictions, but it considerably reduces computation because both detection and recognition are done by the cells from the image various boundary boxes

Ngày đăng: 05/05/2023, 14:30

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w