Design and implementation of marker based visual localization robot

OVERVIEW

INTRODUCTION

Computer vision is an interdisciplinary domain focused on enabling computers to achieve a high-level understanding of digital images and videos It aims to automate tasks traditionally performed by the human visual system, enhancing efficiency and accuracy in various applications.

Computer vision and machine vision are closely related fields utilized in various applications, from industrial machine vision systems inspecting bottles on production lines to advanced AI research These technologies integrate automated image analysis for tasks like automated inspection and robot guidance in industrial settings High levels of automation are essential in many modern projects, exemplified by autonomous vehicles that use computer vision for navigation and obstacle detection, including submersibles, land-based vehicles, and unmanned aerial vehicles (UAVs).

Autonomous vehicles primarily rely on Global Navigation Satellite Systems (GNSS), like GPS, for self-location; however, GNSS struggles in indoor environments and offers limited precision of a few meters without orientation data While combining GPS with a compass can improve accuracy to 1-2 meters in distance and about 10 degrees in orientation, this level of precision is insufficient for robot navigation and self-driving cars To achieve higher accuracy, visual localization techniques—determining location through images—are employed Various methods, including probabilistic visual self-localization algorithms and visual odometry, have been developed Additionally, visual markers with known positions are utilized, allowing for relative positioning and orientation estimation through the Perspective-n-Point (PnP) problem.

Aruco markers fiduciary are used in this project because of their simple, fast processed and highly reliable respondents, not to mention that they are compatible with OpenCV library.

This report details our proposed project for a maker-based autonomous robot designed to simulate a warehouse environment, aimed at reducing working time and human labor To overcome the limitations of static cameras, we equipped a quad-wheel robot with a 5MP wide-angle camera for flexible visual input collection, which is essential for its navigation system The camera frames are processed using an embedded computer running Python and OpenCV for marker detection and pose estimation Each Aruco marker in the environment provides crucial feedback, enabling the robot to accurately locate itself The robot's performance is assessed based on its self-localization capabilities in various scenarios that closely mimic real-life conditions.

OBJECTIVES

This project focuses on creating a marker-based autonomous robot that utilizes visual localization techniques The robot's operational area is defined by Aruco markers and the field's boundaries, ensuring that it returns to its starting point if it exits the designated zone or fails to detect any markers.

The primary objective of this autonomous robot project is to navigate the robot to its intended destination within the working environment Additionally, it features real-time control and display of operational data, ensuring users can easily interact through a user-friendly interface All relevant information, including detected markers and any travel errors, is transmitted to the user's mobile device monitor, allowing for continuous updates on the robot's status and its relationship with the markers.

SCOPES

This vision-based project utilizes the Aruco markers detection method to simulate an indoor environment with adequate lighting The operational area is confined to 750 cm² (75 cm x 100 cm), allowing the robot to function effectively only under optimal lighting conditions and in the absence of obstacles There are a total of 12 designated destinations within this workspace, and static markers are employed to assess the tracking capabilities of the final product Users can input commands via a controlling website to direct the robot to any of the 12 available locations Upon reaching its destination, the robot returns to the starting point, ready for the next command.

OUTLINES

This report is divided into 5 chapters, in which:

Chapter 1: Overview: The need of automation, especially autonomous carriers in our modern life and project’s review

Chapter 2: Literature Review: Basic concepts of computer vision and marker detection Controlling and communicating methods of the project

Chapter 3: Design and Implementation: Specific requirements of the project Describe in detailed hardware components and software algorithms

Chapter 4: Results: Real working results of the robot are analyzed and evaluated Chapter 5: Conclusion and future work: Advantages and disadvantages of the final product Conclusion and future improvements are given

LITERATURE REVIEW

COMPUTER VISION

Computer vision refers to the capability of computers to process and interpret visual input, enabling them to execute specific tasks as directed by users It relies on data sets that are stored, manipulated, and retrieved based on their features As a foundational technology for artificial intelligence, computer vision significantly impacts various applications, including self-driving cars, robotics, and photo correction software.

Robot navigation, also known as robot localization, is the process by which a robot determines its position and orientation within a defined frame of reference This frame of reference consists of a coordinate system defined by reference points, which are identified by numerical values and conventional markers Robot localization encompasses self-localization, path planning, and the creation and interpretation of maps.

Figure 2.1: Introduction to Python and OpenCV

In this project, we utilize OpenCV, a powerful open-source library for computer vision, machine learning, and image processing, to process visual frames captured by the camera OpenCV is essential for real-time operations in modern systems, enabling the identification of objects, faces, and even human handwriting By leveraging vector space and mathematical operations, OpenCV effectively analyzes image patterns and their features, making it a vital tool when integrated with various other libraries.

Python, along with NumPy, effectively processes the OpenCV array structure, making it an excellent choice for analysis Additionally, its power, versatility, and user-friendly nature make Python ideal for projects involving Raspberry Pi embedded computers.

Visual localization is a technique that enables robots and self-driving cars to determine their location using images, playing a crucial role in augmented reality applications for interacting with the physical environment This method allows for precise estimation of camera pose, achieving accuracy within a few centimeters and degrees for both indoor and outdoor settings For self-driving vehicles, the detailed visual data from camera images facilitates the establishment of correspondences between real-time query images and a world representation built from a collection of reference images.

The pipeline of visual localization includes the following steps:

To optimize map loading, import the pre-built 3-D map that includes world point positions and the corresponding 3-D to 2-D relationships between map points and key frames Furthermore, ensure that the feature descriptors linked to the 3-D map points are loaded for each key frame.

Global initialization involves extracting features from the initial image frame and matching them with corresponding features from all 3-D map points Once 3-D to 2-D correspondences are established, the camera pose of the first frame is estimated in the working environment's coordinate system by solving the Perspective-n-Point (PnP) problem This pose is then refined using motion-only bundle adjustment The key frame that shares the highest number of visible 3-D map points with the initial frame is designated as the reference key frame.

In tracking, after localizing the initial frame, features from subsequent frames are matched with those in the reference key frame that have established 3-D world points The camera pose is estimated and refined similarly to the Global Initialization step, with additional refinement possible through tracking features linked to nearby key frames.

Figure 2.2: Visual localization flow of an expected area

The localization process of a robot begins with the collection of visual data through various sensors and cameras To execute specific tasks effectively, vision-based robots must accurately determine their location within a static physical environment This involves analyzing the input data to extract key features, which helps create a map of the robot's surroundings and establish a path By comparing the edges of the captured images with the expected map, the robot can pinpoint its precise location.

DEFINITION OF ARUCO MARKERS

Aruco markers are synthetic square fiducial markers featuring a prominent black border and an inner binary matrix that encodes their unique identifier (id) The black border enhances rapid detection within images, making Aruco markers effective for various applications in computer vision and robotics.

Codification enables the identification and implementation of error detection and correction techniques, with the marker size influencing the internal matrix dimensions For example, a 4x4 marker consists of 16 bits These markers are governed by a dictionary that outlines the rules for calculating and validating the marker identifier, as well as applying error correction Additionally, Aruco markers are effective for estimating camera pose.

Figure 2.3: Examples of different Aruco markers

Aruco markers offer a significant advantage due to their robust, quick, and straightforward detection capabilities The underlying algorithm is mathematically optimized to enhance the spacing between markers, minimizing the risk of misidentification even if some bits are not accurately recognized Additionally, the algorithm intentionally avoids markers that are predominantly black or white, preventing confusion with other objects in images Examples of Aruco markers can be seen in Figure 3.

To create an effective working environment for marker-based robots, it is essential to print markers that will be strategically placed within the operational area One popular option for these markers is the Aruco Marker, which can be easily produced to facilitate robot navigation and task execution.

24 coding or we can make use of external tools By using the appropriate tool, there are 3 main steps to generate Aruco Marker

1 Choose the Aruco Marker Dictionary

3 Give the desired size of the marker

The Aruco Marker is quickly generated upon completing all necessary steps, and users can save the output in either SVG or PDF format Once printed and placed in the operational area, each marker delivers essential information for effective robot control.

The Aruco detection process involves five key steps: camera frame acquisition, adaptive thresholding, square detection, marker validation, and pose estimation Initially, camera frames are converted to grayscale and processed through adaptive thresholding to generate binary images The square detection algorithm is then applied to these binary images to validate the marker patterns If the patterns are confirmed, the final step involves estimating the pose of the detected markers.

Figure 2.4: Aruco detection flow by OpenCV

Thresholding is an image segmentation technique that converts grayscale images into binary images by applying a global thresholding value to all pixels However, this method's reliance on a single threshold makes it vulnerable to variations in lighting conditions, which can significantly impact the quality of the resulting binary images.

Adaptive thresholding is a technique that applies varying thresholds to different regions of an image, depending on the local pixel values This method is especially beneficial for images with uneven lighting conditions.

Figure 2.5: Comparison of Thresholding method

Otsu’s Binarization is a local thresholding technique available in the OpenCV module, which automatically determines the threshold value for each pixel The algorithm aims to identify a threshold value 𝜎 that minimizes the weighted within-class variance, optimizing image segmentation.

To effectively detect Aruco markers, it is essential to perform camera calibration and perspective transformation While there are various methods to implement perspective transformation, the core concept revolves around the perspective transformation formula, which enables the calculation of the joint rotation-translation matrix, also known as the extrinsic parameter matrix.

The symbol M' is typically measured in the first instance and estimated in the latter, representing a transformation from a working environment coordinate system to pixel coordinates This coordinate system uses real-world units, such as meters, to describe points on an object, often a pattern of Aruco markers, which are usually designed in black and white to enhance contrast for easier recognition by programs The scaling factor, denoted as s, is applied to the matrix represented by symbol m', while symbol A denotes the camera's intrinsic matrix The application of formula (2.4) for camera frames allows for this conversion to be effectively represented.

The focal length of a camera, denoted as fx and fy, is the distance between the pinhole and the image plane, with larger focal values making objects appear larger The intersection of the optical axis and the image plane is indicated by Cx and Cy, ideally located at the center of the image plane Additionally, the joint rotation-translation matrix captures the camera's rotation in the x, y, and z axes, along with its position in yaw, pitch, and roll, defining its location and orientation in the world Each direction is represented by a rotation vector, while a rotation matrix illustrates each direction as a vector, ultimately resulting in a pixel coordinate expressed as [u, v, 1] This formula must be applied to every working environment coordinate.

Perspective N Point (PnP) utilizes the established formulas (2.1) and (2.2) to derive the extrinsic matrix, encompassing both rotational and translation vectors, under the assumption that the camera matrix and distortion coefficients are predetermined This process can be efficiently executed using the OpenCV command.

The "solvePnP" function utilizes Levenberg-Marquardt optimization to address nonlinear problems through the least squares method Its primary objective is to minimize reprojection error, defined as the total of squared differences between the detected corners and their re-projected counterparts.

ROBOT’S WEB COMMUNICATION

In this project, a web framework is used to develop and display the robot’s website

A net framework is a software library designed for developing web applications that operate within an internet browser These web applications handle user requests and generate responses by utilizing HTTP communication, as depicted in Figure 2.6.

We utilize the Bottle library to develop a monitoring and control website for our robot, as it is a lightweight and efficient WSGI micro web framework for Python, requiring only the Python Standard Library The user interface functions are created using HTML and in-browser JavaScript The robot's web application operates on the user's device, sending HTTP requests and processing responses to render a viewable page, with all communication occurring through the user's browser.

Figure 2.6: Communication between Client and Server

DESIGN

ROBOT REQUIREMENTS

This project aims to develop a robot utilizing marker-based visual localization to navigate within a designated workspace and reach its destination autonomously The robot operates in both manual and automatic modes, with the latter allowing it to move to user-defined locations using a Pi camera for marker detection The Aruco marker detection algorithm processes visual input to calculate the distance to the markers Users can control the robot through a user-friendly HTML interface, enabling functions such as moving forward, backward, or turning The interface also displays the robot's working modes and marker information for effective monitoring and control In automatic mode, the robot is programmed to proceed straight from the entrance of the workspace and make a right turn toward the specified destination, returning to the entrance upon completion Although designed for warehouse transportation, the prototype does not include a container, as payload capacity is not a focus of this project.

The robot features a compact base frame measuring 25.5cm x 16cm x 10cm, housing essential components such as a Raspberry Pi model 4B, an L298N motor driver module, a servo-mounted camera, two pairs of DC motors, and dual power supplies To enhance its visual capabilities, the camera is mounted on a SG90 servo motor, allowing for 180° rotation The motor driver efficiently manages four DC motors, achieving a movement speed of up to 6 cm/s without any load Powered by two robust battery packs, the robot can operate continuously for three hours Its operational area spans 0.75 m², with dimensions of 1m by 0.75m, divided into twelve distinct locations based on the distance from the robot to the markers.

BLOCK DIAGRAM

Figure 3.1 illustrates the interconnection of hardware components in this project, highlighting five distinct blocks that comprise the robot's hardware implementation Each block serves a specific function and is strategically positioned on the car frame.

The central processing block is an embedded computer responsible for gathering data from the camera block, processing this information, and sending control signals to the motor block to effectively manage the robot's operations.

The motor control block consists of a motor driver module and two pairs of DC motors, enabling speed adjustment and directional control of the vehicle This module facilitates four movement options: forward, backward, left turn, and right turn.

Camera module is used to gather the visual input from the working environment A compatible camera module is connected to the MCU block to process all the input data

The power supply unit includes external power supplies that deliver stable and appropriate power to both the MCU block and the Driver Module block, ensuring reliable performance These power supplies are designed to be compact and flexible, making them ideal for integration within vehicles.

Figure 3.1: Robot’s hardware block connections

The Raspberry Pi 4 embedded computer serves as the central processing unit, managing all input and output data It analyzes visual input from the Camera Module and processes this information to control the vehicle's movements The motor driver module utilizes DC signals to regulate the speed and direction of the robot by powering the DC motors Additionally, the power supply block consists of two separate external power sources, providing the appropriate voltage and current for both the MCU and the motor driver modules.

DETAILED BLOCK DESIGNS

The camera module enables the robot to detect markers in its environment, facilitating accurate positioning and movement For optimal marker detection, a wide field of view and sharp image quality are essential The Raspberry Pi OV5647 5MP camera module, compatible with all Raspberry Pi models, captures high-resolution images and HD video with a 62.2° angle of view It operates at up to 15 frames per second, offering programmable features such as exposure control, white balance, and noise reduction Additionally, its built-in compression engine enhances processing capabilities, while an internal anti-shake engine improves image quality To increase the camera's field of view, it is mounted on an SG90 180° servo, which provides 1.6kg/cm traction and can be easily controlled via PWM, making it a lightweight and efficient solution for robotic applications.

Figure 3.2: Connecting camera module with Raspberry Pi 3.3.2 Motor Driver Module

Figure 3.3: L298N Motor Driver Pin Diagram

Our robot, featuring two front and two back wheels, requires a Motor Driver IC for stable control of its DC motors The L298N driver module is an ideal option, known for its high-intensity motor control capabilities, making it suitable for prolonged operation This module includes an L298 motor driver IC and a 78M05 regulator, ensuring efficient performance.

The L298 integrated circuit (IC) is designed to control 4 DC motors or 2 DC motors with adjustable speed and direction, featuring two internal H-bridge circuits made up of 4 transistors for power amplification This IC can deliver a maximum output current of 2A, meeting the requirements for motor block design The L298N module's pin configuration is categorized into three sections: power supply, input signals, and output signals, as illustrated in Figure 3.3 The ENA and ENB pins function as PWM inputs to regulate the robot's speed, while pins IN4 and IN3 are also integral to motor control.

IN2 and IN1 are used to send controlling signals to the output pins, which power the DC motors

The Raspberry Pi 4 Model B, released by the Raspberry Pi Foundation in June 2019, features a high-performance quad-core Broadcom 2711 Cortex A72 processor clocked at 1.5GHz, offering enhanced computing speed while consuming 20% less power than its predecessor This model includes an 8” ARM Cortex processor with an integrated Video Core VI SN graphics processor, supporting OpenGL ES 3.0 for improved image processing and data generation It is equipped with up to 8GB of RAM, facilitating efficient computational processing, data collection, and motor control Additionally, a 64GB memory card serves as external storage for the Raspbian operating system and associated programs.

The pin connections between Raspberry Pi 4 and other hardware blocks are shown in Figure 3.4:

Figure 3.4: Communication between Raspberry Pi 4 and other components

The Raspberry Pi 4 utilizes its PWM pin to control the MG996R servo motor, enhancing the camera's capturing range Pins 6 and 26 are linked to the Enable pins of the L298N driver module, allowing for PWM-based speed control of DC motors GPIO pins 12, 13, 21, and 20 serve as output control signals connected to motor drivers Our project employs a 5V-3A backup power bank, which is the recommended power supply for the Raspberry Pi 4 Additionally, a series connection of lithium batteries providing 12V DC powers the L298N module, ensuring stable voltage for the DC motors.

The power supply is a crucial element of the robot, providing energy to the motor driver module, five motor drives, and the central control unit In this project, we utilize rechargeable batteries as the primary power source due to their compact design, which ensures they do not impede the robot's movement unlike traditional wired power supplies.

The Raspberry Pi computer and the L298N module, along with five motors including a servo and two pairs of DC motors, necessitate adequate voltage and current from the power supply Table 3.1 details the current consumption for each component of the robot.

Devices Input Voltages Current consumption

Logic current 36mA Drive current 2A for each pair of motor

The robot is equipped with two separately rechargeable battery sources to ensure sufficient power for its central controller and other components To effectively power four DC motors and the L298N module, the primary power source consists of three li-ion batteries connected in series, generating a voltage source with a capacity of 4.2 Ah.

12.6V The Raspberry Pi is powered by a battery bank that has automatic voltage regulation circuitry as the secondary power source.

SOFTWARE DESIGN

The robot operates in two primary modes: manual control and auto mode, as illustrated in Figure 3.6, which outlines its main working algorithm Upon start-up, the robot loads its working program, defaulting to manual mode while automatically connecting to a user interface accessible from any device on the same Internet connection This user interface allows users to easily switch between auto and manual modes, while also providing real-time information about the robot's travel process for convenient monitoring.

In manual control mode, users can operate the robot via an HTML-based user interface The Raspberry Pi computer sends HTTP queries to configure a server address, allowing access to the Robot Control website.

The user interface (UI) features four directional buttons that allow users to manually control the robot's movement—forward, backward, left, and right Pressing a button sends a signal to the robot to move in the selected direction, and holding the button enables continuous movement Releasing the button causes the robot to stop immediately It is important to note that pressing multiple control buttons simultaneously is not permitted.

In auto mode, users choose predefined coordinates from the user interface, which are transmitted to the robot for mapping The robot subsequently scans for recognizable markers to assess its own location accurately.

Using the marker pose estimation function from OpenCV, the robot calculates the distance to its destination along both vertical and horizontal axes, converting this information into suitable X and Y coordinates within the working area As a result, the robot autonomously navigates to the specified location Upon reaching its destination and returning to the entrance, it triggers an announcement that is communicated to the user interface Figure 3.8 depicts the algorithm utilized for the robot's pathfinding.

Upon receiving a user-provided coordinate, the robot promptly seeks the corresponding marker Its position within the working area is established by detecting designated markers If no markers are detected, the robot rotates its servo, equipped with a camera, to continue the search for any defined markers.

The robot navigates toward its target by identifying a suitable marker, retrieving its ID for coordinate feedback, and measuring the distance to assess its proximity to the desired location.

Figure 3.8: Robot’s auto mode flowchart

The servo's timing control is demonstrated in Figure 3.7, where a minimum pulse of 1 millisecond at 50Hz positions the servo's staff at 0 degrees By sending a pulse of 1.5 milliseconds at the same frequency, the staff rotates to 90 degrees, while a pulse of 2 milliseconds achieves a full rotation to 180 degrees For finer adjustments, pulses ranging from 1ms to 1.5ms can be used to control the servo's movement more precisely.

To control a servo using the RPI.GPIO Python module, the PWM pin is set as an output, and a duty cycle value ranging from 2% to 10% is applied to manage the servo's rotation angle A brief delay is incorporated to allow the servo to complete its movement, and to prevent any glitches, the duty cycle is subsequently set to 0, halting the servo's rotation This servo control algorithm is visually represented in figure 3.10.

RESULTS

HARDWARE RESULTS

The robot's hardware configuration features a power supply system that includes a power bank for the Raspberry Pi and a 3-cell battery pack for the L298N module, which powers four DC motors and an SG90 servo Additionally, the camera module is mounted on the servo motor, enhancing the robot's functionality.

Figure 4.1: Hardware layout of the robot from side and top view

The final prototype of our robot measures 25 cm in length and 16 cm in width, effectively utilizing every 25 cm by 25 cm cell in the working field It features a mounted camera that stands 7 cm tall, matching the height of the Aruco markers.

Figure 4.2: Robot size compared to the working area

Figure 4.2 illustrates the size comparison between the robot and its operational area, showcasing a hardware configuration designed to mimic a warehouse robot, although it currently lacks any installed containers.

The user interface for our robot, as shown in Figure 4.3, features four manual control buttons that allow movement in four directions: forward, backward, left, and right The default setting is manual mode, which can be utilized at any time during operation To switch to auto mode, users simply need to input a number between 1 and a specified range.

12 at the user command and presses the submit button This action sends the destination to the robot to be converted into coordinates

Figure 4.3: The waiting state of the robot web UI

Once the robot is powered on, users can access both modes on the controlling website In Auto Mode, users receive a notification prompting them to enter a desired location Upon submitting a number between 1 and 12, a new information window appears, and the robot's status changes from idle to moving, as shown in Figure 4.4 Users will be notified when the robot reaches its destination, at which point they are expected to provide a new input destination.

Figure 4.4 Real-time current states of the Robot

To test the robot's functionality, our team has designed a working area board measuring 100cm by 75cm, divided into twelve 25cm by 25cm cells Each of these cells is labeled with a number from 1 to 12, serving as the input coordinates for user interaction.

Figure 4.5: Working area of the robot

When a user inputs a number between 1 and 12 on the web interface, the robot translates this number into the corresponding destination and marker ID, as detailed in table 4.1 Upon reaching the calculated x- or y-coordinate offset, the robot halts These offset values are derived from the robot's Aruco marker pose estimation function, which measures the distance between the robot and the designated marker ID.

Table 4.1: Converting input number into coordinate table

Marker ID on Y-coordinate [X,Y] coordinate

WORKING RESULTS

4.2.1 Moving to a specific location from the entrance

In this system, users input a destination using numbers from 1 to 12, which correspond to marked cells in the interface The robot navigates to the cell matching the user's input number and identifies the corresponding marker ID to reach the designated location Each time a new destination is entered, a notification is triggered to inform the user.

Figure 4.6: Sending data to Google Firebase

The Raspbian OS terminal displays the detected marker ID and distance measurement, showcasing the robot's operation in auto mode as it navigates from the entrance to designated destinations "8" and "12." Specifically, the robot successfully travels to location number 7, makes a right turn, and arrives at cell 8, as illustrated in Figures 4.2 and 4.3.

In the initial test scenario, the robot receives an input of "8," aligning with markers identified as "1" and "3." It navigates to the row designated by ID 1, moves left, and halts at a cell marked with the number "8," corresponding to ID 2.

46 robot is immediately updated on the user website

Figure 4.7: Robot moves forward from the entrance

Figure 4.8: Robots turns right to reach the destination 8

Finally, we test with input number “10” Figure 4.5 demonstrates the robot behavior

Figure 4.9: Robot travels to location 10

The robot first rotates the servo until it detects a marker with ID "1," at which point it moves forward However, when the distance value displayed on the terminal reaches

As the robot neared the destination, it stopped at marker ID 1 but failed to turn right, instead rotating the servo due to losing sight of the marker To ensure proper navigation, we adjusted marker ID 1 for better detection Once the robot confirmed its position, a notification was sent to indicate its approach to the destination.

Table 4.2 illustrates the precision of our robot operation based on the number of trails versus the number of successful moves to some chosen destination We choose locations

In our accuracy test, we conducted twenty trials for each test case, measuring the average time taken for our robot to navigate from the entrance to the destination We defined the number of tests as the total trials conducted for each case, while the number of successes represents the instances where the robot successfully reached the designated location The accuracy rate was calculated by dividing the number of tests by the number of successes This evaluation specifically aimed to assess our robot's capability to maneuver to locations marked as 1, 4, 10, and 12, which are situated at the rear of the working area, as well as to locations 5 and 8, located at the center.

Our robot demonstrated a high accuracy of 90% by successfully stopping at cell number "1" eighteen out of twenty times, and similarly achieved the same accuracy for cell number "4." This success is attributed to the robot's ability to quickly identify the marker with ID "1" within its camera range However, when tested at position "5," the accuracy decreased to 75%, as the robot occasionally turned at incorrect angles, causing deviations in its path A similar decline in accuracy was observed at location "8." The lowest accuracy was recorded at position "12," where the robot only stopped correctly 10 out of 20 times, approximately 50% This was primarily due to the robot approaching too close to the marker, resulting in the edges being out of the camera's view and hindering identification Additionally, errors in turning angles contributed to the low accuracy at this location In contrast, location "10" had slightly better accuracy than location "12," as it did not experience turning angle errors.

The average time taken for each trial is calculated to assess the performance of our robot across various test locations Locations 1 and 4 recorded the shortest times, as the robot only needed to move straight ahead Conversely, location 12 had the longest time due to its distance from the starting point and the additional time required for the robot to orient itself after turning Overall, the robot's expedition duration varied between 10 seconds and 5 minutes In conclusion, the robot achieved an average accuracy of 75.843% across all test cases, completing its tasks in approximately 0.856 minutes.

49 in average to finish its expedition

Table 4.2: Travelling accuracy of robot

No Destination Number of tests

Time taken per test (minutes)

Table 4.3 evaluates three types of autonomous robots—line follower, marker-based visual localization, and GNSS-based robots—focusing on path defining, stability, and scalability The line follower robot offers high accuracy but requires a pre-drawn path, making it less flexible and scalable In contrast, both the marker-based and GNSS-based robots utilize algorithms for path definition, allowing them to navigate any area; however, the GNSS-based robot is ineffective indoors due to GPS limitations The marker-based robot's accuracy diminishes when the target is far back, as it struggles to detect the marker when it is obscured While the GNSS robot's range can be expanded by modifying its pathfinding algorithms, it remains constrained in areas lacking GPS signals Similarly, the marker-based robot's operational range is limited by its camera's capabilities Ultimately, the line follower robot excels in stability but lacks flexibility, whereas both the GNSS-based and marker-based robots offer better adaptability in path defining and scalability, albeit with their respective operational constraints.

Table 4.3: Comparison of marker-based visual localization with other navigation methods

Marker-based visual localization robot GNSS-based robot

Must have a specific line to follow.

Paths are drawn on the field

Paths are defined by algorithm as desired

High accuracy at any path but it cannot reach a location where there is no line to follow

Can reach any location on the working area

The accuracy at the rear of the working field is low.

Can reach any location on the working field. Cannot work indoor

Re-draw the path on the field and adjust algorithm in the firmware

Adjust the location of reference makers and algorithm in the firmware.

Limited by the camera working range

Adjust algorithm of the firmware.

Can be extended anywhere the robot can receive GPS signal

CONCLUSION AND FUTURE WORK

CONCLUSION

The "Marker-based visual localization robot" has been successfully designed and tested, demonstrating high accuracy in navigating various locations within its operational field The robot's hardware is engineered to perform effectively for up to three continuous hours, utilizing a Raspberry Pi embedded computer for rapid visual input processing to guide it to its destination In warehouse settings, the robot efficiently returns to the entrance after each delivery Additionally, its user-friendly interface allows seamless switching between manual and auto modes, and users can control the robot from multiple smart devices connected to the same internet network.

Our robot has several limitations that affect its performance It does not automatically adjust its location when it strays outside the designated working range, leading to miscalculations due to traveling shocks that disrupt its camera vision Additionally, the robot can only navigate along predetermined paths within a warehouse-like environment, restricting its flexibility Lastly, it struggles to detect markers when it gets too close, as part of the marker may be obscured from the camera's view.

FUTURE WORK

This thesis acknowledges certain limitations in its current design, particularly in enhancing the robot's precision, which can be improved using algorithms such as PID for accurate navigation Unlike existing warehouse robots, our prototype currently lacks essential features like obstacle recognition and the capability to handle large objects, primarily due to constraints in time and technology However, by integrating our robot's foundational elements with more robust hardware, we can address these design flaws and incorporate additional resources to develop advanced functionalities that cater to practical needs.

[1] Babinec, A., Jurišica, L., Hubinský, P., & Duchoň, F “Visual localization of mobile robot using artificial markers” Procedia Engineering”, 2014

[2] Martin Humenberger, Gabriela Csurka Khedari, Nicolas Guerin, Boris Chidlovs,

(2022, Sep 20), “Method for visual localization” [Online] Available at: https://europe.naverlabs.com/blog/methods-for-visual-localization/

[3] OpenCV (2022, Sep 25), “ArUco marker – OpenCV” [Online] Available at: https://docs.opencv.org/4.x/d5/dae/tutorial_aruco_detection.html

[4] Yu, J., Jiang, W., Luo, Z., & Yang, L “Application of a vision-based single target on robot positioning system.” Sensors, 2021

[5] Shang, W A “Survey of mobile robot vision self-localization” Journal of

Automation and Control Engineering Vol, 7(2), 2019

[6] Campilho, A., Karray, F., & ter Haar Romeny, B (Eds.) (2018) Image Analysis and Recognition: 15th International Conference, ICIAR 2018, Póvoa de Varzim,

Portugal, June 27–29, 2018, Proceedings (Vol 10882) Springer

[7] Isaksson, J., & Magnusson, L., “Camera pose estimation with moving Aruco- board.: Retrieving camera pose in a stereo camera tolling system application.”, 2020

[8] OpenCV (2022, Sep 25), “Image Thresholding” [Online] Available at: https://docs.opencv.org/4.x/d7/d4d/tutorial_py_thresholding.html

[9] Dramble (2022, Nov 10), “Power Consumption Benchmarks” [Online] Available at: https://www.pidramble.com/wiki/benchmarks/power-consumption

53 https://create.arduino.cc/projecthub/ryanchan/how-to-use-the-l298n-motor-driver- b124c5

[11] Steve Cassidy (2022, Sep 25) “Writing Web Applications in Python with Bottle” [Online] Available at: https://pwp.stevecassidy.net/bottle/python-webapps/

[12] Loose, H., Zasepa, M., Pierzchala, P., & Ritter, R “GPS Based Navigation of an Autonomous Mobile Vehicle In Solid State Phenomena” (Vol 147, pp 55-60) Trans Tech Publications Ltd, 2009

[13] Goyal, N., Aryan, R., Sharma, N., & Chhabra, V (2021) “Line Follower Cargo-Bot For Warehouse Automation”, 2021

Source code: https://drive.google.com/drive/folders/1VwtdlSqnlUkoHEk0DrXwjKVBp0nG50-

Firebase Database: https://arucorobot-default-rtdb.asia-southeast1.firebasedatabase.app/

Tiêu đề	Design and implementation of marker-based visual localization robot
Tác giả	Đặng Đình Gia Bảo, Tạ Quốc Thịnh
Người hướng dẫn	Lê Minh Thành
Trường học	Ho Chi Minh City University of Technology and Education
Chuyên ngành	Computer Engineering Technology
Thể loại	Graduation project
Năm xuất bản	2022
Thành phố	Ho Chi Minh City

Định dạng
Số trang	60
Dung lượng	5,29 MB