MINISTRY OF EDUCATION AND TRAININGHO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION GRADUATION THESIS MAJOR: MECHATRONICS ENGINEERING TECHNOLOGY INSTRUCTOR: TRUONG DINH NHON HUYNH
INTRODUCTION TO GRADUATION THESIS
The Urgency of This Graduation Thesis
Waiting staff, including waiters, waitresses, and servers, play a crucial role in restaurants, diners, bars, and private residences by fulfilling clients' needs through food and drink service They follow the manager's policies and perform various tasks such as taking orders, running food, cleaning plates and silverware, assisting with bus tables, entertaining customers, restocking supplies, and handling billing.
The COVID-19 pandemic has created significant staffing challenges for restaurants, leading to a widespread labor shortage that results in longer wait times and increased customer dissatisfaction.
In light of labor shortages and health concerns in crowded spaces, many restaurant owners are turning to robotic servers as an innovative solution These advanced machines reduce the need for human interaction while efficiently addressing staffing challenges The response to this technology has been overwhelmingly positive, highlighting its potential to transform the dining experience.
When considering robotic systems for supply handling, it's essential to differentiate between Autonomous Mobile Robots (AMRs) and Automated Guided Vehicles (AGVs) AGVs navigate using fixed infrastructure, such as magnetic tapes or wires, making them ideal for stable environments with repetitive tasks In contrast, AMRs utilize advanced technologies like LiDAR and sensors, allowing for autonomous navigation in dynamic settings without predefined paths This flexibility makes AMRs more cost-effective in the long run, as they can be reprogrammed for various tasks and can work alongside humans in shared spaces Although AMRs may have lower initial costs, their adaptability and collaboration capabilities are somewhat limited compared to AGVs Ultimately, the choice between AMRs and AGVs depends on the specific application requirements and environmental conditions.
This project focuses on researching, designing, and implementing an Autonomous Mobile Robot (AMR) prototype for food service in restaurants The AMRs will alleviate the physical strain on waitstaff by efficiently transporting food and dishes to customer tables, taking orders, and managing bill payments By carrying more dishes than human waiters, AMRs enhance operational efficiency, reduce pressure on workers, and contribute to a smarter, more systematic restaurant environment Additionally, the integration of AMRs can lead to cost savings and allow staff to dedicate more time to other essential tasks.
Figure 1.2: A robot assists a waitress in serving a large amount of food
To effectively address AMR construction, it is essential to analyze and apply key knowledge in the field Selecting the most cost-effective building methods and techniques tailored for restaurant design will ensure optimal implementation.
Reasons, Significant and Particularity of This Graduation Thesis
There are many reasons to apply robots in restaurant service, including the following:
Implementing robots in customer service roles enhances order processing and delivery speed, leading to improved efficiency and increased customer satisfaction This technological integration results in a more streamlined service experience for users.
Implementing automation technology allows organizations to reduce labor expenses by operating with a smaller workforce, thereby enhancing operational efficiency This strategic approach not only improves cost-effectiveness but also contributes to the financial sustainability of the company.
Automation enhances consistency and customer satisfaction by ensuring a uniform approach to tasks, minimizing errors and inconsistencies in service delivery This reliability allows customers to anticipate a dependable experience with the product or service, ultimately resulting in higher levels of customer satisfaction.
Robots equipped with advanced navigation and sensing capabilities can swiftly adapt to changing dining environments This flexibility ensures seamless operations and efficient service, even in dynamic settings where layouts and setups frequently change.
Robots excel in long or continuous operations without requiring breaks, making them ideal for industries that need uninterrupted production or service This reliability enhances productivity and minimizes downtime, providing significant advantages for businesses.
Serving robots significantly boost operational efficiency and minimize labor requirements while offering customers a unique and engaging experience Their adoption highlights a rising trend in utilizing robotics to optimize restaurant operations, enhance service speed, and foster an innovative, technology-driven dining atmosphere.
The integration of robots in restaurant service significantly boosts productivity by automating repetitive tasks, allowing human staff to focus on more complex duties This leads to improved order accuracy, faster service, and enhanced customer satisfaction Robots offer scalability without the corresponding increase in labor costs, making them ideal for fluctuating demand Although initial installation costs may be high, the long-term advantages include increased efficiency and a unique customer experience.
4 adoption, significant consideration of maintenance, system integration, and potential job displacement is required.
Objective of This Graduation Thesis
This project focuses on researching, designing, and implementing an Autonomous Mobile Robot (AMR) system specifically for restaurant environments, where it will function as waiting staff The AMR will autonomously navigate the restaurant, taking orders and delivering food to enhance the dining experience for customers The primary goal is to integrate AMR technology to improve operational efficiency, reduce manual labor, and promote innovation in automated restaurant services.
Subjects and Scope of Graduate Thesis Research
The project's objectives encompass a comprehensive array of tasks aimed at the development and optimization of an Autonomous Mobile Robot (AMR) The overarching goals are outlined as follows:
• Design, manufacturing, and assemble the mechanical chassis of the AMR
• Design the electrical system as long as the control system of the AMR
• Model the motor response and design a controller regulating DC servo motor speed and its position
• Develop a controller system regulating the position of AMR in restaurant
• Create a graphical user interface with a basic database for order management
• Create a speech-to-speech user interface
The successful realization of an efficient and technologically advanced Autonomous Mobile Robot (AMR) relies on the integration of mechanical, electrical, and software components These interconnected objectives foster a holistic approach, ensuring that the project meets its overarching goals through an iterative and systematic process.
Novelty of The Graduation Thesis
We have achieved significant innovations and advancements in our graduation thesis topic compared to previous initiatives This project stands out due to our extensive research and ongoing creativity, which collectively reflect our vision and commitment to excellence.
5 dedication to innovation and sustainable development Here are the main advances and breakthroughs that this project provides:
• Applying large language models to provide conversational capabilities, information provision, and customer inquiries resolution for restaurant patrons
Enhancing the conversational capabilities of large language models through an integrated augmented query system allows the robot to efficiently search for and provide specific information related to restaurants.
Developing a restaurant data management system in conjunction with a robot aims to empower restaurant staff to easily update essential information accessible to the robot for efficient customer interactions This system streamlines the management of menus, orders, and related data, enhancing overall operational efficiency and customer service.
Theorical Basis of This Study
Understanding mechanical manufacturing, structural mechanics, and material properties is essential for engineers, as covered in courses such as Engineering Mechanics, Mechanical Engineering Drawing, Tolerances and Measuring Technology, Manufacturing Technology, and Mechanics of Materials.
• Knowledge of automatic control system learned in these following courses: Automatic Control, Drive Servo System, Process Control, etc
• Knowledge of micro controllers, sensors and electrical actuators in these following courses: Electrical and Electronic Engineering, Digital Techniques, Microcontroller, Sensors and Actuators, etc
• Knowledge of Robot Operating System (ROS), which was taught in Embedded System
• Knowledge of wireless communications, web server and databases, Knowledge of machine learning and artificial intelligence from Internet of Things, Machine Learning and Artificial Intelligence
Research Methods
• Research and consult research papers on service robots, autonomous robot control algorithms, and related topics available on the internet
• Study ROS (Robot Operating System), pathfinding algorithms from available literature or similar projects
• Seek input from supervising professors, colleagues, and others undertaking similar projects
• Investigate robot movement methods, operational spatial zones, and terrains
• Utilize geometric parameters to ideate and draft aesthetically pleasing and space-appropriate robot exteriors using 3D modeling software
• Explore natural language processing (NLP) algorithms and optimization methods for these NLP models.
Comprehensive Overview of Available Related Products
1.8.1: Pepper Robot from Softbank Robotics
In today's market, Robot Pepper stands out as a leading service robot, capturing attention with its humanoid design and advanced natural interaction skills This innovative technology has gained significant trust among diverse restaurants worldwide, making it a focal point of interest in the hospitality industry.
Figure 1.3: Pepper Robot serving dishes
Pepper's advanced ability to recognize voices and faces enhances the welcoming atmosphere for clients, fostering a pleasant dining experience Beyond its visual appeal, Pepper offers diners detailed meal information, including ingredient compositions, and suggests options tailored to individual tastes Additionally, Pepper effectively answers questions, provides valuable insights, and guides guests in making the most of the restaurant's offerings This innovative robot seamlessly moves throughout the restaurant, actively engaging with patrons and delivering personalized service.
1.8.2: Servi Robot from Bear Robotics
The Servi Robot, known for its smooth and agile movement, efficiently delivers food from the kitchen to diners with speed and accuracy Equipped with cutting-edge artificial intelligence, this robot can identify and navigate around obstacles, significantly improving safety during its deliveries.
The Servi Robot excels in physical navigation and boasts advanced technological integration, connecting effortlessly with Point of Sale (POS) systems and restaurant management applications This capability automates ordering and payment processes, reducing customer wait times and enhancing operational efficiency for staff Ultimately, the combination of rapid movement and sophisticated technology makes the Servi Robot a crucial asset in improving service quality and operational fluidity in the restaurant industry.
PuduBot2 stands as an innovative restaurant service robot meticulously developed by Pudu Robotics Distinguished by its compact and adaptable design, PuduBot2 exhibits a
This advanced robotic system showcases exceptional flexibility and security in navigating restaurant environments Powered by a combination of artificial intelligence and advanced sensors, it efficiently performs various tasks, including delivering food and drinks from the kitchen to tables, providing customer service, and guiding patrons to their designated seating areas.
PuduBot2 revolutionizes restaurant service by enhancing operational efficiency while offering innovative interactive experiences for customers Its stable performance and adaptability to changing environments make PuduBot2 an ideal solution for improving service quality and enriching the overall dining experience in the competitive restaurant industry.
Conclusion on Characteristics of The Available Service Robots
1.9.1: Characteristics and Their Functional Features
The robot operates with two driving wheels and utilizes a differential steering mechanism for agile movement and rotation Equipped with a 2D lidar sensor, it gathers environmental data through laser beams, enabling accurate distance measurement from obstacles.
The 9 system processes real-time signals and analyzes lidar data to create an accurate digital map Upon detecting static and dynamic barriers, it utilizes established algorithms to determine the optimal path for the robot to reach its destination Additionally, a control screen mounted on the robot enhances operational efficiency by displaying menu items, table positions, client counts, and shop income.
The robot's lower body frame is constructed from durable metal bars and components, firmly connected through assembly joints, screws, and welding Its upper body features two tubular steel bars that are reinforced and attached to the lower body with assembly joints or screws, facilitating easy disassembly and transport A metal rod further secures these bars together, enhancing stability.
The robot's driving system features two active wheels linked to motors through a gearbox, facilitating both movement and steering control These wheels are integrated into a shock-absorbing system, while omnidirectional wheels are incorporated to evenly distribute and minimize the downward load from the robot's weight and its cargo.
Furthermore, the robot includes additional components such as a touchscreen for customer interaction, trays for delivering food items, and an outer shell made from plastic
THEORICAL BASIS
Robot Design Requirements
Table 2.1: Specification Requirements Table for Robot Design
6 Number of wheels 2 (active), 4 (passive)
9 Human-machine Interface screen (HMI) 7 inches
10 Mapping and path planning Robot Operating System2 (ROS2)
First, the structural study will include determining the dimensions, drive system, frame structure, and wheels (drive wheels and steering wheels)
Selecting the right drive chain for a robot hinges on accurately determining its overall dimensions, which encompass length, width, and height These measurements are essential for selecting an appropriate transmission system Additionally, the steering mechanism should provide the necessary flexibility, enabling the robot to navigate narrow spaces and pivot effectively.
For optimal precision and performance in robotic systems, it is crucial to choose an appropriate transmission system that effectively connects the motor shaft to the wheel shaft This selection must prioritize stability and accuracy in the transmission of rotational speed and torque, taking into account essential factors such as gear ratio, durability, and overall transmission efficiency Additionally, when selecting both the steering mechanism and transmission system, it is important to maintain a balance with the robot's overall dimensions to ensure seamless operation.
To ensure high precision in control, it is essential to prevent wheel slip by focusing on the suspension system design, which helps maintain stable wheel contact with the surface Key factors include the load-bearing capacity of the wheels, the structure of the wheel frames, and the materials used for wheel contact surfaces Additionally, the robot's design plays a vital role not only in aesthetics but also in enhancing movement flexibility, adaptability to various working environments, and interaction capabilities.
2.1.3: Selection of Materials for the Frame and Case of the Robot
When selecting materials for a robot frame, it's essential to consider factors such as application requirements, weight, durability, stiffness, load-bearing capacity, and cost Common materials utilized in robot frame construction include aluminum, steel, carbon fiber, and plastics, each offering unique advantages that cater to specific robotic needs.
Extruded aluminum is a favored material for robot frames due to its lightweight nature, which minimizes the robot's overall weight Its ease of machining, high strength, stiffness, and excellent corrosion resistance make it ideal for applications that don't require heavy load-bearing These attributes also contribute to its aesthetic appeal, making aluminum particularly suitable for compact and mobile robots in industrial and service applications.
(source from interner) Figure 2.1: Aluminum frame
Using thin sheet steel of appropriate thickness for robot bases offers numerous advantages, including high precision and uniformity for accurate positioning and fixation with other components Its durability and load-bearing capacity provide resistance to deformation, protecting the robot from strong impacts and enhancing stability and reliability, which ultimately increases its lifespan While steel is heavier and more challenging to machine than aluminum, precise machining processes are essential to ensure a robust and stable structure.
(source from interner) Figure 2.2: C45 plate
3D printing technology enables the creation of unique robot models and shells with complex geometries using lightweight plastics, which significantly reduces the overall weight of the robot While plastics like ABS, PLA, and PETG may lack the strength and stiffness of metals, they provide adequate durability and load-bearing capacity for general robotic applications It is crucial to identify load-bearing parts and accurately determine mounting positions, reinforcing them with additional materials and stiffening ribs to enhance the shell's robustness and durability Additionally, plastic offers high aesthetic appeal and lower machining costs compared to metals and composites, making it a favorable choice for specific applications that do not require extreme strength or resilience against harsh environmental conditions.
(source from interner) Figure 2.3: 3D Printing Technology
Servo Automated Control System
2.2.1: Overview of Servo Automated Control System
Servo control systems dynamically adjust a servo motor's position, speed, and acceleration based on real-time data The process starts when the controller receives a command signal indicating the desired position or speed This command is then compared to the actual position or speed, which is measured by feedback devices such as encoders or resolvers.
The controller calculates the error by determining the difference between the desired and actual values, which is essential for identifying the necessary corrective actions to achieve the target outcome It then generates a corrective signal that adjusts the voltage and current supplied to the servo motor, enabling precise movement to the desired position or maintaining the specified speed.
A closed-loop control system, characterized by its ongoing cycle of instruction, feedback, error calculation, and correction, ensures high precision and responsiveness This system is ideal for applications that demand accurate and repeatable movements, including CNC machines, robots, and automated manufacturing processes The ability of servo control systems to implement real-time adjustments facilitates seamless and precise control, highlighting their significance in modern automation and precision engineering.
2.2.2: Theorical Basis of Automatic Control System
To effectively control motors, it is essential to first establish the system's transfer function, which mathematically represents the connection between input voltage and motor output speed Understanding this relationship allows for the design of a controller that precisely modifies the input voltage to attain the desired motor speed.
Figure 2.4: The Equivalent Circuit of a DC Servo Motor
The electrical system can be defined in this following formular:
The mechanical system can also be represented as following formular:
• u: voltage input [V]; : angular velocity [rad/s]
• Ke: the back emf constant [V/rad/s]; Km: torque constant [Nm/A]
• JM: inertial moment of motor shaft [kg.m2]
• TM: torque of motor [Nm]
The block diagram of a DC Servo Motor can be represented as following figure:
Figure 2.5: block diagram of a DC Servo Motor
From the given block diagram, obtain the transfer function of motor
Then we can approximate eq (4) by a first order transfer function
2.2.3: A Method to Define PI Parameters
Using a PI controller grounded in Internal Model Control (IMC) is the optimal choice for regulating motor speed, as the speed of a DC motor behaves as a first-order function.
The mathematic equation of the PID controller:
The input variable, known as the error, is derived by sampling the output of the plant at a specific sample rate during the implementation of the PI controller Simultaneously, the PI algorithm is computed at the same sample rate At each step k, the results are processed accordingly.
Figure 2.6: General response of Servo motor and how to determine PI
Using the graph of DC servo response with the value in time
2.2.4: PI controller of DC Servo Motor
To determine the PI parameters for a motor, it is essential to identify the maximum speed attainable at 100% PWM and the time required to reach this peak speed With these values in hand, we can utilize a specific formula to calculate the desired parameters.
Choose τ c with condition τ p > τ c > 0 then we will have
• ∆CV : The maximum speed the motor can achieve
• ∆MV : The resolution of PWM
Robot Kinematics
2.3.1: Position and Velocity – The State Variables
For mobile robots, the key controlled joint variable is the wheel rotation angle Forward kinematics involves utilizing these wheel rotation angles, along with the robot's orientation angle obtained from sensors, to accurately determine its position.
A mobile robot generally features 2 active wheels and 4 passive caster wheels, making its kinematics dependent on the rotation angles of the active wheels Unlike robotic arms, where the end-effector serves as the operational point, a mobile robot's operational point is its body Consequently, identifying the robot's position is synonymous with determining the position of its body.
18 center of mass on a map, and determining its velocity is akin to determining the velocity of its center of mass
To determine the position of a moving object, we utilize the dead reckoning technique, which combines previously known positions with estimates of speed and direction over time This method, inspired by biological processes where animals update their positional awareness, is particularly effective for calculating both position and orientation in robotics Inertial navigation systems leverage dead reckoning to provide orientation data, which is similarly integrated into our robot kinematic model.
Figure 2.7: The kinematic model of robot
To ensure accurate kinematic modeling, the robot's initial orientation must align with the x-axis of the global coordinate system Any deviation from this alignment will render the derived equations invalid due to incorrect projected angles For effective operation, the orientation angle should decrease into negative values during clockwise rotation and increase into positive values during counterclockwise rotation.
Performing the projection on the figure, we obtain the following equations:
• 𝑥 𝑘 : the position of the robot along the x-axis at time k
• 𝑦 𝑘 : the position of the robot along the y-axis at time k
• 𝜃 𝑘 : the orientation of the robot at time k
• ∆𝜃 𝑅(𝐿) : angular change per unit time of the right (left) wheel
• ∆𝑑 𝑅(𝐿) : displacement of the right (left) wheel
• ∆𝜃 : angular change of the robot
• R: radius of the right (left) wheel
• 𝑁 𝑅(𝐿) : number of encoder pulses read from the right (left) wheel
• 𝑅𝑒 𝑅(𝐿) : resolution of the right (left) wheel
From these, the position state can be summarized as follows:
The state variable for velocity can also be written as follows: [𝑣
To determine the unknowns Δd and Δθ, we will utilize the wheel encoder to measure the rolling distance of the robot's center of mass and the BNO085 sensor to ascertain the angle θ Subsequently, we will substitute these values into the equations for x and y to compute the coordinates accurately.
2.3.2: Moving Path of Robot’s Center of Mass
The encoder plays a crucial role in measuring the speed and distance traveled by the wheel, while also providing precise information on the angle of rotation.
To accurately calculate wheel rotation, it is essential to obtain the encoder's resolution from the manufacturer and the pulse count per unit time The resolution for both wheels is represented as 𝑅𝑒 𝑅(𝐿) Understanding the relationship between ticks and pulses, specifically how many ticks correspond to a single pulse, is crucial for determining the total rotation of the wheel.
In this example, the motor features an encoder with a resolution of 2000 pulses per revolution (PPR) With a total gear ratio of 70:1 provided by two gearboxes, the encoder must complete 70 rotations for the wheel to achieve one full revolution.
140000= 0.0000448799 (𝑟𝑎𝑑/𝑝𝑢𝑙𝑠𝑒) The angular displacement of each wheel can be calculated as follows:
The BNO085 is a versatile Inertial Measurement Unit (IMU) commonly utilized for motion measurement and orientation tasks This advanced 9-axis sensor integrates 3-axis accelerometers, gyroscopes, and magnetometers, enabling precise detection of angular rotation and spatial orientation in three dimensions.
The BNO085 significantly enhances robot motion control by providing crucial data such as linear acceleration, angular velocity, and notably, orientation data BNO085 outputs Absolute
Orientation data in the form of quaternions Quaternions are the default method for representing orientations and rotations in ROS2 A quaternion is represented as follows:
Figure 2.8: The linear velocity of each wheel relative to any center of rotation
In a robot where the wheels maintain constant contact with the ground without slipping, the vehicle rotates around a fixed point known as the Instantaneous Center of Rotation (ICR) The ground contact speeds of the left wheel (vₗ) and the right wheel (vᵣ) contribute to the vehicle's angular velocity (ω) This relationship is defined by the principles of angular velocity.
Solving these two equations (2.12a) and (2.12b), we have:
From (2.13a), (2.13b) and using the equation for the angular velocity we have:
To accurately ascertain a robot's position in real space, it is essential to precisely identify the component coordinate systems and their interrelations Utilizing matrix calculations between these systems enables us to determine the robot's position in real-time effectively.
Figure 2.9: The transformation tree of the robot system
• map: the global (fixed) coordinate system
• odom: the coordinate system with its origin at the robot's initial position
• base_footprint: the coordinate system on the robot's projection plane
• base_link: the coordinate system with its origin at the robot's center of mass
• caster_1, caster_2, caster_3, caster_4: the coordinate systems of the caster wheels
• wheel_right_link, wheel_left_link: the coordinate systems attached to the wheels
• base_scan: the coordinate system with its origin at the center of the laser
• imu_link: the coordinate system attached to the IMU sensor
TF2 will assist us in performing these complex transformation calculations accurately and reliably All we must do is ensure that all parameters match the actual robot.
ROS2 – The Robot Operating System
Robot Operating System 2 (ROS2) is an open-source platform for robot development and control, created by Open Robotics As the successor to ROS, it enhances scalability, flexibility, and performance, making it a significant advancement in robotics software.
ROS 2 provides a modular approach to building distributed software systems for robots and embedded systems This platform supports multiple programming languages such as C++, Python, and Java, enabling developers to leverage a rich set of tools and libraries to build complex applications
ROS 2 is renowned for its key strengths, including compatibility with real-time and embedded environments, robust support for networking protocols such as DDS (Data Distribution Service) for seamless communication between system nodes, and its extensibility, which facilitates the development of cross-platform and multi-robot applications.
In summary, ROS 2 is a powerful platform for robot development and automation systems, providing essential tools for building, testing, and deploying complex robot applications across various domains
The Jetson Nano will serve as the central hub of the system, functioning as a server that collects data from peripheral devices Its responsibilities include processing topics, executing Simultaneous Localization and Mapping (SLAM), and overseeing the navigation stack This embedded computer will utilize USB ports to receive laser data from Lidar and communicate with microcontroller boards, which will control and transmit data regarding the motor and IMU back to the Jetson Nano.
The system utilizes serial communication as its chosen protocol due to its significant advantages It offers fast real-time data transmission between microcontroller boards and embedded computers, ensuring efficient communication Additionally, its flexibility allows for seamless integration with various sensors and actuators, such as motors and IMUs Furthermore, the straightforward implementation of serial communication simplifies the development process, minimizing potential errors Overall, adopting serial communication guarantees reliable and efficient data transfer, which is vital for the system's smooth operation.
In ROS 2, a node serves as a fundamental information unit that facilitates communication across the network by transmitting and receiving messages Nodes that send messages are referred to as publishers, while those that receive them are called subscribers The publish-subscribe communication model at the heart of ROS 2 fosters an extensible and flexible architecture, ensuring effective connectivity among various system components.
Publisher is a node that sends out messages For instance, a sensor node may release environmental data, like temperature or distance readings
On the other hand, Subscriber is known as a node that receives messages To decide how to move a robot, for instance, a control node may subscribe to the sensor data
Topics, also known as buses for message exchange, facilitate communication between nodes in a system In this publish-subscribe model, publishers send messages to a topic while subscribers receive them, enabling decoupled communication This mechanism enhances system flexibility and scalability by allowing nodes to operate independently without direct knowledge of each other.
SLAM Toolbox is a powerful package for Simultaneous Localization and Mapping (SLAM) in ROS 2, offering essential tools for 2D SLAM that enable map creation and real-time tracking of a robot's position With support for both lifelong and online SLAM, it caters to a wide range of applications, from basic map generation to complex long-term autonomous navigation.
SLAM, or Simultaneous Localization and Mapping, is essential in robotics, allowing robots to autonomously map and understand their environment By utilizing sensors like Lidar, robots gather data to accurately assess their position and navigate through real-world spaces.
The SLAM Toolbox mapping function relies on two key input values: the /tf topic and the /scan topic The /tf topic provides crucial data regarding the robot's spatial movement, enabling the calculation of position transformations across various coordinate frames Meanwhile, the /scan topic offers Lidar data from the environment, supplying essential information about distances and directions necessary for effective map-building.
Slam mapping integrates data from sensors and key topics to generate an environment map, crucial for determining a robot's location and movement direction By combining feature points from Lidar with positional transformations from the /tf topic, this process produces a detailed map alongside an accurate estimation of the robot's position within its surroundings.
The robot utilizes SLAM mapping in ROS 2 to autonomously localize itself and map its environment This capability enables the robot to interact with its surroundings and perform intricate tasks based on the data derived from the generated map.
Adaptive Monte Carlo Localization (AMCL) is a ROS 2 package designed for robot localization, enabling the accurate determination of a robot's position within a predefined map Utilizing particle filter localization techniques, AMCL effectively enhances the robot's ability to navigate and understand its environment.
In particle filtering, each particle signifies the robot's position and orientation, with initial particles being randomly sampled As the robot moves, these particles are updated according to the robot's current state and actions through recursive Bayesian estimation Techniques like Kalman filters and particle filters are employed to effectively approximate the underlying mathematical equations.
AMCL offers a probabilistic approach to estimating the robot's position and map parameters, enhancing localization accuracy by comparing these estimates with laser scan data and eliminating positions with low probabilities.
Path-Finding Algorithms
The map in ROS, constructed from pixels (default resolution is 0.05m/pixel), hence the algorithm is developed as below, let's consider an example with a 5x5 grid map
Starting at point Q, indicated by a green border, our goal is to reach point N, marked with a red border The black blocks represent occupied cells, illustrating obstacles like walls or objects that impede movement.
The algorithm's primary objective is to ensure the shortest distance from the starting node to each individual node, consistently identifying the shortest paths within the neighborhood Each node's cost is represented as g_cost, located in the bottom right corner of the cell, with the starting point (Q) initially assigned a g_cost of 0 As the algorithm progresses, these values are updated to determine the shortest distance from the starting point Furthermore, the starting point Q is placed in an open list, where nodes within this list are highlighted in orange.
Figure 2.14: Start point on the grid map
Step 1: Select the current node The search process begins by selecting a node from the openlist with the lowest g_cost We call this node the current node Currently, the openlist
29 contains only node Q, so node Q will naturally be the current node We will use a black border to indicate the current node
Step 2: Neighbors of the current node Next, we will select the neighboring nodes of the current node In this example, we will only consider cells that can be accessed directly above, below, left, and right of the current node The neighbors of the current node can be identified by arrows pointing from the current node (for example, L, P, V are not obstacles)
Step 3: Update distance values, store parent nodes After identifying the neighbors for each node, we proceed to update the g_cost and set parent nodes for each neighbor In this example, Q is the parent node, and L, P, V are the neighbors We then add these neighbors to the openlist Since the step cost is 1, the g_cost will be as shown in the diagram below
After evaluating the neighbors of node Q, we classify Q as a visited node, indicating that it has listed its neighbors and computed the cost to each These visited nodes are organized in a collection called the closed list, highlighted in yellow Subsequently, we proceed to remove Q from the open list.
Following the initial iteration, the open list includes nodes L, P, and V The next step mirrors the previous process, where we select a current node; since the G-cost for all three nodes is equal to 1, any of them can be chosen Subsequently, we identify the neighbors of the selected current node.
After assessing the costs with neighboring nodes, we remove the current node from the open list and mark it as visited by changing its color from orange to yellow This iterative process continues until we arrive at the destination node, at which point the search concludes.
Figure 2.18: Loop of actions till reaching the point N
When reaching the destination point, we will backtrack to the parent node Adding the parent node list is necessary to redraw the shortest path
Figure 2.19: Sequentially list the parent nodes
For point N as the destination: N has a parent node S, S has a parent node X, X has a parent node W, W has a parent node V, and finally, V has a parent node Q (the starting point)
To obtain the path, we simply backtrack
Figure 2.20: Completed finding the shortest path
In the A* algorithm, a heuristic serves as an estimation tool to assess the potential value of a state during pathfinding It helps estimate the remaining distance from the current state to the goal state, providing a valuable approximation rather than an exact result This estimation is often grounded in available information and prior experience, guiding the algorithm towards an efficient solution.
The heuristic enhances the A* algorithm by intelligently prioritizing states that are likely to be less costly and nearer to the goal, enabling faster searches than Dijkstra's algorithm by minimizing the number of explored states However, it is essential to understand that the heuristic does not ensure the discovery of the shortest path.
The accuracy of the heuristic used in the A* algorithm is crucial, as it greatly influences the performance and quality of the pathfinding results While the heuristic provides an estimation, it may differ from the actual optimal value.
In the A* algorithm, g(n) denotes the actual cost from the start point to any node n, while h(n) indicates the estimated cost from node n to the goal During each iteration, A* evaluates whether the vertex n, calculated with f(n) = g(n) + h(n), is the smallest among the options.
The DWB controller is the successor to the base local planner and DWA controllers in ROS 1
The dwb_planner package offers a controller designed to navigate a mobile base within a plane It creates a kinematic trajectory for the robot to transition from a starting point to a destination utilizing a map As the robot moves, the planner produces a value function represented as a grid map, which indicates the costs associated with traversing each grid cell The controller then uses this value function to compute the necessary dx, dy, and dtheta velocities to guide the robot effectively.
The basic idea of the Dynamic Window Approach (DWA) algorithm is as follows:
1 Discretely sample in the robot's control space (dx,dy,dtheta)
2 For each sampled velocity, perform forward simulation from the robot's current state to predict what would happen if the sampled velocity were applied for some (short) period
3 Evaluate (score) each trajectory resulting from the forward simulation, using a metric that incorporates characteristics such as: proximity to obstacles, proximity to the goal, proximity to the global path, and speed Discard illegal trajectories (those that collide with obstacles)
4 Pick the highest-scoring trajectory and send the associated velocity to the mobile base
Tokenization
Tokenization in Natural Language Processing (NLP) and machine learning involves breaking down text into smaller units called tokens, which can range from individual characters to full sentences This crucial process enables computers to better understand human language by simplifying it into manageable components.
Tokenization's primary purpose is to represent text in a machine-readable format while maintaining context that the text aims to refer By transforming text into tokens, computers can
Pattern recognition is crucial for enabling robots to understand and respond to human input effectively For instance, when a computer encounters the term "working," it analyzes the word as a set of components to derive its meaning.
Tokenization is easily grouped into three types: word tokenization, character tokenization, and sub-word tokenization Each kind has advantages and limitations, as well as various implementation techniques
Word tokenization is a widely used method that divides text into individual words, making it particularly effective in languages with clear word boundaries, like Vietnamese, as illustrated in the accompanying figure.
Character tokenization involves breaking content into individual characters, making it particularly beneficial for languages with unclear word boundaries and for tasks that demand precise analysis, such as spelling correction.
Sub-word tokenization strikes a balance between word and character tokenization by breaking text into units that are larger than a single character but smaller than a whole word For instance, "Chatbots" can be tokenized into "Chat" and "bots," while "Phát" may be divided into "Ph" and "át." This technique is especially advantageous in languages where meaning is derived from smaller components or when addressing out-of-vocabulary terms in natural language processing tasks.
Rule-based tokenization is a method that divides input text into tokens using a defined set of rules These rules can consider various elements such as whitespace, punctuation, regular expressions, and specific language constraints.
One of the methods, which follows rule-based tokenization, is known as Whitespace Tokenization This technique simply splits a text into smaller tokens by removing white space between them
Figure 2.25: An illustration of Whitespace Tokenization
Regular Expression Tokenization is an effective method for splitting input text based on specific patterns using regular expressions This technique is particularly useful for identifying particular types of information within text, such as email addresses, phone numbers, order numbers, and currency values.
Figure 2.26: An illustration of Regular Expression Tokenization
The next method is Punctuation Tokenization, which splits the text based on punctuation character such as: dot, period, comma, semicolon, …
Figure 2.27: An illustration of Punctuation Tokenization
And the final method, which is known as Language-specific Tokenization, splits input text into tokens using language-specific criteria For example, certain languages, such as German,
37 allow words to be concatenated without spaces As a result, language-specific rules are required to separate the input text into meaningful tokens
Figure 2.28: An illustration of Langue-Specific Tokenization
Statistical tokenization involves training models on large text datasets to predict token boundaries based on word and character frequency patterns Common algorithms used for this purpose include Maximum Entropy and Conditional Random Fields This approach is highly adaptable, allowing it to effectively tokenize complex materials across various writing styles and languages.
The sentence "Tôi cânanngon" can be segmented into smaller tokens such as "tôi," "cần," "ăn," and "ngon," which are frequently used in Vietnamese Tokenization models utilize statistical methods to determine token boundaries by analyzing extensive text datasets These models are trained to recognize word and character patterns, employing algorithms like Maximum Entropy and Conditional Random Fields The advantage of statistical tokenization lies in its flexibility, allowing it to effectively handle various writing styles and languages, making it particularly suitable for complex material.
Sub-word Tokenization Algorithms are essential text analysis techniques that break down text into smaller units called sub-words instead of whole words The three primary methods of Sub-word Tokenization include Byte Pair Encoding (BPE), SentencePiece, and Neural Network Tokenization, each offering unique advantages for natural language processing tasks.
Byte-Pair Encoding (BPE) is a text processing technique that repeatedly merges the most common pairs of letters in a document It starts with single characters as tokens and progressively combines frequently occurring pairs to create sub-word units This method is particularly beneficial for languages with complex morphological structures.
Sentence Piece enhances sub-word tokenization by incorporating entire sentences, treating them as sequences of sub-word units This innovative method offers greater flexibility and is language-independent, making it especially advantageous for tasks such as machine translation and speech recognition.
Neural Network Tokenization is a modern deep learning approach employed for tokenization tasks, where neural networks learn to predict token boundaries by analyzing textual patterns Utilizing Recurrent Neural Networks (RNNs) and Transformer-based models, this technique effectively captures the context and relationships between words, making it particularly suitable for languages with intricate syntax.
2.6.4: Tokenization in Modern NLP models
Tokenization is essential in natural language processing (NLP), addressing the challenges posed by diverse languages and extensive vocabularies It plays a crucial role in the functionality and performance of advanced models like BERT and GPT, which are at the forefront of NLP technology These models excel in various language tasks, with their effectiveness largely depending on advanced tokenization methods that enhance their ability to comprehend and interpret human language.
Text Embedding
Text embedding refers to the mathematical representation of text data within a continuous vector space, where each text element—be it a word, phrase, sentence, or document—is assigned a vector of real values These vectors are designed to reflect the geometric relationships, such as lengths and angles, which correspond to the semantic relationships among the various text components.
Text embedding encompasses three essential aspects: Dimensional Reduction, Semantic Representation, and Contextual Information It transforms high-dimensional text data into a lower-dimensional continuous vector space, enhancing data management and computational efficiency The primary aim of text embedding is to capture the semantic meaning of text, positioning words or phrases with similar meanings closer together in the vector space For example, in an effective embedding space, the vectors for "Sách vở" and "Đại học" are nearer than those for "Sách vở" and "Đường cao tốc." Additionally, modern techniques like BERT and ELMo offer contextualized embeddings, allowing word representations to vary based on their context within a phrase, facilitating a deeper understanding of polysemous words.
2.7.2: Mathematical Foundation of Text Embedding
Word2Vec: Developed by Google, Word2Vec is a pioneering model in word embeddings
It uses neural networks to learn word associations from a large corpus of text and represents these words in a high-dimensional space
GloVe, or Global Vectors for Word Representation, is a prominent word embedding technique developed at Stanford University Unlike Word2Vec, which focuses on local context, GloVe analyzes word co-occurrences throughout the entire corpus to capture global statistical information This approach enables GloVe to create more comprehensive word representations, enhancing the understanding of word relationships in natural language processing tasks.
FastText, developed by Facebook's AI Research lab, enhances the Word2Vec model by incorporating sub-word units such as prefixes and suffixes, enabling it to effectively manage out-of-vocabulary words.
BERT (Bidirectional Encoder Representations from Transformers): Developed by Google
AI, BERT represents a breakthrough in context-dependent embeddings It uses a transformer architecture to consider the context of a word in both directions (left and right of the word)
ELMo (Embeddings from Language Models): ELMo, developed by the Allen Institute for
AI, offers deep, contextualized word representations It utilizes bidirectional LSTMs (Long Short- Term Memory networks) trained on a specific task to create embeddings that consider the entire sentence context
2.7.3: Distance and Similarity between Vectors
To determine the most related paragraphs based on their embedding vector representation, utilize similarity measurements such as Cosine Similarity, Euclidean Distance, Manhattan, Jaccard Similarity, etc
Cosine similarity is a metric used to measure the similarity between two nonzero vectors in an inner product space, defined as the cosine of the angle between them It is calculated by taking the dot product of the vectors and dividing it by the product of their magnitudes This means that cosine similarity is influenced solely by the angle between the vectors, not their lengths The resulting value ranges from -1 to 1, where a cosine similarity of 1 indicates proportional vectors, 0 signifies orthogonal vectors, and -1 represents opposing vectors In cases where the vector components are non-negative, the cosine similarity is restricted to the range of [0, 1].
The cosine of two non-zero vectors may be calculated using the Euclidean dot product formula:
→ Where 𝐴 ⋅ 𝐵 is dot product between 2 vectors, and ‖𝐴‖ and ‖𝐵‖ are magnitudes of those vectors
Given two n-dimensional vectors of characteristics, A and B, the cosine similarity, cos(θ), is expressed by a dot product and magnitude as:
→ Where 𝑨 𝒊 and 𝐵 𝒊 are 𝒊th components of embedding vector A and B
The resultant similarity spans from -1, which means completely opposite, to 1, which means exactly the same, with 0 signifying orthogonality or decorrelation and in-between values denoting intermediate similarity or dissimilarity
Cosine similarity offers several advantages over other similarity measures, including scale invariance, which ensures it remains unaffected by the magnitude of the vectors being compared Its computational efficiency makes it ideal for large-scale machine learning tasks Additionally, cosine similarity is resilient to outliers, meaning it is not influenced by data points that significantly differ from the majority Furthermore, it excels with high-dimensional data, proving valuable in applications such as image processing and text categorization.
In Euclidean space, the distance between two points is defined as the length of the line segment that connects them, known as the Euclidean distance This measurement can be calculated using the Cartesian coordinates of the points and is often referred to as the Pythagorean distance due to its derivation from the Pythagorean theorem.
The distance between two non-point objects is defined as the shortest distance between pairs of points from each object Various formulas exist to calculate distances, including that between a point and a line In advanced mathematics, the concept of distance has expanded to include abstract metric spaces and non-Euclidean distances Additionally, in certain statistical and optimization contexts, the square of the Euclidean distance is often used instead of the distance itself.
In two-dimension Euclidean plane, let point 𝐴 has coordinates (𝑥 1 , 𝑦 1 ) and point 𝐵 has coordinates (𝑥 2 , 𝑦 2 ) Then, the distance between these 2 points can be estimated as the following function:
Figure 2.32: Euclidean Distance in 2D plane
In higher dimension (n-dimension), the point 𝐴 would have coordinates (𝑎 1 , 𝑎 2 , 𝑎 3 , … , 𝑎 𝑛 ) and the point B would have coordinates (𝑏 1 , 𝑏 2 , 𝑏 3 , … , 𝑏 𝑛 ) Then, the distance between these 2 points can be determined as the following function:
The Euclidean distance may alternatively be represented more compactly using the Euclidean norm of the Euclidean vector difference:
Figure 2.33: Euclid Distance in 3D space
Taxicab geometry, or Manhattan geometry, defines the distance between two points as the sum of the absolute differences of their Cartesian coordinates This results in a metric known as taxicab distance, Manhattan distance, or city block distance.
In two-dimension plane, Manhattan Distance between 2 points 𝐴(𝑥 1 , 𝑦 1 ) and 𝐵(𝑥 2 , 𝑦 2 ) is the sum of the absolute values of differences in those 2 coordinates, and can be estimated as following function:
Figure 2.34: Manhattan Distance on a 2D Surface
In an n-dimensional space, the distance between two points A(a1, a2, …, an) and B(b1, b2, …, bn) is calculated by summing the lengths of the line segments projected onto each coordinate axis.
Figure 2.35: Comparison between Euclidean Distance and Manhattan Distance
K-Nearest Neighbors Algorithm
The K-Nearest Neighbors (KNN) method is an intuitive machine learning technique widely used for classification and regression tasks As a supervised learning algorithm, KNN functions based on the principle of proximity This article explores the KNN algorithm's operational principles, mathematical foundations, implementation steps, advantages and disadvantages, as well as its real-world applications.
K-Nearest Neighbors (KNN) operates on the principle that similar data points in the feature space often belong to the same category This method identifies the K closest data points from the training set to a new data point for classification or prediction By leveraging the information from these nearest neighbors, KNN effectively assigns labels or predicts values for the new data point.
Figure 2.36: An illustration of KNN algorithm
The initial step in the KNN algorithm is choosing an appropriate value for K, which determines the number of nearest neighbors to evaluate This hyperparameter is crucial as it significantly impacts the model's performance A low K value can lead to overfitting and increased sensitivity to noise, while a high K value may result in underfitting and a strong bias, making the model overly generic.
The second stage of the KNN algorithm involves distance calculation, where a distance metric is utilized to assess the similarity between data points The most commonly used metric is Euclidean distance, but alternatives such as Manhattan distance, Minkowski distance, and Hamming distance (for categorical data) are also available The choice of distance measure depends on the nature of the data and the specific context Euclidean and Manhattan distances are represented by two key formulas, while Minkowski distance introduces a hyperparameter \(p\); when \(p = 1\), it equates to Manhattan distance, and when \(p = 2\), it becomes Euclidean distance.
After calculating the distances between the query vector and all other vectors in the dataset, KNN sorts these distances in ascending order and selects the K nearest neighbors to the query instance.
In the final step of the KNN algorithm, the model determines the label for a query instance based on the task type—classification or regression For classification, KNN employs a majority vote among the K nearest neighbors, assigning the class with the highest occurrence In regression tasks, it calculates the average of the target values from the K closest neighbors, using this average as the predicted value for the query instance.
2.8.3: Cons and Pros of KNN algorithm
KNN is easy to comprehend and apply Unlike other algorithms, KNN does not require a separate training phase, making deployment quicker It is suitable for both classification and regression workloads
Calculating distances among all training data points can be time-consuming, particularly with large datasets Additionally, storing the entire training dataset can be costly in terms of memory Furthermore, incorporating irrelevant or duplicate features may negatively impact performance.
DESIGN AND CONSTRUCTION
Mechanical Design
The AMR is specifically designed for restaurant use, focusing on the efficient transportation of light food items, with a maximum payload capacity of 15kg.
Autonomous Mobile Robots (AMRs) designed for restaurant service require a compact chassis to navigate efficiently in tight spaces The wheel configuration is vital for maintaining optimal traction, preventing slippage on slippery surfaces like granite, which can cause positional inaccuracies in the vehicle's location.
To enhance operational flexibility, the robot's power source is derived from a rechargeable battery, which can be recharged when the robot returns to a pre-arranged charging point
Concerning the requirement for a compact design to facilitate flexible movement within an environment with limited space, the AMR must have a width not exceeding 500mm and a length not exceeding 700mm
The touchscreen display will be located on the top food storage shelf, requiring the vehicle's height to remain under 1400mm To facilitate easy access for diners while seated, the lowest food storage shelf must be at least 500mm high, and the highest shelf should not exceed 1000mm.
The vehicle will be powered by a rechargeable battery system designed to support at least two working shifts daily, ensuring that the Autonomous Mobile Robot (AMR) can operate for a minimum of 6 hours each day.
A well-designed menu interface is crucial for customer ordering, while a distinct interface is necessary for kitchen staff to efficiently manage incoming orders Both interfaces should be backed by a robust database system that monitors current orders and accesses historical order data.
1 Robot’s Graphical User Interface Give customers ability to interact with robot via a touch screen
Give customers ability to interact with robot via speech
3 Restaurant’s Database Supporting restaurant to manage received orders and orders history, dishes description on the menu and information about the restaurant
4 Restaurant’s Web Server Give staffs ability to interact with the database, manage orders and edit menu
3.1.2: Materials Constituting the Robot Frame
When designing a robot frame, the choice of materials is influenced by factors such as application requirements, weight, strength, stiffness, load-bearing capacity, and cost Commonly used materials for constructing robot frames include aluminum extrusions and sheet steel.
Aluminum extrusions are an affordable and popular choice for robot frames due to their lightweight properties, which help minimize the overall weight of the robot This metal is easy to machine and provides high strength and stiffness, making it suitable for applications that do not require heavy load-bearing capabilities Additionally, aluminum's excellent corrosion resistance and aesthetic appeal contribute to its widespread use in compact and mobile robots, particularly in industrial and service applications.
Sheet steel is an ideal material for robot bases due to its precision and uniformity, which enable accurate positioning and secure component attachment Its high durability, load-bearing capacity, and resistance to deformation enhance the robot's stability, reliability, and lifespan However, using steel necessitates additional corrosion protection to prevent oxidation in humid environments, and its heavier weight compared to aluminum, along with the need for precise machining, can pose challenges in manufacturing a robust structure.
The robot's outer shell will be crafted using advanced 3D printing technology, enabling the creation of uniquely designed and intricately shaped models This innovative fabrication method not only allows for complex designs but also contributes to a lightweight construction, ultimately reducing the robot's overall weight.
Differential drive systems are commonly utilized in mobile robots, including autonomous vehicles and automated assembly robots This system features two independently driven wheels, each powered by separate motors positioned on either side of the robot By independently adjusting the rotational speed of each wheel, the robot can control its speed and direction, allowing for versatile movement and effective rotation.
Figure 3.4: An illustration of differential drive systems
The differential drive system is advantageous due to its simplicity, involving fewer components than other steering mechanisms, which facilitates easier design and maintenance It enables effective movement in multiple directions—forward, backward, turning, and pivoting—by adjusting wheel speeds This system excels in navigating tight spaces and executing complex movements with ease Additionally, it is often a cost-effective solution to implement and maintain, particularly in applications where maneuverability is essential.
Table 1.12: Overall, the differential drive mechanism provides a practical solution for achieving versatile movement capabilities in mobile robots, suitable for a wide range of applications from robotics competitions to industrial automation
There are plenty of solution to arrange driving wheels with motor and driven wheels which is demonstrated in the following tables
Table 3.3: Arrangement of Propulsion System
1 2 wheels One steering in the front one traction wheel in the rear
3 3 wheels Two-wheel centered differential with a third point of contact
4 Two independently driving wheels in the rear, one omni- directional wheel in the front
5 Two connected traction wheels in in rear, one steered free wheel in front
6 Two free wheels in rear and one steered traction wheel in front
Swedish wheels arranged in a triangle; omnidirectional movement is possible
8 4 wheels Two motorized wheels in the rear, two steered wheels in the front
Cars with rear- wheel drive
9 Two motorized wheels in the front, two free wheels in the rear
Cars with front- wheel drive
10 Four steered and motorized wheels
Four-wheel steering Hyperion (CMU)
11 Two traction wheels in rear (or front) and two omnidirectional wheels in the front (or rear)
13 Two-wheels differential drive with two additional points of contact
14 Four motorized and steered castor wheels
15 6 wheels Two motorized and steered wheels aligned in center and one omnidirectional wheel at each corner
16 Two traction wheels in center and one omnidirectional wheel at each corner
After evaluating the tasks and work environment of the robot, we have chosen option 16 for the propulsion system of the Autonomous Mobile Robot (AMR) This design features two central driving wheels and one omnidirectional wheel at each corner, ensuring an even distribution of the payload across all wheels This arrangement significantly reduces the load on individual motors, thereby improving operational efficiency and lowering energy consumption.
3.1.3.3: Transmission Between Driving Motors and Wheels
When choosing a transmission system to connect the motor shaft to the working shaft (wheel), three primary options are available: belt drive, chain drive, and gear drive Each option presents its own advantages and disadvantages, which are effectively summarized in the following table.
Table 3.4: Types of Transmission System and Their Cons, Pros
Belt drive • Ability to transmit rotational motion between two shafts distant from each other
• Slippage of 1%–3% between the belt and the pulleys (with the exception of timing belts provided with teeth) and, respectively, inaccurate gear ratio
• High damping capability of impact load
• Lower cost (about one-half of the toothed gear’s price)
• Increased loads on the shafts and bearings (because of pre-tension of the belts)
• Lower practicably transmissible power, usually up to 50 kW, though in rare cases hundreds or even thousands of kW are transmitted
• Relatively low service life of the belts
• Greater distance required between the shafts
• Greater diameters of the pulleys as compared with gear diameters, therefore greater weight and dimensions
Chain drive • Can be used for both long and short distances
• Do not require initial tension
• Several shafts can be driven from a single chain
• Chain drive takes up less space than a belt drive
• Very high efficiency (up to 96%)
• Noisy and can cause vibrations
• Cannot be used where slip is the system requirement
• Have less load capacity compared with gear drives
Gear dive • Gears can be used for small to large powers
• The drive is positive and gives exact velocity ratio
• Maintenance is inexpensive If properly lubricated, these drives offer
• Suitable for small center distances only
• Costly due to special tools required for manufacturing
• Can cause vibrations due to error in manufacturing
• Requires proper lubrication for satisfactory working
56 the longest service life in comparison to other drives
• Occupies less space, hence compact design
After evaluating the advantages and disadvantages of various transmission systems, it is evident that the AMR demands accurate transmission, seamless operation, and infrequent periodic maintenance Consequently, a gear transmission system was implemented to streamline power transmission from the motor shaft to the load shaft (wheel).
Stepping or stepper motors, as well as permanent-magnet (PM) DC brush type and brushless
DC servomotors, are the most common motors used in motion control systems These three categories can be contrasted as shown in the following table
Attributes Stepping PM brush PM brushless
Torque High (fall of with speed)
None Position and velocity control
Communication, position, and velocity control
Cleanliness Good Brush dust Good
For service robots, precise speed control and effective obstacle avoidance are essential Fast motor response is critical, allowing the motor to quickly react to control signals and adjust power during heavy loads or sudden accelerations This requires the motor to regulate torque to maintain stability and position Feedback components like encoders and rotational sensors provide vital data on the motor's position and speed, enabling the robot's control system to make accurate adjustments based on this feedback.
In conclusion, PM brush DC Servo Motors is the best option to drive the robot due to its low price, its precious and its load capacity
Calculate and Select DC Motor with Appropriate Parameters
Power on the working shaft can be estimated as following function: 𝑃 𝑐𝑡 = 𝜂 × 𝑃 𝐼 𝐶
𝑃 𝑐𝑡 : Power on the working shaft
𝑃 𝐼 𝐶 : Power on the motor shaft
The upper graph illustrates that the driving wheel experiences three external forces: the gravitational force (𝐹𝑔) due to the object's weight, the frictional force (𝐹𝑓𝑟𝑖𝑐𝑡𝑖𝑜𝑛) generated during movement, and the pulling force (𝐹𝑝𝑢𝑙𝑙) exerted by the engine Since the Autonomous Mobile Robot (AMR) is designed for smooth operation, the acceleration (𝑎) remains at zero.
Apply the 3 rd rule of Newton’s law of motion:
Considering the forces on a 2D plane Oxy, we have:
As mentioned earlier, with the acceleration 𝑎 = 0, 𝐹 𝑝𝑢𝑙𝑙 can be estimated as:
𝐹 𝑝𝑢𝑙𝑙 = 55 × 9.81 × (0.05 × cos 3 𝑜 +sin 3 𝑜 ) = 55.18 (𝑊) Power of pulling force can be estimated as: 𝑃 = 𝐹𝑃𝑢𝑙𝑙× 𝑣
𝑚 = 55 𝑘𝑔 is maximum load that act on the driving wheel (𝑚 𝐴𝑀𝑅 = 40, 𝑚 𝑙𝑜𝑎𝑑 = 15)
𝑣 = 0.4 𝑚/𝑠 is the maximum movement speed of the robot
𝛼 = 3 𝑜 is the slope that robot can run over
𝜇 = 0.05 is the coefficient of friction
The maximum power on the working shaft is:
Known that the diameter of the driving wheel is 150 𝑐𝑚, so the radius of the wheel 𝑟 0.075 𝑚 The rotating speed of the working shaft can be estimated as:
2×0.075×𝜋 = 0.849 (𝑅𝑃𝑆) = 50.94(𝑅𝑃𝑀) torque on the working shaft
In a planetary gear transmission system utilizing rolling bearings, the efficiency coefficients for the bearings and gears are 𝜂 𝑏𝑟 = 0.97 and 𝜂 𝑜𝑙 = 0.995, respectively With a transmission ratio of 𝑢 = 5, the overall transmission efficiency is calculated as 𝜂 = 𝜂𝑏𝑟 × 𝜂 𝑜𝑙, resulting in an efficiency of 0.965 Consequently, the power required at the motor shaft is expressed as 𝑃 𝑐𝑡 = 𝑃 𝑡.
0.965= 22.87 (𝑊) the required rotating speed on the motor shaft is 𝑛 𝑐𝑡 = 𝑢 × 𝑛 = 5 × 50.94 = 254.7 (𝑅𝑃𝑀) base on these 2 requirements:
We can choose GP36-3650 DC Servo motor
• Included planetary gearbox with 𝑢 𝑔𝑒𝑎𝑟2 = 14 → 𝑛 𝑔𝑒𝑎𝑟2 = 250 (rpm) < 𝑛 𝑐𝑡
• Output shaft diameter of included planetary gearbox 8mm
Figure 3.8: GP36-3650 DC Servo Motor
Let I: output shaft of planetary gearbox which included with DC servo motor
Let II: loading shaft that attached to the wheel
𝑢 𝑟𝑒𝑎𝑙 = 𝑢 𝑔𝑒𝑎𝑟1 × 𝑢 𝑔𝑒𝑎𝑟2 = 14 × 5 = 70 Rotational speed of every shaft:
5 = 50 (rpm) Power on every shaft:
50 = 6171.21 (N.mm) Maximum velocity of AMR:
For the gear transmission system, the PX57N005 planetary gearbox is selected due to its specifications aligning with the requirements:
Modelling with SolidWorks
Figure 3.11: Profile project and orthographic project of AMR’s base
Figure 3.12: Overview of modelled AMR
Constructing Mechanical Basis
Electrical System
A well-organized electrical system is essential for the efficient operation of a robot, ensuring stable performance and minimizing the risk of component failure By providing an adequate power supply, both actuators and sensors can operate at their best, contributing to the overall functionality of the robotic system.
The compact black box houses all electrical components and power sources, ensuring adequate voltage and power for microcontrollers and sensors while providing a stable power source for motor operation.
The GS GT7A-H battery is a reliable and durable maintenance-free lead-acid battery, ideal for motorcycles and small vehicles It offers consistent power output and boasts a long service life, making it a dependable choice Its compact design is perfect for space-restricted applications, and its robust construction ensures it can handle the demands of daily use.
The GS GT7A-H battery, designed with a compact size, seamlessly fits into the robot's electrical compartment Its 7Ah capacity allows it to power both motors for extended periods, ensuring optimal operation for the robot.
The GS GT7A-H battery is engineered for reliability and endurance, providing efficient and continuous power over extended periods without the need for frequent recharging This makes it an ideal choice for applications that demand consistent and long-lasting energy solutions.
The Waveshare UPS Module for Jetson Nano serves as an Uninterruptible Power Supply, ensuring continuous operation by supplying power from both a main source and a backup battery This setup effectively prevents sudden power outages, allowing the Jetson Nano to function seamlessly even during continuous motion By utilizing four batteries, the module guarantees that the robot remains powered without the risk of losing energy.
The module features a real-time monitoring screen that enables precise tracking of the Jetson Nano's remaining battery power, CPU, GPU, and RAM usage This capability allows for continuous performance monitoring and effective management of power consumption.
Table 3.7: Specifications of Waveshare UPS module
The Jetson Nano, developed by NVIDIA, is an affordable and powerful single-board computer tailored for AI and embedded computing applications Its compact design delivers high-performance computing, making it ideal for various AI projects, robotics, and educational uses.
In this project, the Jetson Nano is utilized to meet the requirements of ROS 2 and execute AI tasks for the robot Serving as a critical link between the computer server and motor microcontrollers, the Jetson Nano ensures seamless robot operation Its capacity to process AI algorithms and facilitate real-time communication is vital for enhancing the robot's autonomous functionality and decision-making abilities.
Figure 3.17: NVIDIA Jetson Nano Model B01
Table 3.8: Specifications of Jeston Nano B01
Model NVIDIA Jetson Nano Developer Kit B01
CPU Quad-core ARM A57 @ 1.43 GHz
Memory 4 GB 64-bit LPDDR4 25.6 GB/s
Camera 2x MIPI CSI-2 DPHY lanes
Display HDMI and display port
Mechanical 69 mm x 45 mm, 260-pin edge connector
STM32 is a series of microcontrollers manufactured by STMicroelectronics, renowned for its flexibility, high performance, and diverse features catering to various applications in electronics, particularly in robotics and embedded systems
In this project, we utilize the STM32F401 microcontroller to control motors, while the STM32F103 is employed to read values from the BNO085 sensor Both microcontrollers establish communication with the Jetson Nano through the CDC (Communications Device Class) serial port.
The STM32F103C8T6 Blue Pill ARM Cortex-M3 development kit is a popular choice for ARM research, thanks to its cost-effectiveness and user-friendly programming through the Blue Pill bootloader Additionally, it is known for its high-quality construction and long-lasting durability.
• Power Supply: 5VDC via Micro USB port, converted to 3.3VDC through a power IC for the main microcontroller
• Integrated 32KHz crystal oscillator for RTC applications
• Full GPIO and communication interfaces: CAN, I2C, SPI, UART, USB
• Integrated status LED, PC13 LED, and Reset button
The STM32F401CCU6 development kit features the STM32F401CCU6 microcontroller from the F4 series, delivering superior performance at a competitive price point compared to the F1 series This kit boasts enhanced processing speeds and advanced floating-point calculation capabilities, while its user-friendly design includes easily accessible IO pins for seamless integration and development.
• Power supply can be provided by a coin cell battery
• On-board 32.768KHz and 25MHz crystals
• V+ / V- can connect external ADC reference voltage, default to power supply voltage
• Pin B2 can be configured for booting or normal GPIO operation
3.5.4: BTS7960 43A High-power DC Motor Driver
The BTS7960 43A High-Power Motor Driver is capable of driving a single DC motor with a maximum current capacity of 43A To ensure safe operation and connection to the microcontroller, the circuit can be enhanced with an additional signal level conversion buffer IC, the 74HC244.
Specifications of BTS7960 Motor Driver
• Automatic Shutdown on Low Voltage: The circuit automatically shuts down to prevent motor control at low voltages It reopens when voltage exceeds 5.5V
• Overheat Protection: Integrated thermal sensor provides protection against overheating, cutting off the output when excessive heat is detected
The RPLIDAR A1M8 360° Laser Range Scanner, produced by SLAMTEC, is an advanced distance sensor ideal for applications like distance measurement, obstacle detection, and mapping in vehicles, autonomous robots, and security systems This sensor is renowned for its high stability and accuracy, making it a reliable choice for various technological implementations.
The RPLIDAR A1M8 360° Laser Range Scanner employs UART communication for seamless integration with microcontrollers, embedded systems, and PCs using a USB-UART converter It offers a scanning range of 0.15 to [insert maximum range].
12 meters, a rotation speed of 5.5Hz, and a sampling frequency of up to 8000 points per second
• Model: RPLIDAR A1M8 360° Laser Range Scanner
• Motor Driver: Brushed DC Motor
• Output: UART Serial (3.3 voltage level)
• Range Resolution: o ≤1% of the range (≤12m) o ≤2% of the range (12~16m)
75 o 2% of the range (3~5m) o 2.5% of the range (5~12m)
PI controller
To achieve stable motor operation and desired speed, it is essential to implement a suitable PID controller for each motor Initially, running the motor at maximum PWM allows us to gather crucial parameters, including the motor's maximum speed and the time required to reach that speed.
To obtain the necessary data, we will plot the speed graph
(Professor Chuong's chart plotting app)
The graph reveals the motor's maximum speed and the duration required to attain it By integrating this data with the formulas presented in chapter 2, we can compile a comprehensive parameter table.
Another point to note is that since the initial setup specifies that the PWM will operate in a range from 0 to 1000, we will set ∆MV00 for both motors
Table 3.9: The PI parameters for two motors
Based on the calculations of the parameters and the theory of DC motor control outlined in Chapter 2, the team has derived the transfer function of the motor.
The delay time, denoted as 𝜃, refers to the interval between the application of a control signal and the initial observable response from the system Specifically, if the system's input is altered at time t=0, 𝜃 represents the duration before any reaction is detected In our case, the delay is minimal, approaching zero, leading us to set 𝜃 to 0.
Our PID tuning utilizes the IMC (Internal Model Control) approach, specifically designed for FOPDT (First-Order Plus Dead Time) models This method effectively employs the FOPDT model to create a PID controller, offering a streamlined representation of system dynamics.
Figure 3.25: IMC-Based PID Controller
With all parameters that we have calculated earlier, we will substitute them into the program to run a test and redraw the motor speed response chart
(Professor Chuong's chart plotting app)
Figure 3.26: No-load speed response
We assess the engine's speed response under load while the robot executes its tasks, leading to the generation of the following chart.
(Professor Chuong's chart plotting app)
Despite the load affecting the smoothness of the speed, the response speed remains quick, comparable to operation without load While the control speed exhibits several spikes, the speed error is minimal, primarily occurring at 500rpm, which is below the robot's minimum operating speed of 1000rpm.
Mapping
To enable effective navigation, the next crucial step involves creating a map of the robot's environment As discussed in Chapter 2, we utilize the SLAM toolbox for mapping, which provides the necessary input for Nav2 After completing the mapping process, the map will be updated to accurately represent the actual space.
Nav2 and AMCL
After creating a static map with the SLAM toolbox, the next step is to implement Adaptive Monte Carlo Localization (AMCL) to ascertain the robot's position within the environment AMCL enables the robot to compare its sensor data against the map, allowing for continuous updates and refinements of its location By combining the static map with AMCL, we can leverage Nav2 for accurate robot localization and navigation.
Nav2 incorporates AMCL for precise tracking of the robot's pose and employs Dijkstra's algorithm for pathfinding, ensuring optimal navigation Although Dijkstra's algorithm requires more resources than A*, it excels in finding the most efficient path, making it ideal for scenarios where path optimality is essential This approach allows the robot to navigate accurately and effectively within complex mapped environments, demonstrating the importance of advanced algorithms in achieving reliable robot navigation.
Retrieval-Augmented Generation
The rise of large language models (LLMs) such as ChatGPT, LLama-2, Qwen, and Mistral has brought attention to the issue of hallucinations, where these models produce fluent and seemingly logical sentences that are, in fact, inaccurate To combat this challenge, the RAG (Retrieval-Augmented Generation) technique has been developed as a solution to improve the reliability of generated content.
Extracting specific information from proprietary documentation, like technical manuals or user guides, can be extremely challenging for businesses Utilizing large language models to sift through extensive and complex documents often feels like searching for a needle in a haystack.
Retrieval-Augmented Generation (RAG) is an innovative approach designed to tackle information overload in large language models (LLMs) By indexing each paragraph of a document, RAG retrieves the most relevant sections in response to a query This selective provision of information not only streamlines the input for the LLM but also significantly improves the quality of the generated responses.
Text embedding transforms paragraphs into vectors, preserving their essential characteristics These embeddings are vital for algorithms to understand meaning and context, effectively addressing various challenges in natural language processing.
82 posed by the vast diversity and complexity of human language, including nuances like irony, sarcasm, and context-dependent meanings
OpenAI's Ada 002 has been a popular choice for text embeddings due to its user-friendly design and early development However, recent evaluations on the MTEB leaderboards indicate that Ada is no longer the top option for embedding text, suggesting that advancements in the field have introduced more effective models.
Figure 3.30: Top 10 text embedding model on MTEB leaderboards
The Ada-V2 Text embedding model is an excellent choice for this study due to its support for multiple languages, including Vietnamese This model effectively transforms Vietnamese text into vectors, preserving its unique characteristics and enabling efficient retrieval and analysis within the system.
An important note is that Ada-V2 Text embedding models will return a vector with the dimensions of 1356, as illustrated in the example below
Figure 3.31: An embedding vector generated from text
3.9.2: Pinecone Vector Database – The Retriever
The retriever serves as a vital tool for extracting relevant information to refine large language models (LLMs) By integrating external knowledge sources, retrieval-augmented generation (RAG) significantly enhances the knowledge base of LLMs Depending on the semantic requirements, the retriever can utilize various tools such as vector databases, graph databases, or traditional SQL databases.
There are many vector database tools to integrate into the RAG system The image below describes the types and compares the advantages of some popular vector databases
Figure 3.32: Overview of Vector Databases
In this project, we leverage Pinecone's vector database to efficiently store and retrieve content as embedding vectors This innovative approach enhances our ability to query relevant information quickly, significantly accelerating the querying process compared to traditional databases.
After utilizing the Ada-V2 Text Embedding model to embed text paragraphs, we generate 1536-dimensional embedding vectors To manage these vectors, it's essential to establish a vector database Furthermore, we will implement the Cosine Similarity method for efficient querying in the future This Vector Database will serve as a repository for all embedded vectors.
The article provides comprehensive information about 84 restaurants, including detailed dish descriptions, nutritional information for each dish, pricing, contact details, and the history of the establishments This data aims to enhance the capabilities of language models to efficiently retrieve relevant restaurant information in the future.
Figure 3.34: First look of the vector database
Figure 3.35: Chunks of text and their embedding vector
Converting text paragraphs into vectors enables us to query and select paragraphs with similar content effectively Our vector database utilizes the KNN algorithm alongside the Cosine Similarity formula to measure the distance between the query vector and existing vectors, facilitating accurate content retrieval.
85 stored embedding vectors It will then return a number K of vectors that are closest to the query vector
Below is an experimental image on the screen, showing a query “món thịt vịt” and retrieving 3 text paragraphs with related content from the vector database
Generative Pre-trained Transformers (GPT) are advanced neural network models utilizing transformer architecture, representing a major leap in artificial intelligence (AI) These models power generative AI applications, such as ChatGPT, allowing for the creation of human-like language and diverse content, including images and music Organizations across various sectors leverage GPT models for applications like Q&A bots, text summarization, content generation, and enhanced search capabilities.
The image below showcases the ranking of large language models, highlighting that OpenAI's GPT models significantly outperform other LLMs.
This study focuses on utilizing the GPT-3.5 Turbo Large Language Model as the foundation for a Retrieval-Augmented Generation system, chosen for its cost-effectiveness and capability to support multiple languages, including Vietnamese.
Retrieval-Augmented Generation Pipeline can be illustrated in the following figure:
Figure 3.39: Overview of RAG pipeline
To create a vector database for a restaurant, begin by adding detailed information to an SQL database, including dish introductions, ingredients, and nutritional values Optionally, include additional restaurant details such as name, address, style, and development history Next, utilize the Ada-V2 Text Embedding model to convert these text chunks into 1536-dimensional embedding vectors, which will be stored in the vector database.
Figure 3.40: Add new data into vector database
Speech-to-Speech Conversation
The RAG model allows us to input a string and generate a corresponding output string To utilize this model, we simply need to convert spoken input into text and then transform the output text back into speech.
SpeechRecognition is a free, open-source Python library designed for speech recognition using multiple engines and APIs It streamlines the conversion of spoken language into text, making it accessible for developers Key features include support for various audio formats and integration with popular speech recognition services.
• Multiple Recognizers: Supports multiple speech recognition engines and APIs, including Google Web Speech API, Wit.ai, Microsoft Bing Voice Recognition, IBM Speech to Text, etc
• Multiple Languages: Supports a wide range of languages and accents, which includes Vietnamese
• Microphone Input: Can capture audio from a microphone
• Audio File Input: Can recognize speech from audio files in various formats (WAV, AIFF, FLAC, etc.)
• Customization: Offers various parameters to customize the recognition process
• Error Handling: Provides robust error handling to deal with network issues, recognition failures, and more
To improve speech recognition and text conversion, it is essential to filter and adjust the input audio The following waveform diagrams illustrate the comparison between the directly recorded audio file and the audio file after noise filtering.
Figure 3.46: Waveform diagram of recorded audio
Figure 3.47: Waveform diagram of recorded audio after noise filtering
3.10.3: Text-to-speech gTTS, which stands for Google Text-to-Speech, is a Python library and CLI tool to interface with Google Text-to-Speech API It allows for easy conversion of text to speech using Google's Text-to-Speech API, providing support for multiple languages and accents
• Multiple Languages: Supports a wide range of languages and accents, which includes Vietnamese
• Easy to Use: Simple and intuitive API
• Flexible Input: Can convert text from strings, files, or URLs to speech
• Output Options: Save the speech to an audio file or play it directly
3.10.4: Flowchart of Speech-to-speech System
Figure 3.48: Speech-to-speech Flowchart
Transmit Orders
The Message Queuing Telemetry Transport Protocol (MQTT) is a standardized messaging protocol designed for efficient machine-to-machine communication It is particularly suited for smart sensors, wearables, and other Internet of Things (IoT) devices that need to exchange data over networks with limited bandwidth By utilizing MQTT, these IoT devices can easily set up data transfers, ensuring effective communication between devices and the cloud.
The MQTT protocol has become a standard for IoT data transport because it offers the following advantages:
• MQTT implementation on IoT devices is lightweight and efficient, using little resources, making it suitable for tiny microcontrollers A simple MQTT control message, for example,
93 can consist of only two data bytes MQTT message headers are also minimal, allowing you to optimize network bandwidth
The MQTT protocol offers a scalable solution for IoT communication, requiring minimal coding and ensuring low power consumption during operation Its design allows for efficient interaction with millions of IoT devices, making it an ideal choice for large-scale implementations.
• Secure MQTT enables developers to encrypt messages and authenticate devices and users with current protocols like OAuth, TLS1.3, Customer Managed Certificates, and more
• Well-supported Several languages, including Python, provide considerable support for MQTT protocol implementation As a result, developers may rapidly and easily integrate it into any form of application
Figure 3.49: Transfer orders from bot to server
This study explores the use of the MQTT protocol to efficiently transfer customer-selected orders from a robot's touch screen to a restaurant server, where the data will be stored in a SQL database This system enables restaurant staff to effectively manage order information for each table.
Graphical User Interface on Touch Screen
After the user says the keyword "Hey Siri", the Service Robot will start recording for 5 seconds The Service bot will then respond to the user's question
Figure 3.51: Service Robot is Listening
Figure 3.52: Service Robot is Speaking
The service robot features a user-friendly menu interface, allowing users to easily add or remove items from their cart Once selections are made, users can confirm their choices by pressing the green button or cancel and return to the screensaver by pressing the red button.
Once the customer finalizes their selections and approves the created order, they will be directed to a confirmation screen Here, they can review the items included in their order before deciding to confirm or cancel it.
Server and Website for Restaurants
3.13.1: About HTML, CSS, PHP, MySQL and JAVASCRIPT
HTML, or "Hypertext Markup Language," is a fundamental markup language essential for creating and structuring web pages on the Internet Originating from the early development of the World Wide Web by Tim Berners-Lee and his team at CERN, HTML allows web developers to use tags and markup elements to effectively describe various types of content, such as text, images, links, and multimedia.
An HTML document is composed of various tags and text, where each tag defines a specific section, like a heading, paragraph, or image Tags may include attributes that offer extra details about elements, such as an image's size or a link's purpose.
HTML is a markup language, distinct from programming languages like JavaScript or Python, as it primarily describes the structure and content of a web page rather than managing complex behaviors or functionalities.
HTML serves as the foundation for creating web pages, but for more intricate websites and dynamic functionalities, developers typically integrate CSS (Cascading Style Sheets) for layout control and JavaScript for interactivity Since its introduction in 1991, HTML has undergone significant updates and standard revisions, solidifying its role as a crucial platform for contemporary web applications.
Markup Language: HTML is a markup language, not a programming language It uses tags to mark elements in a document to indicate how they are displayed or behave on a web page
Structure and Semantics: HTML provides a hierarchical structure for content Elements such as
HTML elements such as , , , , , , , , and are essential for structuring content effectively These elements not only organize the layout but also provide semantic meaning, enabling browsers, search engines, and accessibility tools to comprehend the function of each component.
Multimedia: HTML allows embedding multimedia content such as images, audio, video and interactive content using canvas and svg
Hyperlinks: Hyperlinks are fundamental in HTML, created with the element that allows navigation between different web pages and resources on the internet
Forms and Input: HTML provides form elements such as , , , , and to collect user input and interact with the web
Integration with CSS and JavaScript: Although HTML defines the structure and content of a web page, it works together with CSS for styling and JavaScript for interactivity
CSS, or "Cascading Style Sheets," is a crucial language used to dictate the presentation of HTML documents on the web It enables control over various HTML elements, including text, color, width, height, spacing, and overall layout By enhancing web design and development, CSS significantly improves user experience and facilitates the creation of visually appealing and effective interfaces.
Styling HTML elements: CSS allows web developers to define styles such as color, font, spacing, and layout for HTML elements
CSS facilitates the separation of content from presentation by distinguishing HTML structure from CSS styling, which simplifies the maintenance and updating of web pages This approach not only improves accessibility but also allows content to be displayed in diverse ways across various devices and screen sizes.
CSS utilizes selectors to identify HTML elements and declarations to define their styling Selectors can target elements by type (such as p, div, h1), class, ID, attributes, or their relationships with other elements, including parent, child, and sibling connections.
Cascading in CSS refers to the hierarchical application of styles, where styles can be inherited from parent elements or overridden by more specific rules Additionally, CSS employs specificity to resolve conflicts when multiple rules target the same element, ensuring the correct style is applied.
The CSS box model is essential for defining the dimensions and spacing of HTML elements, encompassing properties like width, height, padding, border, and margin Understanding these properties is crucial for effectively controlling the layout and appearance of elements on a web page.
CSS enables developers to utilize media queries to implement varying styles tailored to specific device or viewport features, including screen width, orientation (landscape or portrait), and device type (screen or print).
CSS offers advanced layout modules like Flexbox and Grid Layouts, enabling developers to create responsive and intricate designs with precise control over element placement and alignment.
Animations and transitions: CSS supports animations and transitions to create interactive effects such as fading, sliding out, and transitions (e.g., scaling, rotation) without relying on JavaScript
To correctly pronounce MySQL, say "MY-ES-KYOO-EL" [maɪˌɛsˌkjuːˈɛl], as many mistakenly pronounce it as "my sequel." Developed by the Swedish company MySQL AB in 1994, MySQL was acquired by Sun Microsystems in 2008 Subsequently, Oracle Corporation purchased Sun Microsystems in 2010, leading to MySQL's current ownership under Oracle.
MySQL is an open-source relational database management system (RDBMS) that functions on a client-server model, enabling users to create and manage databases while effectively handling the relationships between them.
MySQL is a relational database management system that organizes data in tables composed of rows and columns It utilizes Structured Query Language (SQL), the industry-standard language for database management and access.
Analyze trends and user behavior
In today's fast-paced technological landscape, software users demand intuitive, navigable, and aesthetically modern interfaces This shift in user behavior presents a significant challenge for developers, who must continuously adapt to emerging trends and ensure their software not only fulfills user expectations but also enhances productivity in various working environments.
This research examines the restaurant environment with a particular emphasis on Point of Sale (POS) systems, which are essential for managing orders, table reservations, payments, and revenue data POS machines are gaining popularity among small and medium-sized businesses, as they facilitate efficient order management, accurate payment processing, and effective product oversight.
When implementing a POS (Point of Sale) system in a restaurant, it is crucial to consider and optimize several key factors to enhance efficiency and improve user experience.
A modern POS system should offer a variety of payment options, including cash, credit cards, debit cards, e-wallets, and other online payment methods This diversity enhances customer convenience and flexibility during transactions, making it easier for businesses to cater to different payment preferences.
− Product and inventory management: The POS system must allow for effective management of product information, including inventory quantities, prices, and related
104 information Accurate warehouse management helps prevent shortages or ineffective inventory management
− Order management: POS needs to support order management from the time a customer places an order until payment and delivery This feature makes tracking and processing orders easy and fast
Effective customer management enhances relationships and boosts loyalty by recording and analyzing customer information A POS system offers valuable insights into purchase history, incentives, and contact details, enabling employees to deliver improved service.
A robust POS system must offer comprehensive reporting and analytics, delivering insights on revenue, top-selling products, customer demographics, and other key business metrics This functionality is essential for assessing business performance and formulating effective growth strategies.
To ensure robust information security, POS systems must implement stringent security measures to safeguard both customer personal data and business information Compliance with established security standards and the use of encryption techniques are essential to mitigate risks associated with cyber-attacks.
A user-friendly interface design is crucial for attracting and retaining users, as it enhances employee training and reduces operational errors Investing in interface design not only optimizes daily tasks but also improves user experience through features like dish filters, menu suggestions, and instant customer feedback, all of which significantly boost customer satisfaction and loyalty.
Model and deploy html on MySQL server
To optimize robot products for restaurant environments, software must be tailored to manage the robot's operational status and seamlessly integrate with the restaurant's management system An effective solution will oversee the robot's performance while also offering functionalities for payment processing, invoicing, and revenue tracking, ultimately streamlining restaurant operations and improving the overall customer experience.
The software will feature a modern and user-friendly interface designed for simplicity By utilizing HTML as the main tool for interface development, it guarantees an intuitive experience while adhering to modern web standards for optimal compatibility and user satisfaction.
MySQL was selected as the database due to its simplicity, reliability, and efficient data management capabilities It will store critical information such as robot status, payment and invoice details, and support revenue statistics, enabling restaurants to manage their operations more effectively and scientifically.
HTML is essential for developing user interfaces, facilitating a user-friendly experience Its flexibility and compatibility across various devices and browsers allow users to easily access and utilize software from any location.
Integrating HTML, MySQL, and PHP significantly lowers software development and maintenance expenses while creating a flexible system that is easily expandable This approach ensures that the product can adapt to evolving business and technological needs within the culinary and service sectors.
The dishes will be classified into each category,
Figure 3.57: Dishes are classified on separate pages
After choosing the item and quantity, the food will be transferred to the shopping cart
Users can easily modify the quantity of dishes or remove items directly from the shopping cart interface This section allows them to view the quantity and total price for each dish, as well as the overall total quantity of dishes in their cart.
After finalizing the menu, the customer will click on the check order button to proceed to the payment step
At payment interface, users can choose to eat at the table or take away, payment method, select discount code, table is vacant Then press the button to complete payment
Once users finalize their bill, they can opt to dine in or take their meal to go They will select their preferred payment method, apply any discount codes, and confirm that the table is available before pressing the button to complete the payment.
Upon payment completion, essential details such as the menu ID, total price, dish names and quantities, table number, discount code (if applicable), payment method, and usage instructions will be recorded and stored in the MySQL database.
When customers opt for the MoMo payment method, a QR code featuring the restaurant's payment number is generated This allows customers to conveniently scan the code and make direct payments using their banking applications.
In the admin interface, owners can add food through the Add Product form
Owners can also update dish information or delete dishes via the Update and Delete buttons located on each food form
When pressing the update button, the owner will be taken to the page to edit dish information
Figure 3.62: Admin update product interface
After clicking update, all new product information will be replaced with the information in the database in MySQL
The dining table information page allows users to effortlessly view the dishes available, including their quantity and serving status, as well as the chosen payment method.
Summary and overview of the system
The operating system has 3 main function including control system, web server and Speech-to-speech response which can be divided into a number of sub-blocks as following:
Customers and staff engage with the website via a web server, which securely stores their data in an SQL database Utilizing voice commands, customers can effortlessly place orders or make inquiries The user-friendly interface facilitates smooth interaction with the system Additionally, customer queries are transformed into vector form through an embedding process and subsequently stored in a vector database.
The system employs a large language model for seamless speech-to-text and text-to-speech conversion, facilitating effective two-way communication between customers and the platform Additionally, it integrates the Robot Operating System (ROS) and MQTT Broker to efficiently manage and transmit data across different components.
The system comprises 112 components, including a robot and a server, utilizing sensors like IMU and encoders A PID controller is employed to manage the DC motors and accurately track the robot's position.
The NAV2 system enables robots to perform tasks with precision by first notifying customers upon receiving a command and awaiting their confirmation This integration of advanced technologies facilitates automated and efficient services, allowing for seamless execution of tasks through voice command recognition and robot control.
EXPERIMENTAL EVALUATION
Evaluate Mechanical Design
A Service robot has been completed, with the ability to move autonomously, has 3 trays containing dishes, and comes with a touch screen which displaying emoticons and menu information
Figure 4.1: First glance of service robot
Most mechanical components are manually assembled and adjusted, leading to low precision and causing mild shaking during movement, especially noticeable on the upper screen, which is the farthest from the robot's center of gravity.
The robot boasts an impressive load capacity of up to 60kg, ensuring smooth movement and stability during operation Its reliable transmission accurately delivers the necessary rotations, optimizing both speed and torque Additionally, the active 2-wheel suspension system enhances performance, preventing any sliding while in use.
The robot operates smoothly and does not produce any loud noise It is pretty easy to overcome obstacles on the way, such as minor ledges
The final service robot product met all of the design requirements, which were outlined in the preceding section As follows
Table 4.1: Comparation between requirements and real parameters
Ord Characteristics Requirements Parameters Unit
Evaluate Electrical System
The robot's electrical system functions well Electrical equipment is dispersed, organized, and securely installed
The robot's 5V battery can power the control system for around 4 hours The 12V battery allows two DC motors to run continuously for approximately 7 hours
Sensor devices like Lidar and MPU collect and transmit signals to the Jetson Nano and microcontroller for optimal operation The robot processes this data to create a digital map of its environment However, misalignment of the Lidar leads to a limited scanning angle, potentially obscuring obstacles located above or to the sides within its blind spot.
Figure 4.2: Electrical System inside the black box
Evaluate Control System
We aim to assess the robot's performance by measuring its delivery speed and accuracy This evaluation focuses on the robot's ability to swiftly transport items to their designated locations while navigating precisely without errors or delays Our goal is to ensure the robot operates efficiently and reliably across various conditions.
To ensure stability and prevent item spills during transit, we have set the robot's linear velocity at a controlled speed of 0.26 m/s This speed strikes a balance between efficiency and safety, allowing the robot to operate swiftly while minimizing the risk of accidents By maintaining this output threshold, we guarantee smooth and reliable operation, even in complex environments.
Figure 4.3: Linear velocity response chart of your robot
The chart illustrates a discrepancy between the robot's actual velocity, represented by the red line, and the desired speed of 0.26 m/s indicated by the blue line from ROS2 Despite the target speed being consistently set at 0.26 m/s, the robot fails to achieve this rate, which contradicts earlier calculations suggesting stable operation at this speed This issue may stem from a programming error that causes a mismatch between the velocity measured by the microcontroller and the commands sent from ROS2.
To mitigate concerns about excessive jerking from the motors' rapid response, a linear interpolation function has been implemented to smooth the velocity changes This adjustment enables the robot to gradually accelerate its speed over a brief period, preventing abrupt velocity shifts that could result in food spillage.
We will evaluate the precision of the robot's position coordinates (x, y) and its orientation angle at the destination The team has configured NAV2 to allow a tolerance of 0.25 meters for both x and y coordinates, providing the robot with greater flexibility in stopping at different locations, particularly when faced with obstacles from customers.
For this test run, the team set the coordinates to x = 1.58 and y = 1.10, with the quaternion angle value w = 0.997574
Figure 4.4: The robot’s position along the x-axis
Figure 4.5: The robot’s position along the -axis
The analysis of the provided charts reveals the robot's trajectory along the x and y coordinates, culminating in a final position of (1.4688, 0.89057) The deviations recorded for these coordinates are 0.1165 and 0.21411, both of which fall comfortably within the pre-established tolerance limit of 0.25.
The quaternion angle w is utilized by the team to establish the robot's orientation at its destination, with the quaternion angle at the destination measured at 0.972281 This value shows a slight deviation of 0.025293 from the target orientation angle set for the destination.
The robot meets the necessary accuracy standards for position coordinates (x, y) and orientation angle, making it suitable for restaurant environments where ultra-high precision is not critical However, it experiences calculation errors that affect its velocity performance, which requires further review and correction.
Figure 4.7: The robot performs mapping and delivers items to customers.
Evaluate Speech-to-speech conversation
While RAG models produce responses based on prior information, they exhibit slower response times compared to standard LLM models This delay is primarily due to the RAG architecture's need to query relevant data from a vector database, which adds extra processing time.
This following bar chart illustrates the time it takes to generate an output response on 10 different queries from the customer
Figure 4.8: Comparison of response time between RAG and plain LLM
A larger vector database can result in slightly slower query speeds, which may increase response times Furthermore, network latency impacts query speed, as the Pinecone vector database functions through remote servers rather than directly on the local machine.
Figure 4.9: Response time affected by number of data point in vector database
In the context of restaurants, the balance between response time and the quality of responses is often deemed acceptable, particularly when the volume of information is manageable This trade-off is generally considered reasonable in such scenarios.
Evaluate Database and Webserver
The database system enables seamless order management for customers and restaurant employees via their mobile devices, thanks to its user-friendly interface With minimal training required, users can quickly learn to navigate and utilize its features effectively The interface's efficient performance minimizes errors and enhances user satisfaction, ensuring a smooth experience whether customers are placing orders or staff are updating menus and checking order statuses This efficiency not only elevates the dining experience for patrons but also streamlines staff workflows, allowing them to focus on providing excellent service.
CONCLUSION AND DEVELOPMENTAL DIRECTION
Conclusion on This Graduation Thesis
Our team has successfully completed the project on "Research and Manufacture of Service Robots for Restaurants." The innovative service robot autonomously delivers meals and beverages to customers, enhancing the dining experience It allows patrons to order directly from a screen and features advanced interaction capabilities, responding to inquiries based on menu details Additionally, the robot includes a specialized management system for restaurant staff, facilitating order identification, history tracking, and information editing for accurate guest responses.
The project has significantly advanced our understanding of data collection from environmental barriers and the efficient application of path planning algorithms and natural language processing models Participants gained valuable insights into scientific research challenges, highlighting the project's success and its vast potential in the service industry This initiative opens new opportunities for the deployment of autonomous robots across various sectors, including restaurants, hotels, retail, healthcare, and education The team believes that the evolution of autonomous service robots will persist, driven by ongoing technological advancements and innovation.
This graduation thesis showcases significant advancements and innovative breakthroughs compared to previous initiatives The project reflects our commitment to continuous research, creativity, and a vision for sustainable development Key highlights include the integration of cutting-edge techniques and a focus on impactful solutions that drive progress in the field.
• Applying large language models to provide conversational capabilities, information provision, and customer inquiries resolution for restaurant patrons
Enhancing the conversational capabilities of large language models through an integrated augmented query system allows the robot to effectively search for and provide specific information related to restaurants.
• Building a restaurant data management system alongside the robot, with the main goal of helping restaurant staff edit specific restaurant information that the robot can access and search
125 when responding to customer requests Additionally, it assists restaurant staff in managing the menu, orders, and other related information.
Limitations of This Study
Although substantial goals have been fulfilled, this project still has limits, which must be addressed as follows:
The hand-machined mechanical construction produces slight vibrations during movement, which can pose challenges when delivering food, especially for liquid dishes such as noodles, vermicelli, and pho.
• Due to the low robot chassis design, the robot can only operate on flat surfaces that are not too slippery
• The LLM model question features an improved query method, but the robot's responses still lose focus or provide inaccurate information.
Possible Developmental Direction
First and foremost, to solve the inadequacies mentioned in the preceding paragraph, our team has made the following suggestions:
• Install sensors in the elevator shaft to detect impediments above
• Adjust and strengthen the chassis to decrease vibrations while moving
In Additionally, we have some ideas to develop and investigate new potentials for this topic:
• Upgrade hardware to suit the new development directions
• Uses image processing technology to determine consumer emotions, which helps to improve the conversation experience
• Improve staff management skills by include the retrieval and administration of customer loyalty information
[1] Trịnh Chất, Lê Văn Uyển (2000), “Tính toán hệ thống dẫn động cơ khí tập 1,2”
[2] Ths Nguyễn Quang Tuyến, Ks Nguyễn Thị Thạch (2005), “Giáo trình cơ kỹ thuật”
[3] Nguyễn Trường Thịnh (2014), “Giáo trình kỹ thuật robot”
[4] PGS.TS Võ Đình Bảy, TS Vũ Thanh Hiền, TS Huỳnh Quốc Bảo (2017), "Trí tuệ nhân tạo"
[5] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin (2017), “Attention is All You Need”
[6] Chris Albon (2018), “Machine Learning Cookbook”
[7] Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio (2014), “NEURAL MACHINE TRANSLATION”
[8] Many Authors (2024), “Retrieval-Augmented Generation for Large Language Models: A Survey”
[9] Ilya Sutskever, Oriol Vinyals, Quoc V Le (2014) “Sequence to Sequence Learning with Neural Networks”
[10] Many Authors (2024), “A Survey of Large Language Models”
[11] Dudek, Jenkin (2013), “Differential Drive Robots”
[12] Hillcrest Laboratories (2017), “Sensor Calibration Procedure”
[13] Mayank Chaturvedi, Pradeep Juneja (2013), “Effect of dead time approximation on controller performance designed for a second order delayed model”