Combining stereo vision and fuzzy image based visual servoing for autonomous object grasping using a 6 DOF manipulator

Then, the 3D coordinates of the object to be grasped are derived by the stereo vision algorithm, and the coordinates are used to guide the robotic arm to the approximate location of the

Trang 4

A B S T R A CT

This dissertation presents a new grasping method in which a 6-DOF industrial robot can autonomously grasp a stationary, randomly positioned rectangular object using a combination of stereo vision and image-based visual servoing with a fuzzy controller (IBVSFC) First, an openCV software and a color filter algorithm are used

to extract the specific color features of the object Then, the 3D coordinates of the object to be grasped are derived by the stereo vision algorithm, and the coordinates are used to guide the robotic arm to the approximate location of the object using inverse kinematicss Finally, IBVSFC precisely adjusts the pose of the end-effector to coincide with that of the object to make a successful grasp The accuracy and robustness of the system and the algorithm were tested and proven to be effective in real scenarios involving a 6-DOF industrial robot Although the application of this dissertation is limited in grasping a simple cubic object, the same methodology can be easily applied to objects with other geometric shapes

Trang 5

(IBVSFC) 7

OpenCV

IBVSFC

7

Trang 6

Finally, my greatest thanks are to my family: my father, my mother, my younger brother, who has unconditionally supported me during my stay here

Trang 7

T A B L E O F CO N T E NT S

Abstract I II Acknowledgements III Table of contents IV List of Figures V List of Tables VII

Chapter 1 INTRODUCTION 1

1.1 Overview 1

1.2 Motivation 4

1.3 Contribution 5

1.4 Dissertation structure 7

Chapter 2 SYSTEM CONFIGURATION 8

2.1 Denso Robot 8

2.2 Vision system 10

Chapter 3 IMAGE PROCESSING 13

3.1 Color filter 13

3.2 Stereo vision 16

Chapter 4 FORWARD AND INVERSE KINEMATICS 21

4.1 Direct kinematics 21

4.2 Inverse Kinematics 25

Chapter 5 CONTROL ALGORITHM 29

5.1 Position based control 29

Trang 8

5.2.1 Classical image based visual control 33

5.2.2 PID control 35

5.2.3 Fuzzy image based visual control 35

Chapter 6 EXPERIMENT AND CONCLUSION 41

6.1 Experiments 41

6.2 Stacking Cubics 51

Chapter 7 CONCLUSION AND FUTTURE WORK 57

7.1 Conclusion 57

7.1 Contributions 57

7.2 Future works 58

References .59

Appendix A .63

Appendix B .64

Trang 9

L I S T O F FI GU R E S

Fig 1 VS-6556G Denso robot arm [13] 9

Fig 2 VS-6556G robot with the attached grasping component 9

Fig 3 The 3rd camera attached to the end-effecter 11

Fig 4 The Stereo vision system 11

Fig 5 Color feature filter 15

Fig 6 Volume filter 15

Fig 7 Matlab toolbox for calibrate camera 17

Fig 8 3D geometrical model [16] 19

Fig 9 Camera and system calibration 20

Fig 10 Link coordinating system and its parameters 22

Fig 11 Diagram of VS-6556G used in the dissertation 24

Fig 12 Procedure to solve the inverse kinematics for VS-6556G robot used in the dissertation 26

Fig 13 (from left to right): VM1 monocular eye-in-hand, VM2 monocular stand-alone, VM3 binocular eye-in-hand, VM4 binocular stand-alone and VM5 redundant camera system [6] 30

Fig 14 The control system using the stereo vision 31

Fig 15 Orientation of a rectangle cubic in 2D 32

Fig 16 Orientation of a rectangle cubic viewed in 3rd camera 33

Fig 17 Image based visual control scheme 33

Fig 18 The membership function of the position error 37

Fig 19 The membership function of the change rate of the position error 38

Fig 20 The membership function of the output signal for the fuzzy controller 38

Trang 10

mechanism 40 Fig 22 Experiment apparatus 42 Fig.23 Multitask working flow chart 43 Fig 24 The grasping task performed by the Denso Robot: (Phase 1) initial position, (Phase 2) approaching the target using stereo vision, (Phase 3) image-based visual servoing, (Phase 4) grasping the object .44 Fig 25 The overall performance result of the gripper in X coordinate 46 Fig 26 The overall error result of the gripper and rectangle cubic in X coordinate 46 Fig 27 The overall performance result of the gripper in Y coordinate 47 Fig 28 The overall error result of the gripper and rectangle cubic in Y coordinate 47 Fig 29 The overall performance result of the gripper in Z coordinate 48 Fig 30 The overall error result of the gripper and rectangle cubic in Z coordinate 48 Fig 31 The error angle result between the camera and the rectangle cubic 49 Fig 32 The error result between the center of the camera and the center of the

rectangle cubic in X direction 50 Fig 33 The error result between the center of the camera and the center of the

rectangle cubic in Y direction 51 Fig 34 Experimental setup for stacking three cubic 51 Fig 35 The grasping task performed to grasp the blue cubic by the Denso Robot: (Phase 1) initial position, (Phase 2) approaching the target using stereo vision, (Phase 3) image-based visual servoing, (Phase 4) grasping the object 52

Trang 11

3) image-based visual servoing, (Phase 4) grasping the object 54 Fig 38 The stacking task performed to stack the pink cubic on the blue one by the Denso Robot: (Phase 1) initial position, (Phase 2) approaching the target using stereo vision, (Phase 3) image-based visual servoing, (Phase 4) grasping the object 55

Trang 12

L I S T O F T AB L E S

Table 1 Specification of the vision system 12 Table 2 DH table of VS-6556G 24 Table 3 Rule table of the fuzzy controller 39

Trang 14

This application is only suitable for grasping objects that can accommodate large errors due to manual manipulation

Stereo vision can be used to reduce the load on the human by facilitating automatically grasping in industrial applications Stereo vision involves two integrated cameras that can observe the objects ahead with depth information There have been many studies of stereo visual servoing techniques or position based visual servoing, such as Kumar et al [2] In that study, stereo vision was used to track the current position and orientation of an end-effector of a manipulator and those of the object to grasp Then, a self-organizing map was used to control the robot and to reduce error between the end-effector and the object Nasrabadi et al [3] used stereo vision to reconstruct a 3D object then they used a matching technique to calculate the orientation of the object Bohg et al [4] used stereo vision to reconstruct the 3D shape

of an object and then determined the orientation of the object to be grasped using the shape context with a non-linear classifier From [2-4], we can see that the grasping tasks relied purely on the stereo vision system However, stereo vision is prone to errors from camera calibration, image processing and noises in the surround

Trang 15

that the environment remains static after the robot has started to move [5-6] and it often leads to a poor positioning accuracy To improve the success rate, stereo visual servoing can be applied to automatically grasp the object, to execute the task with an acceptable large error, and to grasp an object with a fully known 3D structure

To overcome the difficulties associated with stereo vision, Shirai et al [7] and Peter et al [8] proved that by using a visual feedback loop, the precision of the robot positioning can be increased Since then, many researchers have focused on using eye-in-hand visual servoing to perform grasping tasks William et al [9] presented the design methodology for a Cartesian position based visual servo for robots with a single camera mounted at the end effector First, the 3D pose information of the object was estimated Then, based on the 3D object pose estimation, the robotic arm was commanded to approach the object by reducing the relative error between the pose of object and the pose of the end-effector of the robot arm using a feedback controller Kosmopoulos et al [10] used a camera attached to the end-effector to calculate the image Jacobean matrix without the need for depth estimation Then, he used this matrix as the feedback to control the robot to reduce the error between the camera and the object orientation The advantages of eye-in-hand visual servoing [9-10] include increased precision and reduced sensitivity to noise Unfortunately, if the camera is

Trang 16

constantly observe the object to grasp If it happens, some basic information about the object cannot be acquired, and the visual servoing algorithm cannot be executed Moreover, using vision to estimate the pose or the Jacobean matrix can lead to errors

in calculation due to interpolation, predication methods and the noise of the surrounding environment

To address this viewing problem, Chesi et al [11] presented an approach to keep the observed object in the field of view using an approach that involved switching among position-based control strategies and backward motion This algorithm is mainly affected by the prediction and the available estimate of the intrinsic parameter of the camera Sebastian et al [12] noted that the image processing described in [11] is sensitive to the calibration, which often accumulates errors from system nonlinearities, disturbances or noise in the surround environment To apply this algorithm in a different setup environment, recalibration is required This task has high computational costs, and in some cases, it is impossible to perform [12] Thus, this algorithm would not be robust against calibration errors

Trang 17

have weakness in regards to this application Therefore, combining a stereo vision technique and image-based visual servoing using a fuzzy controller algorithm (IBVSFC) is an intuitive method to achieve a high success rate, high accuracy, and robustness against calibration errors or weak calibrations This combination will have the advantages of both approaches The combined method can function in the presence of calibration errors, perform fulltime observation, extract the basic feature

of the object to grasp using stereo vision, and achieve more precise, less noise-sensitive movements using IBVSFC This method consists of firstly using the openCV software and a color filter algorithm to extract the specific color features of the object and then determining the 3D coordinates of the object to grasp using a stereo vision algorithm Based on the position of the object attained by stereo vision, the robotic arm will be guided to move to an adjacent location near to the object using inverse kinematics Then, IBVSFC will precisely adjust the pose of the end-effector to match to the pose of the object The tasks will be considered successful when the pose

of the end-effector and that of the object coincide

1.3 Contribution

The thesis developed successfully the system which combines the stereo vision

Trang 18

robot to do the autonomous grasping task The system can automatically recognize the object to grasp and successfully guide the VS-6556G robot to approach the object and precisely grasp it With the same experimental setup, the grasping techniques can be easily duplicated in different environments with weak re-calibrations or calibration errors of the vision system This advantage helps to save the precious time and computational cost in calibration when applying in various environments The discussions above demonstrate the effectiveness of our approach in high success rates, high accuracies, robustness under calibration errors

More over by using image based visual servoing using the Fuzzy controller instead

of interaction matrix or jacobian matrix in the classical image based visual control loop [9-11], this dissertation’s controller used fewer feature information (one point instead 4 points in [9-11] which cannot always be obtained by the image processing)

of the object while it also can make the system converge fast and stable

Although the application of this dissertation is only demonstrated in grasping a simple cubic object, the same methodology can be easily extended to grasping objects with other geometric shapes In addition, the dissertation also demonstrated one of the

Trang 19

1.4 Dissertation Structure

The remaining portions of this dissertation are organized as follows: Chapter 2 will give an overview about the system configuration using in this dissertation Then Chapter 3 will present the color filter and stereo vision algorithm that are used to recognize and localize the object Following, Chapter 4 will discuss about the mathematic calculation to solve the direct and inverse kinematics for the 6-DOF VS-6556G robot Next, Chapter 4 will present the image based visual servoing using Fuzzy controller (IBVSFC) The experimental results are presented in Chapter 5, which is followed by a conclusion in Section 6 that emphasizes this thesis's original contributions Finally Chapter 7 will discuss some possibilities for extending and applying the techniques discovered in this dissertation in the future

Trang 20

C H A P T E R 2

S Y S T E M C O NF I G U R AT I O N

In this chapter, we will discuss about the configuration of the system It includes the description about the hardware of the VS-6556G robot, and the vision system used in the experiment From this chapter, people will have a basic overview on the principal operation of the whole system

Trang 21

Fig.1 VS-6556G Denso robot arm [13]

Fig.2 VS-6556G robot with the attached grasping component

Trang 22

In this dissertation, this Denso robot will be controlled through the RS232 signal which is transmitted from the computer The RS232 signal consists of the angle, and the velocity of six joints The software used for programming and calculating the positions of the object and the robot arm is written in Visual C++ 2008 and openCV

2.2 Vision system

The vision system is a very important and essential device helping the robot localize and recognize the object The vision system used in this dissertation takes the role of calculating the position of the target to grasp and the position of the robot to the target Moreover it will serve as the input signals for the controller designed in this dissertation The vision system has 2 modules: the first is attached to the end-effecter

as shown in Fig 3 The second module is the stereo vision consisting of two cameras, using for recognizing the object and calculating the relative position of the object to the end effecter of the robot arm as shown in Fig 4 All the cameras used in the dissertation are Logitech cameras Those cameras can automatically adjust the dim light or back light to produce the best images The specification of this camera is shown in Table 1

Trang 23

Fig.3 The 3rd camera attached to the end-effecter

Stereo vision

Fig 4 The Stereo vision system

Trang 24

Table 1 Specification of the vision system

Frame rate Up to 30 frame per second

Video capture Up to 1280x1024 pixels

Still image capture 5 Megapixels

Trang 25

C H A P T E R 3

I M A GE P R O CE S SI N G

In this chapter we will describe our method to recognize the object by using the color filter and calculate the position of the object by using the stereo vision We also present the method of the camera calibration for the image processing

3.1 Color Filter

Filtering is one of the most useful techniques for recognizing or localizing an object in

a scene The purpose of the color feature filtering is to find an object based on its color After filtering, the pixels with the defined color of the object will stay in the image frame Based on that isolated information, the position or coordinates of the center of the object can be easily calculated The color of an object is a combination

of 3 basic colors, red, green, and blue, which form the RGB color model Changing any of the values of the R, G, or B component will form a new color Despite rapid increases in the speed and the power of computers, image processing in an RGB mode

is computationally expensive Therefore, for real time applications it is better to

Trang 26

does not affect the color components The YUV color model is used for digital video encoding The YUV color space is derived from the RGB space according to Iraji et al [14]:

Trang 27

Fig 5 Color feature filter

Trang 28

After using volume filter the centroid of the object to grasp is calculated This

centroid position of the object in the image space will be the input for the stereo vision to calculate the 3D coordinate of the object in the real world coordinate

3.2 Stereo vision

The camera model can be derived in a homogeneous coordinate as

(2) Where u , v are the point coordinate in the image space, f and 1 f are the 2focal lengths of the lens along the X and Y axes and C and x C are the coordinates y

of the principal point of the camera Camera calibration is one of the important tasks for any image processing, especially for calculation of the position of an object using stereo vision Elimination of the lens distortion will improve the accuracy of image processing and position calculations

In our stereo system, we use two webcams Both cameras are calibrated by the camera calibration toolbox for MATLAB [15] Firstly, 20 images are sequence taken

1 2

00

x y

Trang 29

Fig 7 Matlab toolbox for calibrate camera Calibration results (with uncertainties):

Focal Length: fc = [ 664.05834 664.40064 ] ± [ 3.31388 3.27413 ] Principal point: cc = [ 316.07464 262.41063 ] ± [ 3.41007 3.88220 ] Skew: alpha_c = [ 0.00000 ] ± [ 0.00000 ] => angle of pixel axes

Trang 30

Stereo vision is the calculated result of the extraction of a stereo image pair as the projection of the same object to different camera positions The key to stereo vision is to determine the location of the object from a pair of images The difference between two corresponding points is called the disparity Disparity is used by the vision system to calculate the depth of an object

Fig 8 shows a 3D geometric model, where w (X, Y, Z) is a point in three dimensional space, (x ,1 y ) and (1 x ,2 y ) are the projections of w to the two cameras 2

with the focal lengths f, and the width of the baseline b The displacement of two correspondences points are (x ,1 y ) and (1 x ,2 y ) The disparity, d, of two points is 2(x1 ) According to the principle of similar triangles, the depth of point W can be x2

Trang 31

Image plane Left camera

Z

b

Zc

Fig 8 3D geometrical model [16]

Combining the result of the calibration process and Eq 3, the X and Y

coordinate of Point W is calculated as

x

y

x C ZX

f

y C ZY

Trang 32

calculated position from the camera, then derive the linear equation to calculate the

position of the target as shown in Fig 9

Fig 9 Camera and system calibration

After the position of the object to grasp is calculated, it will be used as the parameter

to command to the robot controller to make robot grasp the object These parameters

Trang 33

C H A P T E R 4

F O R W A R D AN D I NV E RS E K I N E M AT I C S

Kinematics describes the motion of an object or a system The direction and the position of an object play an important role in kinematics Kinematics is necessary for calculating the position of the end effector of an armed robot or a manipulator In other words, robotic kinematics refers to the motions of the robotic arms without consideration of force and moment [17] The relative motion associated with each joint can be controlled by an actuator such that the end-effector can be positioned with a particular orientation anywhere inside its workspace

4.1 Forward kinematics

The Denavit-Hartenberg (DH) table is developed to solve the forward and inverse kinematics problems The link coordinate system and its parameters which are provided for DH table are described in the following Fig 10

Trang 34

Fig 10 Link coordinating system and its parameters

The notations used in Fig 10 are described as follows:

ai : Offset distance between two adjacent joint axes

di : Translation distance between two incident normal of a joint axis : Twist angle between two adjacent joint axes

: Joint angle between two incident normal of a joint axis

For the link i from Fig 7 the transformation matrix is:

i



i



Trang 35

cos sin cos sin sin cos

sin cos cos cos

in Table 2

Trang 36

Fig 11 Diagram of VS-6556G used in the dissertation

Trang 38

Fig 12 Procedure to solve the inverse kinematics for VS-6556G robot used in the

dissertation According to DH table and Fig 12, the 2 0, so the angles are derived as the following Eq.7 to Eq 12 and the procedure using to calculate them are shown in the

Trang 39

2 2 2 1

1

6 6

Định dạng
Số trang	78
Dung lượng	3,73 MB