BÁO CÁO " HANDWRITTEN NUMBER RECOGNITION AND ITS APPLICATION AT DANANG UNIVERSITY OF TECHNOLOGY " pdf

Da Nang University of Technology, Department of Electronics and Telecommunications ABSTRACT This paper presents the results of handwritten digit recognition on well-known image databas

Trang 1

HANDWRITTEN NUMBER RECOGNITION AND ITS APPLICATION AT

DANANG UNIVERSITY OF TECHNOLOGY

Authors: Duong Thi Kim Cuc, Dinh Quang Huy, Tran Hoang An, Nguyen Van Trong

Da Nang University of Technology, Center of Excellence, ECE08

Advisors: Hoang Le Uyen Thuc, M.S., Pham Van Tuan, Ph.D

Da Nang University of Technology, Department of Electronics and Telecommunications

ABSTRACT

This paper presents the results of handwritten digit recognition on well-known image databases using state-of-the-art feature extraction and classification techniques The tested databases are obtained from MNIST [1] and collected samples of digits handwritten by teachers at

Da Nang University of Technology For feature extraction, two features are chosen: Hu’s seven moments and image averaging (resizing the images to ones of less number of pixels for easier comparison) The preceding features are accompanied with corresponding classifiers, which are Neural Network classifier and Euclidean Distance So far with the dictionary for matching collected

at Da Nang University of Technology, the combination of image averaging feature and the Euclidean Distance gives the best accuracies (more than 93%) and can further be improved with a more comprehensive database

1 Introduction

One of the most troublesome and tedious tasks teachers at Da Nang University of Technology generally face is to manually put the exam grades into computers This project aims at providing them with the convenience of not having to copy the grades by hands, by presenting a method of automatically importing grades into computers This technique employs a well-known procedure in pattern recognition called OCR (optical character recognition)

The performance of character recognition largely depends on the feature extraction approach and the classification/learning scheme For feature extraction of character recognition, various approaches have been proposed Hu’s seven moments have been extensively employed as invariant global features of images in pattern recognition Averaging is a rather simple process of representing a square of pixels by a single pixel, leading to an image being expressed by a smaller image

An artificial neural network (ANN) consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation When the network structure is appropriately designed and the training sample size is large, neural networks are able to give high classification accuracy to unseen test data OCR using template matching is a system prototype that useful to recognize the character or alphabet by comparing two images of digits We implement template-matching technique, which involves optimizing the Euclidean Distance between the patterns to be recognized with the sample patterns provided by the dictionary

2 Main process

Trang 2

2.1 Proposed System

The scanned image is first preprocessed to give the normalized image To ease the classification process, the normalized image is represented by a set of features for comparison Finally, the conversion from the JPEG file to xls takes place

Figure 1 Proposed System Overview

2.2 Preprocessing

There are four main steps in this stage Firstly, the scanned RGB image is

converted to gray scale image This process is completed using the formula Intensity = 0.2989*red + 0.5870*green + 0.1140*blue [2] Secondly, the image is thresholded to

obtain the binary one The thresholding level, which is chosen to be 70 in this case, depends on the quality of the scanned image and the background noise The output binary image has value of 1 (white) for all pixels in the input image with luminance greater than

70 and 0 (black) for all other pixels Thirdly, “Opening” morphology method [3] is applied

to smoothen the number and eliminate small noise regions Finally, normalization is used

to regulate the size, position and shape of the image so that the differences between samples in one class are reduced The key idea behind normalization involves bilinear

interpolation theory

depicted in the Figure

2

(a) (b) (c) (d) (e)

Figure 2 Preprocessing Steps (a) RGB Image (b) Gray-scale Image (c) Binary Image

(d) Image after Morphology (e) Normalized Image

2.3 Feature extraction

The features used in our experiment are Hu’s seven moments and image averaging

2.3.1 Hu’s seven moments (SM)

An essential issue in the ﬁeld of pattern analysis is the recognition of objects and characters regardless of their position, size and orientation as illustrated in ﬁgure 1 The idea of using moments in shape recognition gained prominence when Hu (1962) [5], derived a set of invariants using algebraic invariants The two-dimensional (p + q)th order moments of an image with density function f(x, y) are defined in terms of Riemann integrals

Trang 3

The central moment are defined as:

(2) (3)

In particular, Hu (1962), deﬁnes seven values, computed by normalizing central moments through order three, that are invariant to object scale, position, and orientation

2.3.2 Image Averaging

Since the matrix expressions for each of the ten numbers from 0 to 9 are very different, it is reasonable to recognize them by checking each ‘pseudo pixel’, which is represented by a 4x4 block in a particular image number A 4x4 block has its own averaging value and can be considered a ‘pixel’ By choosing 4x4 blocks, we can reduce the complexity of the recognition process but still maintain the shape of the image Figure

3 shows the number zero images before and after the averaging

Figure 3 Example of Image Averaging (a) Initial Image (b) Average Image

2.4 Classification algorithm

2.4.1 Artificial neural network

Artificial neural network [6], which are inspired from studies of biological nervous systems, are composed of many simple nonlinear computational elements (neurons or nodes) which are connected by links with variable weights The inherent parallelism

of these networks allows rapid pursuit of many hypotheses in parallel, resulting in high computation rates Moreover, they provide a greater degree of robustness or fault tolerance than conventional computers because of the many processing nodes, each of which is responsible for a small portion of the task Damage to a few nodes or links thus does not impair overall performance significantly Neural networks can perform different tasks, one of which is in the context of a supervised classifier This is a decision-making process which requires the net to identify the class or

Trang 4

category which best represents an input pattern It is assumed that the net has already adapted to the classes it is expected to recognize through a learning process using labeled training prototypes from each category

Figure 4 General structural of a neural network [6]

2.4.2 Template matching using Euclidean Distance

The Euclidean Distance [7] is based on the smallest ‘distance’ or error between the testing samples and a dictionary that is built up in advance

2

1

k

j

3 Experimental results

Two different features and classifiers result in four experiments: Hu’s SM and Neural Network, SM and Euclidean Distance, Image Averaging and Neural Network, and Image Averaging and Euclidean Distance

Table 1: Errors rates for data from MNIST Table 2: Errors rates for data from DUT (Over 1000 samples) (Over 90 samples)

Figure 6 shows the actual result from the Graphic User Interface (GUI)

Hu’s seven moments

Image averaging Neural

Network

Euclidean

Distance

Hu’s seven moments

Image averaging Neural

Network

Euclidean Distance

Trang 5

Figure 6 The experimental results of DUT and MNIST databases (a) DUT database

Figure 7 introduces the GUI The user types the name of the JPEG file and the

corresponding number of students, and then clicks Convert button to get an xls file containing

extracted scores To view the xls file, the user clicks Open to view xls file button

Figure 7 Graphic User Interface

4 Conclusion

The experimental results indicate that Image Averaging and Euclidean distance

give more stable and smaller errors than the combination of Neural Network and SM,

while the best performance is obtained using Neural Network classifier From these

statistics, we decided to implement Image Averaging and Euclidean Distance in the final

Number Recognition Product In the future, this program will be upgraded to recognize the

score written in decimal number (such as 9.5 or 10) Also, a score written in word

recognition system will be added for checking the result

REFERENCES

[1] Yann LeCun, Corinna Cortes, The MNIST Database of Handwritten Digits,

http://yann.lecun.com/exdb/mnist/

[2] http://www.mathworks.com/help/toolbox/images/ref/rgb2gray.html

[3] Rafael C Gonzalez and Richard E Woods, Digital Image Processing, second

edition (2001) P.528-532

[4] http://en.wikipedia.org/wiki/Bilinear_interpolation

[5] Ming – Kuel Hu, “Visual Pattern Recognition by Moment Invariants” (1962)

[6] V Venugopal, W Baets, Neural Networks and Statistical Techniques in Marketing

Research: A Conceptual Comparison (1994), Vol 12 Iss: 7, pp.30 – 38

[7] Cheng Liu Liu, “Handwritten Digit Recognition: Benchmarking of state-of-the-art

techniques”, Elsevier Ltd, (2002)

Định dạng
Số trang	5
Dung lượng	264,33 KB