1. Trang chủ
  2. » Công Nghệ Thông Tin

Lecture Notes in Computer Science- P72 docx

5 212 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 5
Dung lượng 351,72 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

To model the spatial relationships between the strokes in a Chinese character, a refined interval relationship that considers more granular levels is proposed.. Error-tolerant graph matc

Trang 1

F Li et al (Eds.): ICWL 2008, LNCS 5145, pp 344–355, 2008

© Springer-Verlag Berlin Heidelberg 2008

Attributed Relational Graph Matching

Zhihui Hu1,2,3, Howard Leung2,3, and Yun Xu1,2

1 Department of Computer Science and Technology, University of Science & Technology of China, Hefei, China

2 Joint Research Lab of Excellence, CityU-USTC Advanced Research Institute,

Suzhou, China

3

Department of Computer Science, City University of Hong Kong, Hong Kong S.A.R kittyhu@mail.ustc.edu.cn, howard@cityu.edu.hk, xuyun@ustc.edu.cn

Abstract Due to the complex shapes and various writing styles of Chinese

characters, it is a challenge to automatically detect the errors in people’s

hand-writing In this paper, we use attributed relational graph to represent a Chinese

character To model the spatial relationships between the strokes in a Chinese

character, a refined interval relationship that considers more granular levels is

proposed A novel interval neighborhood graph is also proposed to compute the

distances among the refined interval relationships Error-tolerant graph

match-ing is used to locate the stroke production errors, sequence error as well as the

spatial relationship errors We also propose a pruning strategy in order to speed

up the graph matching Experiment results show that our proposed method

out-performs existing approaches in terms of accuracy as well as its ability to

han-dle more kinds of handwriting errors in less computational time

Keywords: Chinese handwriting error detection, attributed relational graph,

stroke spatial relationship error, stroke spatial relationship error, error-tolerant

graph matching

1 Introduction

A Chinese character is an ideogram composed of many strokes The correct

handwrit-ing should follow the correct position, proportion and order of each stroke Law et al

[1] shows the following handwriting errors children may often make: 1) stroke pro-duction errors that include missing, extra, broken, and concatenated strokes; 2) stroke sequence errors Besides, there exist other handwriting errors such as spatial

relation-ship errors resulting from problems in the relative length or position between strokes

When a student makes a handwriting mistake, he/she often does not even realize it It

is thus essential for the student to receive feedback about his/her handwriting in order

to correct any mistakes

Traditionally, the teacher can help the student find out their handwriting errors in class however the teacher’s available time for each student is limited As a result, we are motivated to build a Chinese handwriting education system for assisting the teacher when the teacher is absent In this system, a student can first write a Chinese

Trang 2

character by following a template character from teacher then the system can auto-matically check the handwriting and give feedback to indicate whether and where there are any errors

The existing handwriting education systems can be divided into two categories The first one is the view-only system The student can see how a Chinese character should be written but they cannot practice handwriting through the system [2, 3] The other category allows the student to practice handwriting and gives some feedback to indicate if there are errors in their handwriting These systems can be further divided into four main streams The first one is focused on locating the production errors [4,

5, 6] The second stream can only evaluate the stroke sequence errors [7] The third stream can detect the spatial relationship error among strokes [8] The last one is the combination of the previous types In [9] the system can find out both the stroke pro-duction and sequence errors but without considering the spatial relationship errors As

a result, we are motivated to explore a method that can identify the stroke sequence, production and spatial relationship errors at the same time

In this paper, we propose a method that can not only identify the stroke production errors and sequence error but also the spatial relationship errors between strokes given

an input online Chinese handwriting This is achieved by using the attributed rela-tional graph (ARG) matching Attributed relarela-tional graph is a powerful tool to repre-sent the relational structure of a pattern It has been used in 2D recognition [10, 11] as well as Chinese handwriting education [6] In our application, the Chinese character is represented by a complete ARG The nodes in the ARG are used to describe the strokes of the character and the edges denote the relations between any two strokes

As the relations between the Chinese characters are rather complex, we propose to extend the existing interval relationship to refine its granularity The optimal detailed matching between the two ARGs is the mapping between corresponding strokes In order to find this detailed matching, the error-tolerant graph matching [13, 14] is used with the graph edit operations: deletion, insertion, substitution, merging and splitting

of the nodes and the edges A* algorithm is applied to perform the state-space search-ing of such a graph matchsearch-ing The resultsearch-ing operations can reflect the graph distor-tions On the other hand the operation of the edges can show the spatial relationship between strokes However, we should not ignore the computational complexity of graph matching thus we propose a pruning strategy to reduce the matching time The main contributions of this paper is as follows: 1) we propose an algorithm that can analyze an input online Chinese handwriting and determine stroke production error, stroke sequence error and stroke spatial relationship error at the same time; 2)

we define a refined interval relationship to model the spatial relationship between strokes and extend the interval neighborhood graph to obtain the distance measures for the refined interval relationships; 3) we propose a pruning strategy in order to reduce the state-space searching time while we apply the error-tolerant graph match-ing The remainder of this paper is organized as follows: In Section 2, the proposed ARG matching method incorporating the spatial relationships is described Experi-ments and results are discussed in Section 3 Conclusions and future work are pro-vided in Section 4

Trang 3

2 Our Proposed Method

2.1 Overview

The flowchart of our method is illustrated in Figure 1 First, the sample handwriting inputted by the student and the template character with which the student should fol-low are both represented as ARGs Then the error-tolerant graph matching is applied

on the two ARGs in order to find out the stroke production and sequence error in the sample handwriting Afterwards, the post processing can detect the stroke relationship error Finally the feedback that locates all the errors is provided to the student

Representation

Representation

Character matching

Post processing Sample

handwriting

Template

handwriting

Feedback

Fig 1 Flowchart of our method

2.2 Spatial Relationship in Chinese Character

A Chinese character consists of many strokes that form a particular structure unique

to that Chinese character The spatial relationship between strokes is one important factor in determining whether a student’s Chinese handwriting is written correctly In object recognition, people have studied the spatial relationship between objects Allen firstly shows 13 interval relationships in [15] and the spatial relationships between objects have been described in [16, 17].Nevertheless, it is not sufficient to use these interval relationships to fully describe the spatial relationship between strokes This can be illustrated by the example in Figure 2 The strokes in Figure 2(a), (b) and (c)

all have the same ‘during d’ relation as defined in Allen’s interval relationship mean-ing that the duration of stroke a is within the duration of stroke b However, only

Figure 2(b) shows the standard handwriting of this character The handwritings in

Figure 2(a) and (c) are non-standard because stroke a in Figure 2(a) is too long

whereas the one in Figure 2(c) is too short

(a) Non-standard handwriting (b) Standard handwriting (c) Non-standard handwriting

Fig 2 Example of spatial relationships in Chinese character

As illustrated in Figure 2, it can be observed that the relationship between the strokes

is not only the topological relationship but also the relative distance between the strokes

A more granular definition of the interval relationship is able to distinguish among the

Trang 4

three cases in Figure 2 In particular, we propose to further refine the interval

relation-ship into three levels (f, m, l) by considering the distance information The refined inter-val relationships of the strokes in Figure 2(a), (b) and (c) become ‘dl’, ‘dm’ and ‘df’

respectively The refined relationship with three additional levels based on the distance can also be applied to other existing interval relationships The resulting refined rela-tionships are summarized in Figure 3

Relation Symbol Symbol for inverse Example

Fig 3 Refined interval relationships with more granular levels

2.3 Complete ARG Representation of Chinese Character

ARG was first described in [10] to represent the structure information of a pattern as

g=(V,E, α,β) In our application, the set of nodes V describe the strokes of the Chinese character, and the set of edges E describes the relationships between any two strokes

as defined in Figure 3 The ARG representation is given as follows

Nodes in the ARG Each node stores the x and y coordinates of a stroke The node

labeling function α:VL V returns n data points for each stroke [6]

Edges in the ARG Each edge stores the relation of the two nodes (strokes) which are

connected by this edge The edge labeling function β:EL E returns (μ, λ) where μ, λ are the refined interval relationship along the x-axis and y-axis respectively

As an example, a Chinese character and its stroke spatial relationships are shown in Figure 4(a) The ARG representation of this character is shown in Figure 4(b) The strokes

a, b and c in the character are represented by the nodes a, b and c in the ARG The term

rs1s2 is the relationship between strokes s1 and s2, and s1,s2∈(a,b,c),s1≠s2

Trang 5

r ac: (df, mi)

r ca: (dif, m)

r ab: (df, dif)

r ba: (dif, df)

r bc: (dm, >m)

r cb: (dim, <m)

(a) A Chinese character (b) Corresponding ARG

Fig 4 ARG representation of a Chinese character

In this example, r ac is denoted by (df ,mi), r ab is denoted by (df, dif), and r bc is denoted

by (dm,>m) Note that the r ca is formed simply by taking the inverse of each

compo-nent of the relationship used to represent r ac and is denoted by (dif, m)

2.4 Error-Tolerant Graph Matching

As illustrated in Figure 1, the input (sample) handwriting is represented as an ARG g1=(V1,E1,α1,β1) and the template handwriting is represented as another ARG g2=(V2,E2,α2,β2) In order to decide whether the two ARGs have some differences, we find an error–tolerant graph matching from g1 to g2 which is a transformation denoted

by the function f [13, 14].This function f consists of many edit operations performed

on both nodes and edges The node operations have been defined by the authors in [6]

with node substitution, merging, splitting, deletion and insertion On the other hand,

we extend the work in [6] by adding the edge operations defined as follows: 1) edge substitution implying that both nodes sharing this edge are correct; 2) edge deletion

implying that one of the nodes/both nodes sharing this edge is an extra or broken

stroke; 3) edge insertion implying that one of the node/both nodes sharing this edge is

a missing or concatenated stroke

Edge substitution The cost for the edge substitution is the matching cost between

an edge in the sample character and an edge in the template We use Rt to denote the set of edges in the template and Rs to denote the set of edges in the sample

Note that an edge represents the spatial relationship between two strokes in a

hand-writing The i-th template edge Rt i can be denoted by (μti, λti) and the j-th sample edge Rs j can be denoted by (μsj, λsj) The dissimilarity between (μti, λti) and (μsj, λsj)

is defined as D(Rt i, Rsj) which is derived from the idea of the interval neighborhood

graph [16] Two interval relationships are neighbors, if they can be transformed into one another by continuous deformation (shortening, lengthening, and moving) [17]

We construct a new interval neighborhood graph in Figure 5 which considers our

proposed refined relationship with three levels (f, m, l) in each relationship defined

in Figure 3 Note that the three levels with the same interval relationship are close

to each other in the refined interval neighborhood graph since they can be trans-formed from one to another by shortening or lengthening the distance between the two strokes

Ngày đăng: 05/07/2014, 09:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN