Representation learning for knowledge graph using deep learning methods = học biểu diễn cho đồ thị tri thức sử dụng các kỹ thuật học sâu

Recently, deep learning methods us-ing the representation of knowledge graph entities nodes and relations edges Gain-in vector space have gaGain-ined traction from the research community

Trang 1

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

MASTER THESIS

Representation learning for Knowledge Graph using

Deep Learning methods

TONG VAN VINH Vinh.TV202705M@sis.hust.edu.vn School of Information and Communication Technology

Supervisor: Assoc Prof Huynh Quyet Thang

Supervisor’s signature

Institution: School of Information and Communication Technology

January 12, 2022

Trang 2

Graduation Thesis Assignment

Name: Tong Van Vinh

Phone: +84354095052

Email: Vinh.TV202705M@sis.hust.edu.vn; vinhbachkhoait@gmail.com

Class: 20BKHDL-E

Affiliation: Hanoi University of Science and Technology

Tong Van Vinh- hereby warrants that the work and presentation in this thesis formed by myself under the supervision of Assoc Prof Huynh Quyet Thang Allthe results presented in this thesis are truthful and are not copied from any otherworks All references in this thesis including images, tables, figures, and quotesare clearly and fully documented in the bibliography I will take full responsibilityfor even one copy that violates school regulations

per-Student

Signature and Name

Trang 3

I would like to acknowledge and give my warmest thanks to my supervisor, soc Prof Huynh Quyet Thang inspired me a lot in my research career path I alsothank Mr Huynh Thanh Trung, Dr Nguyen Quoc Viet Hung, and Dr NguyenThanh Tam for supporting me in giving birth to my brainchild and challengingmyself by submitting it to the top-tier conferences I would also like to thank mycommittee members for letting my defense be an enjoyable moment and for yourthoughtful comments and suggestion

As-I would also like to give a special thanks to my girlfriend Thu Hue and my ily as a whole for their mental support during my thesis writing process There

fam-is nothing to touch my love to you Moreover, in the absence of my friends,Tien Thanh, Trong Tuan, Hong Ngoc, Hieu Tran, Minh Tam, Quang Huy, QuangThang, Ngo The Huan, I could hardly melt away all the tension from my work.Thanks for always accompanying me through ups and downs

Finally, this work was funded by Vingroup and supported by Vingroup InnovationFoundation (VINIF) under project code VINIF.2020.ThS.BK.07 I enormouslyappreciate all the financial support from Vingroup, allowing me to stay focused

on my research without worrying about my financial burden

Trang 4

Knowledge graphs (KGs) have received significant attention in recent years ing more profound insight into the structure of knowledge graphs allows us totackle many challenging tasks, such as knowledge graph alignment, knowledgegraph completion, and question answering Recently, deep learning methods us-ing the representation of knowledge graph entities (nodes) and relations (edges)

Gain-in vector space have gaGain-ined traction from the research community because oftheir flexibility and prospective performance The best way to evaluate how good

a representation learning method is to use that representation to solve real-worldtasks In terms of knowledge graphs, we can rank methods by their performance

on tasks such as knowledge graph completion (KGC) or knowledge graph ment (KGA) However, many research challenges still exist, such as enhancingthe accuracy or simultaneously solving multiple tasks

align-With such motivation, in the scope of our Master work, we address the threegroups of crucial challenges in knowledge graph representation, namely (i) chal-lenges in enhancing KGC performance, (ii) challenges in enhancing KGA per-formance, and (iii) challenges in enhancing both KGC and KGA simultaneously.For the first class of challenges, we develop a model named NoGE which takestake advantage of not only the power of Graph Neural Networks (GNNs) but alsothe expressive power of quaternion vector space and the co-occurrence statistics

of elements in KGs to achieve SOTA performance on the KGC task Moving tothe second challenge group, we propose EMGCN, a special GNN architecturedesigned to exploit different types of information to better the final alignment re-sults Finally, we propose IKAMI, the first multitask-learning model, to solve thetwo tasks simultaneously Our proposed techniques improve upon the state-of-the-art for different tasks and thus cover an extensive range of applications

Student

Signature and Name

Trang 5

TABLE OF CONTENTS

CHAPTER 1 INRODUCTION 1

1.1 Knowledge Graphs (KGs) 1

1.2 Knowledge graph completion and knowledge graph alignment 2

1.2.1 Knowledge graph completion 2

1.2.2 Knowledge graph alignment 3

1.2.3 The relation between completion and alignment 4

1.3 Research challenges 5

1.3.1 Handle knowledge graph completion challenges 5

1.3.2 Handle knowledge graph alignment challenges 6

1.3.3 Handle the challenges of solving the two task simultaneously 7

1.4 Thesis methodology 7

1.5 Contributions and Thesis Outline 8

1.6 Selected Publications 9

CHAPTER 2 BACKGROUND 11

2.1 Graph Convolutional Networks (GCNs) 11

2.2 Knowledge Graph Completion background 12

2.2.1 Incomplete knowledge graphs 12

2.2.2 Knowledge graph completion models 12

2.3 Knowledge Graph Alignment background 15

2.3.1 Previous approaches 15

Trang 6

2.3.2 Alignment constraints 16

2.3.3 Incomplete knowledge graph alignment 18

CHAPTER 3 ENHANCING KNOWLEDGE GRAPH COMPLETION PERFORMANCE 19

3.1 Introduction 19

3.2 Dual quaternion background 20

3.3 NoGE 21

3.4 Experimental Results 23

3.4.1 Experiment setup 23

3.4.2 Main results 25

CHAPTER 4 ENHANCING KNOWLEDGE GRAPH ALIGNMENT PERFORMANCE 27

4.1 Introduction 27

4.2 Overview of the Proposed Approach 28

4.2.1 Motivation 28

4.2.2 The entity alignment framework 29

4.3 Relation-aware Multi-order Embedding 30

4.3.1 GCN-based embedding model 30

4.3.2 Loss function 31

4.4 Alignment Instantiation 32

4.4.1 Single-order alignment matrices 33

4.4.2 Multi-order alignment matrix 33

Trang 7

4.4.3 Attribute Alignment 33

4.4.4 Puting It All Together 35

4.5 Empirical evaluation 35

4.5.1 Experimental setup 36

4.5.2 End-to-end comparison 38

4.5.3 Efficiency Test 39

4.5.4 Ablation Test 40

4.5.5 Hyperparameter sensitivity 41

4.5.6 Robustness to constraint violations 43

CHAPTER 5 MULTITASK LEARNING FOR KNOWLEDGE GRAPH COMPLETION AND KNOWLEDGE GRAPH ALIGNMENT 45

5.1 Introduction 45

5.2 Incomplete Knowledge Graph Alignment 46

5.2.1 Challenges 46

5.2.2 Outline of the Alignment Process 47

5.3 Feature channel models 50

5.3.1 Pre-processing 50

5.3.2 Transitivity-based channel 50

5.3.3 Proximity-based channel 52

5.4 The complete alignment process 55

5.4.1 Alignment instantiation 55

5.4.2 Missing triples recovery 56

Trang 8

5.4.3 Link-augmented training process 58

5.5 Evaluation 59

5.5.1 Experimental Setup 59

5.5.2 End-to-end comparison 62

5.5.3 Robustness to KGs incompleteness 64

5.5.4 Saving of labelling effort 64

5.5.5 Qualitative evidences 66

CHAPTER 6 CONCLUSION 68

Trang 9

LIST OF FIGURES

1.1 An illustration of knowledge graph 1

1.2 An example of knowledge graph completion 2

1.3 An example of knowledge graph entity alignment 3

1.4 Aligning incomplete KGs across domains 4

1.5 Encoder Decoder architecture for GNN based models 6

2.1 CNN and GCN comparison [37] 11

3.1 An illustration of our proposed NoGE 21

4.1 Overview of EMGCN framework 28

4.2 Computation time 38

4.3 Different supervision percentage 38

4.4 #GCN-layers 41

4.5 Embedding dim 41

4.6 Robustness to violations of entity consistency 44

4.7 Robustness to violations of relation consistency 44

5.1 Framework Overview 49

5.2 Running time (in log scale) on different datasets 63

5.3 Saving of labelling effort for entity alignment on D-W-V1 test set 65 5.4 Robustness of graph alignment models against noise on EN-DE-V2 test set 65

5.5 Attention visualisation (EN-FR-V1 dataset) The model pays less attention to noisy relations 66

5.6 KGC performance comparison between TransE and IKAMI dur-ing traindur-ing 67

Trang 10

LIST OF TABLES

3.1 Statistics of the experimental datasets 23

3.2 Experimental results on the CoDEx test sets 25

3.3 Ablation results on the validation sets 26

4.1 Statistics of real-world datasets 36

4.2 End to end comparison 38

4.3 Ablation Test 39

4.4 Different weighting schemes of GCN layers 42

4.5 Effects of similarity matrix coefficients 42

5.1 Summary of notation used 47

5.2 Dataset statistics for KG alignment 59

5.3 End-to-end KG alignment performance (bold: winner, underline: first runner-up) 62

5.4 Ablation study 63

5.5 Knowledge Graph Completion performance 63

5.6 Correct aligned relations in EN↔FR KGs 66

Trang 11

we describe our contributions and our thesis’s outline, followed by listing someselected publications.

1.1 Knowledge Graphs (KGs)

Figure 1.1: An illustration of knowledge graph

Knowledge graphs (KGs) are knowledge bases, but they use graph-structureddata to encode information They present the facts about real-world entities inthe form of triples hhead entity, relation, tail entityi [1], [2] For instance, Fig-ure 1.1 illustrates a knowledge graph that contains a lot of triples such as (DAVINCI, painted, MONA LISA) In this example, DA VINCI, painted, and MONALISA are a head entity, a relation, and a tail entity respectively Each triple in

a knowledge graph can be considered as a fact, and a knowledge graph thus is

a set of valid triples Valid triples represent true facts (Melbourne, city of, tralia), while invalid triples represent false facts (Melbourne, city of, Vietnam).Corrupted triples are facts that are not available in the current knowledge graph

Aus-A corrupted triple can be a valid triple, or an invalid one [3]

Trang 12

In recent years, knowledge graph has been used in natural language processing,intelligent question-answering systems, intelligent recommendation systems, etc.With bigdata and deep learning, knowledge graph has become one of the corepower for the development of artificial intelligence.

1.2 Knowledge graph completion and knowledge graph alignment

1.2.1 Knowledge graph completion

Lucky Partners The Prisoner of Zenda

Surrey Santa Barbara, California

Ronald Colman

Starring Starring

BirthPlace DeathPlace

(Ronald Colman, Job, ?)

Figure 1.2: An example of knowledge graph completion

It is the fact that large knowledge graphs, even at the scale of billions of triples,are still incomplete, i.e., missing a lot of valid triples [3] Therefore, many re-search works have focused on inferring missing triples in KGs, called knowledgegraph completion (KGC) By completing knowledge graphs, we can enrich cur-rent knowledge bases and thus improve the performance of their applications.The KGC task is sometimes referred to as a link prediction task Intuitively,given two out of three elements of a missing triple, the task is to predict the thirdelement For example, it can be seen from 1.2 that given the current knowledgegraph and two available components (head entity Ronal Colman and relation Job)

of a missing triple, we are asked to predict the tail entity This task corresponds

to answering the question “What is Ronald Colman’s job?” Recently, extensivestudies have been done on learning low-dimensional representations of entitiesand relations for missing link prediction [4] These methods have been demon-strated to be scalable and effective The general intuition of these methods is tomodel and infer the connectivity patterns in knowledge graphs according to theobserved knowledge facts For example, some relations are symmetric (e.g mar-riage) while others are antisymmetric (e.g., affiliation); some relations are 1-to-1relationships (e.g., is capital of), or Many-to-Many (e.g., is author of), and some

Trang 13

relations may be composed by others (e.g., my mother’s husband is my father).

It is critical to find ways to model and infer these patterns [5] A robust edge graph completion model should have enough expressive power to presentall these relation types

knowl-Indeed, many existing architectures have been trying to model one or a few ofthe above relation patterns [3], [6] Few models have been proved to be fullyexpressive, which means they can successfully model all these patterns [5], [7].However, these models often face the same challenge of over-fitting because oftheir large number of trainable parameters

In addition to conventional KG embedding models such as TransE [3], DistMult[8], ComplEx [9], and ConvKB [6], recent approaches have adapted graph neuralnetworks (GNNs) for knowledge graph completion [10], [11], [12] In general,vanilla GNNs are modified and utilized as an encoder module to update vectorrepresentations for entities and relations; then these vector representations are fedinto a decoder module that adopts a score function (e.g., as employed in TransE,DistMult, and ConvE) to return the triple scores The model will be trained sothat valid triples have higher scores than invalid ones

1.2.2 Knowledge graph alignment

Medina Duke University

Type: Person DOB: 28-10-1955 Gender: Male

County

Population:2969 Type: La personne

Né à: 15-08-1964 Sexe: femelle

Type: City Ville:King County Population:2969

Type: La personne

Né à: 28-10-1955 Sexe: mâle Type: Université

Emplacement :

North Carolina Fondé : 1838

Figure 1.3: An example of knowledge graph entity alignment

Popular knowledge graphs (e.g., DBpedia, YAGO, and BabelNet) are often tilingual, in which each language domain has a separate version [13] To encour-age the knowledge fusion between different domains, knowledge graph alignment(KGA)– the task of identifying entities in the cross-lingual KGs that refers to thesame real-world object – has received significant interest from both industriesand academia [14], [15] The alignment result can be used for further data en-

Trang 14

mul-richment applications such as repairing inconsistencies, filling knowledge gaps,and building cross-lingual KBs [16]–[18].

Given two knowledge graphs (of different domains or languages), the KGA task

is to find the correspondence of entities across the two knowledge graphs Forexample, 1.3 illustrates two knowledge graphs with their entities in different col-ors (Blue and Orange) We aim to infer all the alignment information (red dashlines) from current information (KGs’ structures, entity names, and attributes).The problem of entity alignment for cross-lingual KGs has been studied inten-sively with the emergence of graph embedding techniques [19], [20] Giventwo monolingual KGs, these techniques first learn low-dimensional vectors rep-resenting the entities of each KG, and the corresponding entities are then dis-covered based on their vector similarities The first generation methods of thisparadigm, including MTransE [21], JAPE [22], ITransE [23], and BootEA [24],learn the embeddings on the assumption that if two entities have a relation, thedistance between their respective embeddings is equal to the embedding of theirrelation Avoiding this strict assumption [25], the second generation of embed-ding techniques such as GCN-Align [26], RDGCN [25], MUGNN [27], KG-matching [28], and NAEA [29] employ graph neural networks, which encode thestructural relationship based on neighbourhood information [30]

1.2.3 The relation between completion and alignment

Lucky Partners The Prisoner of Zenda

Surrey Santa Barbara, California

Ronald Colman

Starring Starring

BirthPlace DeathPlace

Double Chance Le Prisonnier de Zenda

Acteur Santa Barbara (Californie)

Ronald Colman

apparaître apparaître

metier lieuMort

Actor

Figure 1.4: Aligning incomplete KGs across domains

Existing knowledge graph alignment techniques often assume that the input KGsare nearly identical (isomorphic), which is not true [31]–[33] There is usually

a considerable gap between the levels of completeness of different monolingualKGs [2], especially between the English domain and other languages [34] Forexample, in the DBP15K dataset, the ratio of relational triples between Chinese,Japanese, or French KGs over the English KG is only around 80% [35] Fig-ure 1.4 gives an example of incomplete KGs, in which the neighborhood of the

Trang 15

two entities referring to the actor Ronald Colman in English and French KGs areinconsistent (his occupation is missing in the English KG while his place of birth(lieuNaissance) is missing in the French KG) Such inconsistencies easily lead tothe different representations of the corresponding entities, especially for GNNswhere the noise is accumulated over neural layers [36].

On the other hand, no knowledge graph completion method utilizes externalknowledge to the best of our knowledge We may have multiple knowledgegraphs with different amounts of facts; thus, we can design a model that cantransfer knowledge from one to another As a result, we can leverage this mech-anism to enrich our knowledge bases better instead of just using one knowledgegraph to complete itself For example, in Figure 1.4, suppose we already suc-cessfully aligned the same color entities together Then we can easily find themissing relation between the entities Ronald Colman and Actor of the left-handside graph based on the relation (metier) that connects their corresponding enti-ties on the right-hand side of the figure Similarly, we can also find the missingrelation between the entities Ronald Colman and Surrey of the right-hand sidegraph based on their corresponding relation at the left-hand side graph

So the two tasks KGA and KGC are related to each other Intuitively, solvingone task can help us better tackle the other However, designing an architecture

to solve these tasks simultaneously is not trivial This thesis will introduce amultitask-learning method for simultaneously tackling the two tasks

1.3 Research challenges

As mentioned in section 1.2, KGC is a task of a challenge because of variousrelation types Moreover, how to balance between the expressive power and over-fitting risk is still an open question needed to be solved On the other hand, KGA

is also a complex problem due to its NP-hard nature and the fast expansion ofnetworks in complex applications nowadays This section briefly discusses someexisting challenges of KGA and KGC

1.3.1 Handle knowledge graph completion challenges

The expressive power of a representation learning model is of paramount tance in KGC because of KGs’ various relation types However, there is a trade-off between expressive power and over-fitting risk Recently, to overcome theproblem of over-fitting and better capture the neighborhood relationship between

Trang 16

Scorer

Decoder Input Graph

Representation

Score GNN

Figure 1.5: Encoder Decoder architecture for GNN based models

entities and relations, many models have made use of Graph Neural Networks(GNNs) In general, vanilla GNNs are modified and utilized as an encoder mod-ule to update vector representations for entities and relations; then, these vectorrepresentations are fed into a decoder module that adopts a score function (e.g.,

es employed in TransE, DistMult, and ConvE) to return the triple scores (as trated in 1.5) Note that the expressive power of this type of architecture depends

illus-on the expressiveness of its decoder (or scorer) However, designing an encodermodule that fits the input of the decoder module is not a trivial task and thus can

be considered another challenge

Another challenge when designing KGC models is how we can model the occurrent between elements in KGs Entities and relations forming facts fre-quently co-occur in news articles, texts, and documents, e.g., “Melbourne” fre-quently co-occurs with “Australia” We should somehow construct a model that

co-is capable of modeling thco-is relationship, which co-is also considered one of the mainchallenges of KGC

1.3.2 Handle knowledge graph alignment challenges

Most KGA challenges can be linked to General Graph Alignment (GA) lenges such as handling scalability or improving the model’s accuracy However,there are some typical challenges of KGC, namely how to successfully take multi-hop structure information around entities into consideration when we solve thealignment task Furthermore, existing models have not fully utilized attribute in-formation of entities (e.g., the age attribute of a person, the country’s population)due to the high levels of inconsistency and linguistic differences For example,GCN-Align considers only the attributes types and ignores their value Anotherchallenge is designing a model that can adapt to noises, e.g., when a KG pos-sesses more entities than another Attribute noise is also common, e.g., when anentity in source KG has more attributes than its counterpart in the target KG or isstored differently

Trang 17

chal-1.3.3 Handle the challenges of solving the two task simultaneously

Existing techniques often assume that the input KGs are nearly identical phic), which is not true [31]–[33] There is usually a considerable gap betweenthe levels of completeness of different monolingual KGs [2], especially betweenthe English domain and other languages [34] For example, in the DBP15Kdataset, the ratio of relational triples between Chinese, Japanese, or French KGsover the English KG is only around 80% [35] Figure 1.4 gives an example ofincomplete KGs, in which the neighborhood of the two entities referring to theactor Ronald Colman in English and French KGs are inconsistent (his occupation

(isomor-is m(isomor-issing in the Engl(isomor-ish KG while h(isomor-is place of birth (isomor-is m(isomor-issing in the FrenchKG) Such inconsistencies easily lead to the different representations of the cor-responding entities, especially for GNNs where the noise is accumulated overneural layers [36] On the other hand, how to automatically use other knowledgebases to complete a knowledge graph is still a hard question Finally, solving thetwo problems simultaneously seems to make sense, but that’s not a trivial task

1.4 Thesis methodology

The theme of this thesis is to find deep learning methods which can produceexpressive representations for knowledge graph entities and relations so that theycan outperform current state-of-the-art models in two well-known tasks, namelyknowledge graph alignment and knowledge graph completion, by addressing theabove challenges To this end, the proposed methods can adapt with variousapplication settings, save the computation power and memory while guaranteeingthe inference step is within a reasonable time, and the produced result achieveshigh accuracy We follow a top-down approach, where we focus on tackling thetasks for various types of datasets with different complexity levels We attempt

to overcome the mentioned challenges for each network type by first analyzingthe framework’s requirements, then designing an embedding-based model and itscomponents to satisfy such needs For each proposed framework of each networktype, we validate its effectiveness by extensive experiments with both syntheticand real-world datasets against state-of-the-art baselines We also demonstratethe scalability and robustness of the proposed models against different adversarialconditions (e.g., structural noises)

Trang 18

1.5 Contributions and Thesis Outline

In addressing the above research questions, this thesis makes the following tributions:

con-Enhancing knowledge graph completion performance In Chapter 3, we solvethe problem of knowledge graph completion on three new challenging KGs.Given an incomplete knowledge graph with many missing valid triples, we pro-pose a knowledge graph completion framework that can produce high-qualityresults In particular:

• We propose a new effective GNN-based KG representation learning model,named NoGE, to integrate co-occurrence among entities and relations in theencoder module for knowledge graph completion

• We also propose a novel form of GNNs, named Dual Quaternion GraphNeural Network (DualQGNN) as the encoder module Which allows therepresentations of entities and relations of KG to be expressive

• We conduct extensive experiments to compare our NoGE with other strongGNN-based baselines and show that NoGE outperforms these baselines aswell as other up-to-date KG embedding models and obtains state-of-the-artresults on three new and difficult benchmark datasets

Enhancing knowledge graph alignment performance In Chapter 4, we solvethe problem of knowledge graph alignment on large-scale KGs Given the twoassociated KGs, we propose an architecture that can embed entities of KGs intolow dimensional vector space and then align corresponding entities across KGs.The contribution of this solution is as follow:

• We propose a framework called EMGCN for unsupervised KG entity ment with no prior knowledge Since this framework is grounded in thelate-fusion mechanism, rich KG information (e.g relational triples, attributetype, attribute value) can be integrated regardless of the modality This al-lows us to be the first in the literature to successfully use the attribute value

align-• We design a GCN-based model that exploits the rare characteristics of GCNs,including multi-order and permutation immunity, to simultaneously inte-grate different relation-related consistency constraints We also tailor theloss function to enforce joint and consistent learning of the embeddings of

Trang 19

two KGs to support their alignment and avoid reconciliation of their ding spaces.

embed-• We conduct experiments on real-world and synthetic KG datasets to uate our scheme The results show that our framework outperforms otherbaselines and also is robust to various types of adversarial conditions

eval-Enhancing knowledge graph completion and knowledge graph alignmentperformance at the same time Chapter 5 solves the two mentioned tasks si-multaneously by proposing a multi-task learning model We argue this is the firstarchitecture to solve two research questions related to KGs simultaneously Inparticular, the contribution of this innovation is as follow:

• We address the problem of aligning incomplete KGs using external edge bases and propose a framework called Incomplete Knowledge graphsAligner via MultI-channel Feature Exchange (IKAMI) The model exploitsmultiple representations to capture the multi-channel nature of KGs (e.g re-lational type, entity name, structural information) This is the first attempt

knowl-to address the entity alignment and knowledge completion at the same time,and we argue that this collaboration benefit for the both tasks, especially thealignment performance

• We design a joint train schedule of the two embedding models to enable theholistic objectives of the embeddings can support each other well Then, thesimilarity matrix for each channel is calculated and fused by weighted-sum

to return the final result

• We conduct experiments on real-world and synthetic KG datasets to evaluateour scheme The results show that our framework outperforms other base-lines in the entity alignment task and the knowledge completion task by up

to 15.2% and 3.5%, respectively

The remainder of this thesis is organised as follows Chapter 2 presents a survey

of literature related to research challenges addressed in this thesis work Chapter3,4, and 5 address the research challenges as above Chapter 6 concludes ourthesis and discusses the future works

1.6 Selected Publications

This thesis is based on the following research papers:

Trang 20

• Dai Quoc Nguyen*, Vinh Tong*, Dinh Phung, Dat Quoc Nguyen view Graph Neural Networks for Knowledge Graph Completion”, In 202215th ACM International on Web Search and Data Mining Accepted (WSDM

“Two rank A*)

• Nguyen Thanh Tam, Huynh Thanh Trung, Hongzhi Yin, Tong Van Vinh,Darnbi Sakong, Bolong Zheng, Nguyen Quoc Viet Hung “Entity Alignmentfor Knowledge Graphs with Multi-order Convolutional Networks” In 2021IEEE Transaction on Knowledge and Data Engineering Accepted (TKDE -rank Q1)

• Vinh Tong, Huynh Thanh Trung, Nguyen Thanh Tam, Nguyen Quoc VietHung, Huynh Quyet Thang “IKAMI: Multi-channel Feature Exchange forAligning Incomplete Knowledge Graphs from Different Domains”, Submit-ted to 2022 48th International Conference on Very Large Data Base Underreview (VLDB - rank A*)

Trang 21

CHAPTER 2 BACKGROUND

2.1 Graph Convolutional Networks (GCNs)

Figure 2.1: CNN and GCN comparison [37]

Convolutional Neural Network (CNN) has long been used as a great tool forcapturing image (or grid-structured data in general) feature As illustrated inFigure 2.1, CNN (left-hand side) operates over a grid structure of pixels whereeach pixel has exactly 8 (or 3, 5 if the pixel is at the corner, edge of the image)neighbor pixels On the other hand, at the right-hand side, we can see that thereare no fixed structured (number of neighbors) around each node in the graph.Thus, designing a graph convolution is not quite straight forward as CNN Indeed, GCN can be considered as a generalized version of CNN since CNN’s 2Dstructure is equivalent to a special graph

To get a hidden representation of a centre node, one simple solution of GCN is

to take the average value of the node features of itself along with other neighbornodes [37] Suppose we have a homogeneous graph G = (V, A, X), whereV isthe set of nodes;A ∈ {0, 1}|V|×|V| is the adjacency matrix whereAu,v = 1meansthere is an edge connecting nodeuto nodev of the graph andA u,v = 0otherwise;

X ∈ R|V|×d is the attribute matrix where X v is the initialize attribute vector ofnode v GCN learns multi-layer representations for nodes in the graph where

whereWk is thek-th layer trainable parameter of the model andσis an activationfunction, such as ReLU (.) or Sigmoid(.); Hk is the k-th layer representation ofgraph nodes (H0 = X; L˜ is the normalized graph Laplacian matrix [37] which iscomputed as follow:

Trang 22

where Iis the identical matrix; D is the degree matrix (a diagonal matrix where

D v,v equals to the degree of node v) We then have the formula for updating anodev representationh v as:



Xu∈N (v)

2.2 Knowledge Graph Completion background

2.2.1 Incomplete knowledge graphs

The KG is often denoted as KG = (V, R, E), where V is the set of entities; R isthe set of relations and E is the set of triples The triple hh, r, ti ∈ E is atomicunit in KG, which depicts a relation r between a head (an entity) h and a tail t(an attribute or another entity) We present the incomplete knowledge graphs byextending the KG notation as i-KG = (V, R, E, ¯ E), where E¯is the set of missingtriples in the i-KG For brevity’s sake, we use i-KGandKGinterchangeably in thisthesis There are a lot of KGs that have attribute information along with structureinformation For example, It can be seen from Figure 1.4 that each entity hassome extra information such as Duke University has “type: University”, “loca-tion: North Carolina”, and “found: 1838” To model those information, someworks introduce some addition elements to the definition of KGs namely attributetriples (e.g hDuke University,Found, 1983i) Formally, a KG with addition at-tribute and value information can be represented as i-KG = (V, R, E, ¯ E, A, V, EA),where A, V, E A are set of attributes, values, and attribute triples, respectively

2.2.2 Knowledge graph completion models

Given the incomplete knowledge graph KG = (V, R, E, ¯ E, A, V, E A ), where E¯isunrevealed, the knowledge graph completion (KGC) task aims to discover all themissing tripleshh, r, ti ∈ ¯ E|hh, r, ti / ∈ E

For each triple hh, r, ti, the embedding models define a score function f (h, t, r)

Trang 23

of its plausibility Their goal here is to choose f such that the score f (h, r, t)of

a correct triple(h, r, t) is higher than the score of f (¯ h, ¯ r, ¯ t) of an incorrect triple

(¯ h, ¯ r, ¯ t) For example, TransE [3] defines a score function of ftranse(h, r, t) =

−||hh + hr − ht||, where h, r, and t are represented by low dimensional tors hh,hr, and ht, respectively As (Melbourne, city of, Australia) is a correcttriple, while (Melbourne, city of, Vietnam) is an incorrect one, we would have:

vec-−||hMelbourne+ hcity of− hAustralia|| > −||hMelbourne+ hcity of− hVietnam|| Shallowbased models often distinguish each other by their score function, we will exploresome of them as following

a, Translation-based models

The first model in this category is TransE [3] which is inspired by models such

as Word2vec, Skip-gram model [38] where relationships between words oftencorrespond to translations in latent feature space In particular, TransE aims toreturn low-dimensional representation for each entity and relation in the KG andensure that each relation type corresponds to a translation operation from headentity vector to the tail entity vector, i.e hh + hr = ht However, this limitsTransE to only be able to model 1-to-1 relationship such as “is capital of”, where

a head entity is linked to at most one tail entity given a relation type Thus, it fails

to model “Many-to-1”, ”1-to-Many”, or “Many-to-Many” relationships

On the other hand, TransH [39] handles all the problems of TransE by ing each relation with a relation-specific hyperplane and uses a projection vector

associat-to project entity vecassociat-tors onassociat-to that hyperplane TransD [40] and TransR/CTransR[41] extend TransH by using two projection vectors and a matrix to project entityvectors into a relation-specific space, respectively STransE [42] and TranSparse[43] can be viewed as direct extensions of TransR, where head and tail entitiesare associated with their own projection matrices Unlike STransE, TranSparse[43] uses adaptive sparse matrices, whose sparse degrees are defined based on thenumber of entities linked by relations ITransF [44] can be considered as a gen-eralization of STransE, which allows the sharing of statistic regularities betweenrelation projection matrices and alleviates data sparsity issues

b, Neural network-based models

The neural tensor network (NTN) [45] model uses a bilinear tensor operator

to present each relation while ProjE can be viewed as a simplified version ofNTN ConvE [4] and ConvKB [6] are based on convolutional neural networks

Trang 24

ConvE uses a convolution layer directly over 2D reshaping of head entity andrelation embeddings, while ConvKB applies a convolution layer over the embed-ding triples HypER [46] simplifies ConvE by using a hyper network to produce1D convolutional filters for each relation, then extracts relation-specific featuresfrom head entity embeddings.

c, Complex vector-based models

Instead of embedding entities and relations in the real-valued vector space, plEx [9] is an extension of embedding models in complex vector space ComplEx-N3 [47] extends ComplEx with weighted nuclear 3-norm Also, in the complexvector space, RotatE [5] defines each relation as a rotation from the head en-tity to the tail entity QuatE [48] represents entities by quaternion embeddings(i.e., hypercomplex-valued embeddings) and models relations as rotations in thequaternion space by employing the Hamilton and quaternion-inner products

Com-d, Graph neural network based models

Currently, there is an increasing trend of using graph neural networks (GNNs)

as an efficient tool to achieve graph representation that captures not only graphstructure but also node attributes GNNs are originally designed for general undi-rected graphs However, several works have adapted those architectures to fit withthe multi-relational nature of KGs Generally, They design their own graph neu-ral network architectures as encoders to return entity and relation embeddings.These embeddings will then be pushed forward to a decoder module that can beany of the mentioned above to return triple scores Formally, a GNN architecture

is a multi-layer neural network where each layer produces embeddings of graphcomponents by:

hk+1p = fe



X(q,r)∈N (p)

mkqr



whereN (p) = {(q, r)|(hp, r, qi ∈ E) ∨ (hq, r, pi ∈ E)}is the neighbor set of entityp,

mkqr ∈Rd denotes the message passing from neighbor entitypto entityqthroughrelation r; and fe is a linear transformation followed by an activation function.The main difference between GNN models lies on their messages mkqr Regard-ing the GNN-based KG embedding approaches, R-GCN [10] modifies GCNs to

Trang 25

introduce a specific message:

Recently, CompGCN [12] customizes GCNs to consider composition operations

h(k+1)r = Wkhkr

CompGCN then applies ConvE [4] as the decoder module This model is actuallythe first architecture that allows relation to have their own embeddings at eachGNN layer

Note that R-GCN and CompGCN do not consider co-occurrence among entitiesand relations in the encoder module This limitation also exists in other GNN-based models such as SACN [11]

2.3 Knowledge Graph Alignment background

2.3.1 Previous approaches

In recent years, entity alignment models based on representation learning haverapidly received widespread attention from academia and industry These meth-ods use low-dimensional vector representation for entities in KGs to calculate thesimilarity between them across KGs to find equivalent entity pairs They can bedevided into two main categories namely semantic matching-based models andgraph neural network-based models

a, Semantic matching-based models

All models in this class try to contain semantic information about the entity to thelow-dimensional vector representation of entities Inspired by TransE, MTransE[21] uses TransE to learn the vector representation of a single knowledge graph

Trang 26

and then learns linear transformation to map them to the same vector space Themodel then uses the cosine similarity metric to align entity pairs IPTransE [23]restricts pre-aligned equivalent entities to have the close vector representation andthen uses PTransE [49] to iteratively learn and align different KGs in a unifiedvector space BootEA [24] tries to solve a more challenging problem when itconsiders only a small number of supervised entity pairs and then continuouslyselects possible entity pairs for training through an iterative method.

Another representation method is to integrate a variety of knowledge to enrich tity semantics JAPE [22] uses TransE to represent entities and uses Skip-gram tolearn attribute representations Based on the assumption that entities with similarattributes have a greater probability of being equivalent entities, it is generalized

en-by the similarity between attributes to enhance the semantic of entities KDCoE[50] uses GRU to encode the description information of entities and performscollaborative training with representation learning based on relational triples toimprove the alignment performance

b, Graph Neural Network-based models

Many models have successfully adapted GNNs to solve the problem of KGA.GCN-Align [51] uses Graph Convolutional Network (GCN) to learn the vectorrepresentation of entities, and at the same time, allows two GCNs encodings dif-ferent KGs to encode relational triples as well as attribute triples information tothe presentation of entities MuGNN [27] pays attention to the sparsity of the

KG and uses AMIE+ to infer and complete the missing entities in the KG matically, construct a denser graph, and aligns entities with a small number ofpre-aligned pairs RDGCN [25] introduces the concept of dual graphs when con-structing the relationship graph between entities and enhances the discrimination

auto-of different entity network structures through the restriction auto-of dual graphs

2.3.2 Alignment constraints

Existing entity alignment methods for cross-lingual KGs focus on three types ofconstraints, namely entity consistency, relation consistency and attribute consis-tency

Trang 27

a, Entity consistency

Since the corresponding entities reflect the same real-world entity (e.g a person

or concept), their names should be equivalent In Figure 1.3 (showing part ofYAGO), the terms ‘universite de Duke’ in the French KG and ‘Duke University’ inthe English KG both represent a university in North Carolina Recent works haveused Google Translate to check whether the names of corresponding entities indifferent languages have the same meaning by translating them into English [25]

b, Relation consistency

This requirement (a.k.a the homophily rule) states that if two nodes n 1 and n 2are closely related in one network in a structural manner (e.g., being neighbours),then their corresponding nodes n01 and n02 also have a close relation in the coun-terpart KG [26] In Figure 1.3, two entities Bill Gates and Melinda Gates areconnected in the English KG; under relation consistency, their two correspondingentity nodes in the French KG are also connected Mathematically, ifpandqhave

a relation triplehp, r, qi in the source KG, and (p, p0) and (q, q0) are anchor links,thenp0,q0also have a relation triple hp0, r, q0iin the target KG Note that relationtriples in KGs are directional, and this direction should also be respected

c, Attribute consistency

This requirement states that corresponding entities should have equivalent tributes and equivalent attribute values [52] For example, the entity Bill Gateshas an attribute triple hBill Gates,DOB,28-10-1955i in the English KG and itscounterpart ishBill Gates,N´e `a,28-10-1955iin the French KG Formally, if(p, p0)

at-is an anchor link and(p, a, v) ∈ EsAis an attribute triple, then there exists(e0, a0, v0) ∈

EtA such thataanda0 are equivalent andv andv0are equivalent

However, many conventional entity alignment models struggle to address all threetypes of consistency requirements simultaneously, since the attribute imbalance(the difference in the number of attributes) between the two KGs and modalityinconsistency are frequently observed in the real-world datasets While there are

a few notable exceptions, they often ignore the value in an attribute triple [26]

Trang 28

2.3.3 Incomplete knowledge graph alignment

By generalising the problem setting in related works, i-KGalignment aims to findall of the corresponding entities of two given i-KGs Without loss of generality,

we select one i-KG as the source graph and the other as the target graph, anddenote them as KGs and KGt respectively Note that EsSE ¯s = EtSE ¯t, whichrepresents the complete triple facts Then, for each entity p in the source graph

KGs, we aim to recognise its counterpartp0(if any) in the target knowledge graph

KG t The corresponding entities (p, p0) also often denoted as anchor links; andexisting alignment frameworks often require supervision data in the form of a set

of pre-aligned anchor links, denoted by L

Since the corresponding entities reflect the same real-world entity (e.g a person

or concept), the existing alignment techniques often rely on the consistencies,which states that the corresponding entities should maintain similar character-istics across different KGs [26] The entity consistency states that the entitiesreferring to the same objects should exist in both the KGs and have equivalentname The relation consistency (a.k.a the homophily rule) declares that the enti-ties should maintain their relationship characteristics (existence, type, direction).While KG alignment and completion have been studied for decades [33], [53],there is little work on jointly solving these problems together However, doing

so is indeed beneficial: missing triples hh, r, ti ∈ ¯ E in one KG can be recovered

by cross-checking another KG via the alignment, which, in turn, can be boosted

by recovered links To the best of our knowledge, this work is a first attempt tosolve the joint optimization of KG alignment and completion, which is formallydefined as follows

Trang 29

CHAPTER 3 ENHANCING KNOWLEDGE GRAPH COMPLETION

PERFORMANCE

3.1 Introduction

In addition to conventional KG embedding models such as TransE [54], DistMult[8], ComplEx [9], ConvE [4], ConvKB [6], and TuckER [7], recent approacheshave adapted graph neural networks (GNNs) for knowledge graph completion[10]–[12], [55] In general, vanilla GNNs are modified and utilized as an encodermodule to update vector representations for entities and relations; then these vec-tor representations are fed into a decoder module that adopts a score function(e.g., as employed in TransE, DistMult, and ConvE) to return the triple scores.Those GNN-based models, however, are still outperformed by other conventionalmodels on some benchmark datasets [56] To boost the model performance, ourmotivation comes from the fact that entities and relations forming facts oftenco-occur frequently in news articles, texts, and documents, e.g., “Melbourne”co-occurs frequently together with “Australia”

We thus propose a new effective GNN-based KG embedding model, named NoGE,

to integrate co-occurrence among entities and relations in the encoder module forknowledge graph completion (as the first contribution) NoGE is different fromother existing GNN-based KG embedding models in two important aspects: (i)Given a knowledge graph, NoGE builds a single graph, which contains entitiesand relations as individual nodes; (ii) NoGE counts the co-occurrence of enti-ties and relations to compute weights of edges among nodes, resulting in a newweighted adjacency matrix Consequently, NoGE can leverage the vanilla GNNsdirectly on the single graph of entity and relation nodes associated with the newweighted adjacency matrix As the second contribution, NoGE also proposes

a novel form of GNNs, named Dual Quaternion Graph Neural Networks alQGNN) as the encoder module Then NoGE employs a score function, e.g.QuatE [57], as the decoder module to return the triple scores As our final con-tribution, we conduct extensive experiments to compare our NoGE with otherstrong GNN-based baselines and show that NoGE outperforms these baselines

(Du-as well (Du-as other up-to-date KG embedding models and obtains state-of-the-artresults on three new and difficult benchmark datasets CoDEx-S, CoDEx-M, andCoDEx-L [58] for knowledge graph completion

Trang 30

3.2 Dual quaternion background

A background in quaternion can be found in recent works [57] We briefly provide

a background in dual quaternion [59] A dual quaternion h ∈ Hd is given in theform: h = q + p, whereqandpare quaternions∈H,is the dual unit with2 = 0

Conjugate The conjugateh∗of a dual quaternionhis defined as: h∗ = q∗+ p∗

Addition The addition of two dual quaternionsh 1 = q 1 + p 1andh 2 = q 2 + p 2

is defined as: h1+ h2 = (q1+ q2) + (p1+ p2)

Dual quaternion multiplication The dual quaternion multiplication⊗dof twodual quaternionsh 1 andh 2is defined as:

h1⊗dh2 = (q1⊗ q2) + (q1⊗ p2+ p1⊗ q2)

where ⊗denotes the Hamilton product between two quaternions

Norm The norm khkof a dual quaternionhis a dual number, which is usuallydefined as: khk =√h ⊗dh ∗ =pkqk 2 + 2q • p = kqk + q•pkqk

Unit dual quaternion A dual quaternionhis unit ifh ⊗dh∗= 1 withkqk 2 = 1

Matrix-vector multiplication The dual quaternion multiplication⊗dof a dualquaternion matrix WDQ = WQq + WQp and a dual quaternion vector hDQ =

qQ+ pQis defined as:

where the superscripts DQ and Q denote the dual Quaternion space Hd and theQuaternion space H, respectively

Trang 31

Figure 3.1: An illustration of our proposed NoGE

To enhance the efficiency of the encoder module, our motivation comes fromthe fact that valid entities and relations co-occur frequently in KGs, e.g., “Mel-bourne” co-occurs together with “city of” frequently Given a knowledge graph

KG, NoGE builds a single graph G that contains entities and relations as nodesfollowing Levi graph transformation [60], as illustrated in Figure 3.1 The totalnumber of nodes inG is the sum of the numbers of entities and relations, i.e |V|

= |E| + |R| NoGE then builds edges among nodes based on the co-occurrence

of entities and relations within the triples in KG Formally, NoGE computes theweights of edges among nodespandqto create a new weighted adjacency matrix

Trang 32

eral advantages in modeling rotations and translations, and efficiently ing rigid transformations [64] Therefore, we introduce Dual Quaternion GraphNeural Networks (DualQGNN) and then utilize our DualQGNN as the encodermodule in NoGE as:

represent-hk+1,DQp = fe



Xq∈N p ∪{p}

, wherein eA = A +I, andD is the diagonal node degree matrix˜

of eA

NoGE obtains the dual quaternion vector representations of entities and relationsfrom the last DualQGNN layer of the encoder module For each obtained dualquaternion representation, NoGE concatenates its two quaternion coefficients toproduce a final quaternion representation These final quaternion representations

of entities and relations are then fed to QuatE [57], employed as the decodermodule, to compute the score of (h, r, t) as:

We then apply the Adam optimizer [65] to train our proposed NoGE by ing the binary cross-entropy loss function [4] as:

Trang 33

3.4 Experimental Results

3.4.1 Experiment setup

We evaluate our proposed NoGE for the knowledge graph completion task, i.e.,link prediction [54], which aims to predict a missing entity given a relation withanother entity, e.g., predicting a head entity h given (?, r, t) or predicting a tailentityt given (h, r, ?) The results are calculated by ranking the scores produced

by the score functionf on triples in the test set

Table 3.1: Statistics of the experimental datasets

CoDEx-M 17,050 51 185,584 10,310 10,311CoDEx-L 77,951 69 551,193 30,622 30,622

b, Evaluation protocol

Following [54], for each valid test triple(h, r, t), we replace eitherh ortby each

of all other entities to create a set of corrupted triples We also use the tered” setting protocol [54] We rank the valid test triple and corrupted triples

“Fil-in descend“Fil-ing order of their scores and report mean reciprocal rank (MRR) andHits@10(the proportion of the valid triples ranking in top 10predictions) Thefinal scores on the test set are reported for the model that obtains the highest MRR

on the validation set

Trang 34

c, Training protocol

We set the same dimension value for both the embedding size and the hidden size

of the DualQGNN hidden layers, wherein we vary the dimension value in{32, 64,

128} We fix the batch size to 1024 We employtanhfor the nonlinear activationfunction fe We use the Adam optimizer [65] to train our NoGE model up to3,000 epochs on CoDEx-S and CoDEx-M, and 1,500 epochs on CoDEx-L Weuse a grid search to choose the number of hidden layers ∈ {1, 2, 3}and the Adaminitial learning rate ∈

1e−4, 5e−4, 1e−3, 5e−3 To select the best checkpoint, weevaluate the MRR after each training epoch on the validation set

Baselines’ training protocol For other baseline models, we apply the sameevaluation protocol The training protocol is the same w.r.t the optimizer, thehidden layers, the initial learning rate values, and the number of training epochs

In addition, we use the model-specific configuration for each baseline as follows:

• QuatE [57]: We set the batch size to 1024 and vary the embedding dimension

in{64, 128, 256, 512}

• Regarding the GNN-based baselines – R-GCN [10], CompGCN [12], SACN[11], and our NoGE variants with QGNN and GCN – we also set the samedimension value for both the embedding size and the hidden size, wherein wevary the dimension value in{64, 128, 256, 512}

• Our NoGE variant with QGNN: This is a variant of our proposed method thatutilizes QGNN [55] as the encoder module

• Our NoGE variant with GCN: This is a variant of our proposed method thatutilizes GCN [61] as the encoder module

• CompGCN: We consider a CompGCN variant that set ConvE [4] as its decodermodule, circular-correlation as its composition operator, the kernel size to 7,and the number of output channels to 200, producing the best results as reported

in the original implementation

• SACN: For its decoder Conv-TransE, we set the kernel size to 5 and the number

of output channels 200 as used in the original implementation

Trang 35

3.4.2 Main results

In Table 3.2, we report our obtained results for NoGE and other strong lines including QuatE [57], R-GCN [10], SACN [11] and CompGCN [12] on theCoDEx datasets

base-Table 3.2: Experimental results on the CoDEx test sets

is outperformed by ConvE on CoDEx-S and CoDEx-M CompGCN also doesnot perform better than ComplEx and TuckER on the CoDEx datasets Similarly,QuatE, utilized as our NoGE’s decoder module, also produces lower results thanComplEx, ConvE, and TuckER

When comparing with QuatE and three other GNN-based baselines, our NoGEachieves substantial improvements on the CoDEx datasets For example, NoGEgains absolute Hits@10 improvements of 2.9%, 2.7%, and 2.2% over CompGCN

on CoDEx-S, CoDEx-M, and CoDEx-L In general, our NoGE outperforms to-date embedding models and is considered as the best model on the CoDExdatasets In particular, NoGE yields new state-of-the-art Hits@10 and MRRscores on the three datasets, except the second-best MRR on CoDEx-S

up-Ablation analysis We compute and report our ablation results for three ants of NoGE in Table 3.3 In general, the results degrade when using eitherQGNN or GCN as the encoder module, showing the advantage of our proposedDualQGNN The scores also degrade when not using the new weighted adjacencymatrixA Besides, our NoGE variants with QGNN and GCN also substantially

Trang 36

vari-outperform three other GNN-based baselines R-GCN, SACN, and CompGCN,thus clearly showing the effectiveness of integrating our matrixAinto GNNs for

KG completion

Table 3.3: Ablation results on the validation sets

Trang 37

CHAPTER 4 ENHANCING KNOWLEDGE GRAPH ALIGNMENT

PERFORMANCE

4.1 Introduction

The problem of entity alignment for cross-lingual KGs has been studied sively with the emergence of graph embedding techniques [19], [20] Giventwo monolingual KGs, these techniques first learn low-dimensional vectors rep-resenting the entities of each KG, and the corresponding entities are then dis-covered based on their vector similarities The first generation methods of thisparadigm, including MTransE [21], JAPE [22], ITransE [23], and BootEA [24],learn the embeddings on the assumption that if two entities have a relation, thedistance between their respective embeddings is equal to the embedding of theirrelation Avoiding this strict assumption [25], the second generation of embed-ding techniques such as GCN-Align [26], RDGCN [25], MUGNN [27], KG-matching [28], and NAEA [29] employ graph neural networks, which encode thestructural relationship based on neighbourhood information [30]

inten-However, we argue that the above approaches overload the embedding modelwith unrelated objectives On the one hand, the entity embeddings must encodethe syntactic information (e.g., neighborhood, topology, degree) for each KG Atthe same time, they also need to reflect the semantic alignment of entities acrossKGs Some techniques such as JAPE [22] use pre-aligned entities to remedythis issue by increasing the influence of negative samples in the loss function.Furthermore, existing models have not fully utilized the attribute information ofentities (e.g., the age attribute of a person, the country’s population) due to thehigh levels of inconsistency and linguistic differences For example, GCN-Alignconsiders only the types of attributes and ignores their values [26]

This chapter meets the above requirements via a unified and adaptive entity ment model for cross-lingual KGs In essence, our idea is to fully leverage therichness of a KG by simultaneously comparing the relational and attributionalinformation of the entities to be aligned The fusion of these types of informa-tion helps them complete each other and mitigate the high levels of consistencyviolation of each kind To efficiently extract the relational data, we propose touse the multi-layer characteristics of graph convolutional networks (GCNs) [30]

align-to model the relational correlation at different orders without the need for vision data (e.g., pre-aligned entities) In terms of attributional information, we

Trang 38

super-adopt the advance of machine translation (e.g., Google Translate) to efficientlyreconcile the information in different languages and avoid human involvement.More specifically, we summarise our contributions as follows:

• We propose a framework called Entity Alignment with Multi-order GraphConvolutional Network (EMGCN) KG entity alignment with no prior knowl-edge Since this framework is grounded in the late-fusion mechanism, rich

KG information (e.g relational triples, attribute type, attribute value) can beintegrated regardless of the modality

• We design a GCN-based model that exploits the rare characteristics of GCNs,including multi-order and permutation immunity, to simultaneously inte-grate different relation-related consistency constraints We also tailor theloss function to enforce joint and consistent learning of the embeddings oftwo KGs to support their alignment and avoid reconciliation of their embed-ding spaces

• We conduct experiments on real-world and synthetic KG datasets to uate our scheme The results show that our framework outperforms otherbaselines and also is robust to various types of adversarial conditions

eval-4.2 Overview of the Proposed Approach

Alignment Matrix

…

… Topology Information

consistency loss

Relation network of source KG

Relation network of target KG

Attribute Alignment

Attribute triples of source KG

Attribute triples of target KG

Figure 4.1: Overview of EMGCN framework

4.2.1 Motivation

A KG entity alignment framework should satisfy the following requirements:

Trang 39

1 Consistency: Entity consistency, relational consistency, and attribute sistency should be respected since these constraints guide the model to findprecise anchor links w.r.t the specific characteristics of KG (e.g., name equiv-alence, directional relations) False positives may adversely affect the per-formance of the downstream tasks.

con-2 Adaptivity: While the consistency constraints form the backbone assumption

of the alignment techniques, these consistency constraints are sometimesviolated in real-world datasets, e.g., when one KG possesses more entitiesthan another Attribute noise is also common, e.g., when an entity in thesource KG has more attributes than its counterpart in the target KG or whentheir attribute values are stored in different formats

Several challenges need to be addressed to satisfy these requirements Firstly,the source and target KGs often show some inherent differences in the form ofconsistency violations (noise) [66], [67] The proposed model should be immune

to node permutation and robust to structural and attribute noise Secondly, guistic challenges arise when entity and attribute names must be unified in thesame language for direct comparison The use of a translation engine such asGoogle Translate is only a temporary solution, however, since translations with-out context are not always accurate [25] Thirdly, a high noise level often exists

lin-in attribute lin-information For example, when correspondlin-ing entities do not haveequivalent attributes and their values have different formats

4.2.2 The entity alignment framework

The structure of our framework is presented in Figure 4.1 First, we forward therelational network extracted from the input KGs to a designed multi-order GCN-based model to embed the KG entities in low-dimensional vector spaces Therelational correlation of two entities is captured based on the distance betweentheir embeddings We then retrieve the relation alignment using the learned em-beddings and retrieve the attribution alignment using a strategy based on machinetranslation Finally, the alignments in different views are combined to give theoverall result Three important functionalities must be considered here:

a, Multi-order relational-aware embedding

To integrate the relational information of entities into the framework, we adopt

a GCN-based model to learn the relation-aware embedding for the entities The

Trang 40

model consists of several layers; each encodes network topology at multiple ders To train the model, we optimize a loss function equivalent to minimizingthe violations of consistency constraints Only the relational information of KGs

or-is used in thor-is step since attribute information often has a high level of noor-iseand may degrade the quality of the embeddings The details of this process areexplained in section 4.3

b, Relational Alignment

In this step, we compute the alignment output using the relational information ofthe embeddings from all GCN layers In more detail, we construct a single-orderalignment matrix at each layer and then apply a weighted-sum combination to thematrices to obtain the final relational alignment output The weights representthe importance of the layers Before calculating the single-order matrices, weperform a tuning pre-process to decay the impact of noise via an iterative process.The details of this process are described in section 4.4

c, Attribute Alignment

Candidate anchor links are further solidified based on the attribute information ofthe entities (i.e the attributes and their values) A dictionary of correspondingattributes is built to compute the attribute-based similarity A value-based simi-larity is then calculated using a Jaccard measure These similarities are combinedwith the relation similarity to produce the final alignment matrix The details aregiven in section 4.4

4.3 Relation-aware Multi-order Embedding

In this section, we describe our GCN-based model, which learns the aware representation for entities while guaranteeing the consistency constraints

relation-4.3.1 GCN-based embedding model

We employ a GCN to learn the representation for the entities Our GCN-basedmodel consists ofk layers, and each hidden feature layer simultaneously encodesthe topological and attributional information using a message passing scheme inwhich the hidden features in the current layers are constructed from the hiddenfeature in previous layers [68] Based on the general definition of one-layer GNN

Định dạng
Số trang	86
Dung lượng	2,35 MB