Web and big data part i 2018

Lei Chen The Hong Kong University of Science and Technology, Hong Kong,SAR China, and Prof.. Zhiguo Gong University of Macau, SAR ChinaQing Li City University of Hong Kong, SAR ChinaKam-

Trang 2

Lecture Notes in Computer Science 10987

Commenced Publication in 1973

Founding and Former Series Editors:

Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Trang 3

More information about this series at http://www.springer.com/series/7409

Trang 4

Yi Cai • Yoshiharu Ishikawa

Jianliang Xu (Eds.)

Web and Big Data

Second International Joint Conference, APWeb-WAIM 2018 Macau, China, July 23 –25, 2018

Proceedings, Part I

123

Trang 5

Lecture Notes in Computer Science

ISBN 978-3-319-96889-6 ISBN 978-3-319-96890-2 (eBook)

https://doi.org/10.1007/978-3-319-96890-2

Library of Congress Control Number: 2018948814

LNCS Sublibrary: SL3 – Information Systems and Applications, incl Internet/Web, and HCI

This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speci ﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on micro ﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made The publisher remains neutral with regard to jurisdictional claims in published maps and institutional af ﬁliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG

The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Trang 6

This volume (LNCS 10987) and its companion volume (LNCS 10988) contain theproceedings of the second Asia-Paciﬁc Web (APWeb) and Web-Age InformationManagement (WAIM) Joint Conference on Web and Big Data, called APWeb-WAIM.This joint conference aims to attract participants from different scientiﬁc communities

as well as from industry, and not merely from the Asia Paciﬁc region, but also fromother continents The objective is to enable the sharing and exchange of ideas, expe-riences, and results in the areas of World Wide Web and big data, thus covering Webtechnologies, database systems, information management, software engineering, andbig data The second APWeb-WAIM conference was held in Macau during July

23–25, 2018 As an Asia-Pacific flagship conference focusing on research, ment, and applications in relation to Web information management, APWeb-WAIMbuilds on the successes of APWeb and WAIM: APWeb was previously held in Beijing(1998), Hong Kong (1999), Xi’an (2000), Changsha (2001), Xi’an (2003), Hangzhou(2004), Shanghai (2005), Harbin (2006), Huangshan (2007), Shenyang (2008), Suzhou(2009), Busan (2010), Beijing (2011), Kunming (2012), Sydney (2013), Changsha(2014), Guangzhou (2015), and Suzhou (2016); and WAIM was held in Shanghai(2000), Xi’an (2001), Beijing (2002), Chengdu (2003), Dalian (2004), Hangzhou(2005), Hong Kong (2006), Huangshan (2007), Zhangjiajie (2008), Suzhou (2009),Jiuzhaigou (2010), Wuhan (2011), Harbin (2012), Beidaihe (2013), Macau (2014),Qingdao (2015), and Nanchang (2016) Thefirst joint APWeb-WAIM conference washeld in Bejing (2017) With the fast development of Web-related technologies, weexpect that APWeb-WAIM will become an increasingly popular forum that bringstogether outstanding researchers and developers in the field of the Web and big datafrom around the world The high-quality program documented in these proceedingswould not have been possible without the authors who chose APWeb-WAIM fordisseminating their findings Out of 168 submissions, the conference accepted 39regular (23.21%), 31 short research papers, and six demonstrations The contributedpapers address a wide range of topics, such as text analysis, graph data processing,social networks, recommender systems, information retrieval, data streams, knowledgegraph, data mining and application, query processing, machine learning, database andWeb applications, big data, and blockchain The technical program also includedkeynotes by Prof Xuemin Lin (The University of New South Wales, Australia),Prof Lei Chen (The Hong Kong University of Science and Technology, Hong Kong,SAR China), and Prof Ninghui Li (Purdue University, USA) as well as industrialinvited talks by Dr Zhao Cao (Huawei Blockchain) and Jun Yan (YiDu Cloud) Weare grateful to these distinguished scientists for their invaluable contributions to theconference program As a joint conference, teamwork was particularly important forthe success of APWeb-WAIM We are deeply thankful to the Program Committeemembers and the external reviewers for lending their time and expertise to the con-ference Special thanks go to the local Organizing Committee led by Prof Zhiguo Gong

Trang 7

develop-Thanks also go to the workshop co-chairs (Leong Hou U and Haoran Xie), democo-chairs (Zhixu Li, Zhifeng Bao, and Lisi Chen), industry co-chair (Wenyin Liu), tutorialco-chair (Jian Yang), panel chair (Kamal Karlapalem), local arrangements chair(Derek Fai Wong), and publicity co-chairs (An Liu, Feifei Li, Wen-Chih Peng, andLadjel Bellatreche) Their efforts were essential to the success of the conference Lastbut not least, we wish to express our gratitude to the treasurer (Andrew Shibo Jiang),the Webmaster (William Sio) for all the hard work, and to our sponsors who generouslysupported the smooth running of the conference We hope you enjoy the excitingprogram of APWeb-WAIM 2018 as documented in these proceedings.

Jianliang XuYoshiharu Ishikawa

Trang 8

Zhiguo Gong University of Macau, SAR China

Qing Li City University of Hong Kong, SAR ChinaKam-fai Wong Chinese University of Hong Kong, SAR China

Leong Hou U University of Macau, SAR China

Haoran Xie Education University of Hong Kong, SAR China

Demo Co-chairs

Lisi Chen Wollongong University, Australia

Trang 9

Wen-Chih Peng National Taiwan University, China

Ladjel Bellatreche ISAE-ENSMA, Poitiers, France

Treasurers

Andrew Shibo Jiang Macau Convention and Exhibition Association,

SAR China

Local Arrangements Chair

Derek Fai Wong University of Macau, SAR China

Webmaster

William Sio University of Macau, SAR China

Senior Program Committee

Byron Choi Hong Kong Baptist University, SAR China

Christian Jensen Aalborg University, Denmark

Demetrios

Zeinalipour-Yazti

University of Cyprus, Cyprus

Guoliang Li Tsinghua University, China

K Selçuk Candan Arizona State University, USA

Kyuseok Shim Seoul National University, South Korea

Makoto Onizuka Osaka University, Japan

Reynold Cheng The University of Hong Kong, SAR China

Toshiyuki Amagasa University of Tsukuba, Japan

Wang-Chien Lee Pennsylvania State University, USA

Wen-Chih Peng National Chiao Tung University, Taiwan

Wook-Shin Han Pohang University of Science and Technology, South KoreaXiaokui Xiao National University of Singapore, Singapore

Ying Zhang University of Technology Sydney, Australia

Program Committee

Alex Thomo University of Victoria, Canada

Baoning Niu Taiyuan University of Technology, China

Bo Tang Southern University of Science and Technology, ChinaZouhaier Brahmia University of Sfax, Tunisia

Carson Leung University of Manitoba, Canada

Cheng Long Queen’s University Belfast, UK

VIII Organization

Trang 10

Chih-Chien Hung Tamkang University, China

Chih-Hua Tai National Taipei University, China

Cuiping Li Renmin University of China, China

Daniele Riboni University of Cagliari, Italy

Defu Lian Big Data Research Center, University of Electronic

Science and Technology of China, ChinaDejing Dou University of Oregon, USA

Dimitris Sacharidis Technische Universität Wien, Austria

Ganzhao Yuan Sun Yat-sen University, China

Giovanna Guerrini Università di Genova, Italy

Guanfeng Liu The University of Queensland, Australia

Guoqiong Liao Jiangxi University of Finance and Economics, ChinaGuanling Lee National Dong Hwa University, China

Haibo Hu Hong Kong Polytechnic University, SAR ChinaHailong Sun Beihang University, China

Han Su University of Southern California, USA

Haoran Xie The Education University of Hong Kong, SAR ChinaHiroaki Ohshima University of Hyogo, Japan

Hong Chen Renmin University of China, China

Hongyan Liu Tsinghua University, China

Hongzhi Wang Harbin Institute of Technology, China

Hongzhi Yin The University of Queensland, Australia

Hua Wang Victoria University, Australia

Ilaria Bartolini University of Bologna, Italy

James Cheng Chinese University of Hong Kong, SAR ChinaJeffrey Xu Yu Chinese University of Hong Kong, SAR ChinaJiajun Liu Renmin University of China, China

Jialong Han Nanyang Technological University, SingaporeJianbin Huang Xidian University, China

Jian Yin Sun Yat-sen University, China

Jiannan Wang Simon Fraser University, Canada

Jianting Zhang City College of New York, USA

Jianxin Li Beihang University, China

Jianzhong Qi University of Melbourne, Australia

Jinchuan Chen Renmin University of China, China

Junhu Wang Grifﬁth University, Australia

Kai Zheng University of Electronic Science and Technology

of China, ChinaKarine Zeitouni Université de Versailles Saint-Quentin, France

Lianghuai Yang Zhejiang University of Technology, China

Organization IX

Trang 11

Lisi Chen Wollongong University, Australia

Maria Damiani University of Milan, Italy

Markus Endres University of Augsburg, Germany

Mihai Lupu Vienna University of Technology, Austria

Mirco Nanni ISTI-CNR Pisa, Italy

Mizuho Iwaihara Waseda University, Japan

Peiquan Jin University of Science and Technology of China, China

Qin Lu University of Technology Sydney, Australia

Ralf Hartmut Güting Fernuniversität in Hagen, Germany

Raymond Chi-Wing Wong Hong Kong University of Science and Technology,

SAR ChinaRonghua Li Shenzhen University, China

Rui Zhang University of Melbourne, Australia

Sanghyun Park Yonsei University, South Korea

Sanjay Madria Missouri University of Science and Technology, USAShaoxu Song Tsinghua University, China

Shengli Wu Jiangsu University, China

Shimin Chen Chinese Academy of Sciences, China

Shuo Shang King Abdullah University of Science and Technology,

Saudi ArabiaTakahiro Hara Osaka University, Japan

Tieyun Qian Wuhan University, China

Tingjian Ge University of Massachusetts, Lowell, USA

Tom Z J Fu Advanced Digital Sciences Center, Singapore

Tru Cao Ho Chi Minh City University of Technology, VietnamVincent Oria New Jersey Institute of Technology, USA

Wee Ng Institute for Infocomm Research, Singapore

Wei Wang University of New South wales, Australia

Weining Qian East China Normal University, China

Weiwei Sun Fudan University, China

Wolf-Tilo Balke Technische Universität Braunschweig, GermanyWookey Lee Inha University, South Korea

Xiang Zhao National University of Defence Technology, ChinaXiang Lian Kent State University, USA

Xiangliang Zhang King Abdullah University of Science and Technology,

Saudi ArabiaXiangmin Zhou RMIT University, Australia

Xiaochun Yang Northeast University, China

Xiaofeng He East China Normal University, China

Xiaohui (Daniel) Tao The University of Southern Queensland, AustraliaXiaoyong Du Renmin University of China, China

Xike Xie University of Science and Technology of China, China

Trang 12

Xin Cao The University of New South Wales, AustraliaXin Huang Hong Kong Baptist University, SAR China

Xingquan Zhu Florida Atlantic University, USA

Xuan Zhou Renmin University of China, China

Yanghua Xiao Fudan University, China

Yanghui Rao Sun Yat-sen University, China

Yang-Sae Moon Kangwon National University, South Korea

Yaokai Feng Kyushu University, Japan

Yi Cai South China University of Technology, ChinaYijie Wang National University of Defense Technology, ChinaYingxia Shao Peking University, China

Yongxin Tong Beihang University, China

Yuan Fang Institute for Infocomm Research, Singapore

Yunjun Gao Zhejiang University, China

Zakaria Maamar Zayed University, United Arab of Emirates

Zhaonian Zou Harbin Institute of Technology, China

Zhiwei Zhang Hong Kong Baptist University, SAR China

Organization XI

Trang 13

Keynotes

Trang 14

Graph Processing: Applications, Challenges,

and Advances

Xuemin Lin

School of Computer Science and Engineering,

University of New South Wales, Sydneylxue@cse.unsw.edu.au

Abstract.Graph data are key parts of Big Data and widely used for modellingcomplex structured data with a broad spectrum of applications Over the lastdecade, tremendous research efforts have been devoted to many fundamentalproblems in managing and analyzing graph data In this talk, I will cover variousapplications, challenges, and recent advances We will also look to the future

of the area

Trang 15

Differential Privacy in the Local Setting

of the art of LDP We survey recent developments for LDP, and discuss tocols for estimating frequencies of different values under LDP, and for com-puting marginal when each user has multiple attributes Finally, we discusslimitations and open problems of LDP

Trang 16

pro-Big Data, AI, and HI, What is the Next?

Lei Chen

Department of Computer Science and Engineering, Hong Kong University

of Science and Technologyleichen@cse.ust.hk

Abstract.Recently, AI has become quite popular and attractive, not only to theacademia but also to the industry The successful stories of AI on Alpha-go andTexas hold ’em games raise signiﬁcant public interests on AI Meanwhile,human intelligence is turning out to be more sophisticated, and Big Datatechnology is everywhere to improve our life quality The question we all want

to ask is“what is the next?” In this talk, I will discuss about DHA, a newcomputing paradigm, which combines big Data, Human intelligence, and AI.First I will briefly explain the motivation of DHA Then I will present somechallenges and possible solutions to build this new paradigm

Trang 17

Similarity Calculations of Academic Articles Using Topic Events

and Domain Knowledge 45Ming Liu, Bo Lang, and Zepeng Gu

Sentiment Classification via Supplementary Information Modeling 54Zenan Xu, Yetao Fu, Xingming Chen, Yanghui Rao, Haoran Xie,

Fu Lee Wang, and Yang Peng

Training Set Similarity Based Parameter Selection for Statistical

Machine Translation 63Xuewen Shi, Heyan Huang, Ping Jian, and Yi-Kun Tang

Matrix Factorization Meets Social Network Embedding for Rating

Prediction 121Menghao Zhang, Binbin Hu, Chuan Shi, Bin Wu, and Bai Wang

An Estimation Framework of Node Contribution Based

on Diffusion Information 130Zhijian Zhang, Ling Liu, Kun Yue, and Weiyi Liu

Trang 18

Multivariate Time Series Clustering via Multi-relational Community

Detection in Networks 138Guowang Du, Lihua Zhou, Lizhen Wang, and Hongmei Chen

Recommender Systems

NSPD: An N-stage Purchase Decision Model

for E-commerce Recommendation 149Cairong Yan, Yan Huang, Qinglong Zhang, and Yan Wan

Social Image Recommendation Based on Path Relevance 165Zhang Chuanyan, Hong Xiaoguang, and Peng Zhaohui

Representation Learning with Depth and Breadth for Recommendation

Using Multi-view Data 181Xiaotian Han, Chuan Shi, Lei Zheng, Philip S Yu, Jianxin Li,

UIContextListRank: A Listwise Recommendation Model

with Social Contextual Information 207Zhenhua Huang, Chang Yu, Jiujun Cheng, and Zhixiao Wang

and Gang Wang

LIDH: An Efficient Filtering Method for Approximate k Nearest

Neighbor Queries Based on Local Intrinsic Dimension 268Yang Song, Yu Gu, and Ge Yu

Query Performance Prediction and Classification for Information

Search Systems 277Zhongmin Zhang, Jiawei Chen, and Shengli Wu

XX Contents– Part I

Trang 19

Aggregate Query Processing on Incomplete Data 286Anzhen Zhang, Jinbao Wang, Jianzhong Li, and Hong Gao

Machine Learning

Travel Time Forecasting with Combination of Spatial-Temporal

and Time Shifting Correlation in CNN-LSTM Neural Network 297Wenjing Wei, Xiaoyi Jia, Yang Liu, and Xiaohui Yu

DMDP2: A Dynamic Multi-source Based Default Probability

Prediction Framework 312

Yi Zhao, Yong Huang, and Yanyan Shen

Brain Disease Diagnosis Using Deep Learning Features from Longitudinal

MR Images 327Linlin Gao, Haiwei Pan, Fujun Liu, Xiaoqin Xie, Zhiqiang Zhang,

Jinming Han, and the Alzheimer’s Disease Neuroimaging Initiative

Attention-Based Recurrent Neural Network for Sequence Labeling 340Bofang Li, Tao Liu, Zhe Zhao, and Xiaoyong Du

Haze Forecasting via Deep LSTM 349Fan Feng, Jikai Wu, Wei Sun, Yushuang Wu, HuaKang Li,

and Xingguo Chen

Importance-Weighted Distance Aware Stocks Trend Prediction 357Zherong Zhang, Wenge Rong, Yuanxin Ouyang, and Zhang Xiong

Knowledge Graph

Jointly Modeling Structural and Textual Representation for Knowledge

Graph Completion in Zero-Shot Scenario 369Jianhui Ding, Shiheng Ma, Weijia Jia, and Minyi Guo

Neural Typing Entities in Chinese-Pedia 385Yongjian You, Shaohua Zhang, Jiong Lou, Xinsong Zhang,

and Weijia Jia

Knowledge Graph Embedding by Learning to Connect Entity with Relation 400Zichao Huang, Bo Li, and Jian Yin

StarMR: An Efficient Star-Decomposition Based Query Processor for

SPARQL Basic Graph Patterns Using MapReduce 415Qiang Xu, Xin Wang, Jianxin Li, Ying Gan, Lele Chai, and Junhu Wang

DAVE: Extracting Domain Attributes and Values from Text Corpus 431Yongxin Shen, Zhixu Li, Wenling Zhang, An Liu, and Xiaofang Zhou

Contents– Part I XXI

Trang 20

PRSPR: An Adaptive Framework for Massive RDF Stream Reasoning 440Guozheng Rao, Bo Zhao, Xiaowang Zhang, Zhiyong Feng,

and Guohui Xiao

Demo Papers

TSRS: Trip Service Recommended System Based on Summarized

Co-location Patterns 451Peizhong Yang, Tao Zhang, and Lizhen Wang

DFCPM: A Dominant Feature Co-location Pattern Miner 456Yuan Fang, Lizhen Wang, Teng Hu, and Xiaoxuan Wang

CUTE: Querying Knowledge Graphs by Tabular Examples 461Zichen Wang, Tian Li, Yingxia Shao, and Bin Cui

ALTAS: An Intelligent Text Analysis System Based on Knowledge Graphs 466Xiaoli Wang, Chuchu Gao, Jiangjiang Cao, Kunhui Lin, Wenyuan Du,

and Zixiang Yang

SPARQLVis: An Interactive Visualization Tool for Knowledge Graphs 471Chaozhou Yang, Xin Wang, Qiang Xu, and Weixi Li

PBR: A Personalized Book Resource Recommendation System 475Yajie Zhu, Feng Xiong, Qing Xie, Lin Li, and Yongjian Liu

Author Index 481

XXII Contents– Part I

Trang 21

Contents – Part II

Database and Web Applications

Fuzzy Searching Encryption with Complex Wild-Cards Queries

on Encrypted Database 3

He Chen, Xiuxia Tian, and Cheqing Jin

Towards Privacy-Preserving Travel-Time-First Task Assignment

in Spatial Crowdsourcing 19Jian Li, An Liu, Weiqi Wang, Zhixu Li, Guanfeng Liu, Lei Zhao,

and Kai Zheng

Plover: Parallel In-Memory Database Logging on Scalable Storage Devices 35Huan Zhou, Jinwei Guo, Ouya Pei, Weining Qian, Xuan Zhou,

and Aoying Zhou

Inferring Regular Expressions with Interleaving from XML Data 44Xiaolan Zhang, Yeting Li, Fei Tian, Fanlin Cui, Chunmei Dong,

and Haiming Chen

Efficient Query Reverse Engineering for Joins

and OLAP-Style Aggregations 53Wei Chit Tan

DCA: The Advanced Privacy-Enhancing Schemes

for Location-Based Services 63Jiaxun Hua, Yu Liu, Yibin Shen, Xiuxia Tian, and Cheqing Jin

Data Streams

Discussion on Fast and Accurate Sketches for Skewed Data Streams:

A Case Study 75Shuhao Sun and Dagang Li

Matching Consecutive Subpatterns over Streaming Time Series 90Rong Kang, Chen Wang, Peng Wang, Yuting Ding, and Jianmin Wang

A Data Services Composition Approach for Continuous Query

on Data Streams 106Guiling Wang, Xiaojiang Zuo, Marc Hesenius, Yao Xu, Yanbo Han,

and Volker Gruhn

Trang 22

Discovering Multiple Time Lags of Temporal Dependencies from

Fluctuating Events 121Wentao Wang, Chunqiu Zeng, and Tao Li

A Combined Model for Time Series Prediction in Financial Markets 138Hongbo Sun, Chenkai Guo, Jing Xu, Jingwen Zhu, and Chao Zhang

Data Mining and Application

Location Prediction in Social Networks 151Rong Liu, Guanglin Cong, Bolong Zheng, Kai Zheng, and Han Su

Efficient Longest Streak Discovery in Multidimensional Sequence Data 166Wentao Wang, Bo Tang, and Min Zhu

Map Matching Algorithms: An Experimental Evaluation 182

Na Ta, Jiuqi Wang, and Guoliang Li

Predicting Passenger’s Public Transportation Travel Route Using Smart

Card Data 199Chen Yang, Wei Chen, Bolong Zheng, Tieke He, Kai Zheng, and Han Su

Detecting Taxi Speeding from Sparse and Low-Sampled Trajectory Data 214Xibo Zhou, Qiong Luo, Dian Zhang, and Lionel M Ni

Cloned Vehicle Behavior Analysis Framework 223Minxi Li, Jiali Mao, Xiaodong Qi, Peisen Yuan, and Cheqing Jin

An Event Correlation Based Approach to Predictive Maintenance 232Meiling Zhu, Chen Liu, and Yanbo Han

Using Crowdsourcing for Fine-Grained Entity Type Completion in

Knowledge Bases 248Zhaoan Dong, Ju Fan, Jiaheng Lu, Xiaoyong Du, and Tok Wang Ling

Improving Clinical Named Entity Recognition with Global

Neural Attention 264Guohai Xu, Chengyu Wang, and Xiaofeng He

Exploiting Implicit Social Relationship for Point-of-Interest

Recommendation 280Haifeng Zhu, Pengpeng Zhao, Zhixu Li, Jiajie Xu, Lei Zhao,

and Victor S Sheng

Spatial Co-location Pattern Mining Based on Density Peaks Clustering

and Fuzzy Theory 298Yuan Fang, Lizhen Wang, and Teng Hu

XXIV Contents– Part II

Trang 23

A Tensor-Based Method for Geosensor Data Forecasting 306Lihua Zhou, Guowang Du, Qing Xiao, and Lizhen Wang

Keyphrase Extraction Based on Optimized Random Walks

on Multiple Word Relations 359Wenyan Chen, Zheng Liu, Wei Shi, and Jeffrey Xu Yu

Answering Range-Based Reverse kNN Queries 368Zhefan Zhong, Xin Lin, Liang He, and Yan Yang

Big Data and Blockchain

EarnCache: Self-adaptive Incremental Caching for Big Data Applications 379Yifeng Luo, Junshi Guo, and Shuigeng Zhou

Storage and Recreation Trade-Off for Multi-version Data Management 394Yin Zhang, Huiping Liu, Cheqing Jin, and Ye Guo

Decentralized Data Integrity Verification Model in Untrusted Environment 410Kun Hao, Junchang Xin, Zhiqiong Wang, Zhuochen Jiang,

and Guoren Wang

Enabling Concurrency on Smart Contracts Using Multiversion Ordering 425

An Zhang and Kunlong Zhang

ElasticChain: Support Very Large Blockchain by Reducing

Data Redundancy 440Dayu Jia, Junchang Xin, Zhiqiong Wang, Wei Guo, and Guoren Wang

A MapReduce-Based Approach for Mining Embedded Patterns

from Large Tree Data 455Wen Zhao and Xiaoying Wu

Author Index 463

Contents– Part II XXV

Trang 24

Text Analysis

Trang 25

Abstractive Summarization with the Aid

of Extractive Summarization

Yangbin Chen(B), Yun Ma, Xudong Mao, and Qing Li

City University of Hong Kong, Hong Kong SAR, China

{robinchen2-c,yunma3-c,xdmao2-c}@my.cityu.edu.hk,

qing.li@cityu.edu.hk

Abstract Currently the abstractive method and extractive method are

two main approaches for automatic document summarization To fullyintegrate the relatedness and advantages of both approaches, we pro-pose in this paper a general framework for abstractive summarizationwhich incorporates extractive summarization as an auxiliary task Inparticular, our framework is composed of a shared hierarchical docu-ment encoder, an attention-based decoder for abstractive summarization,and an extractor for sentence-level extractive summarization Learn-ing these two tasks jointly with the shared encoder allows us to bet-ter capture the semantics in the document Moreover, we constrain theattention learned in the abstractive task by the salience estimated inthe extractive task to strengthen their consistency Experiments on theCNN/DailyMail dataset demonstrate that both the auxiliary task andthe attention constraint contribute to improve the performance signiﬁ-cantly, and our model is comparable to the state-of-the-art abstractivemodels

Keywords: Abstractive document summarization

Squence-to-sequence·Joint learning

1 Introduction

Automatic document summarization has been studied for decades The target

of document summarization is to generate a shorter passage from the ment in a grammatically and logically coherent way, meanwhile preserving theimportant information There are two main approaches for document summa-rization: extractive summarization and abstractive summarization The extrac-tive method ﬁrst extracts salient sentences or phrases from the source documentand then groups them to produce a summary without changing the source text.Graph-based ranking model [1,2] and feature-based classiﬁcation model [3,4] aretypical models for extractive summarization However, the extractive methodunavoidably includes secondary or redundant information and is far from theway humans write summaries [5]

docu-c

Springer International Publishing AG, part of Springer Nature 2018

Y Cai et al (Eds.): APWeb-WAIM 2018, LNCS 10987, pp 3–15, 2018.

Trang 26

4 Y Chen et al.

The abstractive method, in contrast, produces generalized summaries, veying information in a concise way, and eliminating the limitations to the orig-inal words and sentences of the document This task is more challenging since

con-it needs advanced language generation and compression techniques Discoursestructures [6,7] and semantics [8,9] are most commonly used by researchers forgenerating abstractive summaries

Recently, Recurrent Neural Network (RNN)-based sequence-to-sequencemodel with attention mechanism has been applied to abstractive summariza-tion, due to its great success in machine translation [22,27,30] However, thereare still some challenges First, the RNN-based models have diﬃculties in cap-turing long-term dependencies, making summarization for long document muchtougher Second, diﬀerent from machine translation which has strong correspon-dence between the source and target words, an abstractive summary corresponds

to only a small part of the source document, making its attention diﬃcult to belearned

We adopt hierarchical approaches for the long-term dependency problem,which have been used in many tasks such as machine translation and docu-ment classiﬁcation [10,11] But few of them have been applied to the abstractivesummarization tasks In particular, we encode the input document in a hierar-chical way from word-level to sentence-level There are two advantages First, itcaptures both the local and global semantic representations, resulting in betterfeature learning Second, it improves the training eﬃciency because the timecomplexity of the RNN-based model can be reduced by splitting the long docu-ment into short sentences

The attention mechanism is widely used in sequence-to-sequence tasks [13,

27] However, for abstractive summarization, it is diﬃcult to learn the attentionsince only a small part of the source document is important to the summary

In this paper, we propose two methods to learn a better attention distribution.First, we use a hierarchical attention mechanism, which means that the attention

is applied in both word and sentence levels Similar to the hierarchical approach

in encoding, the advantage of using hierarchical attention is to capture both thelocal and global semantic representations Second, we use the salience scores ofthe auxiliary task (i.e., the extractive summarization) to constrain the sentence-level attention

In this paper, we present a novel technique for abstractive summarizationwhich incorporates extractive summarization as an auxiliary task Our frame-work consists of three parts: a shared document encoder, a hierarchical attention-based decoder and an extractor As Fig.1shows, we encode the document in ahierarchical way (Fig.1 (1) and (2)) in order to address the long-term depen-dency problem Then the learned document representations are shared by theextractor (Fig.1 (3)) and the hierarchical attention-based decoder (Fig.1 (5)).The extractor and the decoder are jointly trained which can capture bettersemantics of the document Furthermore, as both the sentence salience scores

in the extractor and the sentence-level attention in the decoder indicate the

Trang 27

Abstractive Summarization with the Aid of Extractive Summarization 5

Fig 1 General framework of our proposed model with 5 components: (1) word-level

encoder encodes the sentences word-by-word independently, (2) sentence- level encoderencodes the document sentence-by-sentence, (3) sentence extractor makes binary clas-siﬁcation for each sentence, (4) hierarchical attention calculates the word-level andsentence-level context vectors for decoding steps, (5) decoder decodes the outputsequential word sequence with a beam-search algorithm

importance of source sentences, we constrain the learned attention (Fig.1 (4))with the extracted sentence salience in order to strengthen their consistency

We have conducted experiments on a news corpus - the CNN/DailyMaildataset [16] The results demonstrate that adding the auxiliary extractive taskand constraining the attention are both useful to improve the performance ofthe abstractive task, and our proposed joint model is comparable to the state-of-the-art abstractive models

2 Neural Summarization Model

In this section we describe the framework of our proposed model which consists

of ﬁve components As illustrated in Fig.1, the hierarchical document encoderwhich includes both the word-level and the sentence-level encoders reads theinput word sequences and generates shared document representations On onehand, the shared representations are fed into the sentence extractor which is

a sequence labeling model to calculate salience scores On the other hand, therepresentations are used to generate abstractive summaries by a GRU-based lan-guage model, with the hierarchical attention including the sentence-level atten-tion and word-level attention Finally, the two tasks are jointly trained

Trang 28

6 Y Chen et al.

We encode the document in a hierarchical way In particular, the word sequencesare first encoded by a bidirectional GRU network parallelly, and a sequence ofsentence-level vector representations called sentence embeddings are generated.Then the sentence embeddings are fed into another bidirectional GRU networkand get the document representations Such an architecture has two advantages.First, it can reduce the negative effects during the training process caused by thelong-term dependency problem, so that the document can be represented fromboth local and global aspects Second, it helps improve the training efficiency asthe time complexity of RNN-based model increases with the sequence length

Formally, let V denote the vocabulary which contains D tokens, and each token is embedded as a d-dimension vector Given an input document X con-

taining m sentences {X i , i ∈ 1, , m}, let n i denote the number of words in Xi

Word-level Encoder reads a sentence word-by-word until the end, using a

bidirectional GRU network as the following equations:

h w

← −

h w i,j = GRU ( x i,j , ← −

h w

where x i,j represents the embedding vector of the jth word in th ith sentence.

← h w

i,jis a concatenated vector of the forward hidden state− → h w

i,jand the backwardhidden state← h − w

i,j H is the size of the hidden state.

Furthermore, the ith sentence is represented by a non-linear transformation

of the word-level hidden states as follows:

wheres i is the sentence embedding and W,b are learnable parameters.

Sentence-level Encoder reads a document sentence-by-sentence until the

end, using another bi-directional GRU network as depicted by the followingequations:

i.The concatenated vectors←

h s

i are document representations shared by thetwo tasks which will be introduced next

Trang 29

The sentence extractor can be viewed as a sequential binary classiﬁer We use alogistic function to calculate a score between 0 and 1, which is an indicator ofwhether or not to keep the sentence in the ﬁnal summary The score can also

be considered as the salience of a sentence in the document Letp i denote thescore and q i ∈ {0, 1} denote the result of whether or not to keep the sentence.

In particular,p i is calculated as follows:

where Wextr is the weight and b extr is the bias which can be learned

The sentence extractor generates a sequence of probabilities indicating theimportance of the sentences As a result, the extractive summary is created by

selecting sentences with a probability larger than a given threshold τ We set

τ = 0.5 in our experiment We choose the cross entropy as the extractive loss

t is the sentence-level context vector and cw

t is the word-level context

vector at decoding time step t Speciﬁcally, α t,i denotes the attention value on

the ith sentence and β t,i,j denotes the attention value on the jth word of the ith sentence.

The input of the GRU-based language model at decoding time step t

con-tains three vectors: the word embedding of previous generated word ˆy t−1, thesentence-level context vector of previous time stepc s

t−1 and the word-level text vector of previous time stepc w

con-t−1 They are transformed by a linear functionand fed into the language model as follows:

˜

h t = GRU ( ˜ h t−1 , f in(ˆy t−1 , c s

t−1 , c w

Trang 30

8 Y Chen et al.

where ˜h t is the hidden state of decoding time step t f in is the linear

transfor-mation function with Wdec as the weight and b dec as the bias

The hidden states of the language model are used to generate the outputword sequence The conditional probability distribution over the vocabulary in

the tth time step is:

P (ˆ y t |ˆy1, , ˆ y t−1 , x) = g(f out( ˜h t , c s

t , c w

where g is the softmax function and f out is a linear function with Wsof t and

b sof t as learnable parameters

The negative log likelihood loss is applied as the loss of the decoder, i.e.,

α t,i= e s

t,i

m

k=1 e s t,k

where Vs, Wdec1 , Ws1 andb s

1 are learnable parameters

The word-level attention indicates the salience distribution over the sourcewords As the hierarchical encoder reads the input sentences independently, ourmodel has two distinctions First, the word-level attention is calculated within

a sentence Second, we multiply the word-level attention by the sentence-levelattention of the sentence which the word belongs to The word-level attentioncalculation is shown below:

β t,i,j=α t,i

e w t,i,j

n i

l=1 e w t,i,l

where Vw, Wdec2 , Ww2 andb w

2 are learnable parameters for the word-level tion calculation

atten-The abstractive summary of a long document can be viewed as a new sion of the most salient sentences of the document, so that a well-learned sentenceextractor and a well-learned attention distribution should both be able to detect

Trang 31

expres-Abstractive Summarization with the Aid of Extractive Summarization 9

the important sentences of the source document Motivated by this, we design aconstraint to the sentence-level attention which is an L2 loss as follows:

The parameters are trained to minimize the joint loss function In the ence stage, we use the beam search algorithm to select the word which approxi-mately maximizes the conditional probability [17,18,28]

Table 1 The statistics of the CNN/DailyMail dataset S.S.N indicates the average

number of sentences in the source document S.S.L indicates the average length of thesentences in the source document T.S.L indicates the average length of the sentences

in the target summary

CNN/DailyMail 277,554 13,367 11,443 26.9 27.3 53.8

Trang 32

10 Y Chen et al.

In our implementation, we set the vocabulary size D to be 50 K and word ding size d as 300 The word embeddings have not been pretrained as the training

embed-corpus is large enough to train them from scratch We cut oﬀ the documents as

a maximum of 35 sentences and truncate the sentences with a maximum of 50words We also truncate the targeted summaries with a maximum of 100 words.The word-level encoder and the sentence-level encoder each corresponds a layer

of bidirectional GRU, and the decoder also is a layer of unidirectional GRU

All the three networks have the hidden size H as 200 For the loss function, λ

is set as 100 and γ is set as 0.5 During the training process, we use Adagrad

optimizer [31] with the learning rate of 0.15 and initial accumulator value of0.1 The mini-batch size is 16 We implement the model in Tensorﬂow and train

it using a GTX-1080Ti GPU The beam search size for decoding is 5 We useROUGE scores [20] to evaluate the summarization models

4 Experimental Results

We compare the full-length Rouge-F1 score on the entire CNN/DailyMail testset We use the fundamental sequence-to-sequence attentional model and thewords-lvt2k-hieratt [13] as baselines The results are shown in Table2

Table 2 Performance comparison of various abstractive models on the entire CNN/DailyMail test set using full- length F1 variants of Rouge.

words-lvt2k-hieratt 35.4 13.3 32.6

From Table2, we can see that our model performs the best in 1,

Rouge-2 and Rouge-L Compared to the vanilla sequence-to-sequence attentional model,our proposed model performs quite better And compared to the hierarchicalmodel, our model performs better in Rouge-L, which is due to the incorporation

of the auxiliary task

To verify the eﬀectiveness of our proposed model, we conduct ablation study byremoving the corresponding parts, i.e the auxiliary extractive task, the attentionconstraint and combination of them in order to make a comparison among their

Trang 33

Table 3 Performance comparison of removing the components of our proposed model

on the entire CNN/DailyMail test set using full-length F1 variants of Rouge.

a more important role in our framework

We list some examples of the generated summaries of a source document(news)

in Fig.2 The source document contains 24 sentences with totally 660 words.Figure2presents three summaries: a golden summary which is the news highlightwritten by the reporter, the summary generated by our proposed model, and thesummary generated by the sequence-to-sequence attentional model

From the figure we can see that all system-generated summaries are copiedwords from the source document, because the highlights written by reportersused for training are usually partly copied from the source However, differ-ent models have different characteristics As illustrated in Fig.2, all the foursummaries are able to catch several key sentences from the document The fun-damental seq2seq+attn model misses some words like pronouns, which leads togrammatical mistakes in the generated summary Our model without the auxil-iary extractive task is able to detect more salient content, but the concatenatedsentences have some grammatical mistakes and redundant words Our modelwithout the attention constraint generates fluent sentences which are very sim-ilar to the source sentences, but it focuses on just a small part of the sourcedocument The summary generated by our proposed full model is most similar

to the golden summary: it covers as much information and keeps correct mar It does not just copy sentences but use segmentation Moreover, it changesthe order of the source sentences while keeping the logical coherence

Our model has the advantages from three aspects First, summaries ated by our model contain as much important information and perform well

Trang 34

gener-12 Y Chen et al.

Fig 2 An example of summaries towards a piece of news From top to down, the

ﬁrst is the source document which is the raw news content The second is the goldensummary which is used as the ground truth The third is the summary generated byour proposed model The last is the summary generated by the vanilla sequence-to-sequence attentional model

grammatically In practice, it depends on users’ preference between the tion coverage and condensibility to make a suitable balance Compared to thelow recall abstractive methods, our model is able to cover more information Andcompared to the extractive methods, the generated summaries are more coher-ent logically Second, the time complexity of our approach is much less than thebaselines due to hierarchical structures, and our model is trained more quicklycompared to those baselines Third, as our key contribution is to improve theperformance of the main task by incorporating an auxiliary task, in this exper-iment we just use normal GRU-based encoder and decoder for simplicity Morenovel design for the decoder such as the hierarchical decoder can also be appliedand incorporated into our model

informa-5 Related Work

The neural attentional abstractive summarization model was ﬁrst applied in tence compression [12], where the input sequence is encoded by a convolutionalnetwork and the output sequence is decoded by a standard feedforward NeuralNetwork Language Model (NNLM) Chopra et al [13] and Lopyrev [23] switched

sen-to RNN-type model as the encoder, and did experiments on various values of

Trang 35

hyper parameters To address the out-of-vocabulary problem, Gu et al [24], Cao

et al [26] and See et al [25] presented the copy mechanism which adds a tion operation between the hidden state and the output layer at each decodingtime step so as to decide whether to generate a new word from the vocabulary

selec-or copy the wselec-ord directly from the source sentence

The sequence-to-sequence model with attention mechanism [13] achievescompetitive performance for sentence compression, but is still a challenge fordocument summarization Some researchers use hierarchical encoder to addressthe long-term dependency problem, yet most of the works are for extractive sum-marization tasks Nallapati et al [29] fed the input word embedding extendedwith new features to the word-level bidirectional GRU network and generatedsequential labels from the sentence-level representations Cheng and Lapata[16] presented a sentence extraction and word extraction model, encoding thesentences independently using Convolutional Neural Networks and decoding abinary sequence for sentence extraction as well as a word sequence for wordextraction Nallapati et al [21] proposed a hierarchical attention with a hierar-chical encoder, in which the word-level attention represents a probability distri-bution over the entire document

Most previous works consider the extractive summarization and abstractivesummarization as two independent tasks The extractive task has the advantage

of preserving the original information, and the abstractive task has the advantage

of generating coherent sentences It is thus reasonable and feasible to combinethese two tasks Tan et al [14] as the ﬁrst attempt to combine the two, tried touse the extracted sentence scores to calculate the attention for the abstractivedecoder But their proposed model using unsupervised graph-based model torank the sentences is of high computation cost, and incurs long time to train

6 Conclusion

In this work we have presented a sequence-to-sequence model with hierarchicaldocument encoder and hierarchical attention for abstractive summarization, andincorporated extractive summarization as an auxiliary task We jointly train thetwo tasks by sharing the same document encoder The auxiliary task and theattention constraint contribute to improve the performance of the main task.Experiments on the CNN/DailyMail dataset show that our proposed framework

is comparable to the state-of-the-art abstractive models In the future, we willtry to reduce the labels of the auxiliary task and incorporate semi-supervisedand unsupervised methods

Acknowledgements This research has been supported by an innovative technology

fund (project no GHP/036/17SZ) from the Innovation and Technology Commission ofHong Kong, and a donated research project (project no 9220089) at City University

of Hong Kong

Trang 36

5 Yao, J., Wan, X., Xiao, J.: Recent advances in document summarization Knowl.

Inf Syst 53, 297–336 (2017).https://doi.org/10.1007/s10115-017-1042-4

6 Cheung, J.C.K., Penn, G.: Unsupervised sentence enhancement for automatic marization In: EMNLP, Doha, pp 775–786 (2014)

sum-7 Gerani, S., Mehdad, Y., Carenini, G., Ng, R.T., Nejat, B.: Abstractive tion of product reviews using discourse structure In: EMNLP, Doha, pp 1602–1613(2014)

summariza-8 Fang, Y., Zhu, H., Muszynska, E., Kuhnle, A., Teufel, S.H.: A proposition-basedabstractive summarizer In: COLING, Osaka, pp 567–578 (2016)

9 Liu, F., Flanigan, J., Thomson, S., Sadeh, N., Smith, N.A.: Toward tive summarization using semantic representations In: NAACL-HLT, Denver, pp.1077–1086 (2015)

abstrac-10 Li, J., Luong, M.T., Jurafsky, D.: A hierarchical neural autoencoder for paragraphsand documents arXiv preprintarXiv:1506.01057(2015)

11 Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical tion networks for document classiﬁcation In: NAACL-HLT, San Diego, pp 1480–

Trang 37

22 Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning

to align and translate arXiv preprintarXiv:1409.0473(2014)

23 Lopyrev, K.: Generating news headlines with recurrent neural networks arXivpreprintarXiv:1512.01712(2015)

24 Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence learning arXiv preprintarXiv:1603.06393(2016)

sequence-to-25 See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with generator networks arXiv preprintarXiv:1704.04368(2017)

pointer-26 Cao, Z., Luo C., Li, W., Li, S.: Joint copying and restricted generation for phrase In: AAAI, San Francisco, pp 3152–3158 (2017)

para-27 Luong, M.T., Pham, H., Manning, C.D.: Eﬀective approaches to attention-basedneural machine translation arXiv preprint (2015)arXiv:1508.04025

28 Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neuralnetworks In: NIPS, Montreal, pp 3104–3112 (2014)

29 Nallapati, R., Zhai, F., Zhou, B.: Summarunner: a recurrent neural network basedsequence model for extractive summarization of documents In: AAAI, San Fran-cisco, pp 3075–3081 (2017)

30 Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun,M., Cao, Y., Gao, Q., Macherey, K., et al.: Google’s neural machine translationsystem: bridging the gap between human and machine translation arXiv preprint

arXiv:1609.08144(2016)

31 Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning

and stochastic optimization JMLR 12, 2121–2159 (2011)

Trang 38

Rank-Integrated Topic Modeling:

A General Framework

Zhen Zhang, Ruixuan Li, Yuhua Li(B), and Xiwu Gu

School of Computer Science and Technology,Huazhong University of Science and Technology, Wuhan 430074, China

{zenzang,rxli,idcliyuhua,guxiwu}@hust.edu.cn

Abstract Rank-integrated topic models which incorporate link

struc-tures into topic modeling through topical ranking have shown promisingperformance comparing to other link combined topic models However,existing work on rank-integrated topic modeling treats ranking as doc-ument distribution for topic, and therefore can’t integrate topical rank-ing with LDA model, which is one of the most popular topic models

In this paper, we introduce a new method to integrate topical rankingwith topic modeling and propose a general framework for topic model-ing of documents with link structures By interpreting the normalizedtopical ranking score vectors as topic distributions for documents, wefuse ranking into topic modeling in a general framework Under thisgeneral framework, we construct two rank-integrated PLSA models andtwo rank-integrated LDA models, and present the corresponding learn-ing algorithms We apply our models on four real datasets and comparethem with baseline topic models and the state-of-the-art link combinedtopic models in generalization performance, document classiﬁcation, doc-ument clustering and topic interpretability Experiments show that allrank-integrated topic models perform better than baseline models, andrank-integrated LDA models outperform all the compared models

Keywords: Normalized topical ranking·Topic distribution

Rank-integrated topic modeling framework

1 Introduction

With the rapid development of online information systems, document networks,i.e information networks associated with text information, are becoming perva-sive in our digital library For example, research papers are linked together viacitations, web pages are connected by hyperlinks and Tweets are connected viasocial relationships To better mine values from documents with link structures,

we study the problem of building topic models of document networks

The most popular topic models include PLSA (Probabilistic Latent SemanticAnalysis) [1] and LDA (Latent Dirichlet Allocation) [2] Traditional topic modelsassume documents are independent with each other and links among them willc

Springer International Publishing AG, part of Springer Nature 2018

Y Cai et al (Eds.): APWeb-WAIM 2018, LNCS 10987, pp 16–31, 2018.

Trang 39

Rank-Integrated Topic Modeling: A General Framework 17

not be considered in the modeling process Intuitively, linked documents shouldhave similar semantic information, which can be utilized in topic modeling

To take advantage of link structures in document networks, several topicmodels have been proposed One line of this work is to build uniﬁed generativemodels for both texts and links, such as iTopic [3] and RTM [4], and the otherline is to add regularization into topic modeling, such as graph-based regular-izer [5] and rank-based regularizer [6] As a state-of-the-art link combined topicmodel, LIMTopic [7] incorporates link structures into topic modeling throughtopical ranking However, LIMTopic treats topical ranking as document distri-bution for topic which causes that topical ranking can only be combined withsymmetric PLSA model Therefore LIMTopic can not be combined with the pop-ular LDA model To solve this problem, we normalize topical ranking vectorsalong the topic dimension and treat them as topic distributions for documents.Link structures are then fused with text information by iteratively performingtopical ranking and topic modeling in a mutually enhanced framework

In this paper, we propose a general framework for rank-integrated topic eling, which can be integrated with both PLSA and LDA models The maincontributions of this paper are summarized as follows

mod-– A novel approach to integrate topical ranking with topic modeling is proposed,upon which we build a general rank-integrated topic modeling framework fordocument networks

– Under this general framework, we construct two rank-integrated PLSA els, namely RankPLSA and HITSPLSA, and two rank-integrated LDA mod-els, i.e RankLDA and HITSLDA

mod-– Extensive experiments on three publication datasets and one Twitter datasetdemonstrate that rank-integrated topic models perform better than baselinemodels Moreover, rank-integrated LDA models consistently perform betterthan all the compared models

The rest of this paper is organized as follows Section2 reviews the relatedwork, and Sect.3introduces the notations used in topic modeling In Sect.4, wepropose the rank-integrated topic modeling framework and detail the learningalgorithm for rank-integrated PLSA and LDA models Experimental studies arepresented in Sect.5, and we conclude this paper in Sect.6

2 Related Work

Topic modeling algorithms are unsupervised machine learning methods that lyze words of documents to discover themes that run through the corpus anddistributions on these themes for each document PLSA [1] and LDA [2] are twomost well known topic models However, both PLSA and LDA treat documents

ana-in a given corpus as ana-independent to each other Sana-ince their presence, variouskinds of models have been proposed by incorporating contextual informationinto topic modeling, such as time [8] and links [3,4,7,9] Several recent worksintroduce embeddings into topic modeling to improve topic interpretability [10]

Trang 40

18 Z Zhang et al.

or reduce computation complexity [11] To better cope with word sparsity, manyshort text-based topic models have been proposed [12] Topic models have alsobeen explored in other research domains, such as recommender system [13] Themost similar work to ours is the LIMTopic framework [7] The distinguished fea-ture of our work is that we treat topical ranking vectors as topic distributions ofdocuments while LIMTopic treats them as document distributions of topics Ourmethod is arguably more ﬂexible and can construct both rank-integrated PLSAand LDA model under a uniﬁed framework while LIMTopic can only work withsymmetric PLSA model

Our work is also closely related to ranking technology PageRank and HITS(Hyperlink-Induced Topic Search) are two most popular link based ranking algo-rithms Topical link analysis [14] extends basic PageRank and HITS by comput-ing a score vector for each page to distinguish the contribution from diﬀerenttopics Yao et al [15] extend pair-wise ranking models with probabilistic topicmodels and propose a collaborative topic ranking model to alleviate data spar-sity problem in recommender system Ding et al [16] take a topic modelingapproach for preferences ranking by assuming that the preferences of each userare generated from a probabilistic mixture of a few latent global rankings thatare shared across the user population Both of Yao’s and Ding’s models focus onemploying topic modeling to solve ranking problem, while our work incorporateslink structures into topic modeling through ranking

D All the documents in the corpus

D, V, K The number of documents, unique words, topics in the corpus

d Document index in the corpus

N d The number of words in document d

μ The probability of generating speciﬁc documents

θ d The topic distribution of document d, expressed by a multinomial

distribution of topics

γ d The topical ranking vector of document d

w dn The nth word in document d, w dn ∈ {1, 2, V }

z dn The topic assignment of word w dn , z dn ∈ {1, 2, K}

β k The multinomial distribution over words speciﬁc to topic k

α, η Dirichlet priors to multinomial distribution θ , β

Định dạng
Số trang	495
Dung lượng	26,92 MB

Tài liệu tham khảo	Loại	Chi tiết
14. Wu, T., Ling, S., Qi, G., Wang, H.: Mining type information from Chinese online encyclopedias. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Bura- narach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 213–229. Springer, Cham (2015).https://doi.org/10.1007/978-3-319-15615-6 16	Link
16. Xu, B., Zhang, Y., Liang, J., Xiao, Y., Hwang, S., Wang, W.: Cross-lingual type inference. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9642, pp. 447–462. Springer, Cham (2016).https://doi.org/10.1007/978-3-319-32025-0 28	Link
1. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013) 2. Caruana, R.: Multitask learning. In: Thrun, S., Pratt, L. (eds.) Learning to Learn	Khác
5. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., Van Kleef, P., Auer, S., et al.: DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Seman. Web 6(2), 167–195 (2015)	Khác
6. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed repre- sentations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)	Khác
7. Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving chinese linking open data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7032, pp	Khác
9. Preece, A., Hui, K., Gray, A., Marti, P., Bench-Capon, T., Jones, D., Cui, Z.: The kraft architecture for knowledge fusion and transformation. Knowl.-Based Syst.13(2), 113–120 (2000)	Khác
11. Shimaoka, S., Stenetorp, P., Inui, K., Riedel, S.: An attentive neural architecture for ﬁne-grained entity type classiﬁcation. arXiv preprint arXiv:1604.05525 (2016) 12. Silla Jr., C.N., Freitas, A.A.: A survey of hierarchical classiﬁcation across diﬀerentapplication domains. Data Min. Knowl. Disc. 22(1–2), 31–72 (2011)	Khác
13. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge.In: Proceedings of the 16th International Conference on World Wide Web, pp.697–706. ACM (2007)	Khác
15. Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Confer- ence on Management of Data, PP. 481–492. ACM (2012)	Khác
17. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical atten- tion networks for document classiﬁcation. In: HLT-NAACL, pp. 1480–1489 (2016) 18. Yosef, M.A., Bauer, S., Hoﬀart, J., Spaniol, M., Weikum, G.: HYENA: hierarchicaltype classiﬁcation for entity names (2012)	Khác