Web and big data part II 2018

Chih-Chien Hung Tamkang University, ChinaDaniele Riboni University of Cagliari, Italy Defu Lian Big Data Research Center, University of Electronic Science and Technology of China, China

Trang 2

Commenced Publication in 1973

Founding and Former Series Editors:

Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Trang 4

Jianliang Xu (Eds.)

Web and Big Data

Second International Joint Conference, APWeb-WAIM 2018 Macau, China, July 23 –25, 2018

Proceedings, Part II

123

Trang 5

ISSN 0302-9743 ISSN 1611-3349 (electronic)

Lecture Notes in Computer Science

ISBN 978-3-319-96892-6 ISBN 978-3-319-96893-3 (eBook)

https://doi.org/10.1007/978-3-319-96893-3

Library of Congress Control Number: 2018948814

LNCS Sublibrary: SL3 – Information Systems and Applications, incl Internet/Web, and HCI

This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speci ﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on micro ﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made The publisher remains neutral with regard to jurisdictional claims in published maps and institutional af ﬁliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG

The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Trang 6

This volume (LNCS 10987) and its companion volume (LNCS 10988) contain theproceedings of the second Asia-Paciﬁc Web (APWeb) and Web-Age InformationManagement (WAIM) Joint Conference on Web and Big Data, called APWeb-WAIM.This joint conference aims to attract participants from different scientiﬁc communities

as well as from industry, and not merely from the Asia Paciﬁc region, but also fromother continents The objective is to enable the sharing and exchange of ideas, expe-riences, and results in the areas of World Wide Web and big data, thus covering Webtechnologies, database systems, information management, software engineering, andbig data The second APWeb-WAIM conference was held in Macau during July

23–25, 2018 As an Asia-Pacific flagship conference focusing on research, ment, and applications in relation to Web information management, APWeb-WAIMbuilds on the successes of APWeb and WAIM: APWeb was previously held in Beijing(1998), Hong Kong (1999), Xi’an (2000), Changsha (2001), Xi’an (2003), Hangzhou(2004), Shanghai (2005), Harbin (2006), Huangshan (2007), Shenyang (2008), Suzhou(2009), Busan (2010), Beijing (2011), Kunming (2012), Sydney (2013), Changsha(2014), Guangzhou (2015), and Suzhou (2016); and WAIM was held in Shanghai(2000), Xi’an (2001), Beijing (2002), Chengdu (2003), Dalian (2004), Hangzhou(2005), Hong Kong (2006), Huangshan (2007), Zhangjiajie (2008), Suzhou (2009),Jiuzhaigou (2010), Wuhan (2011), Harbin (2012), Beidaihe (2013), Macau (2014),Qingdao (2015), and Nanchang (2016) Thefirst joint APWeb-WAIM conference washeld in Bejing (2017) With the fast development of Web-related technologies, weexpect that APWeb-WAIM will become an increasingly popular forum that bringstogether outstanding researchers and developers in the field of the Web and big datafrom around the world The high-quality program documented in these proceedingswould not have been possible without the authors who chose APWeb-WAIM fordisseminating their findings Out of 168 submissions, the conference accepted 39regular (23.21%), 31 short research papers, and six demonstrations The contributedpapers address a wide range of topics, such as text analysis, graph data processing,social networks, recommender systems, information retrieval, data streams, knowledgegraph, data mining and application, query processing, machine learning, database andWeb applications, big data, and blockchain The technical program also includedkeynotes by Prof Xuemin Lin (The University of New South Wales, Australia),Prof Lei Chen (The Hong Kong University of Science and Technology, Hong Kong,SAR China), and Prof Ninghui Li (Purdue University, USA) as well as industrialinvited talks by Dr Zhao Cao (Huawei Blockchain) and Jun Yan (YiDu Cloud) Weare grateful to these distinguished scientists for their invaluable contributions to theconference program As a joint conference, teamwork was particularly important forthe success of APWeb-WAIM We are deeply thankful to the Program Committeemembers and the external reviewers for lending their time and expertise to the con-ference Special thanks go to the local Organizing Committee led by Prof Zhiguo Gong

Trang 7

develop-Thanks also go to the workshop co-chairs (Leong Hou U and Haoran Xie), democo-chairs (Zhixu Li, Zhifeng Bao, and Lisi Chen), industry co-chair (Wenyin Liu), tutorialco-chair (Jian Yang), panel chair (Kamal Karlapalem), local arrangements chair(Derek Fai Wong), and publicity co-chairs (An Liu, Feifei Li, Wen-Chih Peng, andLadjel Bellatreche) Their efforts were essential to the success of the conference Lastbut not least, we wish to express our gratitude to the treasurer (Andrew Shibo Jiang),the Webmaster (William Sio) for all the hard work, and to our sponsors who generouslysupported the smooth running of the conference We hope you enjoy the excitingprogram of APWeb-WAIM 2018 as documented in these proceedings.

Jianliang XuYoshiharu Ishikawa

Trang 8

Organizing Committee

Honorary Chair

General Co-chairs

Program Co-chairs

Yoshiharu Ishikawa Nagoya University, Japan

Workshop Chairs

Demo Co-chairs

Trang 9

Wen-Chih Peng National Taiwan University, China

Ladjel Bellatreche ISAE-ENSMA, Poitiers, France

Treasurers

Andrew Shibo Jiang Macau Convention and Exhibition Association,

SAR China

Local Arrangements Chair

Webmaster

Senior Program Committee

Christian Jensen Aalborg University, Denmark

Demetrios

Zeinalipour-Yazti

University of Cyprus, Cyprus

K Selçuk Candan Arizona State University, USA

Kyuseok Shim Seoul National University, South Korea

Toshiyuki Amagasa University of Tsukuba, Japan

Wang-Chien Lee Pennsylvania State University, USA

Wen-Chih Peng National Chiao Tung University, Taiwan

Wook-Shin Han Pohang University of Science and Technology, South KoreaXiaokui Xiao National University of Singapore, Singapore

Program Committee

Zouhaier Brahmia University of Sfax, Tunisia

Trang 10

Chih-Chien Hung Tamkang University, China

Daniele Riboni University of Cagliari, Italy

Defu Lian Big Data Research Center, University of Electronic

Science and Technology of China, China

Dimitris Sacharidis Technische Universität Wien, Austria

Giovanna Guerrini Università di Genova, Italy

Guanfeng Liu The University of Queensland, Australia

Guoqiong Liao Jiangxi University of Finance and Economics, China

Hiroaki Ohshima University of Hyogo, Japan

Hongzhi Wang Harbin Institute of Technology, China

Ilaria Bartolini University of Bologna, Italy

Jeffrey Xu Yu Chinese University of Hong Kong, SAR China

of China, ChinaKarine Zeitouni Université de Versailles Saint-Quentin, France

Lianghuai Yang Zhejiang University of Technology, China

Trang 11

Lisi Chen Wollongong University, Australia

Peiquan Jin University of Science and Technology of China, China

Ralf Hartmut Güting Fernuniversität in Hagen, Germany

Raymond Chi-Wing Wong Hong Kong University of Science and Technology,

SAR China

Sanjay Madria Missouri University of Science and Technology, USA

Shuo Shang King Abdullah University of Science and Technology,

Saudi Arabia

Tom Z J Fu Advanced Digital Sciences Center, Singapore

Vincent Oria New Jersey Institute of Technology, USA

Wolf-Tilo Balke Technische Universität Braunschweig, Germany

Xiang Zhao National University of Defence Technology, China

Xiangliang Zhang King Abdullah University of Science and Technology,

Saudi Arabia

Xiaohui (Daniel) Tao The University of Southern Queensland, Australia

Trang 12

Xin Cao The University of New South Wales, Australia

Yang-Sae Moon Kangwon National University, South Korea

Yijie Wang National University of Defense Technology, China

Zakaria Maamar Zayed University, United Arab of Emirates

Zhaonian Zou Harbin Institute of Technology, China

Trang 13

Keynotes

Trang 14

and Advances

Xuemin Lin

School of Computer Science and Engineering,

University of New South Wales, Sydneylxue@cse.unsw.edu.au

Abstract.Graph data are key parts of Big Data and widely used for modellingcomplex structured data with a broad spectrum of applications Over the lastdecade, tremendous research efforts have been devoted to many fundamentalproblems in managing and analyzing graph data In this talk, I will cover variousapplications, challenges, and recent advances We will also look to the future

of the area

Trang 15

of the art of LDP We survey recent developments for LDP, and discuss tocols for estimating frequencies of different values under LDP, and for com-puting marginal when each user has multiple attributes Finally, we discusslimitations and open problems of LDP.

Trang 16

pro-Lei Chen

Department of Computer Science and Engineering, Hong Kong University

of Science and Technologyleichen@cse.ust.hk

Abstract.Recently, AI has become quite popular and attractive, not only to theacademia but also to the industry The successful stories of AI on Alpha-go andTexas hold ’em games raise signiﬁcant public interests on AI Meanwhile,human intelligence is turning out to be more sophisticated, and Big Datatechnology is everywhere to improve our life quality The question we all want

to ask is“what is the next?” In this talk, I will discuss about DHA, a newcomputing paradigm, which combines big Data, Human intelligence, and AI.First I will briefly explain the motivation of DHA Then I will present somechallenges and possible solutions to build this new paradigm

Trang 17

Contents – Part II

Database and Web Applications

Fuzzy Searching Encryption with Complex Wild-Cards Queries

on Encrypted Database 3

He Chen, Xiuxia Tian, and Cheqing Jin

Towards Privacy-Preserving Travel-Time-First Task Assignment

in Spatial Crowdsourcing 19Jian Li, An Liu, Weiqi Wang, Zhixu Li, Guanfeng Liu, Lei Zhao,

and Kai Zheng

Plover: Parallel In-Memory Database Logging on Scalable Storage Devices 35Huan Zhou, Jinwei Guo, Ouya Pei, Weining Qian, Xuan Zhou,

and Aoying Zhou

Inferring Regular Expressions with Interleaving from XML Data 44Xiaolan Zhang, Yeting Li, Fei Tian, Fanlin Cui, Chunmei Dong,

and Haiming Chen

Efficient Query Reverse Engineering for Joins

and OLAP-Style Aggregations 53Wei Chit Tan

DCA: The Advanced Privacy-Enhancing Schemes

for Location-Based Services 63Jiaxun Hua, Yu Liu, Yibin Shen, Xiuxia Tian, and Cheqing Jin

Data Streams

Discussion on Fast and Accurate Sketches for Skewed Data Streams:

A Case Study 75Shuhao Sun and Dagang Li

Matching Consecutive Subpatterns over Streaming Time Series 90Rong Kang, Chen Wang, Peng Wang, Yuting Ding, and Jianmin Wang

A Data Services Composition Approach for Continuous Query

on Data Streams 106Guiling Wang, Xiaojiang Zuo, Marc Hesenius, Yao Xu, Yanbo Han,

and Volker Gruhn

Trang 18

Discovering Multiple Time Lags of Temporal Dependencies from

Fluctuating Events 121Wentao Wang, Chunqiu Zeng, and Tao Li

A Combined Model for Time Series Prediction in Financial Markets 138Hongbo Sun, Chenkai Guo, Jing Xu, Jingwen Zhu, and Chao Zhang

Data Mining and Application

Location Prediction in Social Networks 151Rong Liu, Guanglin Cong, Bolong Zheng, Kai Zheng, and Han Su

Efficient Longest Streak Discovery in Multidimensional Sequence Data 166Wentao Wang, Bo Tang, and Min Zhu

Map Matching Algorithms: An Experimental Evaluation 182

Na Ta, Jiuqi Wang, and Guoliang Li

Predicting Passenger’s Public Transportation Travel Route Using Smart

Card Data 199Chen Yang, Wei Chen, Bolong Zheng, Tieke He, Kai Zheng, and Han Su

Detecting Taxi Speeding from Sparse and Low-Sampled Trajectory Data 214Xibo Zhou, Qiong Luo, Dian Zhang, and Lionel M Ni

Cloned Vehicle Behavior Analysis Framework 223Minxi Li, Jiali Mao, Xiaodong Qi, Peisen Yuan, and Cheqing Jin

An Event Correlation Based Approach to Predictive Maintenance 232Meiling Zhu, Chen Liu, and Yanbo Han

Using Crowdsourcing for Fine-Grained Entity Type Completion in

Knowledge Bases 248Zhaoan Dong, Ju Fan, Jiaheng Lu, Xiaoyong Du, and Tok Wang Ling

Improving Clinical Named Entity Recognition with Global

Neural Attention 264Guohai Xu, Chengyu Wang, and Xiaofeng He

Exploiting Implicit Social Relationship for Point-of-Interest

Recommendation 280Haifeng Zhu, Pengpeng Zhao, Zhixu Li, Jiajie Xu, Lei Zhao,

and Victor S Sheng

Spatial Co-location Pattern Mining Based on Density Peaks Clustering

and Fuzzy Theory 298Yuan Fang, Lizhen Wang, and Teng Hu

Trang 19

A Tensor-Based Method for Geosensor Data Forecasting 306Lihua Zhou, Guowang Du, Qing Xiao, and Lizhen Wang

Keyphrase Extraction Based on Optimized Random Walks

on Multiple Word Relations 359Wenyan Chen, Zheng Liu, Wei Shi, and Jeffrey Xu Yu

Answering Range-Based Reverse kNN Queries 368Zhefan Zhong, Xin Lin, Liang He, and Yan Yang

Big Data and Blockchain

EarnCache: Self-adaptive Incremental Caching for Big Data Applications 379Yifeng Luo, Junshi Guo, and Shuigeng Zhou

Storage and Recreation Trade-Off for Multi-version Data Management 394Yin Zhang, Huiping Liu, Cheqing Jin, and Ye Guo

Decentralized Data Integrity Verification Model in Untrusted Environment 410Kun Hao, Junchang Xin, Zhiqiong Wang, Zhuochen Jiang,

and Guoren Wang

Enabling Concurrency on Smart Contracts Using Multiversion Ordering 425

An Zhang and Kunlong Zhang

ElasticChain: Support Very Large Blockchain by Reducing

Data Redundancy 440Dayu Jia, Junchang Xin, Zhiqiong Wang, Wei Guo, and Guoren Wang

A MapReduce-Based Approach for Mining Embedded Patterns

from Large Tree Data 455Wen Zhao and Xiaoying Wu

Author Index 463

Trang 20

Similarity Calculations of Academic Articles Using Topic Events

and Domain Knowledge 45Ming Liu, Bo Lang, and Zepeng Gu

Sentiment Classification via Supplementary Information Modeling 54Zenan Xu, Yetao Fu, Xingming Chen, Yanghui Rao, Haoran Xie,

Fu Lee Wang, and Yang Peng

Training Set Similarity Based Parameter Selection for Statistical

Machine Translation 63Xuewen Shi, Heyan Huang, Ping Jian, and Yi-Kun Tang

Matrix Factorization Meets Social Network Embedding for Rating

Prediction 121Menghao Zhang, Binbin Hu, Chuan Shi, Bin Wu, and Bai Wang

An Estimation Framework of Node Contribution Based

on Diffusion Information 130Zhijian Zhang, Ling Liu, Kun Yue, and Weiyi Liu

Trang 21

Multivariate Time Series Clustering via Multi-relational Community

Detection in Networks 138Guowang Du, Lihua Zhou, Lizhen Wang, and Hongmei Chen

Recommender Systems

NSPD: An N-stage Purchase Decision Model

for E-commerce Recommendation 149Cairong Yan, Yan Huang, Qinglong Zhang, and Yan Wan

Social Image Recommendation Based on Path Relevance 165Zhang Chuanyan, Hong Xiaoguang, and Peng Zhaohui

Representation Learning with Depth and Breadth for Recommendation

Using Multi-view Data 181Xiaotian Han, Chuan Shi, Lei Zheng, Philip S Yu, Jianxin Li,

UIContextListRank: A Listwise Recommendation Model

with Social Contextual Information 207Zhenhua Huang, Chang Yu, Jiujun Cheng, and Zhixiao Wang

and Gang Wang

LIDH: An Efficient Filtering Method for Approximate k Nearest

Neighbor Queries Based on Local Intrinsic Dimension 268Yang Song, Yu Gu, and Ge Yu

Query Performance Prediction and Classification for Information

Search Systems 277Zhongmin Zhang, Jiawei Chen, and Shengli Wu

Trang 22

Aggregate Query Processing on Incomplete Data 286Anzhen Zhang, Jinbao Wang, Jianzhong Li, and Hong Gao

Machine Learning

Travel Time Forecasting with Combination of Spatial-Temporal

and Time Shifting Correlation in CNN-LSTM Neural Network 297Wenjing Wei, Xiaoyi Jia, Yang Liu, and Xiaohui Yu

DMDP2: A Dynamic Multi-source Based Default Probability

Prediction Framework 312

Yi Zhao, Yong Huang, and Yanyan Shen

Brain Disease Diagnosis Using Deep Learning Features from Longitudinal

MR Images 327Linlin Gao, Haiwei Pan, Fujun Liu, Xiaoqin Xie, Zhiqiang Zhang,

Jinming Han, and the Alzheimer’s Disease Neuroimaging Initiative

Attention-Based Recurrent Neural Network for Sequence Labeling 340Bofang Li, Tao Liu, Zhe Zhao, and Xiaoyong Du

Haze Forecasting via Deep LSTM 349Fan Feng, Jikai Wu, Wei Sun, Yushuang Wu, HuaKang Li,

and Xingguo Chen

Importance-Weighted Distance Aware Stocks Trend Prediction 357Zherong Zhang, Wenge Rong, Yuanxin Ouyang, and Zhang Xiong

Knowledge Graph

Jointly Modeling Structural and Textual Representation for Knowledge

Graph Completion in Zero-Shot Scenario 369Jianhui Ding, Shiheng Ma, Weijia Jia, and Minyi Guo

Neural Typing Entities in Chinese-Pedia 385Yongjian You, Shaohua Zhang, Jiong Lou, Xinsong Zhang,

and Weijia Jia

Knowledge Graph Embedding by Learning to Connect Entity with Relation 400Zichao Huang, Bo Li, and Jian Yin

StarMR: An Efficient Star-Decomposition Based Query Processor for

SPARQL Basic Graph Patterns Using MapReduce 415Qiang Xu, Xin Wang, Jianxin Li, Ying Gan, Lele Chai, and Junhu Wang

DAVE: Extracting Domain Attributes and Values from Text Corpus 431Yongxin Shen, Zhixu Li, Wenling Zhang, An Liu, and Xiaofang Zhou

Trang 23

PRSPR: An Adaptive Framework for Massive RDF Stream Reasoning 440Guozheng Rao, Bo Zhao, Xiaowang Zhang, Zhiyong Feng,

and Guohui Xiao

Demo Papers

TSRS: Trip Service Recommended System Based on Summarized

Co-location Patterns 451Peizhong Yang, Tao Zhang, and Lizhen Wang

DFCPM: A Dominant Feature Co-location Pattern Miner 456Yuan Fang, Lizhen Wang, Teng Hu, and Xiaoxuan Wang

CUTE: Querying Knowledge Graphs by Tabular Examples 461Zichen Wang, Tian Li, Yingxia Shao, and Bin Cui

ALTAS: An Intelligent Text Analysis System Based on Knowledge Graphs 466Xiaoli Wang, Chuchu Gao, Jiangjiang Cao, Kunhui Lin, Wenyuan Du,

and Zixiang Yang

SPARQLVis: An Interactive Visualization Tool for Knowledge Graphs 471Chaozhou Yang, Xin Wang, Qiang Xu, and Weixi Li

PBR: A Personalized Book Resource Recommendation System 475Yajie Zhu, Feng Xiong, Qing Xie, Lin Li, and Yongjian Liu

Author Index 481

Trang 24

Database and Web Applications

Trang 25

with Complex Wild-Cards Queries

on Encrypted Database

He Chen1, Xiuxia Tian2(B), and Cheqing Jin1

1 School of Data Science and Engineering, East China Normal University,

Shanghai, Chinawatch ch@163.com, cqjin@dase.ecnu.edu.cn

2 College of Computer Science and Technology,

Shanghai University of Electric Power, Shanghai, China

xxtian@fudan.edu.cn

Abstract Achieving fuzzy searching encryption (FSE) can greatly

enrich the basic function over cipher-texts, especially on encrypteddatabase (like CryptDB) However, most proposed schemes base on cen-tralized inverted indexes which cannot handle complicated queries withwild-cards In this paper, we present a well-designed FSE schema throughLocality-Sensitive-Hashing and Bloom-Filter algorithms to generate twotypes of auxiliary columns respectively Furthermore, an adaptive rewrit-ing method is described to satisfy queries with wild-cards, such as percentand underscore Besides, security enhanced improvements are provided

to avoid extra messages leakage The extensive experiments show tiveness and feasibility of our work

CryptDB

Cloud database is a prevalent paradigm for data outsourcing In tion of data security and commercial privacy, both individuals and enterprisesprefer outsourcing them in encrypted form CryptDB [21] is a typical out-sourced encrypted database (OEDB) which supports executing SQL statements

considera-on cipher-texts Its transparency essentially relies considera-on the design of splitting butions and rewriting queries on proxy middle-ware Under this proxy-basedencrypted framework, several auxiliary columns are extended with diﬀerentencryptions and query semantics are preserved through modifying or appendingSQL statements

attri-Supported by the National Key Research and Development Program of China(No 2016YFB1000905), NSFC (Nos 61772327, 61532021, U1501252, U1401256 and61402180), Project of Shanghai Science and Technology Committee Grant (No.15110500700)

c

Springer International Publishing AG, part of Springer Nature 2018

Y Cai et al (Eds.): APWeb-WAIM 2018, LNCS 10988, pp 3–18, 2018.

Trang 26

To enrich basic functions on cipher-texts, searchable symmetric tion (SSE) is proposed for keyword searching with encrypted inverted indexes[4,13,22,24], and then dynamic SSE (DSSE) achieves alterations on various cen-tralized indexes to enhance applicability [2,11,12,14] Besides, the studies aboutexact searching with boolean expressions are extended in this ﬁeld to increaseaccuracy [3,10] Furthermore, the researches of similar searching among docu-ments or words are widely discussed through introducing locality sensitive hash-ing algorithms [1,7 9,15,18,19,23,25–27] However, these proposed schemes arenot applicable to OEDB scenario because of the centralized index design andcannot handle complex fuzzy searching with wild-cards.

encryp-Matched DET values Plaintext results

Diﬀerent Encryp ons

Rewrited SQLs Queries

Fig 1 The client-proxy-database framework synthesizes various encryptions together,

such as the determined encryption (DET) preserves symmetric character foren/decryption, the order-preserving encryption (OPE) persists order among numericvalues, the fuzzy searching encryption (FSE) handles queries on text, and the homo-morphic encryption (HOM) achieves aggregation computing

Therefore, it is meaningful and necessary to achieve fuzzy searching tion over outsourced encrypted database As shown in Fig.1, the speciﬁc frame-work accomplishes transparency and homomorphism by rewriting SQL state-ments on auxiliary columns In this paper, we focus on resolving the functionality

encryp-of ‘like’ queries with wild-cards (‘%’ and ‘ ’) Our contributions are summarized

as follows:

– We propose a fuzzy searching encryption with complex wild-cards queries onencrypted database which extends extra functionality for the client-proxy-database framework like CryptDB

– We present an adaptive rewriting method to handle diﬀerent query cases ontwo types of auxiliary columns The formal column works for similar search-ing by locality sensitive hashing and the latter multiple columns work formaximum substring matching by designed bloom-ﬁlter vectors

– We evaluate the eﬃciency, correctness rate and space overhead by adjustingthe parameters in auxiliary columns Besides, security enhanced improve-ments are provided to avoid extra messages leakage The extensive experi-ments also indicate the eﬀectiveness and feasibility of our work

The rest of paper is organized as follows Section2discusses the related workand Sect.3 introduces some basic concepts and deﬁnitions Section4 describesour schema including initialization of auxiliary columns, adaptive rewritingqueries and security enhanced improvements Section5presents the experimentsand a brief conclusion is given in Sect.6

Trang 27

2 Related Work

In recent years, many proposed schemes have been attempting to achieve fuzzysearching encryption with helps of similarity [1,8,9,15,23,25–27] The most ofthem introduce locality sensitive hashing (LSH) to map similar items togetherand bloom-filter to change the method of measuring Wang et al.’s work [23] wasone of the first works to present fuzzy searching They encode every words ineach file into same large bloom-filter space as a vector and evaluate similarity oftarget queries by computing the inner product for top-k results among vectors.Kuzu et al.’s work [15] generates similar feature vectors by embedding keywordstrings into the Euclidean space which approximately preserves the relative editdistance Fu et al.’s work [8] proposes an efficient multi-keyword fuzzy rankedsearch schema which is suitable for common spelling mistakes It benefits fromcounting uni-gram among keywords and transvection sorting to obtain rankedcandidates Wang et al.’s work [26] generates a high-dimensional feature vector

by LSH to support large-scale similarity search over encrypted feature-rich timedia data It stores encrypted inverted file identifier vectors as indexes whilemapping similar objects into same or neighbor keyword-buckets by LSH based onEuclidean distance In contrast to sparse vectors from bi-gram mapping, theirwork eliminates the sparsity and promotes the correctness as well However,there are many problems in existing schemes including the insufficient metricconversion, the coarse-grained similarity comparison, the extreme dependency

mul-of assistant programs and the neglect about wild-card queries

Meanwhile, the proposal of CryptDB [21] has attracted world-wide attentionbecause they provide a practical way to combine various attribution-preservingencryptions over encrypted database Then many analogous researches [5,16,17,

20] study its security definitions, feasible frameworks, extensible functions andoptimizations Chen et al [5] consider these encrypted database as a client-proxy-database framework and presents symmetric column for en/decryption and aux-iliary columns for supporting executions This framework helps execute SQLstatements directly over cipher-texts through appending auxiliary columns withdifferent encryptions It also benefits from the transparency of en/decryptionprocesses and combines various functional encryptions together Therefore, it ismeaningful to achieve efficient fuzzy searching with complex wild-cards queries

on proxy-based encrypted database

A N-gram In the ﬁelds of computational linguistics and probability, the

n-gram method is proposed for measurement by generating a contiguous sequence

of items from given strings Essentially, it converts texts to fragments setsfor vectorization while preserving some connotative connections As shown inTable1, various n-gram methods are utilized to preserve diﬀerent implicit innerrelation from origin strings

Trang 28

Table 1 Various n-gram forms in our scheme

N-gram methods Value Description

String secure The original keyword

Counting uni-gram [8] s1, e1, c1, u1, r1, e2 Preserve repetitions

Bi-gram #s, se, ec, cu, ur, re, e# Preserve adjacent letters

Tri-gram sec, ecu, cur, ure Preserve triple adjacent lettersPreﬁx and suﬃx @s, e@ Beginning and ending of sentence

In general, bi-gram is the most common converting method which maintainsthe connotative information between adjacent letters However, each change ofsingle letter will double influence bi-gram results and cause reduction of matchingprobability The counting uni-gram preserves repetitions and benefits on letter-confused comparison cases, such as misspelling of a letter, missing or adding aletter and reversing the order of two letters However, it reduces the degree ofconstraint along with increasing false positives The tri-gram is a more strictmethod which only suits the specific scene like existing judgment The prefixand suffix preserve the beginning and ending of data to meet edge-searching

B Bloom-Filter The Bloom-ﬁlter is a compact structure reﬂecting whether

speciﬁc elements exist in prepared union In our schema, we introduce this rithm to judge existence about maximized substring fragments and representthe sparse vector through decimal numbers in separated columns Given wordsfragments setS = {e1, , e #e }, a bloom-ﬁlter maps each element e iinto a same

algo-l-bit sparse array by k independent hash functions Positive answer is provided

only if all bits of matched positions are true

C Locality Sensitive Hashing The locality sensitive hashing (LSH)

algo-rithm helps reduce the dimension of high-dimensional data In our schema, weintroduce this algorithm to map similar items together with high probability.Besides, the specific manifestation of the algorithm is different under differentmeasurement standards However, there is no available method for levenshteindistance among text So that a common practice is converting texts to fragmentsets with n-gram methods

Deﬁnition 1 (Locality sensitive hashing) Given a distance metric function

(r1, r2, p1, p2)-sensitive if for any s, t ∈ {0, 1} d and any h ∈ H satisfies:

For nearest neighbor searching, p1 > p2 and r1 < r2 is needed Practically,feasible permutations are generated through surjective hashing functions with

our security parameter λ And the minhash algorithm helps map fragment sets

of every separated words which achieves similar searching

Trang 29

3.2 Functional Model

Let D = (d1, , d #D) be sensitive row data (each line contains some words

respectively, as d i = |d i |

j=1 w i

j ) and C = {c det , c lsh , c bf } be the corresponding

cipher-texts Two types of indexing methods are enforced: the ﬁrst one achievessimilar searching among words through dimension reductions with locality sensi-

tive hashing (let m be the dimension of LSH, n be the tolerance and L represents

its conversion); the last one achieves maximum substring matching through bit

operation with bloom-ﬁlter (let l be the length of vector space, k be the amount

of hashing functions andB represents its conversion) We consider LSH tokens set T i = L n

Deﬁnition 2 (Fuzzy searching encryption) A proxy-based encrypted database

implements fully fuzzy searching with rewriting SQL statements through the lowing polynomial-time algorithms:

fol-(K det , L n

m , B k

dimension m of LSH and tolerance n, vector length l of BF and hash amount k,

it outputs a primary key K det for determining encryption, L n

ciphers T i for similar searching and ciphers V i for maximum substring matchingrespectively

from ‘like’ clause, the adaptive rewriting method help generate representing ments out of diﬀerent considerations with wild-cards condition The determined

ele-cipher-texts would return in next step over encrypted database and K det helpsdecryption

As shown in deﬁnition of fuzzy searching encryption, we mainly emphasizetransformation processes like building, indexing and executing There exist otherfunctional methods such as updating, deleting to achieve dynamically of ourschema It is applicable for outsourced encrypted database through rewritingSQL statements including ‘create’, ‘insert’, ‘select’ and so on

Our security deﬁnition follows the widely-accepted security frameworks in thisﬁeld [6,12,15,22] It is summarized in fuzzy query over encrypted database thatthe overall security relies on the cryptographic assurance of indexes and trap-doors In our schema, we store extra functional ciphers as indexes and rewritequeries as trapdoors The security guarantee means there is no additional infor-mation leaked other than the functional results of fuzzy query

Trang 30

4 Proposed Fuzzy Searching Encryption

The multiple-attributions-splitting design in cloud database synthesizes variousencryptions to preserve query semantics As shown in Table2, two types of aux-iliary columns (c-LSH and c-BF) are appended on cloud database along with asymmetrical determined column (DET)

Table 2 Storage pattern of multiple functional columns in database

l

32

) 0x1234 (“I love apple”) 19030024, 01000409, 00020412 1077036627 1957741388

0x5678 (“I love coconut”) 19030024, 01000409, 06000700 1626500087 1687169793

This schema aims at handling queries with wild-cards on cipher-texts So thatseveral appended columns could store diﬀerent functional ciphers with variousencryptions, such as determination (DET) of data for equality, locality sensitivehashing (LSH) of words fragments for similar searching, bloom-ﬁlter (BF) amonglines for maximum substring matching

A c-LSH The c-LSH column, which stores the locality sensitive hashing values

of each sentence, represents a message digest after dimensionality reduction It

IPA: inverted position array(lave)={1,2,4,7,10,11,12,14,17,20}

Sparse vector N-gram

Fig 2 A sample with bi-gram method (counting uni-gram as well) to show

trans-forming process: (1) split sentences in line to multiple words; (2) transport a word

to fragments with n-gram and build inverted position array; (3) execute dimensionreduction with LSH and getm features; (4) link features to a token for each word; (5)

combine tokens in line with comma

Trang 31

helps map similar items together with probability which equals to the jaccarddistance between their inverted position arrays (IPA for short).

During transforming process, n-gram methods are utilized (such as bi-gramand counting uni-gram) for dividing texts into fragments and ﬁnally to sparse vec-tors (IPA for short) As shown in Fig.2, the transforming process maps every rows

to separate signature collections by steps This process changes measurement fromlevenshtein distance on texts to jaccard similarity on IPAs So that the particularminhash algorithm could reduce the dimensions of numeric features for each sub-ject (words) Finally, each word is converted to a linked sequence as a token andthe c-LSH stores tokens set with comma to represent data of whole line

B c-BF The multiple auxiliary c-BF columns, which represents macroscopic

bloom-ﬁlter spaces for each row, are implemented on several ‘bigint’ (32-bit)columns The database will return the DET ciphers where all c-BF columns coverthe target sequences through native bit arithmetic operation ‘&’ Brieﬂy, thesecolumns are proposed for maximum substring matching which is a supplement

to the c-LSH column above

{[e1,f1,f2,i1,c1,i2,e1,n1,t1],[#e,ef,ff,fi,ic,ci,ie,en,nt,t#], [eff,ffi,fic,ici,cie,ien,ent]}, {[s1,u1,b1,s2,t1,r1,i1,n1,g1],[#s,su,ub,bs,st,tr,ri,in,ng,g#], [sub,ubs,bst,str,tri,rin,ing]}, {[s1,e1,a1,r1,c1,h1],[#s,se,ea,ar,rc,ch,h#],[sea,ear,arc,rch]}

{[@e,h@]}

string

1.N-gram Methods 2.Bloom-Filter Hashing

eﬃcient substring search

{[s1,t1,r1,i1,n1,g1],[#s,st,tr,ri,in,ng,g#],[str,tri,rin,ing]}

IPA(string)={2,6,58, }

Fig 3 The maximum substring matching over c-BF vectors which are stored in

multi-ple ‘bigint’ auxiliary columns separately After mapping fragments from whole sentence

to vectors, queries execute with bit matching

During mapping process, we respectively generate vectors for each rowthrough bloom-filter hashing with following n-gram methods: bi-gram, tri-gram,prefix and suffix These auxiliary columns are designed for substring matching sothat the implicit information need be maximally persisted from origin strings.Through matching fragments between target bit vector and stored separated

‘bigint’ numbers, we could obtain all matched rows as shown in Fig.3

To meet application scenarios of inextensible cloud database, we accomplishoperations completely through rewriting SQL statements by native bit arith-

metic operation over multiple auxiliary columns, such as select m det from t where m bf0 &1=1 and m bf1 &3=3 We experiment the connection between

length of the sparse vector and correct rate of maximum substring matching

in Sect.5

Trang 32

4.2 Adaptive Rewriting Method over Queries with Wild-Cards

In SQL, wild-card characters are used in ‘like’ expression: the percent sign ‘%’matches zero or more characters and the underscore ‘ ’ matches a single charac-ter Usually the former symbol is a coarse-grained comparable delimiter and thelatter could be tolerated by locality sensitive hashing in slightly diﬀerent cases

So we construct an adaptive rewriting method over queries with wild-cards asshown in Fig.4

We consider three basic cases according to the number of percent signs tomeet indivisible string fragments Furthermore in every basic case, we also dividethree sub-cases according to the number of underscore to beneﬁt from diﬀerentauxiliary columns Besides, each query text is considered as whole word andsubstring while experiment exhibits the optimal selection

Fig 4 Adaptive rewriting method over queries with wild-cards We consider percent

sign as a coarse-grained separator and few underscore could be tolerated according tosimilarity

Firstly, the double percent signs case means that user attempts finding rowswhich contains the given string Because the LSH function could tolerate smalldifferences naturally, the sub-case with no underscore could accomplish similarsearching among whole words We achieve the one underscore sub-case with partmatching method This clever trick helps adjust fineness of similar searching asshown in Fig.5 The multiple underscores sub-case is achieved by maximumsubstring matching on c-BF columns with bloom-filter

Secondly, the single percent sign case need to consider prefix and suffix.The occurs of this type of queries reflect more detailed information and wematch them all as substrings with maximum degree of constraint through variousN-gram forms on c-BF column Meanwhile, the prefix and suffix help preservebeginning and ending information of whole sentences in row During splittingprocess, every fragments with underscore would be abandoned and the rest partwould be mapped to the sparse bloom filter space which represented by IPA

Trang 33

Thirdly, in the last no percent sign case, the user might already obtain most

of target information and attempt to match speciﬁc patterns with underscore.Besides no underscore sub-case could be treated as determining equality oper-ation, the maximum substring matching on c-BF column could meet the restsub-cases’ requirements

Additionally, the tolerance parameter n is proposed as a ﬂexible handler under the dimension m of locality sensitive hashing auxiliary column Brieﬂy,

every features of word are set as ﬁxed-length numbers which is ﬁlled by zero in

basic scheme As a linked string with all m features, the token could be

con-verted to diﬀerent variants where some feature parts replacing with underscores

We joint every possible cases together for database searching with keyword ‘or’through a called bubble function as shown in Fig.5

0 0 1 1 1

m n

Fig 5 The part matching method represents the adjustable ﬁneness in c-LSH

col-umn with the tolerance parameter n and the LSH dimension m For instance, let

m = 4, n = 3 and the target feature set be {1, 2, 3, 4}, therefore the candidate set is { 234, 1 34, 12 4, 123 } by this method.

The adaptive rewriting method helps generate trapdoor queries to meet thewild-cards fuzzy searching encryption in database through similar searching onc-LSH column and maximum substring matching on c-BF columns

The security of our schema relies on three parts The symmetric phy algorithm guarantees the security on determining column and the dividedbloom-ﬁlter vectors are presented by unidentiﬁable hashing ciphers However,the content in c-LSH column might leak some extra information such as sizesand sequences of plain-texts We present three improvements to enhance securityand an integrated algorithm as followed

cryptogra-A Linking Features Without Padding

In basic scheme, we pad each feature with zero by the upper limit wid which

beneﬁts selecting process To enhance security, we cancel the zero padding beforelinking features to a token Meanwhile, the part matching method is also changed

to an analogous bi-gram form For instance, a secure enhanced part matching

method is select m det where m lsh like ‘%ab%’ or m lsh like ‘%bc%’ where a,b,c

are multiple features of a word We discuss the validity with experiments

B Modifying Sequences of Tokens

Each line of c-LSH auxiliary column stores a tokens set for whole sentence.Therefore, the sequences of tokens might exhibit the relevancy among words

Trang 34

to malicious attacker To overcome this leakage, we modify the sequences domly by hashing permutations Additionally, we implement the permutationfunction with P : y = a ∗ x + b mod c where a is relatively-prime to c This

ran-improvement protects the relation between invisible words and speciﬁc tokens.Since the matching only demands on existing rather than order, so this sequencemodiﬁcation helps for security protection

C Appending Confusing Tokens

The tokens sets in row leak the size of words Appending tokens is a practicalway for security, but what kind of token content should be added is the target

of our discussion The first way is appending repeated tokens from itself It issimple and effective, but it only improves limited security The second way isappending a little random tokens Because of sparsity and randomization, fewrandom tokens might not change the matching results The third way is append-ing tokens combined from separated features among this tokens set This wayalso influences the matching precision and increases proportion of false positive.Actually, these ways help greatly enhance security despite of disturbances

D Integrated Security Enhanced Algorithm

We present an integrated algorithm for security enhancement which combines allabove implementations As shown in Algorithm1, this algorithm transforms thetokens set in each row to an security enhanced one It helps prevent informationleakage from c-LSH column

Algorithm 1 Security Enhanced Improvements

Input: token m [#word i ] which represents the m-dimensional tokens set of line i, wid be width of feature with zero padding, amount be the

lower bound for appending tokens

Output: an optimal security enhanced set e token[amount]

1 Let t represent token and each t can be split into m features by wid;

2 Generate a permutation function withF : y = a ∗ x + b mod c where

c = amount and (a, c) = 1;

3 Let c = 0 be the count for permutation;

4 foreach t in token m [#word i] do

5 Generate a temporary string et;

11 Generate a temporary string et; for int k=0; k<m; k++ do

12 Get a feature token m [random()].substr(k ∗ wid, (k + 1) ∗ wid);

13 Remove the zero preﬁx and link it to et;

15 return e token[amount];

Trang 35

5 Performance Evaluation

In this section, we evaluate the performance of our work Firstly, we discussthe effect of different n-gram methods about matching accuracy in c-LSH col-umn Secondly, we discuss the effect of bloom-filter length on collision degreeand space usage of maximum substring matching Thirdly, we discuss perfor-mance of adaptive rewriting method Finally, we compare execution efficiencyand space occupancy among efficient proposed schemes The proposed scheme

is implemented in Core i5-4460 3.20 GHz PC with 16 GB memory, and the useddatasets include 2000 TOEFL words, the leaked user data of CSDN and thereuters news

Manifestations of Diﬀerent N-gram Methods on c-LSH Column

Uti-lizing bi-gram and counting uni-gram, we achieve similar searching on c-LSHcolumn by introducing minhash algorithm based on the jaccard distance of frag-ments set Intuitively, every change of character would greatly inﬂuence the cor-responding fragments union over bi-gram method So we introduce the countinguni-gram method to balance this excessiveness relativity In this experiment, weevaluate the performance of these two n-gram methods and the combined onerespectively

The dataset we used is a 2000 TOEFL words set and we construct threevariants of them to reveal the eﬃciency about LSH-based similar searchingunder diﬀerent N-gram methods The ways getting variants include append-ing a letter in the middle or in one side for every words, such as ‘word’ into

‘words’,‘wosrd’,‘sword’ We calculate the average matched rows to reﬂect thesearching results

As shown in Fig.6, we choose m = 4, 6, 8 to reveal matched numbers through part matching method with n And the accuracy rate has a big promotion when

n is larger than half of m Besides, the combined method performs well when

m ≥ 6 It is reﬂected about the variation trend of accuracy that the amount of

false positive reduces while the correct items remain unchanged

The Bloom-Filter Length on c-BF Columns In second experiment, we

valuate collision accuracy and space occupancy under impacts of bloom-filterlength and hashing function amount on c-BF columns when executing maximumsubstring matching In detail, we attempt to find out an appropriate settingabout the number of hashing function and the vector length of our bloom-filterstructures

The dataset we used is a leaked accounts set about CSDN, one of the mostfamous technical forum websites in China, and contains user name, passwordand e-mail To guarantee effectiveness and avoid collisions, we change the vectorlength and keep the sparsity in several degrees such as half, quarter, one-sixthand one-eighth Meanwhile different amount of hashing functions in bloom-filterinfluence accuracy and collision

Trang 36

Fig 6 The matching size under the tolerance parameter n and the LSH dimension

m over variants of TOEFL words set The ﬁrst three graphs show the performance ofdiﬀerent n-gram methods about part-matching respectively The last graph shows theLSH dimension only complete-matching whenm = n.

Fig 7 The experiments show performance of maximum substring searching under

different bloom-filter lengthl and different hashing amount k We utilize fifty thousand

rows of leaked CSDN account data and set several degrees of sparsity about ﬁlter vector while each row contains 50 characters The left graph shows matching sizes

bloom-of substring ‘163.com’ onl

32

auxiliary columns when we build indexes under diﬀerentlength of bloom-ﬁlter And the right graph represents ‘qq.com’

Trang 37

Fig 8 The experiment shows the performance of adaptive rewriting method under

diﬀerent combinations, and reveals the most qualiﬁed modes for each fuzzy searchingcases Some expressions are used, such as ‘%america%’, ‘%am rica%’, ‘%am ri a%’

Because the bloom-ﬁlter length l corresponds to the amount of c-BF columns,

this experiment discuss relations between matching accuracy and space pancy under different amount of bloom-filter hashing functions As shown inFig.7, the amount of matching size drops rapidly in the first place and then getsstable when sparsity is close to one-sixth

occu-The Performance of Adaptive Rewriting Method This experiment aims

at verifying eﬀectiveness of the adaptive rewriting method After auxiliarycolumns storing values as indexes, the ‘like’ clauses with wild-cards are analyzed

by an adaptive rewriting method and rewritten to trapdoors In this experiment,

we consider the content of expression as a word or substring for comparison,and execute diﬀerent types of queries with basic and security enhanced schemesrespectively

The dataset we used is Reuters-21578 news of 1987 [28] In this experiment,

we mainly discuss the double ‘%’ cases because the other single ‘%’ and no ‘%’cases carry out analogous steps The only difference is that these cases addition-ally consider the prefix and suffix

As shown in Fig.8, we compare the matched size under different nations We also execute the origin SQL statements on extra stored plain-textcolumn for contrast It helps find the best combination modes under various wild-cards cases We accomplish this experiment with the sparsity of c-BF columnsbeing one-sixth and the dimension of c-LSH column being six The graph showsthat ‘W and S’ is fit for double ‘%’ no ‘ ’ and double ‘%’ one ‘ ’ cases while ‘S’ is

combi-ﬁt for double ‘%’ few ‘ ’ case Besides, we discuss the performance of LSH-basedsecurity enhanced method and the graph conﬁrms its feasibility

Performance Comparison Among Proposed Schemes In this section, we

compare the eﬃciency of proposed schemes about inserting and selecting data

In general, the inserting process involves generating indexing values in auxiliarycolumns, and the selecting process involves decrypting determined cipher-texts

Trang 38

Fig 9 This experiment show the execution eﬃciency among proposed schemes.

As shown in Fig.9, our schema veriﬁes this point and performs well comparing

to normal JDBC, Crypt-jdbc and CryptDB

This paper investigates the problem about fuzzy searching encryption with plex wild-cards queries on proxy-based encrypted database, then gives a practi-cal schema with two types of auxiliary columns and rewriting SQL statements.Besides, security enhanced implementations and extensive experiments show theeﬀectiveness In future, the serialization and compression of functional cipher-texts would be studied to reduce space overhead

com-References

1 Boldyreva, A., Chenette, N.: Eﬃcient fuzzy search on encrypted data In: Cid, C.,Rechberger, C (eds.) FSE 2014 LNCS, vol 8540, pp 613–633 Springer, Heidel-berg (2015).https://doi.org/10.1007/978-3-662-46706-0 31

2 Cash, D., Jaeger, J., Jarecki, S., Jutla, C., Krawczyk, H., Rou, M.C., Steiner,M.: Dynamic searchable encryption in very-large databases: data structures andimplementation In: Network and Distributed System Security Symposium (2014)

3 Cash, D., Jarecki, S., Jutla, C., Krawczyk, H., Ro¸su, M.-C., Steiner, M.: scalable searchable symmetric encryption with support for boolean queries In:Canetti, R., Garay, J.A (eds.) CRYPTO 2013 LNCS, vol 8042, pp 353–373.Springer, Heidelberg (2013).https://doi.org/10.1007/978-3-642-40041-4 20

Trang 39

Highly-4 Chase, M., Kamara, S.: Structured encryption and controlled disclosure In: Abe,

M (ed.) ASIACRYPT 2010 LNCS, vol 6477, pp 577–594 Springer, Heidelberg(2010).https://doi.org/10.1007/978-3-642-17373-8 33

5 Chen, H., Tian, X., Yuan, P., Jin, C.: Crypt-JDBC model: optimization of onion

encryption algorithm J Front Comput Sci Technol 11(8), 1246–1257 (2017)

6 Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric tion: improved deﬁnitions and eﬃcient constructions In: ACM Conference on Com-puter and Communications Security, pp 79–88 (2006)

encryp-7 Fan, K., Yin, J., Wang, J., Li, H., Yang, Y.: Multi-keyword fuzzy and sortableciphertext retrieval scheme for big data In: 2017 IEEE Global CommunicationsConference, GLOBECOM 2017, pp 1–6 IEEE (2017)

8 Fu, Z., Wu, X., Guan, C., Sun, X., Ren, K.: Toward eﬃcient multi-keyword fuzzysearch over encrypted outsourced data with accuracy improvement IEEE Trans

Inf Forensics Secur 11(12), 2706–2716 (2017)

9 Hahn, F., Kerschbaum, F.: Searchable encryption with secure and eﬃcient updates.In: ACM SIGSAC Conference on Computer and Communications Security, pp.310–320 (2014)

10 Jho, N.S., Chang, K.Y., Hong, D., Seo, C.: Symmetric searchable encryption with

eﬃcient range query using multi-layered linked chains J Supercomput 72(11),

1–14 (2016)

11 Kamara, S., Papamanthou, C.: Parallel and dynamic searchable symmetric tion In: Sadeghi, A.-R (ed.) FC 2013 LNCS, vol 7859, pp 258–274 Springer,Heidelberg (2013).https://doi.org/10.1007/978-3-642-39884-1 22

12 Kamara, S., Papamanthou, C., Roeder, T.: Dynamic searchable symmetric tion In: ACM Conference on Computer and Communications Security, pp 965–976(2012)

encryp-13 Kurosawa, K., Ohtaki, Y.: UC-secure searchable symmetric encryption In:Keromytis, A.D (ed.) FC 2012 LNCS, vol 7397, pp 285–298 Springer, Heidelberg(2012).https://doi.org/10.1007/978-3-642-32946-3 21

14 Kurosawa, K., Sasaki, K., Ohta, K., Yoneyama, K.: UC-secure dynamic searchablesymmetric encryption scheme In: Ogawa, K., Yoshioka, K (eds.) IWSEC 2016.LNCS, vol 9836, pp 73–90 Springer, Cham (2016).https://doi.org/10.1007/978-3-319-44524-3 5

15 Kuzu, M., Islam, M.S., Kantarcioglu, M.: Eﬃcient similarity search over encrypteddata In: IEEE International Conference on Data Engineering, pp 1156–1167(2012)

16 Lesani, M.: MrCrypt: static analysis for secure cloud computations ACM

SIG-PLAN Not 48(10), 271–286 (2013)

17 Li, J., Liu, Z., Chen, X., Xhafa, F., Tan, X., Wong, D.S.: L-EncDB: a lightweightframework for privacy-preserving data queries in cloud computing Knowl.-Based

Syst 79, 18–26 (2015)

18 Liu, Z., Li, J., Li, J., Jia, C., Yang, J., Yuan, K.: SQL-based fuzzy query mechanism

over encrypted database Int J Data Wareh Min (IJDWM) 10(4), 71–87 (2014)

19 Liu, Z., Ma, H., Li, J., Jia, C., Li, J., Yuan, K.: Secure storage and fuzzy query overencrypted databases In: Lopez, J., Huang, X., Sandhu, R (eds.) NSS 2013 LNCS,vol 7873, pp 439–450 Springer, Heidelberg (2013).https://doi.org/10.1007/978-3-642-38631-2 32

20 Popa, R.A., Li, F.H., Zeldovich, N.: An ideal-security protocol for order-preservingencoding In: IEEE Symposium on Security and Privacy, pp 463–477 (2013)

Trang 40

21 Popa, R.A., Redﬁeld, C.M.S., Zeldovich, N., Balakrishnan, H.: CryptDB: tecting conﬁdentiality with encrypted query processing In: ACM Symposium onOperating Systems Principles, SOSP 2011, Cascais, Portugal, October, pp 85–100(2011)

pro-22 Song, D.X., Wagner, D., Perrig, A.: Practical techniques for searches on encrypteddata In: IEEE Symposium on Security and Privacy, p 44 (2000)

23 Wang, B., Yu, S., Lou, W., Hou, Y.T.: Privacy-preserving multi-keyword fuzzysearch over encrypted data in the cloud In: 2014 Proceedings of IEEE INFOCOM,

26 Wang, Q., He, M., Du, M., Chow, S.S.M., Lai, R.W.F., Zou, Q.: Searchable

encryp-tion over feature-rich data IEEE Trans Dependable Secur Comput PP(99), 1

(2016)

27 Wei, X., Zhang, H.: Veriﬁable multi-keyword fuzzy search over encrypted data inthe cloud In: International Conference on Advanced Materials and InformationTechnology Processing (2016)

28 Wiki: Reuters.http://www.research.att.com/∼lewis

Định dạng
Số trang	481
Dung lượng	30,89 MB