Semantic Web Technologies Trends and Research in... 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 42 McDougall Street, Milton, Queensland 4064, Australia John Wiley & So
Trang 2Semantic Web Technologies
Trang 4Semantic Web Technologies Trends and Research in
Trang 5Copyright # 2006 John Wiley & Sons Ltd, The Atrium, Southern Gate,
Chichester, West Sussex, PO19 8SQ, England Telephone (þ44) 1243 779777
Email (for orders and customer service enquiries): cs-books@wiley.co.uk
Visit our Home Page on www.wiley.com
All Rights Reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher Requests to the Publisher should be addressed to the Permissions Depart- ment, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to permreq@wiley.co.uk, or faxed to (þ44) 1243 770571 This publication is designed to provide accurate and authoritative information in regard to the subject matter covered It is sold on the understanding that the Publisher is not engaged
in rendering professional services If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Other Wiley Editorial Offices
John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA
Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA
Wiley-VCH Verlag GmbH, Boschstr 12, D-69469 Weinheim, Germany
John Wiley & Sons Australia Ltd, 42 McDougall Street, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809
John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 Library of Congress Cataloging-in-Publication Data
Davies, J (N John)
Semantic Web technologies : trends and research in ontology-based systems
/ John Davies, Rudi Studer, Paul Warren.
p cm.
Includes bibliographical references and index.
ISBN-13: 978-0-470-02596-3 (cloth : alk paper)
ISBN-10: 0-470-02596-4 (cloth : alk paper)
1 Semantic Web I Studer, Rudi II Warren, Paul III Title: Trends
and research in ontology-based systems IV Title.
TK5105.88815.D38 2006
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
ISBN-13: 978-0-470-02596-3
ISBN-10: 0-470-02596-4
Typeset in 10/11.5 pt Palatino by Thomson Press (India) Ltd, New Delhi, India
Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire This book is printed on acid-free paper responsibly manufactured from sustainable forestry
in which at least two trees are planted for each one used for paper production.
Trang 61 Introduction 11.1 Semantic Web Technologies 11.2 The Goal of the Semantic Web 21.3 Ontologies and Ontology Languages 41.4 Creating and Managing Ontologies 51.5 Using Ontologies 61.6 Applications 71.7 Developing the Semantic Web 8
2 Knowledge Discovery for Ontology Construction 92.1 Introduction 92.2 Knowledge Discovery 102.3 Ontology Definition 102.4 Methodology for Semi-automatic Ontology Construction 112.5 Ontology Learning Scenarios 122.6 Using Knowledge Discovery for Ontology Learning 132.6.1 Unsupervised Learning 142.6.2 Semi-Supervised, Supervised, and Active Learning 162.6.3 Stream Mining and Web Mining 182.6.4 Focused Crawling 182.6.5 Data Visualization 192.7 Related Work on Ontology Construction 222.8 Discussion and Conclusion 24Acknowledgments 24
3 Semantic Annotation and Human Language Technology 293.1 Introduction 293.2 Information Extraction: A Brief Introduction 31
Trang 73.2.1 Five Types of IE 323.2.2 Entities 333.2.3 Mentions 333.2.4 Descriptions 343.2.5 Relations 343.2.6 Events 343.3 Semantic Annotation 353.3.1 What is Ontology-Based Information Extraction 363.4 Applying ‘Traditional’ IE in Semantic Web Applications 373.4.1 AeroDAML 383.4.2 Amilcare 38
3.4.4 S-Cream 393.4.5 Discussion 403.5 Ontology-based IE 403.5.1 Magpie 403.5.2 Pankow 413.5.3 SemTag 41
3.5.5 KIM Front-ends 433.6 Deterministic Ontology Authoring using Controlled Language IE 453.7 Conclusion 48
4 Ontology Evolution 514.1 Introduction 514.2 Ontology Evolution: State-of-the-art 524.2.1 Change Capturing 534.2.2 Change Representation 544.2.3 Semantics of Change 564.2.4 Change Propagation 584.2.5 Change Implementation 594.2.6 Change Validation 604.3 Logical Architecture 604.4 Data-driven Ontology Changes 624.4.1 Incremental Ontology Learning 644.5 Usage-driven Ontology Changes 664.5.1 Usage-driven Hierarchy Pruning 674.6 Conclusion 68
5 Reasoning With Inconsistent Ontologies: Framework, Prototype,
and Experiment 715.1 Introduction 715.2 Brief Survey of Approaches to Reasoning with Inconsistency 735.2.1 Paraconsistent Logics 73
Trang 85.2.2 Ontology Diagnosis 745.2.3 Belief Revision 745.2.4 Synthesis 755.3 Brief Survey of Causes for Inconsistency in the Semantic Web 755.3.1 Inconsistency by Mis-representation of Default 755.3.2 Inconsistency Caused by Polysemy 775.3.3 Inconsistency through Migration from Another Formalism 775.3.4 Inconsistency Caused by Multiple Sources 785.4 Reasoning with Inconsistent Ontologies 795.4.1 Inconsistency Detection 795.4.2 Formal Definitions 805.5 Selection Functions 825.6 Strategies for Selection Functions 835.7 Syntactic Relevance-Based Selection Functions 855.8 Prototype of Pion 875.8.1 Implementation 875.8.2 Experiments and Evaluation 885.8.3 Future Experiments 915.9 Discussion and Conclusions 91Acknowledgment 92
6 Ontology Mediation, Merging, and Aligning 956.1 Introduction 956.2 Approaches in Ontology Mediation 966.2.1 Ontology Mismatches 976.2.2 Ontology Mapping 976.2.3 Ontology Alignment 1006.2.4 Ontology Merging 1026.3 Mapping and Querying Disparate Knowledge Bases 1046.3.1 Mapping Language 1066.3.2 A (Semi-)Automatic Process for Ontology Alignment 1086.3.3 OntoMap: an Ontology Mapping Tool 1106.4 Summary 111
7 Ontologies for Knowledge Management 1157.1 Introduction 1157.2 Ontology Usage Scenario 1167.3 Terminology 1177.3.1 Data Qualia 1197.3.2 Sorts of Data 1207.4 Ontologies as RDBMS Schema 1237.5 Topic-ontologies Versus Schema-ontologies 1247.6 Proton Ontology 1267.6.1 Design Rationales 126
Trang 97.6.2 Basic Structure 1277.6.3 Scope, Coverage, Compliance 1287.6.4 The Architecture of Proton 1307.6.5 Topics in Proton 1317.6.6 Proton Knowledge Management Module 1337.7 Conclusion 135
8 Semantic Information Access 1398.1 Introduction 1398.2 Knowledge Access and the Semantic WEB 1398.2.1 Limitations of Current Search Technology 1408.2.2 Role of Semantic Technology 1428.2.3 Searching XML 1438.2.4 Searching RDF 1448.2.5 Exploiting Domain-specific Knowledge 1468.2.6 Searching for Semantic Web Resources 1508.2.7 Semantic Browsing 1518.3 Natural Language Generation from Ontologies 1528.3.1 Generation from Taxonomies 1538.3.2 Generation of Interactive Information Sheets 1548.3.3 Ontology Verbalisers 1548.3.4 Ontogeneration 1548.3.5 Ontosum and Miakt Summary Generators 1558.4 Device Independence: Information Anywhere 1568.4.1 Issues in Device Independence 1578.4.2 Device Independence Architectures and Technologies 1608.4.3 DIWAF 1628.5 SEKTAgent 1648.6 Concluding Remarks 166
9 Ontology Engineering Methodologies 1719.1 Introduction 1719.2 The Methodology Focus 1729.2.1 Definition of Methodology for Ontologies 1729.2.2 Methodology 1739.2.3 Documentation 1749.2.4 Evaluation 1749.3 Past and Current Research 1749.3.1 Methodologies 1749.3.2 Ontology Engineering Tools 1779.3.3 Discussion and Open Issues 1789.4 Diligent Methodology 1809.4.1 Process 1809.4.2 Argumentation Support 183
Trang 109.5 First Lessons Learned 1859.6 Conclusion and Next Steps 186
10 Semantic Web Services – Approaches and Perspectives 19110.1 Semantic Web Services – A Short Overview 19110.2 The WSMO Approach 19210.2.1 The Conceptual Model – The Web Services Modeling
Ontology (WSMO) 19310.2.2 The Language – The Web Service Modeling Language (WSML) 19810.2.3 The Execution Environment – The Web Service Modeling
Execution Environment (WSMX) 20410.3 The OWL-S Approach 20710.3.1 OWL-S Service Profiles 20910.3.2 OWL-S Service Models 21010.4 The SWSF Approach 21310.4.1 The Semantic Web Services Ontology (SWSO) 21310.4.2 The Semantic Web Services Language (SWSL) 21610.5 The IRS-III Approach 21810.5.1 Principles Underlying IRS-III 21810.5.2 The IRS-III Architecture 22010.5.3 Extension to WSMO 22110.6 The WSDL-S Approach 22210.6.1 Aims and Principles 22210.6.2 Semantic Annotations 22410.7 Semantic Web Services Grounding: The Link Between SWS
and Existing Web Services Standards 22610.7.1 General Grounding Uses and Issues 22610.7.2 Data Grounding 22810.7.3 Behavioural Grounding 23010.8 Conclusions and Outlook 232
11 Applying Semantic Technology to a Digital Library 23711.1 Introduction 23711.2 Digital Libraries: The State-of-the-art 23811.2.1 Working Libraries 23811.2.2 Challenges 23911.2.3 The Research Environment 24111.3 A Case Study: The BT Digital Library 24211.3.1 The Starting Point 24211.3.2 Enhancing the Library with Semantic Technology 24411.4 The Users’ View 24811.5 Implementing Semantic Technology in a Digital Library 25011.5.1 Ontology Engineering 250
Trang 1111.5.2 BT Digital Library End-user Applications 25111.5.3 The BT Digital Library Architecture 25211.5.4 Deployment View of the BT Digital Library 25511.6 Future Directions 255
12 Semantic Web: A Legal Case Study 25912.1 Introduction 25912.2 Profile of the Users 26012.3 Ontologies for Legal Knowledge 26212.3.1 Legal Ontologies: State of the Art 26312.3.2 Ontologies of Professional Knowledge: OPJK 26512.3.3 Benefits of Semantic Technology and Methodology 26712.4 Architecture 27212.4.1 Iuriservice Prototype 27212.5 Conclusions 278
13 A Semantic Service-Oriented Architecture for the
Telecommunications Industry 28113.1 Introduction 28113.2 Introduction to Service-oriented Architectures 28213.3 A Semantic Service-orientated architecture 28413.4 Semantic Mediation 28613.4.1 Data Mediation 28713.4.2 Process Mediation 28713.5 Standards and Ontologies in Telecommunications 28713.5.1 eTOM 28913.5.2 SID 28913.5.3 Adding Semantics 29013.6 Case Study 29013.6.1 Broadband Diagnostics 29213.6.2 The B2B Gateway Architecture 29213.6.3 Semantic B2B Integration Prototype 29413.6.4 Prototype Implementation 29713.7 Conclusion 298
14 Conclusion and Outlook 30114.1 Management of Networked Ontologies 30114.2 Engineering of Networked Ontologies 30214.3 Contextualizing Ontologies 30314.4 Cross Media Resources 30414.5 Social Semantic Desktop 30614.6 Applications 307
Trang 12at its disposal the vast numbers of dedicated personnel needed to store,copy, and distribute books in a totally manual fashion Gutenberg sought
a better way to produce Bibles, and as a result changed fundamentallythe control of knowledge in Western society Within a few years, anyonewho owned a printing press could distribute knowledge widely toanyone willing to read it
In the late twentieth century, Berners-Lee had the goal of providingrapid, electronic access to the online technical reports and other docu-ments created by the world’s high-energy physics laboratories Hesought to make it easier for physicists to access their arcane, distributedliterature from a range of research centers scattered about the world Inthe process, Berners-Lee laid the foundation for the World Wide Web In
1989, Berners-Lee could only begin imagine how his proposal to linktechnical reports via hypertext might someday change fundamentallyessential aspects of human communication and social interaction It wasnot his intention to revolutionize communication of information fore-commerce, for geographic reasoning, for government services, or forany of the myriad Web-based applications that we now take for granted
Trang 13Our society changed irreversibly, however, when Berners-Lee inventedHTML and HTTP.
The World Wide Web provides a dazzling array of informationservices—designed for use by people—and has become an ingrainedpart of our lives There is another Web coming, however, where onlineinformation will be accessed by intelligent agents that will be able toreason about that information and communicate their conclusions inways that we can only begin to dream about This Semantic Webrepresents the next stage in the evolution of communication of humanknowledge Like Gutenberg, the developers of this new technology have
no way of envisioning the ultimate ramifications of their work They are,however, united by the conviction that creating the ability to captureknowledge in machine understandable form, to publish that knowledgeonline, to develop agents that can integrate that knowledge and reasonabout it, and to communicate the results both to people and to otheragents, will do nothing short of revolutionize the way people disseminateand utilize information
The European Union has long maintained a vision for the advent
of the "information society," supporting several large consortia ofacademic and industrial groups dedicated to the development of infra-structure for the Semantic Web One of these consortia has had thegoal of developing Semantically Enabled Knowledge Technologies(SEKT; http://www.sekt-project.com), bringing together fundamentalresearch, work to build novel software components and tools, anddemonstration projects that can serve as reference implementations forfuture developers
The SEKT project has brought together some of Europe’s leadingcontributors to the development of knowledge technologies, data-miningsystems, and technologies for processing natural language SEKTresearchers have sought to lay the groundwork for scalable, semi-automatic tools for the creation of ontologies that capture the conceptsand relationships among concepts that structure application domains; forthe population of ontologies with content knowledge; and for themaintenance and evolution of these knowledge resources over time.The use of ontologies (and of procedural middleware and Web servicesthat can operate on ontologies) emerges as the fundamental basis forcreating intelligence on the Web, and provides a unifying framework forall the work produced by the SEKT investigators
This volume presents a review and synopsis of current methods forengineering the Semantic Web while also documenting some of the earlyachievements of the SEKT project The chapters of this book provideoverviews not only of key aspects of Semantic Web technologies, but also
of prototype applications that offer a glimpse of how the Semantic Webwill begin to take form in practice Thus, while many of the chapters dealwith specific technologies such as those for Semantic Web services,metadata extraction, ontology alignment, and ontology engineering, the
Trang 14case studies provide examples of how these technologies can cometogether to solve real-world problems using Semantic Web techniques.
In recent years, many observers have begun to ask hard questionsabout what the Semantic Web community has achieved and what it canpromise The prospect of Web-based intelligence is so alluring that thescientific community justifiably is seeking clarity regarding the currentstate of the technology and what functionality is really on the horizon Inthis regard, the work of the SEKT consortium provides an excellentperspective on contemporary research on Semantic Web infrastructureand applications It also offers a glimpse of the kinds of knowledge-basedresources that, in a few years time, we may begin to take for granted—just as we do current-generation text-based Web browsers and resources
At this point, there is no way to discern whether the Semantic Web willaffect our culture in a way that can ever begin to approximate thechanges that have resulted from the invention of print media or of theWorld Wide Web as we currently know it Indeed, there is no guaranteethat many of the daunting problems facing Semantic Web researcherswill be solved anytime soon If there is anything of which we can be sure,however, it is that even the SEKT researchers cannot imagine all the ways
in which future workers will tinker with Semantic Web technologies toengineer, access, manage, and reason with heterogeneous, distributedknowledge stores Research on the Semantic Web is helping us toappreciate the enormous possibilities of amassing human knowledgeonline, and there is justifiable excitement and anticipation in thinkingabout what that achievement might mean someday for nearly everyaspect of our society
Mark A MusenStanford, California, USA
January 2, 2006
Trang 16Introduction
Paul Warren, Rudi Studer and John Davies
1.1 SEMANTIC WEB TECHNOLOGIES
That we need a new approach to managing information is beyond doubt.The technological developments of the last few decades, including thedevelopment of the World Wide Web, have provided each of us withaccess to far more information than we can comprehend or manageeffectively A Gartner study (Morello, 2005) found that ‘the averageknowledge worker in a Fortune 1000 company sends and receives 178messages daily’, whilst an academic study has shown that the volume ofinformation in the public Web tripled between 2000 and 2003 (Lyman
et al., 2005) We urgently need techniques to help us make sense of allthis; to find what we need to know and filter out the rest; to extract andsummarise what is important, and help us understand the relationshipsbetween it Peter Drucker has pointed out that knowledge workerproductivity is the biggest challenge facing organisations (Drucker,1999) This is not surprising when we consider the increasing proportion
of knowledge workers in the developing world Knowledge managementhas been the focus of considerable attention in recent years, as compre-hensively reviewed in (Holsapple, 2002) Tools which can significantlyhelp knowledge workers achieve increased effectiveness will be tremen-dously valuable in the organisation
At the same time, integration is a key challenge for IT managers Thecosts of integration, both within an organisation and with external trad-ing partners, are a significant component of the IT budget Charlesworth(2005) points out that information integration is needed to ‘reach a betterunderstanding of the business through its data’, that is to achieve a
Semantic Web Technologies: Trends and Research in Ontology-based Systems
John Davies, Rudi Studer, Paul Warren # 2006 John Wiley & Sons, Ltd