3 The Need for XML 4 • Displaying XML Documents 10 • SGML, HTML, and XML 11 • The Official Goals of XML 12 • Standard XML Applications 14 • Real-World Uses for XML 15 • XML Applications
Trang 2PUBLISHED BY
Microsoft Press
A Division of Microsoft Corporation
One Microsoft Way
Redmond, Washington 98052-6399
Copyright © 2002 by Michael J Young
All rights reserved No part of the contents of this book may be reproduced or transmitted in any form
or by any means without the written permission of the publisher.
Library of Congress Cataloging-in-Publication Data
Young, Michael J.
XML Step By Step / Michael J Young. 2nd ed.
p cm.
Includes index.
ISBN 0-7356-1465-2
1 XML (Document markup language) I Title.
QA76.76.H94 Y68 2001
Printed and bound in the United States of America.
1 2 3 4 5 6 7 8 9 QWT 6 5 4 3 2
Distributed in Canada by Penguin Books Canada Limited.
A CIP catalogue record for this book is available from the British Library.
Microsoft Press books are available through booksellers and distributors worldwide For further informa-tion about internainforma-tional ediinforma-tions, contact your local Microsoft Corporainforma-tion office or contact Microsoft Press International directly at fax (425) 936-7329 Visit our Web site at www.microsoft.com/mspress.
Send comments to mspinput@microsoft.com.
ActiveX, JScript, Microsoft, Microsoft Press, MSDN, Visual Basic, Visual Studio, and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries Other product and company names mentioned herein may be the trademarks of their respective owners.
The example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious No association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred.
Acquisitions Editor: David J Clark
Project Editor: Jean Cockburn
Body Part No X08-24444
Trang 3Contents
Preface vii Introduction xi
Why Another XML Book? xi • What You’ll Learn in This Book xii •
XML Step by Step, Internet Explorer, and MSXML xiv • Using the Companion CD xvi • Requirements xviii • How to Contact the Author xix • Microsoft Press Support Information xix
Chapter 1 Why XML? 3
The Need for XML 4 • Displaying XML Documents 10 • SGML, HTML, and XML 11 • The Official Goals of XML 12 • Standard XML Applications 14 • Real-World Uses for XML 15 • XML Applications for Enhancing XML Documents 19
Chapter 2 Creating and Displaying Your First XML Document 21
Creating an XML Document 21 • Displaying the XML Document 29
PART 2 Creating XML Documents 43
Chapter 3 Creating Well-Formed XML Documents 45
The Parts of a Well-Formed XML Document 46 • Adding Elements
to the Document 50 • Adding Attributes to Elements 62 • Using Namespaces 69
Chapter 4 Adding Comments, Processing
Instructions, and CDATA Sections 81
Inserting Comments 81 • Using Processing Instructions 83 • Including CDATA Sections 86
Trang 4iv Contents
Chapter 5 Creating Valid XML Documents
Using Document Type Definitions 91
The Basic Criteria for a Valid XML Document 92 • The Advantages of Making an XML Document Valid 93 • Adding the Document Type Declaration 95 • Declaring Element Types 98 • Declaring Attributes
107 • Using Namespaces in Valid Documents 117 • Using an External DTD Subset 120 • Converting a Well-Formed Document to a Valid Document 125
Chapter 6 Defining and Using Entities 131
Entity Definitions and Classifications 131 •Declaring General Entities
135 • Declaring Parameter Entities 143 • Inserting Entity References 148 • Inserting Character References 153 • Using Predefined Entities 156 • Adding Entities to a Document 157
Chapter 7 Creating Valid XML Documents
Using XML Schemas 163
XML Schema Basics 165 • Declaring Elements 167 • Declaring an Ele-ment with a Simple Type 169 • Declaring Attributes 182 • Creating an XML Schema and an Instance Document 186
PART 3 Displaying XML Documents on the Web 193
Chapter 8 Displaying XML Documents
Using Basic Cascading Style Sheets 195
The Basic Steps for Using a Cascading Style Sheet 197 • Cascading in
Cascading Style Sheets 211 • Setting the display Property 215 • ting Font Properties 221 • Setting the font-variant Property 231 • Set-ting the color Property 232 • SetSet-ting Background Properties 235 •
Setting Text Spacing and Alignment Properties 246
Chapter 9 Displaying XML Documents
Using Advanced Cascading Style Sheets 257
Setting Box Properties 258 • Using Pseudo-Elements (Internet Explorer 5.5 through 6.0 Only) 285 • Inserting HTML Elements into XML Documents 286 • Creating and Using a Full-Featured Cascading Style Sheet 291
Trang 5Contents v
Chapter 10 Displaying XML Documents
Using Data Binding 297
The Main Steps 298 • The First Step: Linking the XML Document
to the HTML Page 299 • The Second Step: Binding HTML Elements
to XML Elements 303 • Using Paging 309 • Using Scripts with the DSO 350
Chapter 11 Displaying XML Documents
Using Document Object Model Scripts 357
Linking the XML Document to the HTML Page 359 • The Structure of the DOM 360 • Accessing and Displaying XML Document Elements
367 • Accessing and Displaying XML Document Attribute Values 384
• Accessing XML Entities and Notations 388 • Traversing an Entire XML Document 392 • Checking an XML Document for Validity 398
Chapter 12 Displaying XML Documents
Using XML Style Sheets 409
Using an XSLT Style Sheet—the Basics 411 • Using a Single XSLT Template 412 • Using Multiple Templates 432 • Using Other Select and Match Expressions 435 • Filtering and Sorting XML Data 440 • Accessing XML Attributes 451 • Referencing Namespaces in XSLT 457
• Using Conditional Structures 459
Appendix Web Addresses for Further Information 461
General Information on XML 461 • Internet Explorer and MSXML 462
• XML Applications 462 • Namespaces 462 • URIs and URNs 462 • XML Schemas 463 • Cascading Style Sheets (CSS) 463 • Data Binding and the Data Source Object (DSO) 464 • ActiveX Data Objects (ADO)
and the ADO recordset Object 464 • HTML and Dynamic HTML
(DHTML) 464 • Microsoft JScript 464 • The Document Object Model (DOM) 465 • Extensible Stylesheet Language Transformations (XSLT) and XPath 465 • Author’s Web Site 465
Index 467
Trang 7Preface to the
Second Edition
I finished writing the first edition of XML Step by Step around the time of the
final snowfall over the southern Rockies in the spring of 2000 Less than a year later, I was once again witnessing the last snowfalls of the season and working
on XML Step by Step, this time starting the second edition I wrote this preface
to discuss my goals in writing the second edition, to describe what’s new in this edition, and to explain why Microsoft Press and I decided to create a second edition so soon after the first
My first goal in writing the second edition was to bring the book up-to-date with the many changes in XML technologies that had occurred since the book was originally published The current version of the XML specification is still 1.0, as it was when I wrote the first edition However, since I wrote that edition, the technologies used to display and work with XML have undergone many changes, and even the XML 1.0 specification itself has appeared in a second edition that includes error corrections and clarifications The following are some
of the important updates to the book (Don’t worry if you haven’t heard of the XML technologies mentioned in this preface They’re all explained in the book.)
■ To reflect the explosive growth of XML applications, I added 16 more XML applications to the list in Chapter 1 (Even with these additions, the list still represents only a small sampling of the current uses for XML.)
■ I wrote a new chapter (Chapter 7) covering XML schemas as de-fined by the World Wide Web Consortium (W3C) XML Schema specification, which achieved the status of recommendation in May
2001 An XML schema is used to define the content and structure of
a class of XML documents I also added several sections to Chapter
11 to explain how to check the validity of an XML document using
an XML schema
Trang 8viii Preface to the Second Edition
■ I completely revamped the final chapter in the book, which formerly covered XSL style sheets (based on the W3C’s December 1998 Extensible Stylesheet Language (XSL) Version 1.0 working draft), to cover the newer XSLT style sheets (based on the W3C’s November
1999 XSL Transformations (XSLT) Version 1.0 recommendation)
I also covered many more style sheet features than before
■ I updated the book to cover the XML features of Microsoft Internet Explorer versions 5.0 through 6.0, as well as MSXML 2.0 through 4.0 (MSXML is the software module that provides basic XML services for Internet Explorer The first edition covered Internet Explorer versions 5.0 through 5.5, and MSXML 2.0 through 2.5.) Also, I recaptured each of the figures that shows a Windows element, such as a message box or the Internet Explorer window, using the Microsoft Windows XP Professional operating system
My second goal in revising the book was to provide new or expanded coverage
on important technologies and techniques that were already available when I wrote the first edition, but that I was unable to include—or to fully cover—due
to space limitations The second edition is about 100 pages longer than the first The following are some of the important new topics you’ll find in these additional pages:
■ I added coverage to Chapter 3 on the often confusing topic of how white space (sequences of space, tab, or line break characters) is handled in XML documents
■ I greatly expanded the coverage on the increasingly important topic
of namespaces Namespaces are used to qualify names in XML documents so that naming conflicts can be avoided Chapter 3 now includes a general discussion on namespaces (in the section “Using Namespaces”), and later chapters (Chapters 5, 8, 10, 11, and 12) now include information on using namespaces with specific XML technologies
■ I added a sidebar to Chapter 3 covering the new all-inclusive URI Internet addressing scheme (“URIs, URLs, and URNs”)
Trang 9Introduction
Extensible Markup Language, or XML, is currently the most promising lan-guage for storing and exchanging information on the World Wide Web
Although Hypertext Markup Language (HTML) is presently the most common language used to create Web pages, HTML has a limited capacity for storing information In contrast, because XML allows you to create your own elements, attributes, and document structure, you can use it to describe virtually any kind
of information—from a simple recipe to a complex database And an XML document—in conjunction with a style sheet or a conventional HTML page— can be easily displayed in a Web browser Because an XML document so
effectively organizes and labels the information it contains, the browser can find, extract, sort, filter, arrange, and manipulate that information in highly flexible ways
XML thus provides an ideal solution for handling the rapidly expanding quan-tity and complexity of information that needs to be delivered on the Web
Why Another XML Book?
XML can be confusing XML applications are appearing at an astounding rate, and XML is intimately tied to an ever-increasing number of related standards and technologies used to format, display, process, and enhance XML docu-ments Many of these related standards and technologies are still in their infant stages, and are rapidly changing and evolving
Most of the XML books that I have read attempt a comprehensive coverage of these technologies but get a bit lost in the maze I believe that the typical XML book tries to survey too many XML technologies too superficially, without dis-criminating between the important and the unimportant, the practical and the impractical, the current and the future
Trang 10xii Introduction
I wrote XML Step by Step to answer the most fundamental XML questions—
what XML is, why it’s needed, and how it can be used—and to teach the most
important, practical XML technologies available now.
Although I was quite selective in choosing the topics to include in this book,
I cover each of them in depth, and avoid partial solutions (For example,
because I tell you how to define XML attributes in Part 2, in Part 3 I show you how to access these attributes when you display the document.)
I never truly understood XML until I started actually writing and displaying XML documents Consequently, I gave this book a hands-on approach, includ-ing many step-by-step instructions, practical examples, and tutorial exercises
I avoided theoretical and abstract discussions that can be so difficult to under-stand with a topic like XML
The book and companion CD are also unique in providing a complete XML learning kit This kit provides all the information, instruction, and software that you need to learn the practical basics of creating and displaying XML documents The book also includes a comprehensive set of links to a wealth of XML information on the Web, which you can explore if you want to go beyond the basics
What You’ll Learn in This Book
Part 1 of this book (Chapters 1 and 2) provides a gentle introduction to XML and prepares you for the detailed information that comes later Chapter 1 an-swers the basic questions I mentioned earlier—what XML is, why it’s needed, and how it’s being used to solve real-world problems Chapter 2 provides a hands-on exercise that gives you a quick overview of the entire process of creat-ing an XML document and displaycreat-ing it in a Web browser
Part 2 (Chapters 3 through 7) focuses on the rules and techniques for creating XML documents Chapters 3 and 4 show you how to create well-formed XML documents—documents that conform to the basic syntactical rules of XML Chapters 5, 6, and 7 tell you how to create valid XML documents—documents that not only conform to the basic syntactical rules, but also match a specific document structure that you define either in the document itself or in a separate file The chapters in Part 2 are based primarily on version 1.0 of the official XML specification developed by the World Wide Web Consortium (W3C) Part 3 (Chapters 8 through 12) teaches you the most important of the current techniques for displaying XML documents in Web browsers Chapters 8, 9, and
12 explain how to display an XML document by linking a style sheet that pro-vides the browser with formatting and other display instructions Chapters 8
Trang 11Introduction xiii
and 9 cover cascading style sheets A cascading style sheet (CSS) is a simple type
of style sheet that allows you to precisely control the way the document content
is formatted, but doesn’t allow you to modify that content Chapter 12 explains style sheets created with XSLT (Extensible Stylesheet Language Transforma-tions) An XSLT style sheet is a more advanced type of style sheet that allows you not only to format the document content (using CSS properties), but also to select and modify the content, giving you complete control over the displayed output
Chapters 10 and 11 teach you how to display an XML document by linking the document to a conventional HTML Web page that contains instructions for se-lecting and presenting the XML data Chapter 10 explains how to do this using data binding, a straightforward technique that is suitable primarily for simple, symmetrically structured XML documents Chapter 11 shows you how to dis-play an XML document from an HTML page by writing a script that uses the XML Document Object Model (DOM), a much more flexible technique that al-lows you to display any type of XML document and any document component
note
Throughout this book, I use the term page to refer to HTML source and the term
document to refer to XML source I chose this convention to help clearly dis-tinguish these two markup languages, which are often used in conjunction
Part 3 focuses specifically on using the Microsoft Internet Explorer Web browser for displaying XML documents (You’ll see more details on Internet Explorer in the following section of the Introduction.)
Finally, the Appendix provides the addresses of Web sites containing an abun-dance of further information on most of the topics covered in this book I also include all of these addresses in the chapters, each in the appropriate context You’ll find a copy of the Appendix on the companion CD in the Resource Links folder, under the filename Appendix.htm (Instructions for installing the com-panion CD files are given later in the Introduction.) You can visit any of these Web sites by opening Appendix.htm in your Web browser and simply clicking a link, rather than typing the address into the browser