1. Trang chủ
  2. » Công Nghệ Thông Tin

beginning xml with dom and ajax, from novice to professional (2006)

455 585 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Beginning XML with DOM and Ajax: From Novice to Professional
Tác giả Sas Jacobs
Thể loại Sách hướng dẫn
Năm xuất bản 2006
Định dạng
Số trang 455
Dung lượng 9,07 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

• The document contains a single document element, which may contain otherelements.. Understanding the Structure of an XML Document Each XML document is divided into two parts: the prolo

Trang 2

Sas Jacobs

Beginning XML with DOM and Ajax

From Novice to Professional

Trang 3

Beginning XML with DOM and Ajax: From Novice to Professional

Copyright © 2006 by Sas Jacobs

All rights reserved No part of this work may be reproduced or transmitted in any form or by any means,electronic or mechanical, including photocopying, recording, or by any information storage or retrievalsystem, without the prior written permission of the copyright owner and the publisher

ISBN-13 (pbk): 978-1-59059-676-0

ISBN-10 (pbk): 1-59059-676-5

Printed and bound in the United States of America 9 8 7 6 5 4 3 2 1

Trademarked names may appear in this book Rather than use a trademark symbol with every occurrence

of a trademarked name, we use the names only in an editorial fashion and to the benefit of the trademarkowner, with no intention of infringement of the trademark

Lead Editors: Charles Brown, Chris Mills

Technical Reviewer: Allan Kent

Editorial Board: Steve Anglin, Ewan Buckingham, Gary Cornell, Jason Gilmore, Jonathan Gennick,Jonathan Hassell, James Huddleston, Chris Mills, Matthew Moodie, Dominic Shakeshaft, Jim Sumser, Keir Thomas, Matt Wade

Project Manager: Beth Christmas

Copy Edit Manager: Nicole LeClerc

Copy Editor: Nicole Abramowitz

Assistant Production Director: Kari Brooks-Copony

Production Editor: Kelly Winquist

Compositor: Dina Quan

Proofreader: Dan Shaw

Indexer: Brenda Miller

Artist: Kinetic Publishing Services, LLC

Cover Designer: Kurt Krames

Manufacturing Director: Tom Debolski

Distributed to the book trade worldwide by Springer-Verlag New York, Inc., 233 Spring Street, 6th Floor,New York, NY 10013 Phone 1-800-SPRINGER, fax 201-348-4505, e-mail orders-ny@springer-sbm.com,

or visit http://www.springeronline.com

For information on translations, please contact Apress directly at 2560 Ninth Street, Suite 219, Berkeley,

CA 94710 Phone 510-549-5930, fax 510-549-5939, e-mail info@apress.com, or visit http://www.apress.com The information in this book is distributed on an “as is” basis, without warranty Although every precautionhas been taken in the preparation of this work, neither the author(s) nor Apress shall have any liability toany person or entity with respect to any loss or damage caused or alleged to be caused directly or indi-rectly by the information contained in this work

The source code for this book is available to readers at http://www.apress.com in the Source Code section

Trang 4

Contents at a Glance

About the Author xiii

About the Technical Reviewer xv

Acknowledgments xvii

Introduction xix

CHAPTER 1 Introduction to XML 1

CHAPTER 2 Related XML Recommendations 21

CHAPTER 3 Web Vocabularies 53

CHAPTER 4 Client-Side XML 99

CHAPTER 5 Displaying XML Using CSS 121

CHAPTER 6 Introduction to XSLT 169

CHAPTER 7 Advanced Client-Side XSLT Techniques 191

CHAPTER 8 Scripting in the Browser 225

CHAPTER 9 The Ajax Approach to Browser Scripting 265

CHAPTER 10 Using Flash to Display XML 293

CHAPTER 11 Introduction to Server-Side XML 317

CHAPTER 12 Case Study: Using NET for an XML Application 349

CHAPTER 13 Case Study: Using PHP for an XML Application 381

INDEX 417

iii

Trang 6

About the Author xiii

About the Technical Reviewer xv

Acknowledgments xvii

Introduction xix

CHAPTER 1 Introduction to XML 1

What Is XML? 2

A Brief History of XML 2

The Goals of XML 3

Understanding XML Syntax 4

Well-Formed Documents 4

Understanding the Difference Between Tags and Elements 5

Viewing a Complete XML Document 6

Understanding the Structure of an XML Document 7

Naming Rules in XML 8

Understanding the XML Document Prolog 9

Understanding Sections Within the XML Document Element 11

The XML Processing Model 16

XML Processing Types 17

DOM Parsing 17

SAX Parsing 17

Why Have Two Processing Models? 18

Some XML Tools 18

Summary 19

CHAPTER 2 Related XML Recommendations 21

Understanding the Role of XML Namespaces 21

Adding Namespaces to XML Documents 23

Adding Default Namespaces 23

v

Trang 7

Defining XML Vocabularies 24

The Document Type Definition 25

XML Schema 29

Comparing DTDs and Schemas 36

Other Schema Types 37

XML Vocabularies 37

Displaying XML 38

XML and CSS 39

XSL 39

XPath 44

XPath Expressions 45

Identifying Specific Nodes 46

Including Calculations and Functions 46

XPath Summary 47

Linking with XML 47

Simple Links 48

Extended Links 49

XPointer 50

XML Links Summary 51

Summary 51

CHAPTER 3 Web Vocabularies 53

XHTML 53

Separation of Presentation and Content 54

XHTML Construction Rules 56

XHTML Tools 66

Well-Formed and Valid XHTML Documents 67

XHTML Modularization 72

MathML 73

Presentation MathML 73

Content MathML 76

Scalable Vector Graphics 77

Vector Graphic Shapes 78

Images 80

Text 81

Putting It Together 82

Web Services 86

WSDL 86

SOAP 92

Trang 8

Other Web Vocabularies 96

RSS and News Feeds 96

VoiceXML 97

SMIL 97

Database Output Formats 97

Summary 98

CHAPTER 4 Client-Side XML 99

Why Use Client-Side XML? 99

Working with XML Content Client-Side 100

Styling Content in a Browser 100

Manipulating XML Content in a Browser 101

Working with XML in Flash 102

Examining XML Support in Major Browsers 103

Understanding the W3C DOM 103

Understanding the XML Schema Definition Language 104

Understanding XSLT 104

Microsoft Internet Explorer 104

Mozilla 112

Opera 114

Adobe (Formerly Macromedia) Flash 115

Choosing Between Client and Server 116

Using Client-Side XML 117

Using Server-Side XML 117

Summary 120

CHAPTER 5 Displaying XML Using CSS 121

Introduction to CSS 122

Why CSS? 122

CSS Rules 122

Styling XHTML Documents with CSS 124

Styling XML Documents with CSS 129

Attaching the Stylesheet 130

Selectors 130

Layout of XML with CSS 131

Understanding the W3C Box Model 132

Positioning in CSS 135

Trang 9

Displaying Tabular Data 150

Working with Display Properties 150

Working with Floating Elements 152

Table Row Spans 154

Linking Between Displayed XML Documents 154

XLink in Netscape and Firefox 155

Forcing Links Using the HTML Namespace 157

Adding Images in XML Documents 158

Adding Images with Netscape and Firefox 158

Using CSS to Add an Image 159

Using CSS to Add Content 160

Working with Attribute Content 162

Using Attributes in Selectors 163

Using Attribute Values in Documents 164

Summary 166

CHAPTER 6 Introduction to XSLT 169

Browser Support for XSLT 169

Using XSLT to Create Headers and Footers 170

Understanding XHTML, XSLT, and Namespaces 172

Creating the XSLT Stylesheet 172

Understanding the Stylesheet 174

Transforming the <body> Element 174

Applying the Transformation 175

Adding the Footer 175

Transformation Without Change 175

Creating a Table of Contents 176

Selecting Each Planet with <xsl:for-each> 179

Adding a New Planet 180

Presenting XML with XSLT 181

Moving from XHTML to XML 182

Styling the XML with XSLT 182

Removing Content with XSLT 184

Understanding the Role of XPath in XSLT 185

Including Images 186

Importing Templates 187

Including Templates 188

Tools for XSLT Development 188

Summary 190

Trang 10

CHAPTER 7 Advanced Client-Side XSLT Techniques 191

Sorting Data Within an XML Document 191

Sorting Dynamically with JavaScript 196

Adding Extension Functions (Internet Explorer) 203

Understanding More About Namespaces 205

Adding Extension Functions to the Stylesheet 206

Providing Support for Browsers Other Than IE 209

Working with Named Templates 210

Generating JavaScript with XSLT 213

Understanding XSLT Parameters 215

Understanding White Space and Modes 215

Working Through the onelinehtml Template 217

Finishing Off the Page 218

Generating JavaScript in Mozilla 219

XSLT Tips and Troubleshooting 220

Dealing with White Space 220

Using HTML Entities in XSLT 222

Checking Browser Type 222

Building on What Others Have Done 223

Understanding the Best Uses for XSLT 223

Summary 224

CHAPTER 8 Scripting in the Browser 225

The W3C XML DOM 225

Understanding Key DOM Interfaces 227

Examining Extra Functionality in MSXML 238

Browser Support for the W3C DOM 241

Using the xDOM Wrapper 241

xDOM Caveats 246

Using JavaScript with the DOM 246

Creating DOM Document Objects and Loading XML 247

XSLT Manipulation 251

Extracting Raw XML 253

Manipulating the DOM 253

Putting It into Practice 257

Understanding the Application 257

Examining the Code 258

Dealing with Large XML Documents 262

Summary 264

Trang 11

CHAPTER 9 The Ajax Approach to Browser Scripting 265

Understanding Ajax 266

Explaining the Role of Ajax Components 266

Understanding the XMLHttpRequest Object 267

Putting It Together 276

Username Validation with the XMLHttpRequest Object 276

Contacts Address Book Using an Ajax Approach 279

Using Cross-Browser Libraries 284

Sarissa 285

Other Ajax Frameworks and Toolkits 287

Backbase 287

Bindows 287

Dojo 287

Interactive Website Framework 287

qooxdoo 287

Criticisms of Ajax 288

Providing Visual Cues 288

Updating the Interface 288

Preloading Data 289

Providing Links to State and Enabling the Back Button 289

Ajax Best Practices and Design Principles 289

Minimizing Server Traffic 290

Using Standard Interface Methods 290

Using Wrappers or Libraries 290

Using Ajax Appropriately 290

Summary 290

CHAPTER 10 Using Flash to Display XML 293

The XML Class 294

Loading an XML Document 294

Understanding the XML Class 297

Understanding the XMLNode Class 298

Loading and Displaying XML Content in Flash 301

Updating XML Content in Flash 305

Sending XML Content from Flash 309

Trang 12

Using the XMLConnector Component 310

Loading an XML Document 311

Data Binding 313

Updating XML Content with Data Components 315

Understanding Flash Security 316

Summary 316

CHAPTER 11 Introduction to Server-Side XML 317

Server-Side vs Client-Side XML Processing 317

Server-Side Languages 318

.NET 319

PHP 321

Working Through Simple Examples 323

The XML Document 324

Transforming the XML 324

Adding a New DVD 331

Modifying an Existing DVD 339

Deleting a DVD 346

Summary 348

CHAPTER 12 Case Study: Using NET for an XML Application 349

Understanding the Application 349

Setting Up the Environment 350

Understanding the Components of the News Application 352

Summary 380

CHAPTER 13 Case Study: Using PHP for an XML Application 381

Understanding the Application 381

Setting Up the Environment 381

Understanding Components of the Weather Portal Application 388

Summary 416

INDEX 417

Trang 14

About the Author

SAS JACOBSis a web developer who set up her own business,Anything Is Possible, in 1994, working in the areas of webdevelopment, IT training, and technical writing The businessworks with large and small clients building web applicationswith NET, Flash, XML, and databases

Sas has spoken at such conferences as Flashforward,webDU (previously known as MXDU), and FlashKit on topicsrelated to XML and dynamic content in Flash

In her spare time, Sas is passionate about traveling,photography, running, and enjoying life

xiii

Trang 16

About the Technical Reviewer

ALLAN KENTis a born-and-bred South African and still livesand works in Cape Town He has been programming in vari-ous and on diverse platforms for more than 20 years He iscurrently the head of technology at Saatchi & SaatchiCape Town

xv

Trang 18

Iwant to thank everyone at Apress for their help, support, and advice during the writing of

this book Thanks also to my family who has provided much support and love throughout the

process

xvii

Trang 20

This books aims to provide a “one-stop shop” for developers who want to learn how to build

Extensible Markup Language (XML) web applications It explains XML and its role in the web

development world The book also introduces specific XML vocabularies and related XML

recommendations

I wrote the book for web developers at all levels For those developers unfamiliar withXML applications, the book provides a great starting point and introduces some important

client- and server-side techniques More experienced developers can benefit from exposure

to important coding techniques and understanding the workflow involved in creating XML

applications

The book starts with an explanation of XML and introduces the different components of

an XML document It then shows some related recommendations, including Document Type

Definitions (DTDs), XML schema, Cascading Style Sheets (CSS), Extensible Stylesheet

Lan-guage Transformations (XSLT), XPath, XLink, and XPointer I cover some common XML

vocabularies, such as Extensible HyperText Markup Language (XHTML), Mathematical

Markup Language (MathML), and Scalable Vector Graphics (SVG)

The middle section of the book deals with client-side XML applications and shows how todisplay and transform XML documents with CSS and XSLT This section also explores how the

current web browsers support XML, and it covers how to use JavaScript to work with XML

doc-uments In this section, I also provide an introduction to the Asynchronous JavaScript and

XML (Ajax) approach

The book finishes by examining how to work with XML on the server It covers two side languages: PHP 5 and NET 2.0 The last chapters of the book deconstruct two XML

server-applications: a News application and a Community Weather Portal application

The book includes lots of practical examples that developers can incorporate in theirdaily work You can download the code samples from the Source Code area of the Apress web

site at http://www.apress.com I hope you find this book an invaluable reference to XML and

that, through it, you see the incredible power and flexibility that XML offers to web developers

xix

Trang 22

Introduction to XML

This chapter introduces you to Extensible Markup Language (XML) and explains some of its

basic concepts It’s an ideal place to start if you’re completely new to XML The concepts that I

introduce here are covered in more detail later in the book

Web developers familiar with Extensible HyperText Markup Language (XHTML) are oftenunsure about its relationship with XML; it’s not always clear why they might need to learn

about XML as well Be assured that both technologies are important for developers

XML is a metalanguage used for writing other languages, called XML vocabularies

XHTML is one of those vocabularies, so when you understand XML, you’ll also understand the

rules underpinning XHTML XHTML is HTML that conforms to XML rules, and you’ll find out

more about this shortly

XHTML has a number of limitations It’s good at structuring and displaying information

in web browsers, but its primary purpose is not to mark up data XHTML can’t carry out

advanced functions such as sorting and filtering content You can’t create your own tags to

describe the contents of an XHTML document The fixed XHTML tags usually don’t bear any

relationship to the type of content that they contain For example, a paragraph tag is a generic

container for any type of content

XML addresses all of the limitations evident in HTML It provides more flexibility thanXHTML, as it works in concert with other standards that assist with presentation, organiza-

tion, transformation, and navigation XML documents are self-describing; their document

structures can use descriptive tags to identify the content that they mark up

I’ll cover these points in more detail within this chapter I’ll explain more about XML andshow why you might want to use it in your work The chapter will cover:

• A definition and a short history of XML

• A discussion of how to write XML documents

• Information about the processing of XML contentWhen you finish this chapter, you should have a good understanding of XML and seewhere you might be able to use it in your work I’ll start by explaining exactly what XML is

and where it fits into the world of web development

1

C H A P T E R 1

Trang 23

What Is XML?

The first and most important point about XML is that it’s not a language itself Rather, it’s ametalanguage used for constructing other languages or vocabularies XML describes the rulesfor how to create these vocabularies Each language is likely to be different, but all use tags tomark up content The choice of tag names and their structures are flexible, and it’s commonfor groups to agree on standard XML vocabularies so that they can share information

An example of an XML language is XHTML XHTML describes a standard set of tags thatyou must use in a specific way Each XHTML page contains two sections described by the

<head> and <body> tags Each of those sections can include only certain tags For example, it’snot possible to include <meta> tags in the <body> section Web developers around the worldshare the same standardized approach, and web browsers understand how to render

XHTML tags

XML is a recommendation of the World Wide Web Consortium (W3C), making it a dard that is free to use The W3C provides a more formal definition of XML in its glossary athttp://www.w3.org/TR/DOM-Level-2-Core/glossary.html:

stan-Extensible Markup Language (XML) is an extremely simple dialect of SGML The goal is

to enable generic SGML to be served, received, and processed on the Web in the way that

is now possible with HTML XML has been designed for ease of implementation and for interoperability with both SGML and HTML.

A Brief History of XML

XML came into being in 1998 and is based on Standard Generalized Markup Language(SGML) SGML is an international standard that you can think of as a language for definingother languages that mark up documents HTML was based on SGML One of the key pointsabout SGML is that it’s difficult to use XML aims to be much easier

XML also owes much of its existence to HTML HTML focused on the display of content;you couldn’t use it for more advanced features such as sorting and filtering HTML wasn’t avery precise language, and it wasn’t case-sensitive It was possible to write incorrect HTMLcontent but for a browser to display the page correctly

XML addresses many of the shortcomings found in HTML In 1999, HTML was rewrittenusing the XML language construction rules as XHTML The rules for construction of anXHTML document are more precise than those for HTML The strictness with which theserules are enforced depends on which Document Type Declaration (DOCTYPE) you assign tothe XHTML page I’ll explain more about DOCTYPEs in Chapter 3

Since 1998, it’s been clear that XML is a very powerful approach to managing information.XML documents allow for the sharing of data A range of related W3C recommendationsaddress the transformation, display, and navigation within XML documents You’ll find outmore about these recommendations in Chapter 2

Trang 24

Let’s summarize the key points:

• XML isn’t a language; its rules are used to construct other languages

• XML creates tag-based languages that mark up content

• XHTML is one of the languages created by XML as a reformulation of HTML

• XML is based on SGML

The Goals of XML

After the complexity of SGML, the W3C was very clear about its goals for XML You can view

these goals at http://www.w3.org/TR/REC-xml/#sec-origin-goals:

1. XML shall be straightforwardly usable over the Internet

2. XML shall support a wide variety of applications

3. XML shall be compatible with SGML

4. It shall be easy to write programs which process XML documents

5. The number of optional features in XML is to be kept to the absolute minimum,ideally zero

6. XML documents should be human-legible and reasonably clear

7. The XML design should be prepared quickly

8. The design of XML shall be formal and concise

9. XML documents shall be easy to create

10. Terseness in XML markup is of minimal importance

A few things about these goals are worth noting First, the W3C wants XML to be forward; in fact, several of the goals include the terms “easy” and “clear.”

straight-Second, the W3C has given XML two targets: humans and XML processors An XMLprocessor or parser is a software package that processes an XML document Processors can

identify the contents of an XML document; read, write, and change an existing document; or

create a new one from scratch

The aim is to open up the market for XML processors by keeping them simple to develop

Stricter construction rules mean that less processing is required This in turn means that the

targets for XML documents can be portable devices, such as mobile phones and PDAs

By keeping documents human-readable, you can access data more readily, and you canbuild and debug applications more easily The use of Unicode allows developers to create XML

documents in a variety of languages Unfortunately, a necessary side effect is that XML

docu-ments can be verbose, and describing data using XML can be a longer process than using

other methods

Trang 25

Third, note the term XML document This term is broader than the traditional view of a

physical document Some XML documents exist in physical form, but others are created as astream of information following XML construction rules Examples include web services andcalls to databases where the content is returned in XML format

Now that you understand what XML is, let’s delve into the rules for constructing XMLlanguages

indi-XML allows you to construct your own tags, so you could rewrite the previous markup as:

<intro>Here is an introduction to XML.</intro>

In this example, the <intro> tag tells you the purpose of the text that it marks up One bigadvantage of XML is that tags can describe their content—that’s why XML languages are often

called self-describing.

XML is flexible enough to allow for the creation of many different types of languages todescribe data The only constraint on XML vocabularies is that they be well-formed

Well-Formed Documents

XML documents are well-formed if they meet the following criteria:

• The document contains one or more elements

• The document contains a single document element, which may contain otherelements

• Each element closes correctly

• Elements are case-sensitive

• Attribute values are enclosed in quotation marks and cannot be empty

Trang 26

I’ll describe all of these criteria throughout this chapter, but it’s worthwhile highlightingsome points now XML languages are case-sensitive; this means that the tag <intro> is not the

same as <Intro> or <INTRO> In XML, these are three different tags Prior to the days of XHTML,

HTML was case-insensitive, so <body> and <BODY> were equivalent tags

All XML tags need to have an equivalent closing tag written in the same case as the ing tag So the <intro> tag must have a matching </intro> tag If no content exists between

open-the opening and closing tags, you can abbreviate it into a single tag, <intro/> Again, contrast

this with HTML, where it was possible to write a single <p> tag to add a paragraph break

The order of tags is important in XML Tags that are opened first must close last:

<chapter><intro>Here is an introduction to XML.</intro></chapter>

HTML pages had no such requirement The following would have been correct in HTML,although unacceptable in XML:

nowrap attribute in a <td> tag, didn’t need to contain an attribute name and value pair:

<td nowrap>A table cell</td>

This type of tag construction isn’t possible in XML You must replace it with somethinglike this:

<td nowrap="true">A table cell</td>

Understanding the Difference Between Tags and Elements

You may have noticed that I’ve used the terms tag and element when talking about XML

docu-ments At first glance, they seem interchangeable, but there’s a difference between the terms

The term element describes opening and closing tags as well as any content A tag is one

part of an element Tags start with an opening angle bracket and end with a closing angle

bracket Elements usually contain both an opening and closing tag as well as the content

between

The following line shows a complete element that contains the <intro> tag

<intro>Here is an introduction to XML.</intro>

Now that you understand the construction rules, it’s time to look at a complete XMLdocument

Trang 27

Viewing a Complete XML Document

A complete piece of XML is referred to as a document It doesn’t matter whether you’re dealingwith XML that marks up text, information requested from a server, or records received from adatabase—all of these are documents

Each XML document is made up of markup and character data In general, the characterdata comprises the text between a start tag and an end tag, and everything else is markup Youcan further divide markup into elements, attributes, text, entities, comments, character data(CDATA), and processing instructions

The following document illustrates the different parts of an XML document You candownload it, along with the other resource files, from the Source Code area of the Apress website (http://www.apress.com) The document, called dvd.xml,describes the contents of a smallDVD library:

Trang 28

This XML document also includes a comment describing its purpose:

<! This XML document describes a DVD library >

I’ve added this comment as a guide for anyone reading the XML document As withXHTML, developers normally use comments to add notations

The document or root element is called <library> You’ll notice that all elements withinthe document appear between the opening and closing <library> tags

The document element contains a number of <DVD> elements, and each <DVD> elementcontains <title>, <format>, and <genre> elements The <DVD> element also contains an id

The <title>, <format>, and <genre> elements each contain text

You can understand the structure and the contents of this document easily by looking

at the tag names It’s obvious, even without the comment, that this document describes a

list of DVDs You can also easily infer the relationship between all of the elements from the

document

Understanding the Structure of an XML Document

Each XML document is divided into two parts: the prolog and the document or root element

The prolog appears at the top of the XML document and contains information about the

document It’s a little like the <head> section of an XHTML document In the XML document

example, the prolog includes an XML declaration and a comment It can also include other

elements, such as processing instructions or a Document Type Definition (DTD) You’ll find

out more about these later in the “Processing Instructions” and “DTDs and XML Schemas”

sections

Well-formed XML documents must have a single document element that may optionallyinclude other content Any content within an XML document must appear within the docu-

ment or root element In the example XML document, the document element is <library>,

and it contains all of the other elements

You might wonder about the names that I’ve chosen for the elements within the XMLdocument You’re free to use any name for elements and attributes, providing that they con-

form to the rules for XML names

Figure 1-1 shows the structure of an XML document

Trang 29

Figure 1-1.The structure of an XML document

Naming Rules in XML

Elements, attributes, and some other constructs have names within XML documents A name

is made up of a starting character followed by name characters Don’t forget that XML namesare case-sensitive

The starting character must be a letter or underscore; it can’t be a number The namecharacters can include just about any other character except a space or a colon Colons indi-cate namespaces in XML, so you shouldn’t include them within your names You’ll learn moreabout namespaces in Chapter 2 To be sure that you’re using legal characters, it’s best torestrict yourself to the uppercase and lowercase letters of the Roman alphabet, numbers,and punctuation, excluding the colon

Trang 30

If you’re authoring your own XML content as opposed to generating it automatically, it’sprobably a good idea to adopt a standardized naming convention You should also use

descriptive names

I prefer to write in CamelCase and start with a lowercase letter, unless the element name

is capitalized normally:

<camelCaseElementName>Here is an element name</camelCaseElementName>

I tend to avoid using underscore characters in my names because I think it makes themharder to read

The use of descriptive names makes it easier for humans to interpret the content Imaginethe difficulty you’d have with this:

<zyxtr>Some content</zyxtr>

Let’s summarize the rules for XML names:

• XML names cannot start with a number or punctuation

• XML names cannot include spaces

• Don’t include a colon in a name unless it indicates a namespace

• XML names are case-sensitive

I’ll describe the contents of an XML document in more detail I’ll start by showing you theelements that can appear in the prolog

Understanding the XML Document Prolog

The prolog of an XML document contains metainformation about the document rather than

document content It may contain the XML declaration, processing instructions, comments,

and an embedded DTD or schema

mation about the document, such as the character-encoding type

If you include the XML declaration, it must appear on the first line of the XML document

Nothing can precede an XML declaration—not even white space If you accidentally include

white space before the declaration, XML processors won’t be able to parse the content of the

XML document correctly and will generate an error message

The XML declaration may also include attributes that provide information about the sion, encoding, and whether the document is standalone:

ver-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

Trang 31

At the time of writing, the current XML version is 1.1 However, many processors don’trecognize this version, so it’s best to stick with a version 1.0 declaration for backward

Processing Instructions

The prolog can also include processing instructions (PIs) that pass information about the XMLdocument to other applications The XML processor doesn’t process PIs, but rather passesthem on to the application unchanged

PIs start with the characters <? and finish with ?> They usually appear in the prolog,although they can appear in other places within an XML document

Note An XML declaration also starts with the characters <?xml Even though the XML declaration lookssimilar, it’s worth remembering that it’s quite different from a PI

The following PI indicates a reference to an XSL stylesheet:

<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>

The first item in a PI is a name, called the PI target The preceding PI has the name xml-stylesheet Names that start with xml are reserved for XML-specific PIs The PI also hasthe text string type="text/xsl" href="stylesheet.xsl" Although this looks like two attrib-utes, the content isn’t treated that way You’ll see more examples of stylesheet PIs in

Chapters 6 and 7

Comments

Comments can appear almost anywhere in an XML document The example XML documentincluded a comment in the prolog, so let’s look at comments with the other prolog contents.XML comments look the same as XHTML comments They begin with the characters

<! and end with >:

<! Here is a comment >

Comments don’t affect the processing of an XML document They’re normally intendedfor human readers If you add a comment, you must be aware of the following rules:

Trang 32

• A comment may not contain the text >.

• A comment may not be included within tag names

• A comment should not hide either the opening or closing tags in an element

• An XML processor isn’t obliged to pass a comment to an application, although most do

DTDs and XML Schemas

DTDs and XML schemas provide rules about which elements and attributes can appear within

the XML document In other words, they specify which elements and attributes are valid and

which are required or optional

The prolog can include declarations about the XML document, a reference to an externalDTD or schema, or both I’ll explain more about DTDs and schemas in Chapter 2

Understanding Sections Within the XML Document Element

The data within an XML document is stored within the document or root element This

ele-ment contains all other eleele-ments, attributes, text, and CDATA within the docuele-ment and may

also include entities and comments

Elements

Elements serve many purposes in an XML document They

• Mark up content

• Provide a description of the content they mark up

• Provide information about the order of data and its relative importance

• Show the relationships between dataElements include a starting and ending tag as well as content The content can be text,child elements, or both text and elements The starting tag for an element can also contain

attributes You can position comments inside elements

In the earlier example, you saw the following structure within the <DVD> element:

The opening <DVD> tag contains an id attribute and includes three other elements:

<title>, <format>, and <genre> Each of these elements contains text

You saw earlier that it’s necessary to open and close tags in the correct order It would bewrong to write the following:

Trang 33

• Elements containing only text

• Elements containing only child elements

• Elements containing a mixture of child elements and text, or mixed elementsYou’ll see how important it is to distinguish between these different types when I coverXML schemas in Chapter 2

Elements Containing Only Text

Some elements only contain text content You’ll recall from the previous example that the

<title>, <format>, and <genre> elements contain only text:

<title>Breakfast at Tiffany's</title>

<format>Movie</format>

<genre>Classic</genre>

Elements Containing Other Elements

It’s possible for an element to contain only other elements The container element is called the

parent, while the elements contained inside are the child elements The <DVD> element is anexample of an element that contains child elements:

Trang 34

Mixed Elements

Mixed elements contain both text and child elements The DVD example doesn’t include any

of these types of elements, but the following code block shows a mixed element:

<mixedElement>This element contains both text and child elements

<childElement>This element contains text</childElement>

<emptyElement/>

</mixedElement>

To summarize, elements have the following requirements:

• Elements must contain starting and ending tags, unless there is no content, in whichcase you can use the shorthand form

• The tag names must obey the XML naming rules

• Elements must be nested correctly

Attributes

Another way to provide information in XML documents is by using attributes within the

opening tag of an element Attributes normally provide additional information about the

ele-ment that they modify There is no limit to the number of attributes that can appear inside an

In this case, the data Introduction to XML is enclosed in a <p> element This element tells

a web browser to display the information in a separate paragraph The style attribute

pro-vides additional information about how to display the data Here, you’re telling the browser

to center the text

Two common uses of attributes are to convey formatting information and to indicate theuse of a specific format or encoding For example, you could convey a date as

<Date Format="mmddyyyy">06081955</Date>

or indicate use of an International Organization for Standardization (ISO) date format using

<Date Code="ISO8601">1955-06-08</Date>

When an element contains an attribute, it’s said to be a complex type element As you’ll

see later, this is important when writing XML schema documents

You can use either a pair of double or single quotes for different attributes within thesame element:

<elementName att1="value1" att2='value2'>Here is an element</elementName>

Trang 35

Make sure you don’t include one of each in a single attribute, or the document won’t bewell formed.

Caution Be careful when cutting and pasting attributes from a word-processing document into an XMLdocument Word processors often use smart quotes, which cause an error in an XML document

You can also write an attribute as a nested child element For example, you could rewritethe <DVD> element

• An attribute is made up of a name/value pair

• You must enclose the attribute value in single or double quotes

• Attributes cannot contain an XML tag

• Attribute names must follow the XML naming rules

Text

All text within an XML document is contained inside opening and closing tags Unless youmark the text as CDATA, it will be treated as if it were XML and processed accordingly Thismeans an opening angle bracket will be treated as if it were part of an XML tag

If you want to use reserved characters within text, you must rewrite them as characterentities For example, you can write the left angle bracket < as &lt; You can also embed thereserved characters within CDATA

Trang 36

CDATA Sections

CDATA allows you to mark blocks of text so that they’re not processed as XML As I mentioned

before, this is useful for text that contains reserved XML characters:

<title><!CDATA[ Why 9 is < 10 ]]</title>

This CDATA section starts with <!CDATA[ and ends with ]] The character data is tained within the opening and closing square brackets Obviously, the string ]] can’t appear

con-within a CDATA section

You can use CDATA sections in XML documents for embedding code, such as JavaScript,and for adding content that doesn’t need processing For example, an application that reads

data from a database and marks it up in XML might embed all content in CDATA sections to

avoid the need to process the reserved characters explicitly I’ll show you an example of using

CDATA with JavaScript in Chapter 3

Entities

Character entities are symbols that represent a single character In XHTML, character entities

are used for special symbols such as an ampersand (&amp;) and a nonbreaking space (&nbsp;)

You can use character entities to replace the reserved characters in XML documents Alltags start with a left angle bracket, so it isn’t possible to include this character in the text

within an element:

<expression>10 < 25</expression>

If you try to process this element, the presence of the left angle bracket before the text 25causes a processing error Instead, you could replace this symbol with the entity &lt;:

<expression>10 &lt; 25</expression>

You need to consider the following reserved characters:

• <, which indicates the start of a tag name

• &, which indicates the first character of an entity

• xml, which is reserved for referring to parts of the XML language, such as stylesheet

xml-Table 1-1 summarizes the character entities that you need to use

Trang 37

Sometimes you can’t include a literal character in an XML document, perhaps becausethe character doesn’t exist on a keyboard or because it’s a graphic character Instead, you canadd these as character entities using Unicode or hexadecimal numbers For example, you canencode the copyright symbol © as &#169; or &#xA9;

If the reference starts with &# and ends with a semicolon, it’s a character reference Thenumber between is the Unicode code for the character required If the code is written as ahexadecimal, then it’s prefixed with the character x

You can also define your own entities For example, you could define the reference

&copyright; to mean Copyright 2006 Apress Each time you want to include this text in theXML document, you could use the entity reference &copyright; This makes the text easier tomanage and update

Let’s move on to look at the processing of XML documents

The XML Processing Model

The XML recommendation assumes that an XML document will be processed in a particularway The model indicates that an XML processor passes the content and structure of the XMLdocument to an application XML processors are usually called XML parsers, as they parse theXML document; see Figure 1-2

Common XML processors include Microsoft XML Parser (MSXML), Apache Xerces2, andthe Oracle XML parser You can write an application that uses any of these parsers SomeXML parsers are also available as prepackaged software that install automatically ExtensibleStylesheet Language Transformations (XSLT) processors used to display XML in a web browserfall into this category MSXML contains both an XML parser and an XSLT processor, and isboth an XML processor and an application It installs automatically with Internet Explorer andother Microsoft software

Trang 38

XML Processing Types

There are two categories of XML processing: tree-based and event-based Many XML parsers,

including later versions of MSXML, support both models You’ll often hear tree-based parsers

referred to as Document Object Model (DOM) parsers, while event-based parsers are referred

to as Simple API for XML (SAX) parsers Both are named after the specifications they support

The DOM is a W3C recommendation that provides an application programming interface(API) to an XML document Any application can use this API to manipulate an XML docu-

ment, read information, add new nodes, and edit the existing content You can find out more

about this recommendation at http://www.w3.org/TR/REC-DOM-Level-1/

SAX is not a W3C recommendation, but it does enjoy support from both large and smallsoftware companies A SAX-based parser reads an XML document sequentially, firing off

events as it reaches important parts of the document, such as the start or end of an element

You can find out more at http://www.saxproject.org/

DOM Parsing

Figure 1-3 shows the dvd.xml document that you’ve been working with represented as a tree

structure

Displaying the document in this way reinforces the relationship between the elements, as

in a family tree The <library> element is the parent of the <DVD> element and the grandparent

of the <title>, <format>, and <genre> elements The <DVD> elements are siblings and have the

<library> element as a parent or ancestor The <title>, <format>, and <genre> elements are

descendants of the <library> element

DOM parsing allows access to these elements, their values, and all other parts of an XMLdocument through either a programming language or a scripting language such as JavaScript

SAX Parsing

A SAX-based parser presents an XML document as a string of events You must write handlers

for each event so that something suitable occurs when the event triggers the handler

This type of parsing works well with languages that have good event-handling properties

For instance, SAX parsing is used extensively with Java It’s less suitable for the scripting

lan-guages often employed on the web, so I don’t cover it in detail here

Trang 39

Why Have Two Processing Models?

Both processing models offer advantages DOM-based parsing provides full read-write access

to an XML document, and you can traverse the document tree to access nodes within the ument It can also validate a document against a DTD or XML schema to determine that thedocument is valid

doc-However, DOM-based parsing must read the full XML document into memory, so DOMparsing can be slow and memory-intensive when working with large XML documents It’s dif-ficult to determine exactly what constitutes a large XML document, because processing timedepends on computing power, memory, time available, and whether it’s working in a single-user environment or a multiuser environment such as a web server As a rule, most systemscope with documents up to tens of megabytes in size, but you need to take care with filesabove this size

The SAX-based model, on the other hand, is sequential in operation Once a node hasbeen processed, it is discarded and cannot be processed again The whole document isn’tloaded into memory at once, so you can avoid problems associated with processing large XMLdocuments This method of processing puts the onus on you to store any information fromthe XML document that might be required later

SAX is ideal, for example, as an intermediate routing product in a communications tem An incoming XML document is likely to consist of a small routing header and a largerdocument for delivery to the end point Using SAX, a routing device can read the routinginformation and ignore the document, as the document is irrelevant to its delivery A DOM-based parser, however, must parse the complete document to be able to deliver it to itsultimate destination

In general, XML development tools fall into several categories:

• Extensions to existing programmers’ IDEs

• XML-specific IDEs

• Individual toolsTools such as Microsoft Visual Studio (http://msdn.microsoft.com/vstudio/) fall into thefirst category They have good XML support aimed specifically at developers At the time ofwriting, the latest version is Visual Studio 2005 and includes the following features:

• It helps you create and edit XML documents, including checking whether a document

Trang 40

The dedicated XML IDEs tend to cover similar ground and differ in the depth of their port and their user interfaces Most of these tools have an XML editor, tools for creating DTDs

sup-and XML schemas, sup-and support for XSLT development Several such tools are available,

including this small sample of common ones:

• Altova’s XML Suite: http://www.altova.com/suite.html

• TIBCO Software’s suite of XML tools: http://www.tibco.com/software/

business_integration/xml_tools.jsp

• DataDirect Technologies’ Stylus Studio: http://www.stylusstudio.com/

Many of the suites mentioned include individual tools that you can use for editing XMLdocuments These include

• Altova’s XMLSpy: http://www.altova.com/products_ide.html

• Blast Radius’ XMetal: http://www.xmetal.com/index.x?products/xmetal/

• SyncRO Soft’s <oXygen/>: http://www.oxygenxml.com//

There are many other excellent tools available that I haven’t mentioned here You canfind out more by searching the Internet or subscribing to mailing lists such as XML-DEV

(http://xml.org/xml/xmldev.shtml)

Summary

In this chapter, you’ve been introduced to some of the basic concepts relating to XML I’ve

covered XML syntax in some detail, and I’ve shown you the benefits that XML provides for

web developers I’ve also shown you some of the tools that you can use to work with XML

documents

In Chapter 2, I’ll show you some of the related XML recommendations You’ll learn how towork with DTDs and XML schemas You’ll also find a brief introduction to XSLT, XPath, XLinks,

and XPointer

Ngày đăng: 27/03/2014, 13:34

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm