1. Trang chủ
  2. » Công Nghệ Thông Tin

Tài liệu Module 9: Using XML to Exchange Data pptx

80 492 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Using XML to Exchange Data
Trường học Microsoft Corporation
Chuyên ngành XML
Thể loại module
Năm xuất bản 2000
Thành phố Redmond
Định dạng
Số trang 80
Dung lượng 1,04 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Contents Overview 1 Using the Document Object Model 31 Applying XML in N-Tier Applications 53 Lab 9: Exchanging Data Using XML 60 Review 69 Module 9: Using XML to Exchange Data... S

Trang 1

Contents

Overview 1

Using the Document Object Model 31

Applying XML in N-Tier Applications 53

Lab 9: Exchanging Data Using XML 60

Review 69

Module 9: Using XML to Exchange Data

Trang 2

to represent any real individual, company, product, or event, unless otherwise noted Complying with all applicable copyright laws is the responsibility of the user No part of this document may

be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of Microsoft Corporation If, however, your only means of access is electronic, permission to print one copy is hereby granted

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property

 2000 Microsoft Corporation All rights reserved

Microsoft, BackOffice, MS-DOS, Windows, Windows NT, ActiveX, MSDN, PowerPoint, and Visual Basic are either registered trademarks or trademarks of Microsoft Corporation in the U.S.A and/or other countries

The names of companies, products, people, characters, and/or data mentioned herein are fictitious and are in no way intended to represent any real individual, company, product, or event, unless otherwise noted

Other product and company names mentioned herein may be the trademarks of their respective owners

Trang 3

Instructor Notes

This module introduces the Extensible Markup Language (XML) Students will learn how data is represented by using XML and how to use Document Type Definitions (DTDs) and schemas to validate document structure Students will also learn how to parse XML by using the Document Object Model (DOM) After completing this module, students will be able to:

! Describe the purpose and benefits of XML

! Describe the structure of a well-formed XML document

! Describe the purpose of XML Schemas and DTDs

! Manipulate XML by using DOM

In the lab, students will examine code to see how the XML DOM can be used

to create XML documents They will then use the DOM to read an XML purchase order that was generated by the Purchase Order Online application used in the labs for this course

Materials and Preparation

This section provides you with the required materials and preparation tasks that are needed to teach this module

Required Materials

To teach this module, you need the following materials:

! Microsoft PowerPoint® file 1907A_09.ppt

! Module 9: Using XML to Exchange Data

! Lab 9: Exchanging Data Using XML

Preparation Tasks

To prepare for this module, you should:

! Read all of the materials for this module

! Complete the lab

! Read the instructor notes and the margin notes for this module

Presentation:

75 Minutes

Lab:

60 Minutes

Trang 4

Demonstration

This section provides demonstration procedures that will not fit in the margin notes or are not appropriate for the student notes

Using the Document Object Model

The purpose of this demonstration is to show how the DOM can be used to create, load, browse, and search an XML document This demonstration uses an XML document containing a booklist It has an associated schema at

http://localhost/books/booklistschema.xml

This demonstration is divided into four parts You can demonstrate all parts or selected individual parts Note, however, that to demonstrate parts C or D, the demonstration program first requires that an XML document be loaded This

initial step is achieved by clicking the Load XML Document button (described

in part B) You must also follow the demonstration preparation instructions that follow prior to performing a demonstration

! Part A: Create and Save an XML Document

! Part B: Load and Validate an XML Document

! Part C: Walk Through an XML Document

! Part D: Search an XML Document

! Prepare for the demonstration

1 Use Windows Explorer to navigate to the <install folder>\Democode\Mod09\XML folder

2 Right-click the Books folder and choose Properties

3 On the Web Sharing tab, click the Share this folder option button

4 Leave the default alias (Books) and ensure that all Access Permissions check boxes are checked Then click OK

5 Click Yes to accept the warning message and then click OK on the

Properties dialog box

6 Open the project XMLDemo.vbp located in <install folder>\Democode\Mod09\XML

7 Display the cmdCreateXMLDoc_Click procedure and place a breakpoint

on the line Set xmlDoc = New MSXML.DOMDocument

8 Display the cmdLoadXMLDocument_Click procedure and place a breakpoint on the line Set xmlDoc = New MSXML.DOMDocument

9 Display the cmdWalkXMLDocument_Click procedure and place a breakpoint on the line Debug.Assert Not (xmlDoc Is Nothing)

10 Display the cmdSearchXMLDocument_Click procedure and place a breakpoint on the line Debug.Assert Not (xmlDoc Is Nothing)

Trang 5

! Part A: Create and Save an XML Document

1 Run the project

2 Click the Create XML Document button Execution will halt at the breakpoint in cmdCreateXMLDoc_Click

3 Explain that the line with the breakpoint instantiates an

MSXML.DOMDocument object, which represents the top node of the

XML DOM tree Press F8 to step over the line with the breakpoint

4 Step over the next three lines of code, making the following observations:

• A processing instruction containing an XML declaration is created

• Processing instructions (like all new elements and attributes) must be appended to the appropriate place in the DOM tree This step is

performed by using the appendChild method of the

IXMLDOMDocument interface In this case, the processing instruction

is appended directly to the DOMDocument, placing it at the top level of

the document

5 Step over the next three lines of code, making the following observations:

The DOMDocument interface’s createElement method is used to create the booklist element

An xmlns attribute is created by using setAttribute This attribute is

used to associate a schema with an XML document

The booklist element is appended to the tree at the top level, making

booklist the root element

6 Explain that the private subroutine AddBook is used to create a book element, together with its associated attributes and child elements (title,

author, and price)

7 Press F8 to enter the AddBook subroutine

8 Step over the code in the AddBook subroutine, making the following

observations:

New elements are created by using the createElement method of the

IXMLDOMDocument interface This interface is obtained via the ownerDocument property of IXMLDOMNode This code returns the

root of the document containing the node

Attributes are associated with elements by using the setAttribute method of the element’s IXMLDOMElement interface

• Each element is appended to the supplied parent node (which in this case

is the booklist element) by using the appendChild method

9 Having returned to the cmdCreateXMLDoc_Click procedure, step over the remaining calls to AddBook

Trang 6

10 Step into the SaveXMLDocument private subroutine Step through this

routine, making the following observations:

ADO Record and Stream objects are used to create the output XML

document Using these objects allows the document to be output to a Web folder by using a URL

A Stream object is opened from the Record object to represent the

contents of the file

The WriteText method of the Stream object is called with a string

representation of the XML document passed as a parameter (xmlDoc.xml)

11 Press the Continue toolbar button to resume execution of the program

12 A message box will be displayed confirming that the document has been

successfully created Press OK to dismiss this message box

! Part B: Load and Validate an XML Document

1 Run the project if it is not already running Click the Load XML Document

button Execution will halt at the breakpoint within

cmdLoadXMLDocument_Click

2 Step over the line of code that instantiates a new DOMDocument

3 Explain that by default the Load method of DOMDocument will load a document asynchronously In this instance, the code sets the async property

to False to perform a synchronous load Point out the pitfalls of starting to

use the DOM with a partially loaded tree Also point out that if an

asynchronous load is chosen, the ondataavailable event is fired when the

XML document data is available

4 Press F8 to step over the next line of code

5 Point out that the validateOnParse property of DOMDocument indicates

whether validation should be performed while loading the document In this case, validation occurs against the schema referenced by the document (by

using the xmlns attribute of the booklist root element)

6 Press F8 to step over the next line of code

7 Step over the call to the DOMDocument’s Load method and point out that

a URL is used to locate the XML document

8 Press the Continue toolbar button to resume program execution

9 A message box will be displayed confirming that the document has been

successfully loaded Press OK to dismiss this message box

10 The DOM tree is now fully populated

! Part C: Walk Through An XML Document

1 Click the Walk XML Document button If this button is disabled, you must first load an XML document by using the Load XML Document button

2 Execution will halt at the breakpoint in cmdWalkXMLDocument_Click

3 Step over the remaining lines of code in the subroutine, making the following observations:

4 The root IXMLDOMNode variable is set to the document’s root element by

using the DOMDocument’s documentElement property

Trang 7

5 The hasChildNodes method of IXMLDOMNode is used to test whether

the root element has any child elements

6 The length property of the child nodes IXMLDOMNodeList interface is used to ascertain how many direct children elements the booklist element

possesses This number represents the number of books in the book list

7 A For loop is established to process each book element

8 The bookNode IXMLDOMNode variable is set to each successive child node

9 The node type is checked by using the nodeTypeString property In this

case, the node type will always be “element” because only book elements

are direct children of the booklist element

10 For each element, the ProcessBookElement private subroutine is called,

which outputs the element tag name and text together with the values of the

isbn and type attributes Make sure the Immediate window is visible as

output is sent to this window Notice that the text property associated with the book element is a concatenation of all the text nodes for all child elements of book

! Part D: Search an XML Document

A set of XSL patterns has been provided in the XSL Pattern combo box You

can repeat these steps for each pattern The patterns are:

//author Returns all author elements in the document //book[@isbn=’1-444444-11-0’]

Returns the book element with the specified isbn attribute value

2 Execution will halt at the breakpoint in cmdSearchXMLDocument_Click

3 Step over the remaining lines of code in this subroutine, making the following observations:

4 The XSL pattern is passed as a parameter to the selectNodes method of the

IXMLDOMDocument interface

5 selectNodes returns an IXMLDOMNodeList collection that contains

matching nodes

6 A For Each construct is established to process each node in the collection

7 The element tag name and text values are output to the Immediate window

Notice that for searches that return book elements, the text property of the

book element is a concatenation of all the text nodes for all child elements

of book

Trang 8

Module Strategy

Use the following strategy to present this module:

! Introduction to XML Provide an overview of XML XML defines a generic mechanism for adding tagged information to character data This extra information can help

to convey additional context, or metadata, or it can define the structure of the data contained within the tags (for example, by defining the fields of a purchase order) Although it may initially seem fairly simplistic, the simplicity and flexibility of XML are two of the key features that have helped make it the de facto information exchange mechanism for e-commerce

Discuss the syntax of XML and how it encompasses other data Look at how XML can be applied and the types of data it can help to represent Emphasize the suitability of XML for document interchange (e-business) and explain that document exchange is how XML is used in the lab scenario The purchase order system produces XML order documents for vendor trading partners Mention other members of the XML family and which parts of the XML family are implemented in some common Microsoft products

There is a practice of using Microsoft Internet Explorer 5 to view an XML document The practice initially uses Internet Explorer 5 to display the XML data in its raw format and then asks students to associate an XSLT style sheet with the XML document Internet Explorer 5 is used to view the document again This time, as Internet Explorer 5 processes the style sheet, the data is displayed in an HTML table

! Validating XML Documents Describe the concept of XML validation and that it is frequently useful to know whether an XML document conforms to a specific XML grammar The process of checking that an XML document conforms to a specific XML grammar is called validation Applications can accept or reject documents based on their validity The two common mechanisms for defining the XML grammars used when validating XML documents are DTDs and XML Schemas

Explain why it is useful to validate XML documents Show students how DTDs and XML Schemas work, and stress the advantages that schemas have over DTDs Point out that the prime advantage of the DTD is that it is part of the XML 1.0 specification Discuss schema syntax

! Using the Document Object Model Discuss that one of the great advantages of using XML rather than a proprietary data format is that there are ready-made parsers, such as the Microsoft XML engine MSXML, to perform much of the difficult work automatically However, after the parser has processed the XML data, you need some mechanism of accessing it programmatically The DOM standard defines such a mechanism

Explain the basics of how to access and manipulate XML data by using the DOM Discuss searching for name tags and nodes based on particular criteria Discuss XPath syntax Finally, describe how to create, trim, and persist XML trees

Trang 9

! Applying XML in N-Tier Applications Discuss the different mechanisms for creating and manipulating XML Refer students to the lab scenario, and discuss some of the ways in which XML might be used in distributed applications like those found in the lab for this module

Examine some of the potential sources of XML, and explain how XML can

be easily sent to a URL for further processing Look at how XSL Transformations (XSLT) can be used as a powerful tool for converting between XML grammars

! Best Practices Summarize the best practices that should be observed when using XML to exchange data

Trang 10

THIS PAGE INTENTIONALLY LEFT BLANK

Trang 11

# Overview

! Introduction to XML

! Validating XML Documents

! Using the Document Object Model

! Applying XML in N-Tier Applications

! Lab 9: Exchanging Data Using XML

! Best Practices

! Review

The Extensible Markup Language (XML) defines a flexible data representation that is ideal for data exchange in loosely coupled systems Because of this advantage, XML is rapidly becoming the markup language of choice for e-commerce and other business-to-business data exchange

In this module, you will learn some of the uses for XML and its basic syntax You will learn about different XML grammars and how to validate an XML document against a grammar defined as a Document Type Definition (DTD) or XML Schema You will also learn how to manipulate XML programmatically

by using the Document Object Model (DOM) Finally, you will learn about how XML can be used in an n-tier application

Objectives

After completing this module, you will be able to:

! Describe the purpose and benefits of XML

! Describe the structure of a well-formed XML document

! Describe the purpose of XML Schemas and DTDs

! Manipulate XML by using the Document Object Model

! Describe how XML can be applied in an n-tier Windows DNA solution

Trang 12

! XML Support in Microsoft Products

! Practice: Viewing an XML Document in Internet Explorer 5

XML defines a generic mechanism for adding tagged information to character data This extra information can help to convey additional context, or metadata,

or it can define the structure of the data contained within the tags (for example,

to define the fields of a purchase order) Although it may seem fairly simplistic

at first, the simplicity and flexibility of XML are two of the key features that have helped to make it the de facto information exchange mechanism for e-commerce

In this section, you will learn the syntax of XML and how it encompasses other data You will look at how XML can be applied and the types of data it can help

to represent You will learn about the other members of the XML family and which parts of the XML family are implemented in some common Microsoft products

! This section includes the following topics:

! What Is XML?

! Benefits of XML

! XML Syntax

! XML Family of Standards

! XML Support in Microsoft Products

! Practice: Viewing an XML Document with Internet Explorer 5

Trang 13

What Is XML?

! Standard for defining data in tagged form

the underlying raw data

! Imposing a structure on the underlying raw data

! Providing extra information, or metadata, about part of the underlying raw data

An XML tag consists of a name enclosed in angle brackets, such as <book> As you will see later, tags usually have matching end tags; in this case, </book> The pair of tags defines an XML element within which data can be contained If the application processing the file or stream containing the tags is XML-aware,

it will identify these tags and interpret them appropriately

A portion of a simple XML document is shown below:

Trang 14

Examining the fragment reveals information about two books surrounded by XML tags In this case, each book has a title, author, and price These three pieces of information about the book are contained within a <book> element Two different <book> elements are defined and they, in turn, are contained within a <booklist> element This fragment shows how XML can be used to define structure for the underlying data

This example reveals two other aspects of XML The first is that, when formatted correctly, it can be human readable The second aspect is that it is hierarchical in nature This hierarchy means that it can be formed into tree-like structures for processing For information about the structure of the Document Object Model (DOM), see Using the Document Object Model in this module

XML is All About Data

If you have written or examined an HTML document, XML may look familiar However, there are two important aspects in which XML differs from HTML

Convey formatting instructions for a Web browser These formatting instructions do not convey any information about the data within them

Consist of a fixed set of tags defined by the World Wide Web Consortium (W3C)

If you include your own tags in an HTML document, the browser will silently ignore them

Can be used to define the type and structure of the data contained within tags This function preserves the original meaning of the data in the document Allow you to define your own tags to create your own grammar or dialect to describe the data in your document In fact, XML has only a few predefined tags that pertain to document structure

XML and HTML look similar because they come from a common origin They are both derived from the Standard Generalized Markup Language (SGML) SGML is used to define the structure and metadata for complex documentation, such as that required to describe the electrical wiring contained in an airliner SGML was used before the advent of the Internet The problem with SGML is that it is a complex syntax with many options for providing a high level of flexibility Unfortunately, this feature can make it difficult to handle As a result, SGML-aware applications have tended to reflect this difficulty in their complexity and price

HTML was an attempt to apply SGML principles and provide a small subset of tags for simple documents The success of the Word Wide Web (WWW) is a testament to how important simplicity is when creating a common standard The simplicity of HTML means that anyone with a copy of Microsoft Notepad can create a Web document

But in some ways, the simplicity of HTML has also been its undoing HTML is

a good mechanism for conveying information about the formatting of a document in a browser However, it is not an effective way of representing data

As the Web becomes the backbone for e-commerce, much of the information transmitted over the Internet is not for human consumption in a Web browser, but rather for the use of applications These applications need a description of the data being sent, not just instructions on how to display it This requirement

is best satisfied by XML XML documents retain the structure of the data and can be more easily processed in software Later in this module, you will see how much easier it is to write a program to process an XML document than it would be to process an HTML document

Trang 15

XML itself is only one of a set of related standards The World Wide Web Consortium defines the standard for XML and the other technologies in the XML family For more information about the XML family of standards, see XML Family of Standards in this module

Trang 16

Benefits of XML

! Data exchange

$ A cheap alternative to EDI

! Standardization of documents

$ XML grammars being standardized for vertical markets

$ Microsoft’s BizTalk initiative

! Metadata

$ Tools such as Rational Rose can export OO design model in a format known as XML Metadata Interchange (XMI)

! Structure and interoperability in infrastructure

$ For example, the Simple Object Access Protocol (SOAP)

As it has evolved, XML has found a variety of applications:

! Data exchange The main area in which XML is being applied is data exchange The ability

to exchange structured data between applications is a key enabler for commerce For many years, the Electronic Data Interchange (EDI) standard governed most data interchange between organizations This standard acted

e-as a barrier to entry for smaller firms because it we-as traditionally expensive

to implement The EDI standardization process also limited the speed at which EDI-based systems could respond to changing conditions With XML representing the data and the Web acting as the transport mechanism, the barrier to entry has lowered considerably Also, the flexibility of XML has increased dramatically because two organizations simply need to agree to a common XML grammar to start the exchange of data

! Standardization of documents Many professionals in industry, computer science, and the academic world are working to standardize XML grammars for vertical markets such as legal practice, scientific work, and finance This standardization will allow interchanging common documents and files that define such things as client records, chemical models, and financial instruments Although some of these documents may form part of an e-commerce chain, they can be equally well exchanged on a floppy disk Other initiatives, such as Microsoft’s BizTalk, concentrate on the interchange and interoperability of documents between organizations rather than absolute standards

! Metadata For example, XML is being used as metadata in the world of object-oriented software development Tools such as Rational Rose can now export a model

of an object-oriented design in a format known as XML Metadata Interchange (XMI) This format can then be read by other modeling tools or processed by a tool that can convert the class and component descriptions into software

Trang 17

! Structure and interoperability in infrastructure For example, the Simple Object Access Protocol (SOAP) defines a mechanism for delivering remote procedure calls as XML-encoded messages in HTTP requests Using XML obviates the need for specialized parsing code and provides a high degree of extensibility

This discussion should give you a few ideas about how XML is being applied For more information about the advantages and uses of XML, see "Proposed Applications and Industry Initiatives" on the Extensible Markup Language page of Robin Cover’s XML/SGML Web pages located at

www.oasis-open.org/cover/xml.html

Trang 18

XML Syntax

! Comments

! Elements

<!–- This is my favorite book >

Comments can occur anywhere in an XML document It is a good practice to insert comments in XML documents or code if you expect them to be read by others at any point

The following example shows an XML element containing simple text:

<title>Is Anger the Enemy?</title>

Trang 19

It is important to note that all characters in an XML document are by default unicode characters This rule applies to both XML tags and the data As a result, XML does not provide equivalence between uppercase and lowercase characters (that is, it is case sensitive) The following example is not correct XML syntax:

<! This is invalid XML syntax: >

<title>Is Anger the Enemy?</TITLE>

The following example shows an XML element containing a mixture of other elements and text:

<chapter title="Inorganic Chemistry">

In this chapter, we will discuss inorganic chemistry <section>

Transition Metals </section>

Transition Metals are found in the centre of the periodic table

<section>

Group 1 Metals </section>

Group 1 Metals have a single electron

</chapter>

Note that the indentation is only shown for clarity There is no need for such spacing, or indeed new lines, in your XML document

It is important that XML tags nest correctly If tag B is contained within tag A, then there must be an end tag for tag B before the end tag for tag A The following example shows invalid nesting:

<! This is invalid XML syntax: >

Trang 20

The <inStock> element is enough to signify that the book is currently in stock This type of XML element is called an empty element XML defines a

shorthand notation for empty elements by collapsing the two tags into one This single tag looks like a start tag, but with a trailing slash as shown in the

Trang 21

Document Structure

The structure of an XML document is defined in the XML specification:

! XML declaration This element is a special tag that declares that the document contains XML and indicates the version of the XML specification to which it conforms The XML declaration belongs to a family of tags called processing instructions

! Encoding The XML declaration may also contain an encoding This element defines the character set used to encode the rest of the file By default, this set is assumed to be the 8-bit Unicode Transmission Format (UTF-8) that is compatible with 8-bit ASCII

Trang 22

! Prolog The XML declaration forms part of the document prolog This element may contain various things, including a DTD specification for the document The DTD defines the expected structure of the document For more information about DTDs, see the Validating XML Documents section later in this module

! Root element There should be a single XML element that encloses all of the other XML elements and data in the document This element is called the root element

An XML document that conforms to this structure and obeys the rules for attributes, elements, and comments defined previously is called a well-formed XML document The following example shows a well-formed XML document:

Trang 23

XML Family of Standards

! Schema

$ Defines a document’s structure; XML Schema will replace DTDs

! Document Object Model (DOM)

$ Standard object model for manipulating XML documents

! Extensible Stylesheet Language (XSL)

$ Transformation and formatting language

$ Evolved into XSL for Formatting and XSL for Transformations (XSLT)

As is the nature of standards, the XML family is constantly evolving to take into account new uses and challenges that present themselves as the XML family of standards is applied Some of the main members of the XML family

of standards are described in the following discussion

Schema

The term schema is somewhat overloaded in the XML environment In generic terms, a schema describes some form of plan or structure In computer terms, the term schema is commonly used to describe the structure of the tables and columns in databases Used in its generic form, an XML schema would define what can and cannot be in a particular XML document It would describe which elements could contain which other elements, what attributes each element can have, and so forth

The XML specification already contains a form of schema called a DTD The DTD can be used to define an XML grammar to which a document must conform The grammar could define a purchase order, the structure of a book, a financial transaction, or the format of an RPC packet An XML-aware tool can then use the DTD to ensure that a document conforms to the given grammar This process is termed validating the document A document that has been proven to conform to its associated grammar is called a valid XML document

Trang 24

Unfortunately, the DTD syntax is somewhat limited because compatibility with SGML is required At the time of writing, the W3C is in the process of defining

a replacement for DTDs called XML Schema The XML Schema standard is based on work by Microsoft and other W3C members and will replace DTDs over time

For more information about DTDs and XML Schema, see Validating XML Documents in this module

Document Object Model

To manipulate XML in applications, some form of programmatic interface is required The W3C defines a standard for manipulating XML documents called the Document Object Model (DOM) The DOM model treats an XML

document as a tree structure in which each element, attribute, and chunk of text

is a node The DOM provides a set of interfaces that allow a programmer to traverse the tree, access data, add new nodes, and remove unwanted nodes For more information about the DOM, see Using the Document Object Model

in this module

Extensible Stylesheet Language

The Extensible Stylesheet Language (XSL) standard was originally intended as

a transformation and formatting language to be used alongside XML in a similar way that Cascading Style Sheets (CSS) are used alongside HTML It soon became clear, however, that the transformation and formatting were actually two distinct parts of XML As a result, XSL has now evolved into two standards: XSL for the formatting aspects (sometimes also known as formatting objects) and XSLT for XSL transformations

Most of the interest in the XML community has been in XSLT By using an XSLT style sheet, an XML document can be transformed into another XML document The ability to transform one XML grammar into another is a powerful mechanism, whether it is transforming XML into HTML for display

or transforming one company’s purchase order definition into another form that

is compatible with a different company’s software

XSLT has its own, XML-based syntax An XSLT style sheet consists of a set of template rules Each template rule has a pattern that can be matched to part of the source XML document and a matching output template Any part of the input XML document that matches a pattern in a template rule will have the associated output template applied to it

The pattern matching language used in XSLT is defined in a separate standard called XPath The XPath standard can also be used with Microsoft’s

implementation of the DOM to help find particular nodes in the DOM tree For more information about XPath, XSLT, and XSL, you can go to the W3C Web site at www.w3.org/Style/XSL

Trang 25

Namespaces

People and organizations are free to define their own XML grammar A potential problem, however, is that the names of the elements and attributes in these grammars may conflict For example, consider an XML grammar that defines the structure of a book Each chapter in the book would have a title Imagine that the book described family trees Each tree may be defined according to another XML grammar specifically for family tree description In this family tree grammar, each member of a family may also have a title The title for a family member may have limitations placed on it (for example, limited to “Mr ” “Ms.,” “Mrs.,” “Dr.,” and so on), whereas there would be no such restrictions on the titles of the chapters

In this example, we need a way to differentiate between the titles used in different XML grammars Using namespaces solves this problem Namespaces are prefixes that can be used to establish that a particular element or attribute belongs to a specific XML grammar An example of using namespaces is shown below:

<book xmlns:bookns='urn:com:booknamespace' xmlns:familyns=’urn:com:familyns’>

<chapter>

<bookns:title>My Family</bookns:title>

The original Bloggs family can be traced back

to <familyns:title>Dr.</familyns:title> Jack Bloggs

at the turn of the 1900's

</chapter>

</book>

You will encounter namespaces again when you look at XML Schemas in Validating XML Documents later in this module

For more information about namespaces, you can go to the W3C Web site at www.w3.org/TR/1999/REC-xml-names-19990114

Trang 26

XML Support in Microsoft Products

! Internet Explorer

! Microsoft's BizTalk initiative and BizTalk server

! Microsoft Office 2000

! Microsoft SQL Server 2000

! Simple Object Access Protocol (SOAP)

! ActiveX Data Objects (ADO version 2.0 and above)

! Microsoft's BizTalk initiative and BizTalk server Microsoft’s BizTalk initiative and BizTalk server define an infrastructure for e-commerce and e-business that is based on XML The BizTalk.org Web site provides a forum for organizations to exchange XML Schema so that they can communicate with each other The BizTalk server provides the software required to route XML-based messages between organizations and

to transform them between different grammars to smooth their flow between organizations

! Microsoft Office 2000 Office 2000 allows you to save documents in a Web format that includes XML The XML is used to define metadata for the document, allowing it to

be passed and manipulated in this Web format without losing the original rich structure and formatting that Microsoft Office can provide

! Microsoft SQL Server™ 2000 SQL Server 2000 provides the capability of submitting a SQL query and receiving the response as an XML document

! Simple Object Access Protocol (SOAP) SOAP defines a mechanism for using XML and HTTP as a firewall-friendly alternative to COM For more information about SOAP, go to

http://msdn.microsoft.com/xml/general/soapspec-v1.asp

Trang 27

! ActiveX® Data Objects (ADO version 2.0 and above) ADO recordsets can be persisted in an XML format This advantage allows them to be reconstituted later or transmitted to environments that are not ADO aware and in which their data can still be accessed

For more information about XML support in Microsoft products, visit the Microsoft Web site at: www.microsoft.com

Trang 28

Practice: Viewing an XML Document with Internet Explorer 5

In this practice, you will use Internet Explorer 5 to view an XML document containing a book list You will then apply an XSLT style sheet to the document and view the document again

! View an XML document

1 Using Windows Explorer, navigate to the <install folder>\Practices\Mod09\XML folder

2 Double-click on the booklist.xml file to launch Internet Explorer If a

connection to the Internet is currently unavailable, the Work Offline dialog box may be displayed If so, click the Try Again button

3 The XML book list document will be displayed Notice the color scheme adopted by Internet Explorer 5, which makes it easy to differentiate between elements, attributes, and text values

4 Notice that the document begins with an XML processing instruction specifying the XML document version This specification is part of the document’s prolog and is displayed in blue by Internet Explorer 5

Following the prolog is the root element, booklist The booklist element contains a further set of book elements

5 Click the minus sign preceding the booklist element, which will cause the entire document to be contracted Because the booklist element is the root

element for this document, only the processing instruction followed by the root element will remain

6 Click the plus sign preceding the booklist element to expand the document

again

7 Click all of the minus signs preceding the book elements This step provides

a condensed view of the document listing just the book elements within the

booklist root element

8 Close Internet Explorer

Trang 29

! Apply an XSL style sheet

You will now apply an XSLT style sheet to the XML document to transform the XML into HTML

1 Start a copy of Notepad You will use this copy to edit the XML document

a From the Start menu, select Run

b In the Open field, type notepad

c Click OK

2 In Notepad, point to the File menu and select Open

3 In the Open dialog box, select All Files from the Files of type combo box

4 Navigate to the <install folder>\Practices\Mod09\XML folder and select the

12 Close Internet Explorer

Trang 30

# Validating XML Documents

! Conforming to an XML Grammar

! Validating XML Documents with DTDs

! Validating XML Documents with Schemas

When processing an XML document, it is frequently useful to know whether it conforms to a specific XML grammar The process of checking that an XML document conforms to a specific XML grammar is called validation

Applications can accept or reject documents based on their validity The two common mechanisms for defining the XML grammars used when validating XML documents are DTDs and XML Schemas

In this section, you will learn why it is useful to validate XML documents You will then learn how DTDs and XML Schemas work and the pros and cons of each mechanism You will also learn how to apply a DTD or XML Schema to

an XML document

This section includes the following topics:

! Conforming to an XML Grammar

! Validating XML Documents with DTDs

! Validating XML Documents with Schemas

Trang 31

Conforming to an XML Grammar

! Two mechanisms for defining an XML grammar

$ Document Type Definition (DTD)Part of XML 1.0 standardEvolution of the SGML mechanism

$ XML SchemasNew mechanism to address weaknesses of DTDsLimited types for the content of an elementClosed content model

Fixed syntaxSyntax differs from XML

There are various initiatives to define standard XML grammars for different vertical markets For example, having a standard grammar for purchase order interchange is only useful if people conform to that standard For more formal industry standards, such as TCP/IP, there are test suites that can be used to prove whether a particular implementation conforms to the standard Some standards bodies will provide official branding if a product passes their test suites

The world of XML document interchange is very dynamic It would be impossible for every XML document to be submitted to a standards body to ensure that it conformed to a particular XML grammar For this reason, XML grammars are not defined in long, formal text documents Rather, they are defined in a way that an XML parser can understand The parser can then check that a document complies with a particular XML grammar as it processes it There are two mechanisms for defining an XML grammar

Mechanism Description

Document Type Definition (DTD)

Defined as part of the XML 1.0 standard DTDs are an evolution of the validation mechanism used with SGML because one of the requirements on the XML standard was that it should be backward compatible with SGML

XML Schemas Due to some of the limitations of DTDs, XML Schemas have

been created as a new mechanism for defining XML grammars XML Schemas have evolved without the constraints placed on DTDs At the time of writing, the XML Schema standard was approaching W3C recommendation status (that is, an official standard)

The rest of this section discusses these mechanisms and explains how to apply them

Trang 32

Validating XML Documents with DTDs

! DTD defines the following:

document

contain

The syntax for DTDs is defined in the XML 1.0 standard The DTD syntax can

be used to define:

! Names of elements that can be contained in the document

! Elements or text that a particular element can or must contain

! The order in which child elements should appear

! Attributes that can be applied to each element

! Required attributes or default values for attributes

Trang 33

Validating XML Documents with DTDs (continued)

! DTD Syntax

! Applying a DTD to an XML Document

<!ELEMENT chapter (#PCDATA)>

<!ELEMENT book (chapter)+>

<!ELEMENT book (foreword, chapter+, appendix*)>

<!ELEMENT book (foreword?, (chapter | appendix)+)>

<!ATTLIST book ISBN CDATA #REQUIRED>

<!ATTLIST book onBookerPrizeList (true|false) #IMPLIED

To define an element in a DTD, use the following syntax:

<!ELEMENT chapter (#PCDATA)>

This syntax establishes that a chapter element can only contain character data (The term PCDATA stands for Parsed Character Data.) Any attempt to include

a <chapter> element containing another element will result in an invalid document

To build a book from one or more chapters, use the following syntax:

<!ELEMENT book (chapter)+>

Note that the name of the element to be contained (or the #PCDATA token) is always enclosed in parentheses In this case, there is the plus (+) operator after the element name This syntax indicates that the <chapter> element can appear one or more times inside a <book> element Beware of the implications in this case A book must have at least one chapter If it were valid to have a book without any chapters, you would have to apply the asterisk (*) operator instead

to indicate zero or more occurrences

Obviously, books do not consist simply of chapters A book will have a foreword and possibly one or more appendices To denote this possibility in DTD syntax, you would use the following:

<!ELEMENT book (foreword, chapter+, appendix*)>

Trang 34

This statement means that a <book> element must contain a <foreword>

element followed by one or more <chapter> elements, followed in turn by zero

or more <appendix> elements The comma separator implies a strict ordering of elements You can use the pipe (|) operator and nested parentheses to indicate a one-of-many selection For example:

<!ELEMENT book (foreword?, (chapter | appendix)+)>

This example defines a book so that it consists of an optional foreword (zero or one) followed by one or more chapter or appendix Obviously, the syntax required should reflect the real-world requirements for your data

Attributes are defined in a similar way The following example requires that each <book> element must have an ISBN attribute to be valid:

<!ATTLIST book ISBN CDATA #REQUIRED>

The #REQUIRED token means that the attribute must be present for the element to be valid The term CDATA means that the attribute’s value consists

of character data (that is, a string) Alternatively, the attribute can be selected from a list of values, as shown in the following example:

<!ATTLIST book onBookerPrizeShortList (true | false) #IMPLIED>

This syntax means that the onBookerPrizeShortList attribute can have a value

of either True or False The token #IMPLIED means that the attribute does not

have to be present To define a default value, you could replace the #IMPLIED

token with either True or False

Although the tokens CDATA and #PCDATA represent similar entities, they are not interchangeable The token #PCDATA is only used in element definitions and CDATA is used in attribute definitions

A DTD can also define things called entities that can be used to define shorthand notations for commonly used strings Entities can also be used to hold the contents of imported files

Trang 35

Applying a DTD to an XML Document

To apply a DTD to an XML document, you use a Document Type Declaration The Document Type Declaration uses the tag <!DOCTYPE … > and is part of the prolog of an XML document The following example illustrates a Document Type Declaration:

or as part of the file in the Document Type Declaration

When a validating XML parser encounters a Document Type Declaration, it will recover the DTD information and apply that DTD to the root element If the document does not match the DTD, the parser will indicate that the document is invalid

Trang 36

Validating XML Documents with Schemas

! Internet Explorer 5 implementation based on XDR proposal

! Schema syntax

<?xml version="1.0"?>

<Schema xmlns="urn:schemas-microsoft-com:xml-data"

xmlns:dt="urn:schemas-microsoft-com:datatypes">

<ElementType name="title" content="textOnly"/>

<ElementType name="author" content="textOnly"/>

<ElementType name="price" dt:type=“float"/>

<ElementType name="book" content="eltOnly“ model="closed">

an element or attribute value must be, for example, an integer or currency value

! Closed content model DTDs have a closed content model Consequently, an element can only contain precisely what is defined for it in the DTD Using an open content model provides more extensibility and flexibility during the evolution of the grammar

! Fixed syntax The DTD syntax is fixed; a more extensible syntax is needed

! Syntax differs from XML The DTD syntax is completely different than XML syntax It would be easier if the grammar definition language was itself a grammar of XML

Because of these limitations, the W3C commissioned a working group to devise

a replacement for DTDs termed XML Schema The standard defining the XML Schema has been split into two parts: structures that define the content model, and data types At the time of writing, the XML Schema specifications had not reached W3C recommendation (that is, have not been accepted as the standard)

Trang 37

Because the XML Schema standard is not yet defined at the time of writing, all schema examples use the schema implementation in Internet Explorer 5 This implementation is based on the XML Data Reduced proposal submitted by Microsoft and the University of Edinburgh in 1998

The lack of standardization highlights one advantage that DTDs have over schemas, namely, that DTDs will be supported by any parser compliant with XML 1.0, whereas XML Schemas will only be supported by newer parsers

Schema Syntax

You will recall that an XML Schema is an XML document The root element of the schema is the <Schema> element The following example shows a simple schema

<?xml version="1.0"?>

<Schema xmlns="urn:schemas-microsoft-com:xml-data"

xmlns:dt="urn:schemas-microsoft-com:datatypes">

<ElementType name="title" content="textOnly"/>

<ElementType name="author" content="textOnly"/>

<ElementType name="price" dt:type="float"/>

<ElementType name="book" content="eltOnly" model="closed"> <element type="title" />

<element type="author" />

<element type="price" />

<AttributeType name="isbn" dt:type="string"

In the preceding example, the first thing to notice is the use of namespaces All

of the elements used within this definition are defined in the namespace urn:schemas-microsoft-com:xml-data The acronym “urn” indicates a Uniform Resource Name that uniquely identifies the source of these element definitions

In addition, the alias “dt” is associated with the namespace microsoft-com:datatypes Data types are discussed later in the section

urn:schemas-Uniform Resource Names (urn’s) are conceptually similar to the more familiar Uniform Resource Locator (url) Both are strings used to uniquely identify a resource, and as such, both are forms of Uniform Resource Identifier (uri) The primary difference between a url and a urn is that the urn is location independent and is a pure identifier A url, on the other hand, uniquely identifies a resource and encodes the protocol used to reach the resource; for example, HTTP or FTP

Note

Note

Trang 38

Next, the two initial <ElementType> declarations establish that the elements

<title> and <author> can only contain text The next <ElementType>

declaration establishes that the element <price> can only contain integer values The fourth <ElementType> declaration establishes that the element <book> contains one of each of the <title>, <author>, and <price> tags In addition, the

<AttributeType> declaration establishes a required type string attribute called isbn The <attribute> tag declares that the <book> element has such an attribute It can contain no stand-alone text because its content is defined as eltOnly (element only) and it has a closed content model, meaning that it can only contain what is defined in the schema Under the closed content model, the document will be invalid if you include elements or text that are not in the schema

This section can only provide information on some of the features of XML Schemas For more information about XML Schemas, see the XML Schema Reference in the Web Workshop under the Windows 2000 Platform SDK or visit the online Web Workshop Web site:

msdn.microsoft.com/library/psdk/xmlsdk/xmls5gkl.htm

Trang 39

Validating XML Documents with Schemas (continued)

! XDR Data Types

char, date, datetime, int, and float.

! Applying a schema to an XML document

Data type Description

boolean 0 or 1, where 0 == "false" and 1 =="true"

char A string that is one character long

date The date in a subset ISO 8601 format, without the time data (for

example, 1994-11-05)

dateTime The date in a subset of ISO 8601 format, with optional time and no

optional zone Fractional seconds can be as precise as nanoseconds; for example, 1988-04-07T18:39:09

float A real number, with no limit on digits, that can potentially have a

leading sign, fractional digits, and an exponent Punctuation is the same as that in American English Values range from

1.7976931348623157E+308 to 2.2250738585072014E-308

int A number, with optional sign, no fractions, and no exponent

uri A Universal Resource Identifier (URI); for example,

urn:schemas-microsoft-com:Office9

uuid Hexadecimal digits that represent octets with optional embedded

hyphens that are ignored; for example, 0080C7055A83

For a complete list of supported data types, see the XML Data Types Reference

in the Web Workshop in the Windows 2000 Platform SDK or visit the online Web Workshop at msdn.microsoft.com/library/psdk/xmlsdk/xmls1cbp.htm

Trang 40

Applying a Schema to an XML Document

To apply a schema to an XML document, use namespace declarations There is

a special namespace prefix, x-schema, that indicates to the parser that the URL following it points to a schema The schema is applied to any element and attribute declarations within that namespace Consider the following example:

<booklist schema:http://localhost/bookstore/BookListSchema.xml">

One advantage of associating schemas with namespaces is that different schemas can be applied to different parts of the document A schema can be associated with a namespace alias and all elements and attributes prefixed with that alias will be validated against the associated schema

Ngày đăng: 21/12/2013, 19:15