1. Trang chủ
  2. » Công Nghệ Thông Tin

Xml programming bible phần 3 docx

99 303 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 99
Dung lượng 1,67 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Supported by: SAX 2 isSpecified index or Returns false if the default attribute value was isSpecified qName or specified in the DTD.. Table 6-33 Attributes2Impl Interface Methods addA

Trang 2

Table 6-26

ParserFactory Class Methods

makeParser() Create a new SAX parser using the ‘org.xml.sax.parser’

Supported by: system property

SAX 1

makeParser( className) Create a new SAX parser object using the class name

Supported by: provided

SAX 1

AttributeListImpl

AttributeListImplis the SAX helper class of the SAX 1 interface for a list ofXML attributes As with the Parser and ContentHandler interfaces, AttributeListinterface should not be used for new development Consequently, the

AttributeListImplclass should not be used either We’ve included it here tohelp debug and upgrade SAX 1 code to the SAX 2 XMLReader, ContentHandler, andAttributes interfaces Table 6-27 describes the methods

Table 6-27

AttributeListImpl Class Methods

addAttribute( name, type, value) Adds an attribute to an attribute list

getLength() Returns the count of element attributes,

Supported by: starting at 0.

SAX 1

getName( i) Returns the name of an attribute by index

Supported by: Attribute indexes start at 0.

SAX 1

getType( i) Returns the type of an attribute by index

Supported by: Attribute indexes start at 0.

SAX 1

Continued

Trang 3

Table 6-27 (continued)

getType( name) Returns the type of an attribute by name

Supported by:

SAX 1

getValue( i) Returns the value of an attribute by index

Supported by: Attribute indexes start at 0.

SAX extension interfaces

Aside from the SAX core interfaces, there are several extension interfaces that areimplemented using the SAX extension API SAX extensions are optional interfacesfor SAX parsers For example, the MSXML parser supports the DeclHandler andLexicalHandler interfaces, while the Apache Xerces parser classes support allextension interfaces They can also be implemented independently of the SAX coreinterfaces All extensions have been developed using the SAX 2 extensions API, andare not available in SAX 1

At the beginning of this chapter, you reviewed the SAX extensions at the interfacelevel Now let’s review the methods that are contained in the extension interfaces

You may see SAX documentation that refers to “SAX Extensions 1.x.” This refers tothe SAX 2 Extensions 1.x API, not SAX 1 There is no SAX extension API for SAX 1

Attributes2

The Attributes2 interface checks a DTD to see if an attribute in an XML documentwas declared in a DTD It also checks to see if the DTD specifies a default value forthe attribute This interface is used mainly for data validation Table 6-28 describesthe methods

Note

Trang 4

Table 6-28

Attributes2 Interface Methods

isDeclared( index) or Returns true if attribute was declared in the DTD

isDeclared( qName) or isDeclared accepts an index (starting with 0), a

isDeclared( uri, localName) qualified name, or a local name

Supported by:

SAX 2

isSpecified( index) or Returns false if the default attribute value was

isSpecified( qName) or specified in the DTD isSpecified accepts an index

isSpecified( uri, localName) (starting with 0), a qualified name, or a local

DeclHandler Interface Methods

attributeDecl( eName, aName, Returns a DTD attribute type declaration Values

type, mode, value) returned include any valid DTD values, such as

Supported by: “CDATA”, “ID”, “IDREF”, “IDREFS”, “NMTOKEN”, SAX 2 and MSXML “NMTOKENS”, “ENTITY”, or “ENTITIES”, a token

group, or a NOTATION reference.

elementDecl( name, model) Returns a DTD element type declaration Values

Supported by: returned include any valid DTD values, such as SAX 2 and MSXML “EMPTY”, “ANY”, order specification, and so on.

externalEntityDecl( name, publicId, Returns a parsed external entity declaration

systemId) Supported by:

SAX 2 and MSXML

internalEntityDecl( name, value) Returns a parsed internal entity declaration

Supported by:

SAX 2 and MSXML

Trang 5

EntityResolver2 extends the EntityResolver interface by programmatically addingexternal entity reference subsets This can be useful for automatically adding pre-defined DTD references to an XML document for validation while parsing Table 6-30describes the methods

Table 6-30

EntityResolver2 Interface Methods

getExternalSubset( name, baseURI) Returns an external subset for documents

Supported by: without a valid DOCTYPE declaration

SAX 2

resolveEntity( name, publicId, Allows applications to map external entities to

baseURI, systemId) XML document inputSources, or map an external

Supported by: entity by URI.

SAX 2

LexicalHandler

LexicalHandler returns information about lexical events in an XML document.Comments, the start and end of a CDATA section, the start and end of a DTD decla-ration, and the start and end of an entity can be tracked with LexicalHandler Table6-31 describes the methods

Table 6-31

LexicalHandler Interface Methods

comment(char[] ch, start, length) This event is triggered when the parser

Supported by: encounters a comment anywhere in the

endCDATA() This event is triggered when the parser

Supported by: encounters the end of a CDATA section

SAX 2 and MSXML

endDTD() This event is triggered when the parser

Supported by: encounters the end of a DTD declaration

SAX 2 and MSXML

Trang 6

Method Name Description

endEntity( name) This event is triggered when the parser

Supported by: encounters the end of an entity

SAX 2 and MSXML

startCDATA() This event is triggered when the parser

Supported by: encounters the start of a CDATA section

SAX 2 and MSXML

startDTD( name, publicId, This event is triggered when the parser

systemId) encounters the start of DTD a declaration.

Supported by:

SAX 2 and MSXML

startEntity( name) This event is triggered when the parser

Supported by: encounters the beginning of internal or external

Locator2

Locator2 extends the Locator interface to return the encoding and the XML versionfor an XML document Table 6-32 describes the methods

Table 6-32

Locator2 Interface Methods

Method Name Description

getXMLVersion() Returns the entity XML version

SAX extension helper classes

The SAX extension helper classes provide the same programmatic access to theSAX Extension interfaces that the SAX helpers do to the SAX Core Interfaces Theoptional SAX 2 Extension API interface properties, methods and object classes have

to be implemented to support these classes

Trang 7

The SAX Extension Helper classes are only for Java implementations Currently,MSXML does not support helper classes, though they do support some of thefunctionality through additional methods in the core interfaces

Attributes2Impl

The Attributes2Impl helper class is the implementation class of the Attributes2interface Attributes2 checks a DTD to see if an attribute in an XML document wasdeclared in a DTD It also checks to see if the DTD specifies a default value for theattribute It’s used mainly for data validation Attributes2Impl extends the interfacefunctionality by letting you add, edit, and delete attributes from lists, as described

in Table 6-33

Table 6-33

Attributes2Impl Interface Methods

addAttribute( uri, localName, Adds an attribute to the end of the attribute list, setting

qName, type, value) its “specified” flag to true

Supported by:

SAX 2

isDeclared( index) or Returns true if attribute was declared in the DTD

isDeclared( qName) or isDeclared accepts an index (starting with 0), a qualified

isDeclared( uri, localName) name, or a local name

Supported by:

SAX 2

isSpecified( index) or Returns false if the default attribute value was specified

isSpecified( qName) or in the DTD isSpecified accepts an index (starting with 0),

isSpecified( uri, localName) a qualified name, or a local name.

Supported by:

SAX 2

removeAttribute( index) Removes an attribute from the attribute list Attribute

Supported by: indexes start at 0

SAX 2

setAttributes(Attributes atts) Copy the specified Attributes object to a new Attributes

Supported by: object.

SAX 2

setDeclared( index, Set the “declared” flag of a specified attribute Attribute

boolean value) indexes start at 0.

Supported by:

SAX 2

Note

Trang 8

Method Name Description

setSpecified( index, Set the “specified” flag of a specified attribute Attribute

boolean value) indexes start at 0.

DefaultHandler2 Interface Methods

attributeDecl( eName, aName, type, Returns a DTD attribute type declaration Values

mode, value) returned include any valid DTD values, such as

Supported by: “CDATA”, “ID”, “IDREF”, “IDREFS”, “NMTOKEN”, SAX 2 “NMTOKENS”, “ENTITY”, or “ENTITIES”, a token

group, or a NOTATION reference Source interface

is DeclHandler.

elementDecl( name, model) Returns a DTD element type declaration Values

Supported by: returned include any valid DTD values, such as SAX 2 “EMPTY”, “ANY”, order specification, etc Source

interface is DeclHandler.

externalEntityDecl( name, publicId, Returns a parsed external entity declaration

systemId) Source interface is DeclHandler.

Supported by:

SAX 2

internalEntityDecl( name, value) Returns a parsed internal entity declaration

Supported by: Source interface is DeclHandler.

SAX 2

comment(char[ ] ch, start, length) This event is triggered when the parser

Supported by: encounters a comment anywhere in the SAX 2 document Source interface is LexicalHandler.

startDTD( name, publicId, systemId) This event is triggered when the parser

Supported by: encounters the start of a DTD declaration Source

Continued

Trang 9

Table 6-34 (continued)

endDTD() This event is triggered when the parser

Supported by: encounters the end of a DTD declaration Source

startCDATA() This event is triggered when the parser

Supported by: encounters the start of a CDATA section Source

endCDATA() This event is triggered when the parser

Supported by: encounters the end of a CDATA section Source

startEntity( name) This event is triggered when the parser

Supported by: encounters the beginning of internal or external SAX 2 XML entities Source interface is LexicalHandler.

endEntity( name) This event is triggered when the parser

Supported by: encounters the end of internal or external XML SAX 2 entities Source interface is LexicalHandler.

getExternalSubset( name, baseURI) Returns an external subset for documents

Supported by: without a valid DOCTYPE declaration Source

resolveEntity( publicId, systemId) Allows applications to map an external entity by

Supported by: URI Source interface is EntityResolver2.

SAX 2

resolveEntity( name, publicId, Allows applications to map external entities to

baseURI, systemId) XML document inputSources, or map an external

Supported by: entity by URI Source interface is EntityResolver2 SAX 2

Locator2Impl

Locator2Impl is the implementation class for the Locator2 SAX extension interface.Locator2 extends the Locator interface to return the encoding and the XML versionfor an XML document Table 6-35 describes the methods

Trang 10

Table 6-35

Locator2Impl Interface Methods

getEncoding() Returns the type of character encoding for the entity.

IMXAttributes Interface Methods

addAttribute (URI, LocalName, Adds an attribute to the end of an attribute list.

QName, Type, Value) Supported by:

MSXML

Continued

Note

Trang 11

Table 6-36 (continued)

addAttributeFromIndex Adds the attribute specified by an index value to

(attributes, index) the end of an attribute list Attribute indexes start

Supported by: with 0

MSXML

clear Clears the attribute list Attribute indexes start

Supported by: with 0.

MSXML

removeAttribute (index) Removes an attribute from the attribute list

Supported by: Attribute indexes start with 0.

MSXML

setAttribute (index, URI, localName, Sets an attribute in the list Attribute indexes start

QName, type, value) with 0.

setLocalName (index, localName) Sets the local name of a specified attribute

Supported by: Attribute indexes start with 0.

MSXML

setQName (index, QName) Sets the qualified name (QName) of a specified

Supported by: attribute Attribute indexes start with 0.

MSXML

setType (index, type) Sets the type of a specified attribute Attribute

Supported by: indexes start with 0.

MSXML

setURI (index, URI) Sets the namespace URI of a specified attribute

Supported by: Attribute indexes start with 0.

MSXML

setValue (index, value) Sets the value of a specified attribute Attribute

Supported by: indexes start with 0.

MSXML

IMXSchemaDeclHandler

The MSXML IMXSchemaDeclHandler extension interface provides schema tion about an element being parsed, including attributes Table 6-37 describes themethods

Trang 12

informa-Table 6-37

IMXSchemaDeclHandler Interface Methods

schemaElementDecl Declares a schema for validation of an element Assists

Supported by: in MSXML SAX validation when parsing.

MSXML

IMXWriter

IMXWriter writes parsed XML output to:

✦ An IStream object: A stream object representing a sequence of bytes thatcan be forwarded to another object such as a file or a screen

✦ A string (remember, all XML documents are technically strings)

✦ A DOMDocument object: Can be passed to the MSXML DOM parser for furtherprocessing For example, a new XML document could be parsed using SAX forspeed, then sent to the DOM parser for DTD validation

The encoding and version properties of IMXWriter are similar to the

getXMLVersion() and getEncoding() methods of the SAX API Locator2extension interface Also, one piece of trivia: Note that this is the only SAX interfacethat has more properties than methods

Table 6-38 describes the properties

Table 6-38

IMXWriter Interface Properties

byteOrderMark (boolean) Controls the writing of the Byte Order Mark

Supported by: (BOM) for encoding, according to XML 1.0

disableOutputEscaping (boolean) Sets the flag for the disable-output-escaping

Supported by: attribute of the <xsl:text> and <xsl:value-of>

MSXML elements If True, entity reference symbols and

other non-XML data are passed without entity resolution

Continued

Note

Trang 13

Table 6-38 (continued)

encoding (string) Sets and gets XML document encoding for the

Supported by: written output

MSXML

Indent (boolean) Sets indentation in the output

Supported by:

MSXML

omitXMLDeclaration (boolean) If true, the output will not include the XML

Supported by: declaration

MSXML

output (variant) Sets the destination and the type of IMXWriter

Supported by: output

MSXML

standalone (boolean) Sets the XML declaration standalone attribute to

Supported by: “yes” or “no.”

IMXWriter Interface Methods

Method Name Description

flush() Flushes the object’s internal buffer to its destination (not

for DOMDocument output).

Trang 14

In this chapter, I provided a deep dive into the details of the Simple API for XML(SAX):

✦ A history of SAX

✦ SAX versions and evolution

✦ Understanding differences in W3C and MSXML SAX parser implementations

✦ SAX interfaces, extension interfaces, and helper classes

✦ SAX interface event callback methods

✦ SAX helper classes for implementing SAX 1 to SAX 2 compatibility

✦ Properties and methods for W3C and MSXML SAX interfaces

In the next chapter, we move on to something completely different: ExtensibleStylesheet transformations The chapters will follow the same format as the parsingchapters Chapter 7 is an introduction to XSL and XSLT, while Chapter 8 providesmore information on implementing XSLT and includes working examples

Trang 16

XSLT Concepts

Chapters 1, 2, and 3 showed you what XML was all about,

how to develop XML documents, and how to make surethat XML document structures are enforced using data valida-tion Chapters 4, 5, and 6 showed you some of the things youcan do with XML documents, namely parsing them for conver-sion to other types of data

This chapter will discuss the syntax, structure, and theory ofExtensible Stylesheet Language (XSL) and XSL Transform-ations (XSLT), with some basic examples for illustration

Chapter 8 will show you XML and XSLT in real-world examplesand tips for writing XSL stylesheets for XML documents

Chapter 9 will extend those examples to show you how to useXSL: Formatting Objects (XSL:FO) with XML documents

All of the XML document and stylesheet examples contained in this chapter can be downloaded from the xmlprogrammersbible.com Website, in the Downloads section

Introducing the XSL Transformation Recommendation

XSL stands for Extensible Stylesheet Language The XSL

stylesheet XSL Transformation Recommendation describesthe process of applying an XSL stylesheet to an XML docu-ment using a transformation engine, and also specifies the

XSL language covered in this chapter XSLT is based on DSSSL

(Document Style Semantics and Specification Language), which

was originally developed to define SGML document outputformatting XSLT 1.0 became a W3C Recommendation in 1999,and the full specification is available for review at http://

www.w3.org/TR/xslt

The XSLT Recommendation should not be confused with the very confusingly named Extensible Stylesheet Language(XSL) Version 1.0 Recommendation, which achieved W3C

An introduction toXSL stylesheetelementsUseful XPath andXSLT functions forstylesheet developersExtending XSLT withthe help of EXSLT.org

Trang 17

Recommendation status on 15 October 2001 This recommendation has more to dowith XSL: Formatting Objects (XSL:FO) than XSL Transformations (XSLT) You canview the Extensible Stylesheet Language (XSL) Version 1.0 Recommendation athttp://www.w3.org/TR/xsl/ Chapter 9 covers XSL XSL: Formatting Objects,including most of the W3C Extensible Stylesheet Language 1.0 Recommendation.Another W3C Recommendation that affects XSLT is the XML Path Language (XPath).XPath is a tree-based representation model of an XML document that is used inXSLT to describe elements, attributes, text data, and relative positions in an XMLdocument The full recommendation document can be seen at http://www.w3.org/TR/xpath.

Version 2.0 of XSLT and XPath are currently in the Recommendation process, andare expected to become W3C Recommendations sometime in late 2003 The currentdocuments and their status can be reviewed at http://www.w3.org/TR/

xslt20req and http://www.w3.org/TR/xpath20req

Stylesheet structure and syntax is defined in the W3C XSLT Recommendation ment, and Transformation engines are based on these definitions Transformationengines support a variety of programming languages, usually based on the languagethat they are developed in At time of writing, there is no comprehensive list ofXSLT engines available, but the Open Directory Project provides a good overview athttp://dmoz.org/Computers/Data_Formats/Markup_Languages/XML/Style_Sheets/XSL/Implementations/ Despite a multitude of XSLT enginessupporting a multitude of languages, mainstream XSLT engines are split into twoplatform camps: Java and Microsoft

docu-One of the first Java transformation engines was the LotusXSL engine, which IBMdonated to the Apache Software Group, where it became the Xalan Transformationengine Since then, Apache has developed Xalan Version 2, which implements apluggable interface into Xalan 1 and 2, as well as integrated SAX and DOM parsers.Both of the Java versions of XALAN implement the W3C Recommendations XSLTand XPath You can find more information on Xalan at http://xml.apache.org/xalan-j/index.html

Microsoft support for XML 1.0 and a reduced implementation of the W3C XSLT ommendation began with the MS Internet Explorer 5, which also supported theDocument Object Model (DOM), XML Namespaces, and beta support for XMLSchemas XML and XSL functionality was extended in later browser versions andseparated from the browser into the MSXML parser, more recently renamed theMicrosoft XML Core Services MSXML is for use in client applications, via Webbrowsers, Microsoft server products, and is a core component of the NET platform

Trang 18

rec-How an XSL Transformation Works

Developers create code that identifies an XML source, an XSL stylesheet, and atransformation output method and destination to a transformation engine, which isusually described as an XSL processor Instructions from source code to the XSLprocessor perform a transformation using the predefined components The XSLprocessor reads the Source XML document and performs a transformation of theXML attributes, elements, and text values based on instructions in the XSLstylesheet

XSLT stylesheets are well-formed XML documents that conform to W3C standardsfor syntax Output format is specified in the XSL document as well, and can beHTML, text, or XML

remov-XSL for attributes and elements

XSL directives and functions combined with XPath functions make up the lary for XSL stylesheet transformations All of the directives and functions will beexplained a little later in this chapter Before I get into the full list of directives andfunctions, let’s step through a very basic transformation using very basic source,output, and stylesheet formats Listing 7-1 shows the very simple XML documentthat is based on the first XML document examined in Chapter 1 The document has a root element and a few nested elements, a few attributes, and a few text datavalues

vocabu-Listing 7-1: A Very Simple XML Document

Trang 19

✦ href: Must be a valid URI.

✦ title: Used for distinguishing between more than one XML-stylesheet

process-ing instruction in the same XML document

✦ media: A list of values as defined in the W3C HTML Recommendation Version

4.0 and higher Used in addition to or instead of the title attribute

✦ charset: Used to specify a separate encoding for a stylesheet For example,

the XML document may be UTF-8, and the XSL stylesheet could be ISO-8859-1.Theoretically, the XSLT processor should know how to handle the charset differences

✦ alternate: For use when more than one XML-stylesheet processing instruction

is in the same XML document If the attribute value is no, the stylesheetshould be used first All other stylesheets should have an alternate attributevalue of yes

There are three ways that transformations happen:

✦ Referencing the XSL explicitly: As illustrated in the reference code earlier,

and in Listing 7-1, a reference to a stylesheet can be explicitly declared usingthe XML-stylesheet processing instruction This is useful when automatic

Trang 20

client-side XSLT transformations are necessary and the client software, ally a Web browser, is W3C XSLT compliant Explicit referencing is most com-monly used for separation of data in XML documents from display

usu-characteristics in XSL stylesheets The XML is usually transformed to HTML

on a server or in a browser client before the HTML is displayed to a user

✦ Referencing the stylesheet programmatically: Programs can declare the XML

source, the XSL stylesheet, and the output destination, then invoke an XSLTprocessor to perform the transformation This is the technique used onservers to separate XML document data from XSL stylesheet HTML displaycharacteristics in XML-based Websites, where one stylesheet controls the dis-play of many XML documents It is also the way that most XML-to-XML andXML-to-text transformations occur in XML applications

✦ Embedding XML into an XSL stylesheet: XML data can also be embedded

into an XSL document This is not recommended for the same reasons thatembedded DTDs are not recommended This is only mentioned here in case adeveloper comes across this technique in a legacy system Embeddedstylesheets represent a maintenance nightmare if the transformation or thesource data should ever need to be altered, and defeat the purpose of trans-formations In most cases, the transformed document can be substituted forthe XML data and stylesheet combination document

Next is the remainder of the XML document, which consists of a single-valuerootelementelement:

Nested under the “firstelement” element is the level1 element, which contains

an attribute called children The element name is used to describe the nestinglevel in the XML document, and the attribute is used to describe how many morelevels of nesting are contained under the level1 element, in this case, no morenested levels (0) The phrase This is level 1 of the nested elementsrepresents a textual data value for the level1 element that the text is nested in

The secondelement element is a variation of the firstelement element Let’scompare the firstelement and secondelement elements to get a better sense

of the structure of the document:

Trang 21

Last but not least, to finish the XML document, the rootelement tag is closed:

</rootelement>

Listing 7-2 shows a stylesheet that transforms attributes in Listing 7-1 to elements

by matching a pattern and applying a template to items in the source XML ment that transforms them into a new format in the destination XML document

docu-Listing 7-2: A Very Simple XSL Stylesheet

Trang 22

The XSL stylesheet starts with an optional XML declaration and an attribute thatsets the encoding style for the XSL stylesheet Encoding style for the transforma-tion output is handled separately:

ele-a good reele-ason for not using stylesheet For XSLT 1.0, the version ele-attribute isoptional if stylesheet is used as the element name, but must be included iftransformis used When using stylesheet as the element name, the default ver-sion is 1.0 if the attribute is not included, which does not impact XSLT transforma-tions until XSLT 2.0 becomes an official W3C Recommendation

There is one other Namespace declaration that developers may see in legacy cations and older stylesheets:

appli-<xsl:Stylesheet xmlns:xsl=”http://www.w3.org/TR/WD-xsl”>

This Namespace declaration was used in older stylesheets to maintain ity with Microsoft IE 5.0 browsers, which supported an older version of the W3CRecommendation This Namespace should not be used unless compatibility with5.0 browsers needs to be maintained

compatibil-XSLT Elements

The stylesheet element is used to specify the root element of W3C stylesheets

XSLT vocabularies are mostly made up of elements that describe template tions or types of data that XSLT processors use during transformations Table 7-1describes the full listing of XSL elements available to stylesheet developers

Trang 23

instruc-Table 7-1

W3C XSLT Elements

Element Description

stylesheet Defines a root element of a stylesheet Can be used

interchangeably with transform, but most stylesheets use

stylesheetas a de facto standard.

transform Defines a root element of a stylesheet Should only be used to

replace stylesheetas the root element of a stylesheet, but only if there is a good reason not to use stylesheet.

output Defines the format of the output document html, xml, and text

output methods are predefined If the output method is xml, output is well-formed xml, html formats the output as HTML, and text is any character data, including RTF and PDF files If no output method is specified, the XSLT processor usually checks to see if the document is html-based on html output document tree node prefixes, and defaults to xml if no other determination can

be made Must be a child of the stylesheetelement Several optional attributes can also be used to define the output version, the encoding type, to include or not include an XML declaration declaration, define the standalone attribute, define a doctype, support output document indentation, and indicate a media type.

namespace-alias Replaces a source document Namespacewith a new

Namespacein the output node tree Must be a child of the

stylesheetelement.

preserve-space Defines whitespace preservation for elements Must be a child of

the stylesheetelement.

strip-space Defines whitespace removal for elements Must be a child of the

stylesheetelement.

key Adds key values to each node in the result of an XPath

expression Must be defined as a child of the stylesheet

element For use with the key function in XPath expressions (functions are defined in Table 7-4).

import Imports an external stylesheet into the current stylesheet If there

are conflicts between the current stylesheet and the imported stylesheet, the current stylesheet takes precedence Must be defined as a child of the stylesheetelement.

apply-imports Follows the apply-template rules but overrides a stylesheet

template with the template from an imported template.

Normally, the current stylesheet takes precedence over the imported stylesheet

Trang 24

Element Description

Include Includes an external stylesheet in the current stylesheet If there

are conflicts between the current stylesheet and the included stylesheet, it’s up to the XSLT processor to decide precedence.

Must be defined as a child of the stylesheetelement.

template Applies rules in a match or select action Optional attributes can

be used for specifying a node-set by match, template name, processing priority for this template in case of conflicts in the stylesheet, and an optional QName for a subset of nodes in a nodeset.

apply-templates Applies templates to all children of the current node, or a

specified node-set using the optional selectattribute.

Parameters can be passed using the with-paramelement.

call-template Calls a template by name Parameters can be passed using the

with-paramelement Results can be assigned to a variable.

param Defines a parameter and a default value in a stylesheet template.

A global parameter can be defined as a child of the

stylesheetelement.

with-param Passes a parameter value to a template when call-template or

apply-templates is used.

variable Defines a variable in a template or a stylesheet A global variable

can be defined as a child of the stylesheetelement.

copy Copies the current node and any related Namespaceonly.

Output matches the current node (element, attribute, text, processing instruction, comment, or Namespace).

copy-of Copies the current node, Namespaces, descendant nodes, and

attributes Scope can be controlled with a select attribute.

If Conditionally applies a template if the testattribute expression

evaluates to true.

choose Makes a choice based on multiple options Used with whenand

otherwise.

when An action for chooseelements.

otherwise A default action for chooseelements Must be the last child of a

chooseelement

for-each Iteratively processes each node in a node-set defined by an XPath

expression.

sort Defines a sort key used by apply-templates to a node-set and by

for-each to specify the order of iterative processing of a node set.

Continued

Trang 25

Table 7-1 (continued)

Element Description

element Adds an element to the output node tree Names, Namespaces,

and attributes can be added with the names, Namespaces, and use-attribute-setsattributes.

attribute Adds an attribute to the output node tree Must be a child of an

element.

attribute-set Adds a list of attributes to the output node tree Must be a child

of an element.

text Adds text to the output node tree.

value-of Retrieves a string value of a node and write it to the output node

tree.

decimal-format Specifies the format of numeric characters and symbols when

converting to strings Used with the format-numberfunction only, not with the number element (Functions are defined in Table 7-4.)

number Adds a sequential number to the nodes of a node-set, based on

the value attribute Can also define the number format for the current node in the output node tree.

fallback Defines alternatives for instructions that the current XSL processor

does not support.

message Adds a message to the output node tree This element can also

optionally stop processing on a stylesheet with the terminate

attribute Mostly used by developers for debugging stylesheets and XSLT processors.

processing- Adds a processing instruction to the output node tree.

instructioncomment Adds a comment to the output node tree.

All of the elements in Table 7-1 should be prefixed by xsl:and follow the format

Trang 26

The other XSLT 1.0 output options are text or HTML, or a valid prefixed QName thatcan be resolved into a URI For more complete documentation on this element,please refer to the XSLT element listings in Table 7-1.

Next, the stylesheet goes hunting for all the attributes in the XML document usingthe template element and the match attribute:

<xsl:template match=”@*”>

The match attribute is available with the template and key elements, and is used

to match the pattern specified by the match attribute value When an XSLT sor is invoked, the source XML document is parsed into a set of nodes in a tree,starting with the root element in the document XSLT uses pattern matching to lookthrough the document node tree and retrieve nodes that match the patterns speci-fied The @* attribute value is an XPath expression and instructs the processor tolook at all child nodes of the root node (*) and find all the attributes (@) in thesource XML document

proces-XSL and XPath

The match attribute is one of several XSLT pattern-matching attributes that areused to find nodes in an XML source document The match attribute is used tomatch a pattern in an XML document, for example, to detect the root element, or anattribute in the second element under the root element Pattern matching is facili-tated through XPath expressions, which express the parsed nodes of an XML docu-ment in tree hierarchy references XPath follows a syntax that closely mirrors filesystem paths but in the context of an XML document XPath tree representationsbreak XML documents down into a series of connected root, element, text,attribute, Namespace, processing instruction, and comment nodes

Imagine that the XSLT processor parses a document and places each of the ments in the document into a directory on a file system, and defining attributes,Namespaces, and text data in each directory with special identifiers The new filesystem starts with the root directory (/), and each descendant element can be

ele-found in a subdirectory under the root XPath doesn’t work exactly like this, but on

the surface it appears to, and the directory metaphor is a good point of referencefor starting to understand how XPath really does work Table 7-2 shows the basiclocation operators for XPath expressions

Trang 27

Table 7-2

XPath Location Operators

Operator Description

The location operators are actually abbreviations of commonly used XPath nodeaxes Node axes are expressions that relate to the current node and radiate outfrom that node in different directions, to locate parents, ancestors, children,descendants, and siblings, in relation to the current node Table 7-3 lists anddescribes the XPath node axes

Table 7-3

XPath Node Axes

ancestor Ancestors, excluding the current node ancestor-or-self The current node and all ancestors attribute The attributes of the current node child Children of the current node descendant Descendants, excluding the current node descendant-or-self The current node and all descendants following The next node in the document order, including all descendants

of the next node, and excluding the current node descendants and ancestors

following-sibling The next sibling node in the document order, including all

descendants of the sibling node, and excluding the current node descendants and ancestors

namespace All Namespacenodes of the current node parent The parent of the current node

Trang 28

Axis Description

preceding The previous node in the document order, including all

descendants of the previous node, and excluding the current node descendants and ancestors

preceding-sibling The previous sibling node in the document order, including all

descendants of the sibling node, and excluding the current node descendants and ancestors

XPath axes, attributes, and namespaces

XPath axis nodes treat attributes and Namespaces differently than they treat ments, text values, processing instructions, and comments, depending on the axisand the current node This is because attributes and Namespaces in the documentare not part of the hierarchy of elements, text values, processing instructions, andcomments, but are located separately in the node tree

ele-✦ Attributes are only available from element nodes or the root node, not fromother attribute and namespace nodes

✦ The child, descendant, following, following-sibling, preceding, and sibling axes do not contain attributes or Namespaces, and are empty if thecurrent node is an attribute or a Namespace node

preceding-✦ Attributes of the current node can be accessed using the attribute axis or theattribute identifier (@), as long as the current node is an element node

The next few lines in our example stylesheet create a new element based on thename of the current node in the XML document tree The current node is set to anattribute in the XML document, based on the previous line in the XSL stylesheet(xsl:template match=”@*”) However, XPath has limitations on what can beaccessed if the current node is an attribute or Namespace To get around this limi-tation, the XSLT name() function is used to pass the name of the current attributenode to the new element declaration The XPath location operator representing theself node (.) is used to pass the value of the attribute into the value of the new ele-ment using the value-of select element, and then the new element is finishedwith a hard-coded closing tag, and the template is finished with the template clos-ing tag:

<xsl:element name=”{name()}”>

<xsl:value-of select=”.”/>

</xsl:element>

</xsl:template>

Trang 29

The name() function is one of many functions that can be used in stylesheets.Unlike other types of XML, XPath supports five types of data, even though the dataitself remains text

✦ boolean objects: True or false values.

✦ numbers: Any numeric value.

✦ string: Any string.

✦ node-set: A set of nodes selected by an XPath expression or series of

expressions

✦ external object: A set of nodes returned by an XSLT extension function other

than an XPath or XSLT expression Support for external objects depends onthe XSLT processor support for extensions

There are also several functions related to each data type that can be used in XSLstylesheets Table 7-4 describes the functions supported for each data type

Table 7-4

Functions by Data Type

Boolean Functions

boolean() Converts an expression to the Boolean data type value and

returns true or false.

true() Binary true.

false() Binary false.

not() Reverse binary true or false: not(true

expression)=false, not(falseexpression)=true

Number Functions

number() Converts an expression to a numeric data type value.

round() Rounds a value up or down to the nearest integer:

round(98.49) = 98, round(98.5) = 99floor() Rounds a value down to the nearest integer:

floor(98.9) = 98

ceiling() Rounds a value up to the nearest integer:

ceiling(98.4) = 99

sum() Sums the numeric values in a node-set.

count() Counts the nodes in a node-set.

Trang 30

Function Description

String Functions

string() Converts an expression to a string data type value.

format-number() Converts a numeric expression to a string data type value,

using the decimal-format element values as a guide if the decimal-format element is present in a stylesheet.

concat() Converts two or more expressions to a concatenated string

data type value.

string-length() Counts the characters in a string data type value.

contains() Checks for a substring in a string Returns Boolean true

or false.

starts-with() Checks for a substring at the beginning of a string Returns

Boolean true or false.

translate() Replaces an existing substring with a specified substring in

a specified string data type value.

substring() Retrieves a substring in a specified string data type value

starting at a numeric character position and optionally ending at a specified numeric length after the starting point.

substring-after() Retrieves a substring of all characters in a specified string

data type that occurs after a numeric character position.

substring-before() Retrieves a substring of all characters in a specified string

data type that occurs before a numeric character position.

normalize-space() Replaces any tab, newline, and carriage return characters in

a string data type value with spaces, then removes any leading or trailing spaces from the new string.

Node Set Functions

current() The current node in a single-node node-set.

position() The position of the current node in a node-set.

key() A node-set defined by the key element.

name() The name of the selected node

local-name() The name of a node without a prefix, if a prefix exists.

namespace-uri() The full URI of a node prefix, if a prefix exists.

unparsed-entity-uri() The URI of an unparsed entity via a reference to the source

document DTD, based on the entity name.

id() A node-set with nodes that match the id value.

Continued

Trang 31

Table 7-4 (continued)

generate-id() A unique string for a selected node in a node-set The

syntax follows well-formed XML rules.

lang() A Boolean true or false depending on if the xml:lang

attribute for the selected node matches the language identifier provided in an argument.

last() The position of the last node in a node-set.

document() Builds a node tree from an external XML document when

provided with a valid document URI.

External Object Functions (Note: These functions may also apply to other data types.)

system-property() Returns information about the processing environment.

Useful when building multi-version and multi-platform stylesheets in conjunction with the fallback element.

element-available() A Boolean true or false based on if a processing instruction

or extension element is supported by the XSLT processor.

function-available() A Boolean true or false based on if a function is supported

by the XSLT processor.

The next segment of the sample stylesheet uses the wildcard to create a templatefrom all child nodes in the document The copy element is used to copy the con-tents of the current XML document and apply the predefined templates related tothe attribute match (@*) and the current template match (*) while copying by usingthe select attribute of the apply-templates element After that, the XSL

stylesheet is closed by the stylesheet closing tag

Trang 32

Listing 7-3: The transformation output document

XSLT Extensions with EXSLT.org

As mentioned earlier in this chapter, the W3C XSLT stylesheet Recommendation willprobably be updated from Version 1.0 to Version 2.0 in late 2003 In the meantime,the 1999 1.0 Recommendation has been showing its age The 1.0 specification does,however, leave room for extensions to existing stylesheet structure and syntax viathe external-object data type and the extension-element-prefixes attribute in thestylesheet and transform elements, and the element-available and function-availablefunctions Many XSLT processors now support external extensions, and a goodsource of extensions can be found at EXSLT.org Most extensions take the form ofcode that acts as add-in modules to existing XSLT processors and support functionsthat can be used as if they were part of the W3C Recommendation, once the mod-ules are installed EXSLT.org provides several free-distribution modules, plus setupinstructions and function documentation Developers are also welcomed to con-tribute to the group with their own extensions

Trang 33

In this chapter, I provided an introduction to XSL and provided a theoreticaloverview of XSLT, XSL stylesheet elements, structure, and syntax, XPath axes, func-tions, and data types, and a few XSLT-specific functions

✦ All about EXSLT.org

In the next chapter, you’ll be putting all the lessons you have learned so far about XSLT Transformations to use by showing examples for transforming XML

to text and HTML We’ll also cover changing the format of XML documents usingtransformation

Trang 34

XSL Transformations

In the last chapter, you were introduced to the theory of

XSLT, XSL stylesheets, and XPath expressions In this ter, you’ll apply that theory to real-world examples that willshow you how to use XSLT elements, functions, and XPathexpressions to transform XML documents to other formats ofXML, text, and HTML The next chapter will extend the HTMLexamples in this chapter even further by using XSL:FO in ourtransformations

chap-All of the XML document and stylesheet examples tained in this chapter can be downloaded from the xmlprogrammingbible.comWebsite, in the Downloadssection

con-To Begin

All of the examples in this chapter use the same source XMLfile, which is the sample XML document I have used in previ-ous chapters This example starts with a list of selectedquotes from William Shakespeare, then goes on to list threebooks that contain the quotes that are available for purchasefrom Amazon.com, and a Spanish translation of Macbeth,Romeo and Juliet, Hamlet, and other volumes that are avail-able from http://www.elcorteingles.es Amazon.comprovides a service that returns XML documents based on aURL query, and the Amazon element is based on this format

The elcorteingles.com book listing format and the quotelisting, as well as other parts of the document are used toillustrate several features of XSLT stylesheet transformations

I convert the source document into HTML, delimited text, andHTML to show you some advanced XSLT tips and tricks

Trang 35

Listing 8-1 shows the XML document, named AmazonMacbethSpanish.xml, which Iwill refer back to in the next few examples.

Listing 8-1: The Contents of AmazonMacbethSpanish.xml

<?xml version=”1.0” encoding=”ISO-8859-1”?>

<quotedoc>

<quotelist author=”Shakespeare, William” quotes=”4”>

<quote source=”Macbeth” author=”Shakespeare,William”>When the hurlyburly’s done, / When the battle’s lost and won.</quote>

<quote source=”Macbeth” author=”Shakespeare, William”>Out, damned spot! out, I say! One; two; why, then ‘tis time to do’t ; Hell is murky! Fie, my lord, fie! a soldier, and afeard? What need we fear who knows

it, when none can call our power to account? Yet who would have thought the old man to have had so much blood

<quote source=”Macbeth” author=”Shakespeare, morrow, and to-morrow, and to-morrow,creeps in this petty pace from day to day, to the last syllable of recorded time; and all our yesterdays have lighted fools the way to dusty death Out, out, brief candle! Life’s but a walking shadow; a poor player, that struts and frets his hour upon the stage, and then is heard no more: it is a tale told by an idiot, full of sound and fury, signifying nothing </quote>

Trang 36

<tagged_url>http://www.amazon.com:80/exec/obidos/redirect?tag=associateid&amp;benztechnonogies=9441

Trang 37

A simple technique using xsl:copy-of

One of the simplest ways to start using XSL is to use the xsl:copy-of element tocreate a new XML document using a subset of a larger XML document Listing 8-2shows the contents of the XMLtoQuotes.xsl stylesheet This stylesheet creates anew XML document containing just the quotes from the sample XML document inListing 8-1

Listing 8-2: The Code for the XMLtoQuotes.xsl Stylesheet

<?xml version=”1.0” encoding=”UTF-8”?>

<xsl:stylesheetxmlns:xsl=”http://www.w3.org/1999/XSL/Transform” version=”1.0”>

Trang 38

Walking through the transformation, I declare the XSL stylesheet as an XML ment, and then declare an xsl: Namespace for the XSL elements in the stylesheet.

docu-Next, I specify the output method for the stylesheet as xml, and also specify theencoding for the output as ISO-8859-1, the same as the origin document Notethat the output encoding differs from the stylesheet encoding This is a good illus-tration of the fact that the source XML document, the XSL stylesheet, and the trans-formation output can all be different encoding types if needed However, it’s worthpointing out that most XSLT processors support only UTF-8 and UTF-16 encoding Ialso set the indent attribute to “yes” The indent attribute is one of the optionaland vague attributes that must be recognized but do not necessarily need to besupported in an XSLT processor If the indent attribute is set to “yes”, the XSLTprocessor is supposed to perform rudimentary formatting on the XSLT output

<?xml version=”1.0” encoding= <?xml version=”1.0”

quotelistelement in the source document, which is a child of the quotedoc rootelement using the select attribute of the apply-templates element

(select=”/quotedoc/quotelist/*”>):

<xsl:template match=”/”> <transformedquotes>

<transformedquotes>

<xsl:apply-templates select=

”/quotedoc/quotelist/*”>

Trang 39

The only template in the stylesheet is called as a result of the apply-templateselement The template is applied to all XML data in the node-set via the match=”*”attribute of the template element In this case, the node-set contains all the descen-dants of the /quotedoc/quotelist element The xsl:copy-of element makes acopy of all the nodes in a node-set without exception, including namespaces,attributes, and so on The select attribute could limit the copy-of element to aspecific scope, for example all of the attributes in the node-set, but in this case theselect just passes the whole node-set to the transformation output document byusing the XPath current node operator (.):

<xsl:template match=”*”> <quote source=”Macbeth”

author=”Shakespeare,

<xsl:copy-of select=”.”/> William”>When the hurlyburly’s

done, / When the battle’s lost

</xsl:template> and won.</quote>

<quote source=”Macbeth”

author=”Shakespeare,William”>Out, damned spot! out,

I say! One; two; why, then

‘tis time to do’t ; Hell ismurky! Fie, my lord, fie! asoldier, and afeard? What need

we fear who knows it, when none can call our power toaccount? Yet who would havethought the old man to have had

so much blood in him?</quote>

<quote source=”Macbeth”

author=”Shakespeare,William”>Is this a dagger which

I see before me, the handletoward my hand? Come, let meclutch thee: I have thee not,and yet I see thee still Artthou not, fatal vision,

sensible to feeling as tosight? or art thou but a dagger

of the mind, a false creation,proceeding from the heat-oppressed brain?</quote>

Trang 40

Stylesheet Output XML Document Result

<quote source=”Macbeth”

author=”Shakespeare,William”>To-morrow, and to-morrow, and to-morrow,creeps inthis petty pace from day today, to the last syllable ofrecorded time; and all ouryesterdays have lighted foolsthe way to dusty death Out,out, brief candle! Life’s but awalking shadow; a poor player,that struts and frets his hourupon the stage, and then isheard no more: it is a taletold by an idiot, full of soundand fury, signifying nothing

</quote>

Once the template is finished, control is passed back to the template that called thecopy-of template, and the hard-coded transformedquotes closing tag is added tothe XSLT output Next, the template and the stylesheet closing tags finish the XSLTprocess

</xsl:apply-templates> </transformedquotes>

</transformedquotes>

</xsl:template>

</xsl:stylesheet>

Listing 8-3 shows the final XSLT transformation output in its entirety

Listing 8-3: The XSLT Output Document

Ngày đăng: 09/08/2014, 18:22

TỪ KHÓA LIÊN QUAN

w