Figure 5.6: Walking down the input treeFollowing the Processor Let’s follow the XSL processor for the first few templates in the style sheet.After loading the style sheet and the source
Trang 1The syntax for XML paths is similar to file paths XML paths start fromthe root of the document and list elements along the way Elements are separated by the “/” character
The root of the document is “/” The root is a node that sits before the level element It represents the document as a whole
top-The following four paths match respectively the title of the article(<title>XML Style Sheets</title>), the keywords of the article, the top-most article element, and all sections in the article Note that the last pathmatches several elements in the source tree
/article/title /article/keywords /article
/article/section
T I P
Note that “ / ” points to the immediate children of a node Therefore, /article/title
selects the main title of the article (XML Style Sheets), not all the titles below the cle element It won’t select the section titles.
arti-To select all the descendants from a node, use the “ // ” sequence /article//title
selects all the titles in the article It selects the main title and the section titles
In the style sheet, most paths don’t start at the root XSL has the notion ofcurrent element Paths in the match attribute can be relative to the currentelement
Again, this is similar to the file system Double-clicking the accessories
folder in the c:\program filesfolder moves to c:\program files\
accessoriesfolder, not to c:\accessories
If the current element is an article, then titlematches /article/title
but if the current article is a section, titlematches one of the /article/ section/titles
To match any element, use the wildcard character “*” The path /article/*
matches any direct descendant from article, such as title, keywords, and
so on
It is possible to combine paths in a match with the “|” character, such as
title | pmatches title or pelements
135Basic XSLT
E X A M P L E
Trang 2It matches <url protocol=”mailto”>bmarchal@pineapplesoft.com</url>
that has a protocol attribute with the value “mailto” but it does not match
<url>http://www.w3.org/Style</url> The more generic urlpath wouldmatch the later element
url[@protocol]matches URL elements that have a protocol attribute,
no matter what its value is It would match the <url protocol=”http”>www.w3.org/Style</url>but it would not match
<url>http://www.w3.org/Style</url>
Matching Text and Functions
Functions restrict paths to specific elements The following two paths areidentical and select the text of the title of the second section in the docu-ment (Styling)
/article/section[position()=2]/title/text() /article/section[2]/title/text()
Most functions can also take a path as an argument For example,
count(//title)returns the number of title elements in the document.Table 5.1 lists some of the most common functions
Table 5.1: Most common XSL functions
E X A M P L E
O U T P U T
O U T P U T
Trang 3last() returns the position of the last node in the current node set
starts-with() returns true if the first argument starts with the second
</xsl:functions>
C A U T I O N
Be aware that this element was still very much in flux in the draft we used to prepare this chapter
Deeper in the Tree
After loading the style sheet, the XSL processor loads the source document.Next, it walks through the source document from root to leaf nodes At eachstep it attempts to match the current node against a template
If there is a match, the processor generates the nodes in the resulting tree.When it encounters xsl:apply-templates, it moves to the children of thecurrent node and repeats the process; that is, it attempts to match themagainst a template
In other words, xsl:apply-templatesis a recursive call to the style sheet
A recursive approach is natural to manipulate trees You might have nized a deep-first search algorithm Figure 5.6 illustrates how it works
recog-137Basic XSLT
E X A M P L E
Trang 4Figure 5.6: Walking down the input tree
Following the Processor
Let’s follow the XSL processor for the first few templates in the style sheet.After loading the style sheet and the source document the processor posi-tions itself at the root of the source document It looks for a template thatmatches the root and it immediately finds
When it encounters xsl:appy-templates, the processor moves to the firstchild of the current node The first child of the root is the top-level element
or the article element
The style sheet defines no templates for the article but can match template
against a built-in template Built-in templates are not defined in the style
sheet They are predefined by the processor
Trang 5The first and only child of title is a text node The style sheet has no rule tomatch text but there is another built-in template that copies the text in theresulting tree.
<xsl:template match=”text()”>
<xsl:value-of select=”.”/>
</xsl:template>
The title’s text has no children so the processor cannot go to the next level
It backtracks to the article element and moves to the next child: the dateelement This element matches the last template
<xsl:template match=”abstract | date | keywords | copyright”/>
This template generates no output in the resulting tree and stops ing for the current element
process-The processor backtracks again to article and processes its other children:copyright, abstract, keywords, and section Copyright, abstract, and key-words match the same rule as abstract and generate no output in theresulting tree
139Basic XSLT
Trang 6The section element, however, matches the default template and the sor moves to its children, title, and pelements The processor continues tomatch rules with nodes until it has exhausted all the nodes in the originaldocument.
proces-Creating Nodes in the Resulting Tree
Sometimes it is useful to compute the value or the name of new nodes Thefollowing template creates an HTML anchor element that points to theURL The anchor has two attributes The first one, TARGET, is specifieddirectly in the template However, the processor computes the secondattribute, HREF, when it applies the rule
Table 5.2 lists other XSL elements that compute nodes in the resulting tree
Table 5.2: XSL elements to create new objects
xsl:processing-instruction creates a processing instruction
inserting a variable
alternatives
E X A M P L E
O U T P U T
Trang 7PRIORITYThere are rules to prioritize templates Without going into too many details,templates with more specific paths take precedence over less specific tem-plates In the following example, the first template has a higher prioritythan the second template because it matches an element with a specificattribute.
Supporting a Different Medium
Recall that my original problem is to provide both an HTML and a text sion of the document We have seen how to automatically create an HTMLversion document, now it’s time to look at the text version
ver-Text Conversion
C A U T I O N
Text conversion stretches the concept of XML to XML conversion; therefore, you have to
be careful in writing the style sheet.
Listing 5.4 is the text style sheet It is very similar to the previous stylesheet except that it inserts only text nodes, no XML elements, in the resulting tree
141Supporting a Different Medium
E X A M P L E
E X A M P L E
Trang 8<?xml version=”1.0” encoding=”ISO-8859-1”?>
<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform/”>
Trang 9=== XML Style Sheets ===
Send comments and suggestions to <bmarchal@pineapplesoft.com>.
*** Styling ***
Style sheets are inherited from SGML, an XML ancestor Style sheets originated
➥in publishing and document management applications XSL is XML’s standard style
➥sheet, see [http://www.w3.org/Style].
*** How XSL Works ***
An XSL style sheet is a set of rules where each rule specifies how to format
➥certain elements in the document To continue the example from the previous
➥section, the style sheets have rules for title, paragraphs and keywords With XSL, these rules are powerful enough not only to format the document
➥but also to reorganize it, e.g by moving the title to the front page or
➥extracting the list of keywords This can lead to exciting applications of XSL
➥outside the realm of traditional publishing For example, XSL can be used to
➥convert documents between the company-specific markup and a standard one.
*** The Added Flexibility of Style Sheets ***
Style sheets are separated from documents Therefore one document can have more
➥than one style sheet and, conversely, one style sheet can be shared amongst
➥several documents.
This means that a document can be rendered differently depending on the media or
➥the audience For example, a “managerial” style sheet may present a summary
➥view of a document that highlights key elements but a “clerical” style sheet
➥may display more detailed information
143Supporting a Different Medium
O U T P U T
Trang 10Customized Views
Currently, most people access the Web through a browser on a Windows
PC Some people use Macintoshes, others use UNIX workstations This willchange in the future as more people turn to specialized devices AlreadyWebTV has achieved some success with a browser in a TV set
Mobile phones and PDAs, such as the popular PalmPilot, will be ingly used for Web browsing Ever tried surfing on a PalmPilot? It workssurprisingly well but, on the small screen, many Web sites are not readableenough
increas-One solution to address the specific needs of smaller devices might be touse XHTML, an XML simplified version of HTML XHTML is based onHTML but it has an XML syntax (as opposed to an SGML syntax) It is alsodesigned to be modular as it is expected smaller devices will implementonly a subset of the recommendation
According to the W3C, these new platforms might account for up to 75% ofWeb viewing by the year 2002 What can you do about it? Will you have tomaintain several versions of your Web site: one for existing Web browsersand one for each new device with its own subset?
XSL to the rescue! It will be easy to manage the diversity of browsers andplatforms by maintaining the document source in XML and by converting
to the appropriate XHTML subset with XSLT In essence, this is how Imanage the e-zine Figure 5.7 illustrates how this works
Figure 5.7: Maintain one XML document and convert it to the appropriate
markup language.
Trang 11Where to Apply the Style Sheet
So far, we have converted the XML documents before publishing them Theclient never sees XML; it manipulates only HTML
Today, this is the realistic option because few users have an XML-enabled
browser such as Internet Explorer 5.0 or a beta version of Mozilla 5.0
(Mozilla is the open source version of Netscape Communicator)
Furthermore, the XSL recommendation is not final yet so implementations
of XSL processors are not always compatible with one another
Yet, if your users have XML-enabled browsers, it is possible to send themraw XML documents and style sheets The browser dynamically applies thestyle sheets and renders the documents Figure 5.8 contrasts the twooptions
145Where to Apply the Style Sheet
Figure 5.8: Style sheets on the server or on the client
Internet Explorer 5.0
C A U T I O N
Because XSL is still in draft, browser implementations are not compatible The material
in this section works with Internet Explorer 5.0, which implements an early draft of XSL and is not compatible with the current draft, much less with the future recommenda- tion.
The processing instruction xml-stylesheetassociates a style sheet with thecurrent document It takes two parameters, an href to the style sheet andthe type of the style sheet (text/xsl, in this case)
<?xml-stylesheet href=”simple-ie5.xsl” type=”text/xsl”?>
Listing 5.6 is the XML document with the appropriate processing tion for Internet Explorer 5.0
instruc-Listing 5.6: The XML Document Prepared for Internet Explorer 5.0
<?xml version=”1.0”?>
<?xml-stylesheet href=”simple-ie5.xsl” type=”text/xsl”?>
E X A M P L E
continues
Trang 12<article fname=”19990101_xsl”>
<title>XML Style Sheets</title>
<date>January 1999</date>
<copyright>1999, Benoît Marchal</copyright>
<abstract>Style sheets add flexibility to document viewing.</abstract>
<keywords>XML, XSL, style sheet, publishing, web</keywords>
<p>Style sheets are inherited from SGML, an XML ancestor Style sheets
➥originated in publishing and document management applications XSL is XML’s
➥standard style sheet, see <url>http://www.w3.org/Style</url>.</p>
</section>
<section>
<title>How XSL Works</title>
<p>An XSL style sheet is a set of rules where each rule specifies how to format
➥certain elements in the document To continue the example from the previous
➥section, the style sheets have rules for title, paragraphs and keywords.</p>
<p>With XSL, these rules are powerful enough not only to format the document
➥but also to reorganize it, e.g by moving the title to the front page or
➥extracting the list of keywords This can lead to exciting applications of XSL
➥outside the realm of traditional publishing For example, XSL can be used to
➥convert documents between the company-specific markup and a standard one.</p>
</section>
<section>
<title>The Added Flexibility of Style Sheets</title>
<p>Style sheets are separated from documents Therefore one document can have
➥more than one style sheet and, conversely, one style sheet can be shared
➥amongst several documents.</p>
<p>This means that a document can be rendered differently depending on the media
➥or the audience For example, a “managerial” style sheet may present a summary
➥view of a document that highlights key elements but a “clerical” style sheet
➥may display more detailed information.</p>
</section>
</article>
Furthermore, the style sheet must be adapted to the older version of XSLthat Internet Explorer supports Listing 5.7 is the adapted style sheet.Figure 5.9 shows the result in Internet Explorer
Trang 13Listing 5.7: XSLT Style Sheet for Internet Explorer 5.0
<?xml version=”1.0” encoding=”ISO-8859-1”?>
<xsl:stylesheet xmlns:xsl=”http://www.w3.org/TR/WD-xsl”
continues
Trang 14Figure 5.9: Internet Explorer 5.0 renders XML.
Changes to the Style Sheet
The style sheet has been adapted in two places First, the XSL namespacepoints to an earlier version of XSL
Trang 15<xsl:stylesheet xmlns:xsl=”http://www.w3.org/TR/WD-xsl”
Advanced XSLT
XSLT is a powerful transformation mechanism So far, we have only used asubset of it Our resulting document follows a structure that is close to theoriginal document Elements might have been added or removed from thetree but they are not reorganized
Yet, it is often useful to reorganize completely the source document Forexample, we might want to create a table of contents at the beginning ofthe document
This is possible with the xsl:value-ofelement xsl:value-ofinserts trary elements from the source tree anywhere in the resulting tree
arbi-Listing 5.8 is a more sophisticated style sheet that, among other things,creates a table of contents
Listing 5.8: A More Powerful XSLT Style Sheet
149Advanced XSLT
E X A M P L E
continues
Trang 18You can use LotusXSL to apply this style sheet It generates the HTML ument in Listing 5.9 Figure 5.10 shows the result in a browser.
doc-Listing 5.9: The Resulting HTML Document
<!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”>
<HTML>
<HEAD>
<TITLE>XML Style Sheets ( January 1999 )</TITLE>
<META name=”keywords” content=”XML, XSL, style sheet, publishing, web”>
<LI><A href=”#N-634270327”>How XSL Works</A></LI>
<LI><A href=”#N-653406839”>The Added Flexibility of Style Sheets</A></LI>
</UL>
<P>Send comments and suggestions to <A href=”mailto:bmarchal@pineapplesoft.com”>bmarchal@pineapplesoft.com</A>.</P>
<P><I><A name=”N-614609527”>Styling</A></I></P>
<P>Style sheets are inherited from SGML, an XML ancestor Style sheets
➥originated in publishing and document management applications XSL is XML’s
➥standard style sheet, see <A target=”_blank”
➥href=”http://www.w3.org/Style”>http://www.w3.org/Style</A>.</P><P><I><A name=
➥”N-634270327”>How XSL Works</A></I></P>
<P>An XSL style sheet is a set of rules where each rule specifies how to format
➥certain elements in the document To continue the example from the previous
➥section, the style sheets have rules for title, paragraphs and keywords.</P>
<P>With XSL, these rules are powerful enough not only to format the document
➥but also to reorganize it, e.g by moving the title to the front page or
➥extracting the list of keywords This can lead to exciting applications of XSL
➥outside the realm of traditional publishing For example, XSL can be used to
➥convert documents between the company-specific markup and a standard one.</P>
<P><I><A name=”N-653406839”>The Added Flexibility of Style Sheets</A></I></P>
<P>Style sheets are separated from documents Therefore one document can have
➥more than one style sheet and, conversely, one style sheet can be shared
➥amongst several documents.</P>
<P>This means that a document can be rendered differently depending on the media
➥or the audience For example, a “managerial” style sheet may present a summary
➥view of a document that highlights key elements but a “clerical” style sheet
➥may display more detailed information.</P>
<P>Copyright ©1999, Benoît Marchal</P></BODY></HTML>
O U T P U T
Trang 19Figure 5.10: The resulting HTML document in a browser
Declaring HTML Entities in a Style Sheet
This style sheet has an internal DTD to declare the copyentity—an HTML entity HTML has many entities that XML does not recognize
<!DOCTYPE xsl:stylesheet [
<!ENTITY copy “©”>
]>
Reorganizing the Source Tree
The list of keywords must appear in an HTMLMETAelement The followingexample extracts the keywords from the source tree, with the xsl:value-of
O U T P U T
E X A M P L E
O U T P U T
Trang 20Calling a Template
When the same styling instructions are used at different places, groupthem in a named template For example, titles appear in the HTML titleand in the body of the document
xsl:includemust be a direct child of xsl:stylesheet, it cannot appear in
xsl:templatefor example
Repetitions
Sometimes a path points to several elements For example, article/ section/titlepoints to the three section titles To loop over the elements,use xsl:for-each The following rule builds a table of contents with sectiontitles:
<LI><A href=”#N-634270327”>How XSL Works</A></LI>
<LI><A href=”#N-653406839”>The Added Flexibility of Style Sheets</A></LI>
Trang 21xsl:for-eachhas a select attribute so it needs a fully qualified path.
However, within the loop, the current element is the selection that
xsl:value-ofretrieves through the “.” path
The template also introduces the generate-id()function The functionreturns a unique identifier for the current node
Using XSLT to Extract Information
As the various examples in this chapter illustrate, XSLT is a very powerfuland flexible mechanism that serves many purposes
Indeed XSLT is not limited to styling It also can be used to extract mation from XML documents
infor-Imagine I need to generate an index of articles The XSLT solution is a step process In the first step, a style sheet extracts useful information fromthe documents Extracting information can be thought of as transforming alarge XML document into a smaller one
two-The first step creates as many extract documents as there are originals.The next step is to merge all the extracts in one listing It is then a simpleissue to convert the listing in HTML Figure 5.11 illustrates the process
155Using XSLT to Extract Information
Figure 5.11: How using XSL can extract information from XML documents
A N E W S T A N D A R D
The W3C works on a new standard XQL, the XML Query Language, that will offer a
better solution to this problem XQL can query multiple documents stored in an XML database.
XQL will use paths similar or identical to XSLT so it will be familiar Because it works across several documents, XQL is really designed for XML databases XML databases store documents in binary format to provide faster access.
To experiment with XQL without buying a database, you can download the GMD-IPSI XQL engine from xml.darmstadt.gmd.de/xml The engine is written in Java but it has a command-line interface.
Trang 22Listing 5.10 is a style sheet to extract the URL, the title, the abstract, andthe date of an article document.
Listing 5.10: Style Sheet to Extract Data
<?xml version=”1.0” encoding=”ISO-8859-1”?>
<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform/”>
➥for %%0 in (%files%) do %xslprocessor% -in %%0.xml -out %%0.xtr
E X A M P L E
E X A M P L E
E X A M P L E
Trang 23-xsl extract.xsl copy opening.tag index.xml for %%0 in (%files%) do copy index.xml /a + %%0.xtr
➥/a index.xml /a copy index.xml + closing.tag index.xml
➥offering from Sun Jini extends Java towards distributed computing in novative
➥ways In particular, Jini builds on the concept of “spontaneous
O U T P U T
E X A M P L E
continues
Trang 25What’s Next
In this chapter, you learned how to use the transformation half of XSL Thenext chapter is dedicated to styling XML directly, no conversion required,with CSS and XSLFO
The combination of XSLT and CSS gives you total control over how yourdocument is displayed
159What's Next