To teach this module, you need the following materials: Microsoft® PowerPoint® file 2663A_02.ppt 2663A_02_Code.htm To prepare to effectively teach this module: Read the following M
Trang 1Contents
Overview 1
Lesson: Overview of XML Parsing 2
Lesson: Parsing XML Using XmlTextReader 14
Lesson: Creating a Custom Reader 31
Review 37
Module 2: Parsing XML
Trang 2Information in this document, including URL and other Internet Web site references, is subject to change without notice Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred Complying with all applicable copyright laws is the responsibility of the user Without limiting the rights under copyright, no part
of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted
in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property
©2002 Microsoft Corporation All rights reserved
Microsoft, MS-DOS, Windows, Windows NT, Win32, Active Directory, ActiveX, BizTalk, IntelliSense, JScript, Microsoft Press, MSDN, PowerPoint, SQL Server, Visual Basic, Visual C#, and Visual Studio are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries
The names of actual companies and products mentioned herein may be the trademarks of their respective owners
Trang 3Instructor Notes
After completing this module, students will be able to:
Create a Stream object from an XML file
Build a mutable string by using the StringBuilder object
Handle errors in the form of XML
Parse XML as text by using the XmlTextReader object
Create a custom XmlReader object
To teach this module, you need the following materials:
Microsoft® PowerPoint® file 2663A_02.ppt
2663A_02_Code.htm
To prepare to effectively teach this module:
Read the following Microsoft NET Framework Class Library topics:
• XmlReader Class
• XmlTextReader
• StringBuilder Class
Read all of the materials for this module
Complete the practices and the lab
Practice delivering the demonstrations
In this module, some of the Microsoft PowerPoint® slides provide hyperlinks that open a code samples page in the Web browser The code samples page provides a way to show and discuss code samples when there is not enough space for the code on the PowerPoint slide It also allows students to copy code samples directly from the browser window and paste them into a development environment All of the linked code samples for this module are in a single htm file
To open a code sample, click the appropriate hyperlink on the slide To navigate between code samples in a particular language, use the table of contents
provided at the top of the code page Each hyperlink opens a separate instance
the Web browser, so it is a good practice to click the Back button in Microsoft
Internet Explorer after viewing a code sample This will close the browser window and return you to the PowerPoint presentation
Required materials
Preparation tasks
Hyperlinked Code
Examples
Trang 4How to Teach This Module
This section contains information that will help you to teach this module
Lesson: Overview of XML Parsing
This section describes the instructional methods for teaching each topic in this lesson
This topic introduces the module by defining the technical problem of parsing XML Most students will already understand what parsing is and why they would do it
This topic introduces XmlReader by comparing it with the Simple application
programming interface (API) for XML, or SAX, which many students are already familiar with Many students should also already be aware of the two models of XML parsing, the push model versus the pull model This topic compares SAX, as an example of the push model, to the Microsoft NET
Framework XmlReader class, as an example of the pull model of XML
processing As the lesson progresses, if you identify those students who have previous experience writing a SAX application, they might be able to help you
point out the advantages of XmlReader
Briefly cover the major features of the XmlReader class Students might ask about the technique of using XmlValidatingReader with a
ValidationEventHandler, which is covered in the next module
We cover reading XML from streams early, because it is a basic skill Be prepared to provide a definition of a stream
Another basic skill is creating and appending parsed XML by using a
StringBuilder object StringBuilder is preferred over the String object, because it uses much less memory StringBuilder also allows you to append content to the string without having to create a new StringBuilder object
Lesson: Parsing XML Using XmlTextReader
This section describes the instructional methods for teaching each topic in this lesson
This demonstration consists of showing typically usage of three functions of a Microsoft Visual Studio® NET add-in that was custom-built for this course To prepare for this demonstration, you should perform the demonstration steps as they are written and prepare to explain what the add-in does
Do not walk through the code during the demonstrations There are separate code examinations you will perform in which you will do just that
For more information about the add-in see Appendix A, “The XML Tools Add-In.”
Show how to instantiate a new XmlTextReader
Discuss the Read() method
How to Read Streams
How to Build Strings
Trang 5Discuss the NodeType property
Discuss how to use the Name, Value, and Attributes properties to read the
To change the display options
1 On the Tools menu, click Options
2 Click the Text Editor folder, and then click the HTML/XML folder
3 Select the Word wrap and Line numbers options
While in the Code window, pressing CTRL+R twice will toggle word wrap on and off
4 Click the Environment folder, and then click the Fonts and Colors folder
5 Change the font used for the Text Editor and the Text Output Tool Windows to Lucida Console 14 pt
6 Click OK
7 Close and restart Visual Studio NET for the changes to take effect
Lesson: Creating a Custom Reader
This section describes the instructional methods for teaching each topic in this lesson
Be prepared to provide one or two anecdotes that illustrate the need for a custom reader
Discuss the types of XmlReader you can inherit from and the mechanics of overriding the Read() method
Be prepared to explain how the Read() method exposes the attribute as an element node type by using the XmlNodeType.Name and
XmlNodeType.Value properties
How to Determine the
Current Node Type
How to Read the
Trang 7Overview
Overview of XML Parsing
Parsing XML Using XmlTextReader
Creating a Custom Reader
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
This module discusses how to parse Extensible Markup Language (XML) data
from a file, string, or stream by using the XmlTextReader class The XmlNodeReader object is not covered in this module, but works in a similar way as the XmlTextReader object
Both the XmlTextReader and XmlNodeReader objects inherit from XmlReader If these descendant objects do not provide the needed
functionality, you can create a custom reader object that inherits from
XmlReader
After completing this module, you will be able to use the Microsoft® NET Framework to:
Create a Stream object from an XML file
Build a mutable string by using the StringBuilder object
Handle errors in the form of XML
Parse XML as text by using the XmlTextReader object
Create a custom XmlReader object
Introduction
Objectives
Trang 8Lesson: Overview of XML Parsing
Introduction to XML Parsing
XML Parsing Models
Parsing XML with the XmlReader Class
How to Read Streams
How to Build Strings from Parsed XML
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
The XmlReader base class and the objects that inherit from it are a powerful set of tools for parsing XML This lesson discusses how to use the XmlReader
and supporting classes to parse XML in a variety of use contexts
After completing this lesson, you will be able to:
Read XML from a File object
Read XML from a Stream object
Store XML in a StringBuilder object
Introduction
Lesson objectives
Trang 9Introduction to XML Parsing
Parsing and reading XML mean the same thing
Parse XML to find content and to use node information
Create a list by node type
Sort nodes by namespace identifier
List all of the child elements in an XML source
Find a node by relative position
Find the last node to signal when to stop parsing
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
What does it mean to parse XML? Parsing refers to the process of reading
XML and then performing some action based on the information read
When you parse XML, you often filter the data in an attempt to locate a particular data value or range of values At other times, you might be more
interested in the node information that the parser finds The term node, when
used in this context, refers to a node as defined by the World Wide Web Consortium (W3C) XML Information Set Recommendation available at http://www.w3.org/TR/xml-infoset
Parsing XML allows you to query an XML source to find a particular data value For example, suppose that you must build an application that can query a local store of XML-based human resources data Parsing the XML should allow you to find a particular value such as the record that is associated with an employee number that is equal to “12345.”
Parsing also allows you to filter an XML source to find a set of related information For example, you might want to filter a personnel listing to find those employees whose hire date falls within the current month
Parsing allows you to use the node information in an XML source, such as the node type, or node value The following are useful tasks that you can
accomplish by using node information made available by parsing:
Use node information to create a list by node type
Sort nodes by namespace identifier
List all of the child elements in an XML source
Introduction
Find particular content
Make use of node
information
Trang 10Application Generate calls to XmlReader that pull specific XML
SAX XML reader Push unfiltered XML to the calling application
XmlReader class
Pull specified XML and implement error handling
XmlReader class Pull specified XML and implement error handling
Application Process nodes, handle errors, and monitor the state of the reader
XmlTextReader
Content Handler Error Handler
XmlNodeReader
Node Handler
XmlValidatingReader
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
XML processors are based on the push model or the pull model of XML processing The push model is typified by a processor that uses the Simple application programming interface (API) for XML, referred to as SAX The pull model is typified by how the NET Framework XML reader classes process XML
The push model of XML processing means that the parser “pushes” to the
application an unfiltered, steady stream of parsed XML nodes SAX is an example of a parser that does this SAX pushes unfiltered XML nodes in response to a request by an application
You must write applications that consume unfiltered XML nodes to filter relevant node information and content The push model assumes that there is perfectly formed XML If the SAX processor finds an XML error, it
immediately stops processing and then sends an exception to the calling application You should write any application that uses the push model of XML processing to handle a variety of XML errors
SAX is not supported by the NET Framework, but you can use existing SAX tools, such as the Microsoft XML Parser (MSXML), in your NET-based programs
Introduction
What is the push model
of XML processing?
Trang 11The pull model of XML processing means that the parser pulls from the XML
source only those nodes that it is instructed to pull by a calling application
XmlReader, a NET Framework class, is an example of a parser that pulls a
filtered set of XML nodes in response to a request by an application
XmlReader objects read the XML one node at a time and only send
notification to the application in response to some predefined criteria Similar to
the SAX processor, if an XmlReader object finds an XML error, it sends an exception to the calling application Unlike the SAX processor, XmlReader
objects are designed to continue processing XML even after an XML error is found
There are two main advantages of using the XmlReader pull model versus the
push model, when it is implemented by SAX First, it is easier to code
applications that use the XmlReader XmlReader pull-processing is typically
implemented by using looping structures, whereas push models use routines that handle state Looping structures are easier to write than routines that handle state Although contextual state management is still a challenge with the pull model, managing the context is easier to code by using consumer-driven procedural techniques
Second, applications that use XmlReader can potentially perform better
because they require less processing power and memory than applications that
rely on SAX Applications that use XmlReader can take advantage of client
hints to make more efficient use of character buffers; for example, by avoiding needless string copies Consumers can also selectively process elements; for example, by skipping elements of no interest and by not expanding entities With a push model, everything must be passed through the application, because the reader has no way of knowing what is important
If you still prefer to use a push model, you can layer a set of push-style
interfaces on top of the XmlReader pull model, but the reverse is not yet true
A sample SAX2 implementation layered over an XmlReader may ship with the
.NET Framework software development kit (SDK)
What is the pull model of
XML processing?
Trang 12The following table summarizes the primary benefits of the pull model
Benefit Description
complex state machines The pull model client simplifies state management by means of a natural, top-down procedural refinement
input streams This task is extremely complicated in the push model
reverse is not true
parser writes a string object to its own buffer Then, the parser pushes the string object to the client buffer
In the pull model, the string is read into the parser buffer one time only
attributes, processing instructions, and white space The pull model client can skip items, processing only those items that are of interest to the application This allows for extremely efficient applications
Summary of pull model
benefits
Trang 13Parsing XML with the XmlReader Class
What is XmlReader?
An abstract base class
Extends to these XML readers: XmlTextReader, XmlNodeReader, and XmlValidatingReader
Can be used either to create customized readers
Non-cached, forward-only, read-only access
Allows you to pull only those nodes that interest you
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
The XmlReader class is an abstract base class that provides non-cached,
forward-only, read-only access to XML sources, including streams, files, and Uniform Resource Locators (URLs) It implements the namespace requirements outlined in the Namespaces in XML Recommendation provided by the W3C, located at http://www.w3.org/TR/REC-xml-names/
XmlReader class objects can quickly read data from XML sources without
placing high demands on system resources such as memory and CPU time
Because XmlReader is an abstract base class, you can use it to create your own type of reader or implement one of the XmlReader extended classes The XmlReader class has three implementations that extend the base class and vary
in their design to support different scenario needs
The following table describes the implementations of the XmlReader class Class Description
XmlTextReader Reads character streams This is a forward-only reader
that has methods that return data on content and node types
XmlNodeReader Provides a parser over an XML Document Object Model
(DOM) API
XmlValidatingReader Provides a fully compliant validating or non-validating
XML parser with Document Type Definition (DTD), XML Schema Definition language (XSD) schema, or XML-Data Reduced (XDR) schema support This class
takes an XmlTextReader and layers validation services
Trang 14By using the XmlReader class members, you can develop a solution that can
respond conditionally to node information in the XML source
XmlReader class objects read XML by stepping though it one node at a time
As each node is read, the program can perform actions based on the qualities of that node Such qualities include the type of the node, its attributes and data, and other node information
As an additional benefit to the job of programming, XmlReader class objects
determine if the XML is well-formed If the XML contains an error,
XmlReader objects throw an exception of the type XmlException, and the
processing stops
To continue processing after an error occurs, you must use an
XmlValidatingReader with a ValidationEventHandler instead
For a complete description of the members of the XmlReader class, see
XmlReader Members in the Additional Reading folder
To use an XmlReader object or any of its derived classes in your application, you must provide a reference to the NET Framework System.Xml
Trang 15How to Read Streams
A stream is an abstraction of bytes drawn from any number of sources
A stream may be created from a file, URL, or another stream
Use a StreamReader to read a stream
object = new Stream( file | string | stream )
Visual Basic Example C# Example
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
Your application can use the classes contained in the System.IO namespace to
read XML data from a stream or from a file The terms file and stream convey a
particular meaning within the NET Framework
The term file is here used in the ordinary sense: an ordered and named
collection of a particular sequence of bytes having persistent storage When you program an application to read XML from a file, you must consider directory paths, disk storage, and file and directory names
To simplify the job of programming an application to read files, you can use NET Framework file and directory system input and output classes The
following table describes the file and directory System.IO classes
System.IO class Description File Provides static methods to create, copy, move, and open files Aids
in the creation of FileStream objects The FileInfo class provides
instance methods
Directory Provides static methods to create, move, and enumerate directories
and subdirectories The DirectoryInfo class provides instance
methods
TextReader Represents a reader that can read a sequential series of characters
TextReader is designed for character input, whereas the Stream
class is designed for byte input and output
StreamReader Implements a TextReader that reads characters from a byte stream
in a particular encoding StreamReader is designed for character input in a particular encoding, whereas the Stream class is
designed for byte input and output
Introduction
What is a file?
System.IO classes for
reading files
Trang 16A stream is an abstraction of a sequence of bytes The bytes themselves can
originate from any number of sources, such as a file, an input/output device, an interprocess communication pipe, or a Transmission Control Protocol/Internet Protocol (TCP/IP) socket Examples of streams include network, memory, and tape streams
The Stream class and its derived classes provide a generic view of a sequence
of bytes Using a stream simplifies the job of programming read operations of XML that might originate from various operating systems and devices
Streams involve the following fundamental operations:
Streams can be read from Reading is the transfer of data from a stream into
a data structure, such as an array of bytes
Streams can be written to Writing is the transfer of data from a data structure into a stream
Streams can support seeking Seeking is the querying and modifying of the current position within a stream
Depending on the underlying data source or repository, streams might support only some of these capabilities
What is a stream?
Trang 17In this example, a StreamReader object is created from a File object The
following code example, provided in both the Microsoft Visual Basic® and C# languages, reads an entire text file line by line
All code samples assume that any required namespaces are aliased at the
top of the class For example, to use the classes within the System.IO
namespace, the following statement is required:
' Visual Basic ® Imports System.IO // C#
using System.IO;
' Visual Basic Dim BooksFilename As String = "c:\books.txt"
If File.Exists(BooksFilename) Then Dim BooksReader As StreamReader = _ File.OpenText(BooksFilename) Dim CurrentLine As String = BooksReader.ReadLine() While Not CurrentLine Is Nothing
' process line CurrentLine = BooksReader.ReadLine() End While
BooksReader.Close() End If
// C#
string BooksFilename = @"c:\books.txt";
if (File.Exists(BooksFilename)) { StreamReader BooksReader = File.OpenText(BooksFilename); String CurrentLine = BooksReader.ReadLine();
while (CurrentLine != null) { // process line
CurrentLine = BooksReader.ReadLine();
} BooksReader.Close();
} For more information, search the NET Framework Class Library for the
keywords Stream Class
When using the classes in the System.IO namespace, you must satisfy the
operating system security requirements, such as access control lists (ACLs), for
access to be allowed This requirement is in addition to any FileIOPermission
Trang 18How to Build Strings from Parsed XML
The String object is immutable
Do NOT use when concatenating in a loop
The StringBuilder object is mutable
To build a string with the StringBuilder class, use the Append() method inside a loop
Use the ToString() method to retrieve the string
Visual Basic Example C# Example
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
It is typical for an application that reads XML to build strings to hold filtered data A reader object is normally inserted into a looping structure In such a case, each time the loop iterates, the reader object reads another node or set of nodes and then copies the data into a string object For a large XML file, the loop might iterate thousands of times and build a result composed of tens of thousands of XML nodes
When you want to modify a string without creating a new object, consider using
the System.Text.StringBuilder class instead of the String class For example, using the StringBuilder class can boost performance when concatenating many
strings together in a loop
The System.Text.StringBuilder class represents a mutable string of characters This means that you can modify the contents of a StringBuilder object The
value is said to be mutable because it can be modified after it has been created,
by appending, removing, replacing, or inserting characters
At first glance, you might decide to try the String class as the object type
to concatenate XML fragments that originate from an XML reader However,
this would be a mistake, because the String class is designed to represent an
immutable series of characters This means that you cannot simply append new
characters to a String class each time a reader iterates through a looping structure Doing so creates multiple instances of the String object and can
easily result in highly expensive XML source processing However, in the case
of reading a file into a stream, an appropriate first step is to load the file into a
String object The stream can then load the XML from the String object
Introduction
What is the
StringBuilder class?
Note
Trang 19The following example initializes a new instance of the StringBuilder class by
using the specified string, and then creates a string containing the 12 Times
Table by using a for loop:
' Visual Basic Dim sb As New StringBuilder("12 Times Table:") Dim i As Integer
For i = 1 To 12 sb.Append(vbCrLf & i & " x 12 = " & i * 12) Next
MessageBox.Show(sb.ToString()) ' Do NOT use the String class, for example Dim s As String = "12 Times Table:"
Dim i As Integer For i = 1 To 12
s += vbCrLf & i & " x 12 = " & i * 12 Next
MessageBox.Show(s) // C#
StringBuilder sb = new StringBuilder("12 Times Table:"); for (int i = 1; i <= 12; i++) {
sb.Append("\n" + i + " x 12 = " + i * 12);
} MessageBox.Show(sb.ToString());
// Do NOT use the String class, for example string s = "12 Times Table:";
for (int i = 1; i <= 12; i++) {
s += "\n" + i + " x 12 = " + i * 12;
} MessageBox.Show(s);
Example
Trang 20Lesson: Parsing XML Using XmlTextReader
Demonstration: Parsing XML
How to Create an XmlTextReader Object
How to Navigate Nodes
How to Determine the Current Node Type
How to Read the Contents of a Node
How to Handle White Space
How to Handle XML Errors While Parsing
Code Examination: Parsing XML
Practice: Reading XML Content and Nodes
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
The node information in an XML source is an important resource that you can use in applications that process XML You can use node information not only to find particular content, but also as a very useful basis for the logic that controls program flow In this lesson, you will learn how to find and use XML node information in your applications
After completing this lesson, you will be able to:
Navigate through XML nodes by using the Read() methods
Determine the current node type and extract information about the current node
Read the attributes of an element type of node
Handle white space in an XML document
Implement XML error handling while parsing
Introduction
Lesson objectives
Trang 21***************************** ILLEGAL FOR NON - TRAINER USE ******************************
In this demonstration, you will see the parsing and filtering functionality of the
XML Tools add-in Compiled release versions of the add-in written in both
Microsoft Visual C#™ and Microsoft Visual Basic® languages are available in the following folders:
install_folder\Democode\Addins\
XmlToolsAddinCS\XmlToolsAddinCSSetup\Release\
install_folder\Democode\Addins\
XmlToolsAddinVB\XmlToolsAddinVBSetup\Release\
To install the add-in
1 Double-click the setup.exe file in one of the folders above
2 Follow the instructions in the wizard
For detailed installation instructions see Appendix A
To parse a sample XML file that is open in the editor
1 In Microsoft Visual Studio® NET, open the files named books.xml and employee.xml These are located in the folder
</employees>
3 On the XML Tools toolbar, click Parse
Introduction
Demonstration
Trang 224 Notice that the Output window opens, showing detailed information about the employee.xml file Each node in the XML file appears as a row in the details table, and a count of the number of each type of node appears in the summary table
To parse another sample XML file
1 Click Solution Explorer to make it active
2 On the XML Tools toolbar, click Parse Because no XML file is active, a
dialog box appears prompting the user to choose one of the open files
3 In the Parse dialog box, click the file named books.xml, and then click OK
The Output window opens showing detailed information about the file
4 Use the Output window to verify the answers provided to the following questions:
a What is the Depth of the Text node with a value of Benjamin?
Trang 23To parse a sample XML file in the internal browser
1 On the View menu, click Web Browser, and then click Show Browser (or press Ctrl+Alt+R)
2 Make sure that Set web links to internally or externally opened is set to
internal (The icon should look like this )
3 Enter the following URL in the Web toolbar:
http://localhost/2663/Democode/Mod02/books.xml
4 On the XML Tools toolbar, click Parse This demonstrates that the add-in
can parse any XML-compliant file that is accessible on the Internet
To filter by specifying a child element value
1 On the XML Tools toolbar, click Filter
2 If the add-in prompts to select a file, click books.xml, and then click OK
3 In the Filter dialog box, enter the following options, and then click OK Option Value
4 Notice that the Output window shows the one book that matches the filter
To filter by specifying an attribute value
1 On the XML Tools toolbar, click Filter
2 If the add-in prompts you to select a file, click books.xml, and then click
OK
3 In the Filter dialog box, enter the following options, and then click OK Option Value
4 Notice that the Output window shows the two books that match the filter
To convert the active file and save the result to a file
1 On the XML Tools toolbar, click Convert
2 If the add-in prompts to select a file, click books.xml, and then click OK
The Output window shows books.xml with all of its attributes converted to elements
3 Click the Output window to make it active, and then on the File menu, click Save Output As
4 Save the output as BooksAsElements.xml in the folder
install_folder\Democode\Mod02\
Trang 24How to Create an XmlTextReader Object
XmlTextReader BooksReader =new XmlTextReader(@"c:\books.xml");
XmlTextReader BooksReader =new XmlTextReader(@"c:\books.xml");
Stream String TextReader URL
***************************** ILLEGAL FOR NON - TRAINER USE ******************************
The XmlTextReader class is an implementation of XmlReader and provides a
high performance parser It enforces the rule that XML must be well-formed It
is neither a validating nor a non-validating parser, because it does not have DTD or schema information It can read text in blocks or read characters from a stream
The XmlTextReader can read data from different inputs
XmlTextReader BooksReader = new XmlTextReader(@"c:\books.xml");
Introduction
XmlTextReader
constructor