1. Trang chủ
  2. » Công Nghệ Thông Tin

Professional ASP.NET 1.0 Special Edition- P20 pdf

40 194 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Validating Xml Documents With An Xmlvalidatingreader Object
Trường học Standard University
Chuyên ngành Computer Science
Thể loại Bài luận
Năm xuất bản 2023
Thành phố Standard City
Định dạng
Số trang 40
Dung lượng 1,06 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

We also display a hyperlink to this document: 'variable to count number of validation errors found Dim intValidErrors As Integer = 0 'create the new XmlTextReader object and load the X

Trang 1

or DTD

However, things are different when using the NET System.Xml classes Loading a combined schema or DTD and the XML data content (that is, an inline schema) or an XML document that references an external schema or DTD into any of the XML "storage" objects such as XmlDocument, XmlDataDocument, and XPathDocument does not automatically validate that document And there is no property that we can set to make it do this

Instead, we load the document via an XmlTextReader object to which we have attached an XmlValidatingReader The Load method of the XmlDocument and XmlDataDocument objects can accept an XmlValidatingReader as the single parameter instead of a file path and name Meanwhile the constructor for the XPathDocument object can accept

an XmlValdiatingReader as the single parameter

So all we have to do is set up our XmlValidatingReader and XmlTextReader combination, and then pass this to the Load method or the constructor function (depending on which document object we're creating) The document will then

be validated as it is loaded:

'create XmlTextReader, load XML document and create Validator

objXTReader = New XmlTextReader(strXMLPath)

Dim objValidator As New XmlValidatingReader(objXTReader)

objValidator.ValidationType = ValidationType.Schema

'use the validator/reader combination to create XPathDocument object

Dim objXPathDoc As New XPathDocument(objValidator)

'use the validator/reader combination to create XmlDocument object

Dim objXmlDoc As New XmlDocument()

objXmlDoc.Load(objValidator)

The XmlValidatingReader can also be used to validate XML held in a String So we can validate XML that's already loaded into an object or application by simply extracting it as a String object (using the GetXml method with a DataSetobject, or the OuterXml property to get a document fragment, for example) and applying the XmlValidatingReader

to this

Trang 2

Validating XML in a DataSet Object

Like the XML document objects, a DataSet does not automatically validate XML that you provide for the ReadXmlmethod against any schema that is already in place within the DataSet or which is inline with the XML (that is, in the same document as the XML data content) In a DataSet, the schema is used solely to provide information about the intended structure of the data It's not used for actual validation at all

When we load the schema, the DataSet uses it as a specification for the table names, column names, data types, etc Then, when we load the XML data content, it arranges the data in the appropriate tables and columns as new data rows

If a value or element is encountered that doesn't match the schema, it is ignored and that particular column in the current data row is left empty

This makes sense, because the DataSet is designed to work with structured relational data, and so any superfluous content in the source file cannot be part of the correct data model So, you should think of schemas in a DataSet as being

a way to specify the data structure (rather than inferring the structure from the data, as happens if no schema is present) Don't think of this as a way of validating the data

A Document Validation Example

We've provided an example page 'Validating XML documents with an XmlValidatingReader object'

(validating-xml.aspx) that demonstrates how you can validate an XML document When first opened it displays a list

of source documents that you can use in a drop-down list, and it performs validation against the selected document As you can see from the screenshot, it reports no validation errors in a valid document:

Note that you must run the page in a browser on the web server itself to be able to open the XML document and schema using the physical paths in the hyperlinks in the page

However, if you select the "well-formed but invalid" document, it reports a series of validation errors:

Trang 3

In this case the XML document contains an extra child element within one of the <Books> elements, which is not permitted in the schema that we're using to validate it (you can view the document and the schema using the hyperlinks

we provide in the page):

Trang 4

The code that performs the validation is shown in the next listings We start by creating the paths to the schema and XML document In this example, the document name comes from the drop-down list named selXMLFile that is defined earlier in the page - the filename itself is the value attribute of the selected item:

'create physical path to sample files (in same folder as ASPX page)

Dim strCurrentPath As String = Request.PhysicalPath

Dim strXMLPath As String = Left(strCurrentPath, _

InStrRev(strCurrentPath, "\")) & selXMLFile.SelectedItem.Value

Dim strSchemaPath As String = Left(strCurrentPath, _

InStrRev(strCurrentPath, "\")) & "booklist-schema.xsd"

We then declare a variable to hold the number of validation errors we find This is followed by code to create an XmlTextReader object, specifying the XML document as the source We also display a hyperlink to this document:

'variable to count number of validation errors found

Dim intValidErrors As Integer = 0

'create the new XmlTextReader object and load the XML document

objXTReader = New XmlTextReader(strXMLPath)

outXMLDoc.innerHTML = "Loaded file: <a href=""" & strXMLPath _

& """>" & strXMLPath & "</a><br />"

Creating the XmlValidatingReader and Specifying the Schema

The next step is to create our XmlValidatingReader object with the XmlTextReader as the source, and specify the validation type to suit our schema (we could, of course, have used Auto to automatically validate against any type of

Trang 5

schema or DTD):

'create an XMLValidatingReader for this XmlTextReader

Dim objValidator As New XmlValidatingReader(objXTReader)

'set the validation type to use an XSD schema

objValidator.ValidationType = ValidationType.Schema

Our schema is in a separate document and there is no link or reference to it in the XML document, so we need to specify which schema we want to use We create a new XmlSchemaCollection, and add our schema to it using the Addmethod of the XmlSchemaCollection Then we specify this collection as the Schemas property, and display a link to the schema:

'create a new XmlSchemaCollection

Dim objSchemaCol As New XmlSchemaCollection()

'add the booklist-schema.xsd schema to it

objSchemaCol.Add("", strSchemaPath)

'assign the schema collection to the XmlValidatingReader

objValidator.Schemas.Add(objSchemaCol)

outXMLDoc.innerHTML += "Validating against: <a href=""" _

& strSchemaPath & """>" & strSchemaPath & "</a>"

Specifying the Validation Event Handler

The XmlValidatingReader will raise an event whenever it encounters a validation error in the document, as the XmlTextReader reads it from our disk file If we don't handle this event specifically, it will be raised to the default error handler In our case, this is the Try Catch construct we include in our example page

Trang 6

However, it's often better to handle the validation events separately from other (usually fatal) errors such as the XML file not actually existing on disk To specify our own event handler for the ValidationEventHandler event in Visual Basic

we use the AddHandler method, and pass it the event we want to handle and a pointer to our handler routine (which is named ValidationError in this example):

'add the event handler for any validation errors found

AddHandler objValidator.ValidationEventHandler, AddressOf ValidationError

In C#, we can add the validation event handler using the following syntax:

objValidator.ValidationEventHandler += new

ValidationEventHandler(ValidationError);

Reading the Document and Catching Parser Errors

We are now ready to read the XML document from the disk file In our case, we're only reading through to check for validation errors In an application, you would have code here to perform whatever tasks you need against the XML, or alternatively use the XmlValidatingReader as the source for the Load method of an XmlDocument or

XmlDataDocument object, or in the constructor for an XPathDocument object:

Trang 7

'display count of errors found

outXMLDoc.innerHTML += "Validation complete " & intValidErrors _

& " error(s) found"

Catch objError As Exception

'will occur if there is a read error or the document cannot be parsed

outXMLDoc.innerHTML += "Read/Parser error: " & objError.Message

Finally

'must remember to always close the XmlTextReader after use

objXTReader.Close()

End Try

That's all we need to do to validate the document The remaining part of the code in this page is the event handler that

we specified for the Validation event We'll look at this next

The ValidationEvent Handler

The XmlValidatingReader raises the Validation event whenever a validation error is discovered in the XML document, and we've specified that our event handler named ValidationError will be called when this event is raised This event handler receives the usual reference to the object that raised the event, plus a ValidationEventArgs object containing information about the event

In the event handler, we first increment our error counter, and then check what kind of error it is by using the Severityproperty of the ValidationEventArgs object We display a message describing the error, and the line number and character position if available (although these are generally included in the error message anyway):

Public Sub ValidationError(objSender As Object, _

objArgs As ValidationEventArgs)

Trang 8

'event handler called when a validation error is found

intValidErrors += 1 'increment count of errors

'check the severity of the error

Dim strSeverity As String

If objArgs.Severity = 0 Then strSeverity = "Error"

If objArgs.Severity = 1 Then strSeverity = "Warning"

'display a message

outXMLDoc.innerHTML += "Validation error: " & objArgs.Message _

& "<br /> Severity level: '" & strSeverity

If objXTReader.LineNumber > 0 Then

outXMLDoc.innerHTML += "Line: " & objXTReader.LineNumber _

& ", character: " & objXTReader.LinePosition

End If

End Sub

We saw the validation error messages in the previous screenshot using a well-formed but invalid document We've also provided an XML document that is not well-formed so that you can see the parser error that is raised in this case and trapped by our Try Catch construct This also prevents the remainder of the document from being read:

Trang 9

In this case, as you can verify if you try to open the XML document using the hyperlink, there is an illegal closing tag for one of the <Books> elements:

Trang 10

We've spent a lot of time looking at how we can read and write XML documents, access them in a range of ways, and validate the content against a schema or DTD However, we haven't looked at how we can edit XML documents, or how

we create new ones The example page for this section, 'Creating and Editing the Content of XML Documents'

(edit-xml.aspx) fills out these gaps in our coverage

The example page loads an XML document named bookdetails.xml and demonstrates four different techniques we can use for editing and creating documents:

ƒ Selecting a node, extracting the content, and deleting that node from the document

ƒ Creating a new empty document and adding a declaration and comment to it

ƒ Importing (that is, copying) a node from the original document into the new document

ƒ Selecting, editing and inserting new nodes and content into the original document

The next screenshot shows the page when you run it You can see the four stages of the process, though the second and third are combined into one section of the output in the page:

Note that you must run the page in a browser on the web server itself to be able to open the XML documents using the physical paths in the hyperlinks in the page

Trang 11

The Code for this Example Page

The page contains the customary <div> elements to display the results and messages, and details of any errors that we encounter It also creates the paths to the existing and new documents, and displays a hyperlink to the existing document This is identical to the previous example, and we aren't repeating the code here Instead, we start with the part that loads the existing document into a new XmlDocument object:

Dim objXMLDoc As New XmlDocument()

Try

objXMLDoc.Load(strXMLPath)

Catch objError As Exception

outError.innerHTML = "Error while accessing document.<br />" _

& objError.Message & "<br />" & objError.Source

Exit Sub ' and stop execution

End Try

Selecting, Displaying, and Deleting a Node

To select a specific node in our document we can use an XPath expression In our example the expression is

descendant::Book[ISBN="1861003234"], which, when the current node is the root element of the document, selects the <Book> node with the specified value for its <ISBN> child node

We use this expression in the SelectSingleNode method, and it returns a reference to the node we want To display this node and its content, we just have to reference its OuterXml property:

'specify XPath expression to select a book element

Dim strXPath As String = "descendant::Book[ISBN=" & Chr(34) _

Trang 12

& "1861003234" & Chr(34) & "]"

'get a reference to the matching <Book> node

Dim objNode As XmlNode

objNode = objXMLDoc.SelectSingleNode(strXPath)

'display node and content using the OuterXml property

outResult1.InnerHtml = "XPath expression '<b>" & strXPath _

& "</b>' returned:<br />" _

& Server.HtmlEncode(objNode.OuterXml) & "<br />"

If we only want the content of the node, we can use the InnerXml property, and if we only want the text values of all the nodes concatenated together we can use the InnerText property

To delete the node from the document, we call the RemoveChild method of the parent node (the root of the document, which is returned by the DocumentElement property of the document object), and pass it a reference to the node to be deleted:

'delete this node using RemoveChild method from document element

objXMLDoc.DocumentElement.RemoveChild(objNode)

outResult1.InnerHtml += "Removed node from document.<br />"

Creating a New Document and Adding Nodes

We create a new empty XML document, simply by instantiating an XmlDocument (or XmlDataDocument) object Then

we can create nodes and insert them into this document using code like the following Here, we're creating a new XML declaration (the <?xml version="1.0"?> element) and inserting it into the new document with the InsertBefore

Trang 13

method:

'create new empty XmlDocument object

Dim objNewDoc As New XmlDocument()

'create a new XmlDeclaration object

Dim objDeclare As XmlDeclaration

objDeclare = objNewDoc.CreateXmlDeclaration("1.0", Nothing, Nothing)

'and add it as the first node in the new document

objDeclare = objNewDoc.InsertBefore(objDeclare, objNewDoc.DocumentElement)

The second and third parameters of the CreateXmlDeclaration method are used to specify the encoding type used

in the document, and the standalone value (in other words, if there is a schema available to validate the document) We set both to Nothing, so we'll get neither of these optional attributes in our XML declaration element An XML parser will then assume the default values "UTF-8" and "yes" when it loads the document

When we create the new node, we get a reference to it back from the CreateXmlDeclaration method, and we use this

as the first parameter to the InsertBefore method The second parameter is a reference to the node that we want to insert before, and in this case we specify the root of the document Notice that DocumentElement is not the root element

of the document, as it doesn't yet have one This sounds confusing, but you can think of it as a reference to the placeholder where the root element will reside

Next we create a new Comment element, and insert this into the new document after the XML declaration element:

'create a new XmlComment object

Dim objComment As XmlComment

objComment = objNewDoc.CreateComment("New document created " & Now())

'and add it as the second node in the new document

Trang 14

objComment = objNewDoc.InsertAfter(objComment, objDeclare)

Importing Nodes into the New Document

To get some content into the new document we just created, our example page imports a node from the existing document we loaded from disk at the start of the page We again use an XPath expression with the SelectSingleNodemethod to get a reference to the <Book> element we want to import:

strXPath = "descendant::Book[ISBN=" & Chr(34) & "1861003382" & Chr(34) & "]"

objNode = objXMLDoc.SelectSingleNode(strXPath)

Now we create a new XmlNode object in the target document to hold the imported node, and call the Import method of this new node to copy the node from the original document The second parameter to the Import method specifies if we want a "deep" copy - in other words if we want to import all the content of the node as well as the value:

'create a variable to hold the imported node object

Dim objImportedNode As XmlNode

'import node and all children into new document as unattached fragment

objImportedNode = objNewDoc.ImportNode(objNode, True)

Once we've got our new node into the document, we have to insert it into the tree - it is only an unattached fragment at the moment We use the InsertAfter method as before, using the reference we've already got to the new node, and the reference we created earlier to our Comment node so that the imported node becomes the root element of the new document:

'insert new unattached node into document after the comment node

objNewDoc.InsertAfter(objImportedNode, objComment)

'display the contents of the new document

Trang 15

outResult2.InnerHtml = "Created new XML document and inserted " _

& "into it the node selected by<br />" _

& "the XPath expression '" & strXPath & "'" _

& "Content of new document is:<br />" _

& Server.HtmlEncode(objNewDoc.OuterXml)

We finish this section (in the code above) by displaying the contents of the new document We've got a reference to the XmlDocument object that contains it, so we just query the OuterXml property to get the complete content You can see the new document displayed in the example page shown previously

Inserting and Updating Nodes in a Document

The final part of our example page edits some values in the original document This time we need an XPath expression that will match more than one node, and so we use the SelectNodes method of the document to return an XmlNodeListobject containing references to all the matching nodes (in our example all the <ISBN> nodes) Then we can display the number of matches found:

strXPath = "descendant::ISBN"

'get a reference to the matching nodes as a collection

Dim colNodeList As XmlNodeList

colNodeList = objXMLDoc.SelectNodes(strXPath)

'display the number of matches found

outResult3.InnerHtml = "Found " & colNodeList.Count _

& " nodes matching the" _

& "XPath expression '" & strXPath & "'<br />" _

Trang 16

& "Editing and inserting new content<br />"

Our plan is to add an attribute to all of the <ISBN> elements, and replace the text content (value) of these elements with two new elements that contain the information in a different form After declaring some variables that we'll need, we iterate through the collection of <ISBN> nodes using a For Each loop:

Dim strNodeValue, strNewValue, strShortCode As String

'create a variable to hold an XmlAttribute object

Dim objAttr As XmlAttribute

'iterate through all the nodes found

For Each objNode In colNodeList

Within the loop, we first create a new attribute named "formatting" and set the value to "hyphens" (all our <ISBN>nodes will have the same value for this attribute) Then we can add this attribute to the <ISBN> element node by calling the SetAttribute method However, there is a minor hitch - the members of an XmlNodeList are XmlNode objects, which don't have a SetAttribute method We get round this in Visual Basic by casting the object to an XmlElementobject using the CType (convert type) function:

'create an XmlAttribute named 'formatting'

objAttr = objXMLDoc.CreateAttribute("formatting")

'set the value of the XmlAttribute to 'hyphens'

objAttr.Value = "hyphens"

'and add it to this ISBN element - have to cast the object

'to an XmlElement as XmlNode doesn't have this method

CType(objNode, XmlElement).SetAttributeNode(objAttr)

Trang 17

To change the content of the <ISBN> elements, we just have to set the InnerXml property This is much easier than using the InsertBefore and InsertAfter methods we demonstrated previously, and provides a valid alternative when the content we want to insert is available as a string (recall that we had references to the element node and its new content node when we used InsertBefore previously)

Our code extracts the existing ISBN value, creates the new "short code" from it, formats the existing ISBN with hyphens, and then creates a string containing the new content for the element The final step is to insert these values into the

<ISBN> node by setting its InnerXml property, before going round to do the next one:

'get text value of this ISBN element

strNodeValue = objNode.InnerText

'create short and long strings to replace content

strShortCode = Right(strNodeValue, 4)

strNewValue = Left(strNodeValue, 1) & "-" _

& Mid(strNodeValue, 2, 6) & "-" _

& Mid(strNodeValue, 8, 2) & "-" _

& Right(strNodeValue, 1)

'insert into element by setting the InnerXml property

objNode.InnerXml = "<LongCode>" & strNewValue _

& "</LongCode><ShortCode>" _

& strShortCode & "</ShortCode>"

Next

Trang 18

We end the page by writing the complete edited XML document to a disk file and displaying a hyperlink to it so that you can view it:

'write the updated document to a disk file

objXMLDoc.Save(strNewPath)

'display a link to view the updated document

outResult3.InnerHTML += "Saved updated document: <a href=""" _

& strNewPath & """>" & strNewPath & "</a>"

Viewing the Results

If you open both documents, the original and the edited version, you can see the effects of our editing process The first contains the <Book>; node with the <ISBN>; value 1861003234, while it is not present in the second one (they are in order by ISBN code) You can also see the updated <ISBN>; elements in the second document:

Trang 20

In this example, we've demonstrated several techniques for working with an XML document using the System.Xmlclasses provided in NET Some of the techniques use the XML DOM methods as defined by W3C, and some are specific

"extensions" available with the XmlDocument (and other) objects In general, these extensions make common tasks a lot easier, for example the ability to access the InnerText, InnerXml, and OuterXml of a node makes it remarkably easy

to edit or insert content and markup

We have by no means covered all the possibilities for accessing XML documents, as you'll see if you examine the list of properties, methods, and events for each of the relevant objects in the SDK However, by now, you should have a flavor for what is possible, and how easy it is to achieve

Using XSL and XSLT Transformations

To finish this chapter, we need to come back to a topic that we first looked at in the data management introduction chapter

Ngày đăng: 03/07/2014, 07:20

TỪ KHÓA LIÊN QUAN