Finally, the fourth column is produced as a list of WorkOrderIdvalues, using the magical datakeyword in a nested FOR XML PATHstatement.datatells SQL Server to generate a space-delimited
Trang 1CHAPTER 47 Using XML in SQL Server 2008
<ScrapReason>12
<! Comment: Name = Thermoform temperature too high >
<?ModDatePI 1998-06-01T00:00:00?>
<WorkOrders WorkOrderIds=”72370 72273 70875 69474 69173 68573 65970 60472
56975 56875 55275 53771 50370 47670 45773 42071 41975 39372 36673
36671 32872 32775 32770 31073 29370 27771 24174 22673 22670 17674
16073 13073 10274 9071 7771 4972 2573” />
</ScrapReason>
</ScrappedWorkOrders>
Let’s review the selected columns in Listing 47.11: the first is aliased with the asterisk (*)
character This character tells SQL Server to inline-generate the data for that column (as
text) (Using the text()node test would do the same in this case.)
Next, the comment()node test is specified for Name, telling the XML generator to output its
value in a comment For clarity’s sake, we added a little syntactic sugar in this statement
by prepending the text ’Comment: Name = ‘to the value produced inside the comment
Next, the processing-instruction()node test is specified to output each value of
ModifiedDateto a new processing instruction called ModDatePI
Finally, the fourth column is produced as a list of WorkOrderIdvalues, using the magical
data()keyword in a nested FOR XML PATHstatement.data()tells SQL Server to generate a
space-delimited list of atomic column values, one value for each row in the result set
Note that the nested query is merely used to generate a list of WorkOrderIdvalues The
empty string is given for the PATHkeyword, telling the XML engine not to generate a
default element at all, so no XML is generated whatsoever! You can extract and test the
statement to see this in action
The nested query applies the same WHEREclause as its parent to filter WorkOrderIdvalues
where the value of ScrapReasonIdis12 This ensures the relevancy of the nested data to
the outer query
The resulting list of values is grafted to the XML of the outer statement, using the column
alias’WorkOrders/@WorkOrderIds’
FOR XML and the xml Data Type
By default, the results of any FOR XMLquery (using all four modes) is streamed to output
as a one-column/one-row dataset with a column named
XML_F52E2B61-18A1-11d1-B105-00805F49916Bof type nvarchar(max) (In SQL Server 2000, this was a stream of XML split
into multiple varchar(8000)rows.)
One of the biggest limitations of SQL Server 2000’s XML production was the inability to
save the results of a FOR XMLquery to a variable or store it in a column directly without
using some middleware code to first save the XML as a string and then insert it back into
anntextornvarcharcolumn and then select it out again
Trang 2Today, SQL Server 2008 natively supports column storage of XML, using thexmldata
type Be sure to read the section “Using thexmlData Type,” later in this chapter, for a
complete overview
You can easily convert FOR XMLresults to instances of xmlby using the TYPEdirective with
all four modes (RAW,AUTO,EXPLICIT, and PATH) Listing 47.12 demonstrates the use of FOR
XML PATHwith theTYPEdirective
LISTING 47.12 UsingFOR XML PATH, TYPE to Create an Instance of the xml Data Type
SELECT *
FROM Production.WorkOrder WorkOrder
WHERE ScrapReasonId = 12
AND WorkOrderId = 72370
FOR XML RAW(‘WorkOrder’), ELEMENTS XSINIL, ROOT(‘WorkOrders’), TYPE
go
<WorkOrders xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<WorkOrder>
<WorkOrderID>72370</WorkOrderID>
<ProductID>329</ProductID>
<OrderQty>48</OrderQty>
<StockedQty>47</StockedQty>
<ScrappedQty>1</ScrappedQty>
<StartDate>2008-07-01T00:00:00</StartDate>
<EndDate>2008-07-11T00:00:00</EndDate>
<DueDate>2008-07-12T00:00:00</DueDate>
<ScrapReasonID>12</ScrapReasonID>
<ModifiedDate>2008-07-11T00:00:00</ModifiedDate>
</WorkOrder>
</WorkOrders>
Notice that in contrast to the preceding FOR XMLexamples, in this example, the query
window in SQL Server Management Studio (SSMS) no longer displays the lengthy XML
column UUID in the results frame, nor on the window tab The results have been cast to a
single instance of the xmldata type, ready for use in variables of type xml, in subsequent
queries, inserted into xmlcolumns, or returned to the client
The five xmldata type methods—value(),exist(),nodes(),query(), and modify(),
discussed later in this chapter, in the section “The Built-in xmlData Type Methods”—can
be intermixed with relational queries by using all FOR XMLmodes This makes it even
easier to shape your XML exactly the way you want
Listing 47.13 demonstrates how you can nest XQuery queries inside regular FOR XML
T-SQL to produce XML documents built from both relational and XML sources
Trang 3CHAPTER 47 Using XML in SQL Server 2008
LISTING 47.13 Bridging the Gap Between Relational and XML Data by Using FOR XML PATH
and the xml Data Type
SELECT
FirstName,
LastName,
E.JobTitle,
Resume.query(
‘declare namespace ns=”http://schemas.microsoft.com/sqlserver/2004/07/
adventure-works/Resume”;
//ns:Education
‘
) ‘*’
FROM HumanResources.Employee E
JOIN Person.Person C on E.BusinessEntityID = C BusinessEntityID
JOIN HumanResources.JobCandidate J on J BusinessEntityID = E BusinessEntityID
WHERE J.JobCandidateId = 8
FOR XML PATH(‘AWorthyJobCandidate’), TYPE
go
<AWorthyJobCandidate>
<FirstName>Peng</FirstName>
<LastName>Wu</LastName>
<Title>Quality Assurance Supervisor</Title>
<ns:Education
xmlns:ns=”http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume”>
<ns:Edu.Level> </ns:Edu.Level>
<ns:Edu.StartDate>1986-09-15Z</ns:Edu.StartDate>
<ns:Edu.EndDate>1990-05-15Z</ns:Edu.EndDate>
<ns:Edu.Degree>Bachelor of Science</ns:Edu.Degree>
<ns:Edu.Major> </ns:Edu.Major>
<ns:Edu.Minor />
<ns:Edu.GPA>3.3</ns:Edu.GPA>
<ns:Edu.GPAScale>4</ns:Edu.GPAScale>
<ns:Edu.School>Western University</ns:Edu.School>
<ns:Edu.Location>
<ns:Location>
<ns:Loc.CountryRegion>US </ns:Loc.CountryRegion>
<ns:Loc.State>WA </ns:Loc.State>
<ns:Loc.City>Seattle</ns:Loc.City>
</ns:Location>
</ns:Edu.Location>
</ns:Education>
</AWorthyJobCandidate>
Trang 4In this example, the asterisk (*) is used as a column alias for the results of the nested
query (on HumanResources.JobCandidate.Resume), telling SQL Server to simply output the
XML inline with the other nodes
XML As Relational Data: Using OPENXML
This section covers what might be called the inverse ofFOR XML:OPENXML You use
OPENXMLin T-SQL queries to read XML data and shred (or decompose) it into relational
result sets.OPENXMLis part of theSELECTstatement, and you use it to generate a table
from an XML source
The first step required in this process is a call to the system stored procedure
sp_xml_preparedocument.sp_xml_preparedocumentcreates an in-memory representation
of any XML document tree for use in querying It takes the following parameters:
An integer output parameter for storing a handle to the document tree
The XML input data
An optional XML namespace declaration, used in subsequent OPENXMLqueries
sp_xml_preparedocumentis able to convert the following data types into internal XML
objects:text,ntext,varchar,nvarchar, single-quoted literal strings, and untyped XML
(data from an xmlcolumn having no associated schema collection) This is its syntax:
sp_xml_preparedocument integer_variable OUTPUT[, xmltext ][, xpath_namespaces ]
And here is an example of OPENXMLin use:
DECLARE @XmlDoc XML, @iXml int
SET @XmlDoc = ‘
<ex:ExampleDoc xmlns:ex=”urn:www-samspublishing-com:examples”>
<ex:foo>hello</ex:foo>
<ex:bar>sql!</ex:bar>
</ex:ExampleDoc>’
EXEC sp_xml_preparedocument
@iXml OUTPUT,
@XmlDoc,
‘<ExampleDoc xmlns:ex=”urn:www-samspublishing-com:examples”/>’
SELECT id, parentid, nodetype, localname, prefix
FROM OPENXML(@iXml, ‘/ex:ExampleDoc/ex:foo’)
WITH (foo varchar(10) ‘/ex:ExampleDoc/ex:foo’)
EXEC sp_xml_removedocument @iXml
go
Trang 5CHAPTER 47 Using XML in SQL Server 2008
id parentid nodetype localname prefix
-3 0 1 foo ex
5 3 3 #text NULL
Notice in the example that the WITHpredicate has been commented out This is to
illus-trate in the query results what is known as an edge table: the XML document in its
rela-tional form Edge is a term taken from graph theory It refers to what you might visualize
as a depth line between two nodes
If the edge table looks familiar, the reason is probably that it bears a resemblance to the
universal table that must be created for EXPLICITmode As with the universal table, the
edge table follows the adjacency list model for its hierarchical relationships The node
types of the input XML are marked in the nodetypecolumn (1= element, 2= attribute, 3
= text) Namespaces are stored in namespaceuri, and the data of each node is stored in the
textcolumn
If you uncomment the WITHpredicate and change the query from SELECT *toSELECT
foo, you get back a one-row/one-column table with a column called foothat has the
varchar(10)valuehello This shows that the WITHpredicate instructs OPENXMLhow to
decompose the nodes to columns by using XPath syntax
The syntax for OPENXML(including the WITHpredicate) is as follows:
OPENXML(integer_document_handle_variable int, rowpattern nvarchar,[flags byte])
[WITH (SchemaDeclaration | TableName)]
Let’s match this syntax with the values in the example:
The first parameter is the local variable @iXml, which acts as a handle to the internal
XML representation
The next parameter is a row pattern in XPath syntax that tells OPENXMLhow to select
nodes into rows OPENXMLgenerates one row in the result set for each node that
matches this row pattern This is similar to the NET XmlDocumentobject’s
SelectNodes()method, insofar as every matching node in rowpatternreturns a row
in the rowset
The result set’s columns are then defined, using matching nodes as the context and
the XPath in the column definitions of the WITHpredicate to find the values relative
to the node
Theflagsparameter is a combinable byte value that controls how the selected XML
nodes are to be decomposed The following values are possible:
0—Uses attribute-centric decomposition In this case, each attribute in the
source XML is decomposed into a column This is the default
1—Uses attribute-centric decomposition May be combined with flag2(that is,
the value3may be specified) Combining flags1and2tells the rowset
genera-tor how to deal with the values in the XML not yet accounted for in the
Trang 6ward parse of the XML document from nodes into rows In other words,
attribute-centric decomposition takes place before element-centric
decomposi-tion This point is important because without the combinability of the flags,
only one or the other decomposition will happen, and (lacking aWITH
predi-cate that captures all the nodes) some nodes would not make it into the rowset
2—Uses element-centric decomposition Combinable with flag 1(that is,
specify3)
8—Tells the rowset generator how to deal with text data in the metaproperties
(not covered in this chapter) Can be combined with flags1,2, or both
Note that the column generation determined by the flags 0,1, and 2can all be overridden
by the XPath expressions expressed in the lines of the WITHpredicate For example, if the 1
flag is specified to map a particular attribute to a column, but in the line of the WITH
pred-icate for that same column, the XPath maps the value from an XML element, the WITH
predicate takes precedence It’s truly best to just set the value of flags to 3in most cases,
unless you care to ignore attributes or elements for some reason
The syntax of the WITHpredicate tells the rowset generator which column names and data
types to use when mapping the XML to rows If the structure of the input XML matches
the schema of a particular table in your database, the name of that table may be specified
An example of this case occurs when the input XML has been produced from an existing
table, using FOR XML The values in the FOR XML-produced document have been updated,
and the new values need to make it back into the table The following code example
illus-trates this common scenario:
DECLARE @JobCandidateXmlDoc XML, @iXml int
SET @JobCandidateXmlDoc = ‘
<JobCandidateUpdate>
<ModifiedDate>
10/5/2008 12:34PM
</ModifiedDate>
</JobCandidateUpdate>’
EXEC sp_xml_preparedocument
@iXml OUTPUT,
@JobCandidateXmlDoc,
‘<JobCandidateUpdate
xmlns:ns=”http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume”/>’;
UPDATE HumanResources.JobCandidate
SET ModifiedDate = OXML.ModifiedDate
FROM
(
SELECT *
FROM OPENXML(@iXml, ‘/JobCandidateUpdate’, 2)
WITH HumanResources.JobCandidate
) AS OXML
Trang 7CHAPTER 47 Using XML in SQL Server 2008
WHERE JobCandidateId = 8
EXEC sp_xml_removedocument @iXml
go
(1 row(s) affected)
If a table name is not specified, you need to specify a comma-separated list of lines, using
the following syntax:
column_name datatype ‘XPath’
The following list explains each part of the preceding syntax:
are to be mapped to the XML-produced column
When you’re done reading out the XML, it’s important to free the memory used to hold
the internal XML document You accomplish this by calling the system stored procedure
sp_xml_removedocument, as in the following example:
EXEC sp_xml_removedocument @iXml
Thexmldata type is a real problem solver for those who use both XML and SQL Server on
a daily basis Relational columns and XML data can be stored side by side in the same
table, in an implementation that plays to the strengths of both With SQL Server’s
power-ful XML storage, validation, querying, and indexing capabilities, it’s bound to cause quite
a stir in the field of XML content management and beyond
Some of the benefits of storing XML on the database tier can be realized immediately
Building middleware using the NET Framework to manage XML stored in columns, rather
than on the filesystem, is a far more robust solution than depending on the filesystem;
plus, it’s a lot easier to access the content from anywhere
SQL Server inherently provides to stored XML the traditional DBMS benefits of backup
and restoration, replication and failover, query optimization, granular locking, indexing,
and content validation The xmldata type can be used with local variable declarations, as
the output of user-defined functions, as input parameters to stored procedures and
func-tions, and much more XML instances containing up to 128 levels of nesting can be stored
inxmlcolumns; deeper instances cannot be inserted, nor may existing instances be made
to increase beyond this depth via the modify()data type method
xmlcolumns can also be used to store code files such as XSLT, XSD, XHTML, and any
other well-formed content These files can then be retrieved by user-defined functions
written in managed code hosted by SQL Server (See Chapter 53, “SQL Server 2008
Reporting Services,” for a full review of SQL Server–managed hosting.)
Trang 8NOTE
In some cases, it’s still a perfectly valid scenario to store XML on the filesystem or in
[n]varchar(max),[n]text, or [n]varbinary(max)columns In a few cases this
usage is actually recommended The following summary details some possible XML
usage scenarios and makes suggestions for each
XML data is stored in an internal binary format and can be up to 2GB in size
Before we dig into the many uses of the xmldata type, it’s worthwhile to consider some of
the different ways you can leverage your institution’s XML with SQL Server:
XML can be used solely as a temporary output format produced from relational data,
usingFOR XML This applies in scenarios in which the relational tables hold the
real-time data and XML is produced only for read-only application uses, as in the display
of dynamic web pages In this scenario, the XML really just provides a
DBMS-inde-pendent, easy-to-transform view of the data
XML can be stored in relational (nvarcharand so on) columns, as done previously
This might be the best option when your XML is sometimes not well formed or
when the learning curve to XQuery is too high for an application-delivery time
frame This is also a valuable option when the byte-for-byte exactness of the XML
must be preserved
Note that the latter is a necessary option in some institutions because typed XML
(that is, xmldata type columns associated with a schema collection) storage
disre-gards extra whitespace characters, namespace prefixes, attribute order, and the XML
declaration to make way for query optimizations This scenario also leverages fast
data retrieval because, as far as SQL Server is concerned, XML is never brought into
the mix (it’s all relational) The data can still be converted to the xmldata type,
using the methods described earlier, and applications can use OPENXMLto read it as
well To read XML into SQL Server from server-side accessible files, you call the
T-SQLOPENROWSETfunction
The XML can be stored as untyped XML—that is, XML stored in an xmldata type
column lacking an associated schema collection This provides the benefits of
query-ing the XML usquery-ing the data type methods (discussed later in the section “The
Built-inxmlData Type Methods”) and provides server-side checks for well-formed XML
This scenario also allows for the possibility that XML adhering to any (or no)
schemas may reside in the column A schema collection could be added later to
pro-vide validation on the existing data (although a few intermediate editing steps may
be necessary if any documents fail to validate)
Safely armed with an understanding of some of the different options and uses, let’s plunge
into our discussion of xml
Trang 9CHAPTER 47 Using XML in SQL Server 2008
Defining and Using xml Columns
You can add columns of type xmlto any table by using a familiar Data Definition
Language (DDL) syntax, with a few new twists Much like their relational counterparts,
xmlcolumns, parameters, and variables may contain null or non-null values
The following snippet shows the DDL used to create the table
HumanResources.JobCandidatefromAdventureWorks2008 The column you are concerned
with is Resume:
CREATE TABLE [HumanResources].[JobCandidate](
[JobCandidateID] [int] IDENTITY(1,1) NOT NULL,
[EmployeeID] [int] NULL,
[Resume] [xml](CONTENT [HumanResources].[HRResumeSchemaCollection]) NULL,
[ModifiedDate] [datetime] NOT NULL
CONSTRAINT [DF_JobCandidate_ModifiedDate] DEFAULT (getdate()),
CONSTRAINT [PK_JobCandidate_JobCandidateID] PRIMARY KEY CLUSTERED
(
[JobCandidateID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
When you are defining objects of type xml, either of two facets may be applied:
CONTENT—This facet specifies that well-formed XML documents as well as fragments
may be inserted into the xmlcolumn or variable (CONTENTis the default and may be
omitted from the definition.)
Fragments may have more than one top-level node (as is produced, by default, using
FOR XML), and elements may be mixed with text-only nodes
DOCUMENT—This facet specifies that only well-formed, valid XML conforming to a
specified schema collection may be stored Updates to the column must also result
in schema-valid, well-formed XML
XML schema collections can be associated with xmlvariables, parameters, or columns The
name of the schema collection is specified directly after the chosen facet, as is done in
JobCandidate.Resume
The following code example defines a typed xmllocal variable that allows only valid
Resumedata to be stored in it:
DECLARE @ValidWellFormed xml (DOCUMENT HumanResources.HRResumeSchemaCollection)
Trying to insert the following well-formed but invalid document throws an error that says
the first (and only) ThisBlowsUpelement in the document is not declared in any of the
schemas in HRResumeSchemaCollection:
SELECT @ValidWellFormed = ‘<ThisBlowsUp/>’
go
Trang 10XML Validation: Declaration not found for element ‘ThisBlowsUp’.
Location:/*:ThisBlowsUp[1]
When you change the facet to CONTENT(the default) and remove the schema association,
the following is possible:
DECLARE @WellFormed xml
SELECT @WellFormed = ‘<ThisWorks/>’
go
Command(s) completed successfully.
When defining xmlcolumns, you can specify defaults and constraints just as you do with
relational columns Consider the following example:
CREATE TABLE XmlExample
(
XmlColumn xml NOT NULL DEFAULT CONVERT(xml,’<root/>’,0)
)
This example creates anxmlcolumn calledXmlColumnthat starts out having an emptyroot
node Notice how the string’<root/>’is converted to thexmltype This is actually not
necessary because conversions from literal strings and fromvarchartoxmlare implicit
The next example adds a table-level constraint toXmlColumnto make sure therootnode
always exists It depends on a scalar-valued user-defined function to do its validation work:
CREATE FUNCTION dbo.fn_XmlColumnNotNull
(
@XmlColumnValue xml
)
RETURNS bit
AS
BEGIN
RETURN @XmlColumnValue.exist(‘/root’)
END
GO
CREATE TABLE XmlExample
(
XmlColumn xml NOT NULL DEFAULT CONVERT(xml,’<root/>’,0)
)
GO
ALTER TABLE XmlExample WITH CHECK
ADD CONSTRAINT CK_XmlExample_HasRoot
CHECK (dbo.fn_XmlColumnNotNull(XmlColumn) = 1)
The following statement thus fails:
INSERT XmlExample SELECT ‘<foo/>’