1. Trang chủ
  2. » Công Nghệ Thông Tin

XML Step by Step- P6 doc

15 290 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 15
Dung lượng 258,11 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

In the example document, if you declared the cd namespace within a TITLE element rather than within the COLLECTION element, you could still apply that prefix to the element name: Violi

Trang 1

74 XML Step by Step

note

You can use a namespace prefix to qualify the name of the element in which the namespace is declared, even though the prefix is used before it’s declared

In the example document, if you declared the cd namespace within a TITLE

element (rather than within the COLLECTION element), you could still apply that prefix to the element name:

<cd:TITLE xmlns:cd=”http://www.mjyOnline.com/cds”>

Violin Concerto in D

</cd:TITLE>

As an alternative to creating a namespace prefix and using it to explicitly qualify

individual names, you can declare a default namespace within an element, which

will apply to the element in which it is declared (if that element has no

namespace prefix), and to all elements with no prefix within the content of that element Listing 3-5 shows the XML document from Listing 3-4 but with the

book namespace (http://www.mjyOnline.com/books) declared as a default

namespace, so that it doesn’t have to be explicitly applied to each of the book-related elements (You’ll find a copy of this listing on the companion CD under the filename Collection Default.xml.)

Collection Default.xml

<?xml version=”1.0"?>

<! File Name: Collection Default.xml >

<COLLECTION

xmlns=”http://www.mjyOnline.com/books”

xmlns:cd=”http://www.mjyOnline.com/cds”>

<ITEM Status=”in”>

<TITLE>The Adventures of Huckleberry Finn</TITLE>

<AUTHOR>Mark Twain</AUTHOR>

<PRICE>$5.49</PRICE>

</ITEM>

<cd:ITEM>

<cd:TITLE>Violin Concerto in D</cd:TITLE>

<cd:COMPOSER>Beethoven</cd:COMPOSER>

<cd:PRICE>$14.95</cd:PRICE>

</cd:ITEM>

<ITEM Status=”out”>

Trang 2

Chapter 3 Creating Well-Formed XML Documents 75

<TITLE>Leaves of Grass</TITLE>

<AUTHOR>Walt Whitman</AUTHOR>

<PRICE>$7.75</PRICE>

</ITEM>

<cd:ITEM>

<cd:TITLE>Violin Concertos Numbers 1, 2, and 3</cd:TITLE>

<cd:COMPOSER>Mozart</cd:COMPOSER>

<cd:PRICE>$16.49</cd:PRICE>

</cd:ITEM>

<ITEM Status=”out”>

<TITLE>The Legend of Sleepy Hollow</TITLE>

<AUTHOR>Washington Irving</AUTHOR>

<PRICE>$2.95</PRICE>

</ITEM>

<ITEM Status=”in”>

<TITLE>The Marble Faun</TITLE>

<AUTHOR>Nathaniel Hawthorne</AUTHOR>

<PRICE>$10.95</PRICE>

</ITEM>

</COLLECTION>

Listing 3-5.

You declare a default namespace by assigning the namespace name to the

re-served xmlns attribute In the example document in Listing 3-5, this is done in

the COLLECTION element start-tag:

<COLLECTION

xmlns=”http://www.mjyOnline.com/books”

xmlns:cd=”http://www.mjyOnline.com/cds”>

As a result, the COLLECTION element and all nested elements within it that don’t have prefixes (namely, the book-related elements) belong to the namespace

named http://www.mjyOnline.com/books The CD-related elements all have the

cd prefix, which explicitly assigns them to the cd namespace rather than the

de-fault namespace

You can override the default namespace within a nested element by assigning a

different value to xmlns within that element For instance, in the example

docu-ment in Listing 3-5, if you defined an ITEM eledocu-ment for a CD as follows, the

ITEM element and all elements within it would not belong to a namespace (If you assign an empty string to xmlns, all nonprefixed elements within the scope

of the assignment are considered not to belong to a namespace.)

Trang 3

Chapter 3 Creating Well-Formed XML Documents 77

■ When you create an XSLT style sheet, as described in Chapter 12,

you use a standard set of elements that belong to the namespace

named http://www.w3.org/1999/XSL/Transform.

note

For more information on using namespaces in XML, see the topic “Using Namespaces in Documents” in the Microsoft XML SDK 4.0 help file, or the same topic in the XML SDK documentation provided by the MSDN (Microsoft

Developer Network) Library on the Web at http://msdn.microsoft.com/library.

You’ll find the official W3C XML namespace specification on the Web at

http://www.w3.org/TR/REC-xml-names/

Characters, Encoding, and Languages

The characters you can enter into an XML document are tab, carriage-re-turn, line feed, and any of the legal characters belonging to the Unicode character set (or the equivalent ISO/IEC 10646 character set), which in-cludes characters for all the world’s written languages (For more informa-tion on these character sets and the specific characters you can use in XML, see the section “2.2 Characters” in the XML specification at

http://www.w3.org/TR/REC-xml.)

An XML file can represent, or encode, the Unicode characters in different

ways For example, if the file uses the encoding scheme known as UTF-8,

it represents a capital A as the number 65 stored in 8 bits (41 in

hexadeci-mal) However, if it uses the encoding scheme known as UTF-16, it

repre-sents a capital A as the number 65 stored in 16 bits (0041 in hexadecimal).

If you save your XML document in a plain text format using Notepad or another text or programming editor, and if you use only the standard ASCII characters (characters numbered 1 through 127 in the Unicode character set, which are the common characters you can directly enter using an English language keyboard), it’s unlikely that you’ll have to worry about encoding That’s because an XML processor will assume that the file uses the UTF-8 encoding scheme, and in a plain text file ASCII characters (and only ASCII characters) are normally encoded in conformance with the UTF-8 scheme

continued

Trang 4

78 XML Step by Step

Suppose, however, that you want to be able to type characters that aren’t

in the ASCII set directly into your element character data or your attribute values, such as the á and ñ in the following element:

<AUTHOR>Vicente Blasco Ibáñez</AUTHOR>

In this case, you must do two things:

1 Make sure that the XML file is encoded using a scheme that the XML processor can understand All conforming XML processors must be able to handle UTF-8 and UTF-16 encoded files, so try to use one of these schemes Some XML processors, however, support additional en-coding schemes you can use

To create your XML document, you must use a word processor or other program that can create text files in which all characters are uniformly encoded in a supported scheme For example, you can create a UTF-8 encoded XML document by opening or creating it in Microsoft Word 2002, and then saving the file by choosing the Save

As command from the File menu, selecting Plain Text (*.txt) in the Save As Type drop-down list in the Save As dialog box, clicking the Save button, and then in the File Conversion dialog box selecting the Unicode (UTF-8) encoding scheme (In Word 2000, you need to select Encoded Text (*.txt) in the Save As Type drop-down list rather than Plain Text (*.txt).)

The Microsoft Notepad editor supplied with some versions of Windows also lets you select the encoding scheme when you save a file

2 If your XML document is encoded in a scheme other than UTF-8 or

UTF-16, you must specify the name of the scheme by including an en-coding declaration in the XML declaration, immediately following the

version information For example, the following encoding declaration indicates that the file is encoded using the ISO-8859-1 scheme:

<?xml version=”1.0" encoding=”ISO-8859-1" ?>

(If you also include a standalone document declaration, as described in the sidebar “The standalone Document Declaration” on page 159, it must go after the encoding declaration.) If the XML processor can’t

handle the specified encoding scheme, it will generate a fatal error Also, if your XML document references an external DTD subset (de-scribed in Chapter 5) or an external parsed entity (de(de-scribed in Chapter

continued

Trang 5

Chapter 3 Creating Well-Formed XML Documents 79

6), and if the file containing the subset or entity uses an encoding

scheme other than UTF-8 or UTF-16, you must include a text declara-tion at the very beginning of the file A text declaradeclara-tion is similar to an

XML declaration, except that the version information is optional, the

encoding declaration is mandatory, and it can’t include a standalone

document declaration Here’s an example:

<?xml version=”1.0" encoding=”ISO-8859-1" ?>

(In an external parsed entity, the text declaration is not part of the

entity’s replacement text that gets inserted by an entity reference.)

You can also insert non-ASCII characters into any XML document, regard-less of its encoding, by using character references as discussed in “Insert-ing Character References” on page 153

The XML specification’s support for the Unicode character set allows you

to freely include characters belonging to any written language It might also

be important to tell the application that handles your document the specific language used for the text in a particular element For example, the appli-cation might need to know the language of the text in order to display it properly on the screen or to check its spelling XML reserves an attribute

named xml:lang for this purpose (The xml: indicates that this attribute belongs to the xml namespace Because this namespace is predefined, you

don’t have to declare it See “Using Namespaces” on page 69.) To specify the language of the text in a particular element (the text in the element’s

character data as well as its attribute values) include an xml:lang attribute

specification in the element’s start-tag, assigning it an identifier for the lan-guage, as in the following example elements:

<! This element contains U.S English text: >

<TITLE xml:lang=”en-US”>The Color Purple</TITLE>

<! This element contains British English text: >

<TITLE xml:lang=”en-GB”>Colours I Have Known</TITLE>

<! This element contains generic English text: >

<TITLE xml:lang=”en”>The XML Story</TITLE>

<! This element contains German text: >

<TITLE xml:lang=”de”>Der Richter und Sein Henker</TITLE>

continued

Trang 6

80 XML Step by Step

For a description of the official language identifiers you can assign to

xml:lang, see the section “2.12 Language Identification” in the XML fication at http://www.w3.org/TR/REC-xml The xml:lang attribute

speci-fication applies to the element in which it occurs and to any nested elements,

unless it is overridden by another xml:lang attribute specification in a nested

element To indicate the language of the text throughout your entire

docu-ment, just include xml:lang in the document element.

The xml:lang attribute doesn’t affect the behavior of the XML processor.

The processor merely passes the attribute specification on to the applica-tion, which can use the value as appropriate The XML specification doesn’t

say how the xml:lang setting must be used.

When you get to Chapters 5 and 7 on creating valid documents, keep in

mind that in a valid document the xml:lang attribute must be defined just

like any other attribute (This will make sense when you read those chap-ters.) For instance, in a DTD you could define this attribute as in the fol-lowing example attribute-list declaration:

<!ATTLIST TITLE xml:lang NMTOKEN #REQUIRED>

continued

Trang 7

Adding Comments,

Processing Instructions,

and CDATA Sections

In this chapter, you’ll learn how to add three types of XML markup to your

documents: comments, processing instructions, and CDATA sections While

these items aren’t required in a well-formed (or valid) XML document, they

can be useful You can use comments to make your document more understand-able when read by humans You can use processing instructions to modify the way an application handles or displays your document And you can use

CDATA sections to include almost any combination of characters within an

element’s character data

Inserting Comments

As you learned in Chapter 1, the sixth goal in the XML specification is that

“XML documents should be human-legible and reasonably clear.” Well-placed and meaningful comments can greatly enhance the human readability and clarity

of an XML document, just as comments can make program source code such as

C or BASIC much more understandable The XML processor ignores comment text, although it may pass the text on to the application

CHAPTER

4

Trang 8

Chapter 4 Adding Comments, Processing Instructions, and CDATA Sections 83

And you can place them within an element’s content:

<?xml version=”1.0"?>

<DOCELEMENT>

<! This comment is part of the content of the root element > This is a very simple XML document.

</DOCELEMENT>

Here’s an example of a comment that’s illegal because it’s placed within markup:

<?xml version=”1.0"?>

<DOCELEMENT <! This is an ILLEGAL comment! > >

This is a very simple XML document.

</DOCELEMENT>

You can, however, place a comment within a document type definition (DTD)— even though a DTD is part of markup—provided that it’s not within a markup declaration in the DTD You’ll learn all about DTDs and how to place

com-ments within them in Chapter 5

Using Processing Instructions

For the most part an XML document doesn’t include information on how the data is to be formatted or processed However, the XML specification does

pro-vide a form of markup known as a processing instruction that lets you pass

in-formation to the application that isn’t part of the document’s data The XML processor itself doesn’t act on processing instructions, but merely hands the text

to the application, which can use the information as appropriate

note

Recall from Chapter 2 that the XML processor is the software module that reads and stores the contents of an XML document The application is a separate software module that obtains the document’s contents from the processor and then manipulates and displays these contents When you display XML in Internet Explorer, the browser provides both the XML processor and at least the front end of the application (If you write a script to manipulate and display an XML document, you are supplying part of the application yourself.)

Trang 9

84 XML Step by Step

The Form of a Processing Instruction

A processing instruction has the following general form:

<?target instruction ?>

Here, target is the name of the application to which the instruction is directed.

Note that you can’t insert white space—that is, space, tab, carriage-return, or line feed characters—between the first question mark (?) in the processing

in-struction and target Any name is allowable, provided it follows these rules:

■ The name must begin with a letter or underscore (_), followed by zero or more letters, digits, periods (.), hyphens (-), or underscores

The target name xml, in any combination of uppercase or lowercase letters, is reserved (As you’ve seen, you use xml in lowercase letters

for the document’s XML declaration, which is a special type of pro-cessing instruction.) To avert possible conflicts with current or fu-ture reserved target names, you should also avoid beginning a target

name with xml (in any combination of cases), although the Internet

Explorer parser doesn’t prohibit the use of such names

And instruction is the information passed to the application It can consist of

any sequence of characters, except the character pair ?> (which is reserved for terminating the processing instruction)

How You Can Use Processing Instructions

The particular processing instructions that will be recognized depend upon the application that will be handling your XML document If you’re using Internet Explorer to display and work with your XML documents (as described through-out this book), you’ll find two main uses for processing instructions:

■ You can use standard, reserved processing instructions to tell Internet Explorer how to handle or display the document An ex-ample you’ll see in this book is the processing instruction that tells Internet Explorer to display the document using a particular style sheet For instance, the following processing instruction tells Internet Explorer to use the cascading style sheet (CSS) located in the file Inventory01.css:

<?xml-stylesheet type=”text/css” href=”Inventory01.css”?>

Trang 10

86 XML Step by Step

</BOOK>

</INVENTORY>

<! And here’s one following the document element: >

<?ScriptA Category=”books” Style=”formal” ?>

Here’s an example of a processing instruction illegally placed within markup:

<! The following element contains an ILLEGAL

processing instruction: >

<BOOK <?ScriptA emphasize=”yes” ?> >

<TITLE>Leaves of Grass</TITLE>

<AUTHOR>Walt Whitman</AUTHOR>

<BINDING>hardcover</BINDING>

<PAGES>462</PAGES>

<PRICE>$7.75</PRICE>

</BOOK>

You can, however, place a processing instruction within a document type defini-tion (DTD)—even though a DTD is part of markup—provided that it’s not within a markup declaration in the DTD You’ll learn all about DTDs and how

to place processing instructions within them in Chapter 5

Including CDATA Sections

As you learned in Chapter 3, you can’t directly insert a left angle bracket (<) or

an ampersand (&) as part of an element’s character data, because the XML parser would interpret either of these characters as the start of markup One

way to get around this restriction is to use a character reference (&#60; repre-senting < or &#38; reprerepre-senting &) or a predefined general entity reference (&lt; representing < or &amp; representing &) You’ll learn about character and

pre-defined general entity references in Chapter 6 However, if you need to insert many < or & characters, using these references is awkward and makes the data difficult for humans to read In this case, it’s easier to place the text containing the restricted characters inside a CDATA section

The Form of a CDATA Section

A CDATA section begins with the characters <![CDATA[ and ends with the characters ]]> Between these two delimiting character groups, you can type any characters except ]]> You can freely include the often forbidden < and & char-acters You can’t include ]]> because these characters would be interpreted as

Ngày đăng: 03/07/2014, 07:20

TỪ KHÓA LIÊN QUAN