1. Trang chủ
  2. » Công Nghệ Thông Tin

XML Step by Step- P7 doc

15 242 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 15
Dung lượng 303,31 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

note The Microsoft Internet Explorer processor will check a document for validity only if the document contains a document type declaration and you open the docu-ment through an HTML Web

Trang 1

The Advantages of Making an XML

Document Valid

Creating a valid XML document might seem to be a lot of unnecessary bother: You must first fully define the document’s content and structure in a DTD or XML schema and then create the document itself, following all the DTD or

schema specifications It might seem much easier to just immediately add what-ever elements and attributes you need, as you did in the examples of

well-formed documents in previous chapters

If, however, you want to make sure that your document conforms to a specific structure or set of standards, providing a DTD or XML schema that describes the structure or standards allows an XML processor to check whether your

document is in conformance In other words, a DTD or XML schema provides a standard blueprint to the processor so that in checking the validity of the docu-ment, it can enforce the desired structure and guarantee that your document meets the required standards If any part of the document doesn’t conform to the DTD or XML schema specification, the processor can display an error mes-sage so that you can edit the document and make it conform

Making an XML document valid also fosters consistency within that document For example, a DTD or XML schema can force you to always use the same ele-ment type for describing a given piece of information (for instance, to always enter a book title using a TITLE element rather than a NAME element); it can ensure that you always assign a designated value to an attribute (for instance,

hardcover rather than hardback); and it can catch misspellings or typos in

ele-ment or attribute names (for instance, typing PHILUM rather than PHYLUM for an element name)

Making XML documents valid is especially useful for ensuring uniformity

among a group of similar documents In fact, the XML standard defines a DTD

as “a grammar for a class of documents.” Consider, for example, a Web publish-ing company that needs all its editors to create XML documents that conform to

a common structure Creating a single DTD or XML schema and using it for all documents can ensure that these documents uniformly comply with the required structure, and that editors don’t add arbitrary new elements, place information

in the wrong order, assign the wrong data types to attributes, and so on Of

course, the document must be run through a processor that checks its validity Including a DTD or XML schema and checking validity is especially important

if the documents are going to be processed by custom software (such as a Web page script) that expects a particular document content and structure If all users

of the software use a common appropriate DTD or XML schema for their XML

Trang 2

documents, and if the documents are checked for validity, the users can be sure that their documents will be recognized by the processing software For ex-ample, if a group of mathematicians are creating mathematical documents that will be displayed using a particular program, they could all include in their documents a common DTD that defines the required structure, elements, at-tributes, and other features

In fact, most of the “real-world” XML applications listed at the end of Chapter

1, such as MathML, consist of a standard DTD or XML schema that all users of the application use with their XML documents, so that checking the documents for validity ensures that they conform to the application’s structure and will be recognized by any software designed for that application

note

The Microsoft Internet Explorer processor will check a document for validity only

if the document contains a document type declaration and you open the docu-ment through an HTML Web page (using the techniques you’ll learn in Chap-ters 10 and 11), or if you use an XML schema as explained in Chapter 11

If you open an XML document—one with or without a style sheet—directly in Internet Explorer (as you have done so far in this book and will do in Chapters

8, 9, and 12), the processor will check the entire document—including any document type declaration it contains—for well-formedness and will display a fatal error message for any infraction it encounters However, the Internet

Explorer processor will not check the document for validity, even if it contains

a document type declaration

To test a document with a DTD or XML schema for validity and to see messages for any well-formedness or validity errors the document contains, you can use one of the validity checking scripts (contained in HTML Web pages) that are given in “Checking an XML Document for Validity” on page 396 (These scripts are also provided on the companion CD.) You might want to read the instruc-tions in that section now so that you can begin checking the validity of the XML documents you create

Trang 3

Adding the Document Type Declaration

A document type declaration is a block of XML markup that you add to the prolog of a valid XML document It can go anywhere within the prolog—out-side of other markup—following the XML declaration (Recall that if you in-clude the XML declaration, it must be at the very beginning of the document.)

Prolog

Document

element

Document type declaration can go here

or here

A document type declaration defines the content and structure of the document

If you open a document without a document type declaration (or XML schema)

in Internet Explorer, the Internet Explorer processor will merely check that the document is well-formed If, however, you open a document with a document type declaration in Internet Explorer, the processor will, under certain circum-stances, check the document for validity as well as for well-formedness, and your document must therefore conform to all declarations within the document type declaration (See the note at the end of the previous section for a descrip-tion of the circumstances under which Internet Explorer checks for validity.) You won’t, for example, be able to include any elements or attributes in the

document that you haven’t declared in the document type declaration And ev-ery element and attribute that you do include must match the specifications

(such as the allowable content of an element or the permissible type of an at-tribute value) expressed in the corresponding declaration

Trang 4

Well-Formedness and Validity Constraints

Well-formedness constraints are a set of rules given in the XML

specifica-tion that you must follow—in addispecifica-tion to the rules specified in the formal XML grammar—to create a well-formed document Because an XML document must be well-formed, any violation of a well-formedness

con-straint or any other failure to achieve well-formedness is considered a

fa-tal error When the XML processor encounters a fafa-tal error, it must stop

normal processing of the document and not attempt to recover

Validity constraints are a further set of rules in the XML specification that

you must follow if you’ve chosen to create a valid document by defining a DTD (They don’t apply if you’ve chosen to create a valid document using

an XML schema.) Because validity is optional for an XML document, a

violation of a validity constraint is considered only an error, as opposed to

a fatal error When a validating XML processor (that is, one that checks

documents for validity) encounters an error, it can simply report the problem and attempt to recover from it Validity constraints consist

of specific rules for creating a proper document type declaration with its DTD, and for creating a document that conforms to the specifications within your DTD

Declaring Element Types

In a valid XML document created using a DTD, you must explicitly declare the

type of every element that you use in the document in an element type declara-tion within the DTD An element type declaradeclara-tion indicates the name of the

ele-ment type and the allowable content of the eleele-ment (often specifying the order in which child elements can occur) Taken together, the element type declarations

in the DTD map out the entire content and logical structure of the document That is, the element type declarations indicate the element types that the docu-ment contains, the order of the eledocu-ments, and the contents of these eledocu-ments

The Form of an Element Type Declaration

An element type declaration has the following general form:

<!ELEMENT Name contentspec>

Trang 5

Here, Name is the name of the element type being declared (To review the rules

for legal element names, see “The Anatomy of an Element” on page 53.) And

contentspec is the content specification, which defines what the element can

con-tain The next section describes the different types of content specifications you can use

The following is a declaration of an element type named TITLE, which is per-mitted to contain only character data (no child elements would be allowed):

<!ELEMENT TITLE (#PCDATA)>

And here’s a declaration for an element type named GENERAL, which can con-tain any type of content:

<!ELEMENT GENERAL ANY>

As a final example, here’s a complete XML document with two element types The declaration of the COLLECTION element type indicates that it can contain one or more CD elements, and the declaration of the CD element type specifies that it can contain only character data Notice that the document conforms to these declarations and is therefore valid:

<?xml version=”1.0"?>

<!DOCTYPE COLLECTION

[

<!ELEMENT COLLECTION (CD)+>

<!ELEMENT CD (#PCDATA)>

<! You can also insert a comment in a DTD >

]

>

<COLLECTION>

<CD>Mozart Violin Concertos 1, 2, and 3</CD>

<CD>Telemann Trumpet Concertos</CD>

<CD>Handel Concerti Grossi Op 3</CD>

</COLLECTION>

note

You can declare a particular element type only once in a given document For general information on redeclaring items in the DTD, see the sidebar

“Redeclarations in a DTD” on page 148

Trang 6

The Element’s Content Specification

You can specify the content of an element—that is, fill in the contentspec part of

the element type declaration—in four different ways:

element must be empty—that is, that it cannot have content Here’s

an example:

<!ELEMENT IMAGE EMPTY>

The following would be valid IMAGE elements you could enter into your document:

<IMAGE></IMAGE>

<IMAGE />

element can have any legal content That is, an element of this type can contain zero or more child elements of any declared type, in any order or number of repetitions, with or without interspersed character data This is the most lax content specification, and creates

an element type without content constraints Here’s an example of

a declaration:

<!ELEMENT MISC ANY>

content specification, the element can contain child elements of the indicated types, but can’t directly contain character data I’ll de-scribe this option in the next section

can contain any quantity of character data Also, if one or more child element types are specified in the declaration, the character data can be interspersed with any number of these child elements, in any order I’ll describe this option later in this chapter

Specifying Element Content

If an element has element content, it can directly contain only the specified child elements The element cannot contain character data, except for white space characters used to separate the child elements and enhance readability (for ex-ample, you can display each child element on a separate line and indent them using space or tab characters) As always, the processor must pass the white space characters on to the application, but the application will typically ignore

Trang 7

them (For more details, and to learn about an exception, see the sidebar “White Space in Elements” on page 56.)

Consider the following example XML document, which describes a single book:

<?xml version=”1.0"?>

<!DOCTYPE BOOK

[

<!ELEMENT BOOK (TITLE, AUTHOR)>

<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT AUTHOR (#PCDATA)>

]

>

<BOOK>

<TITLE>The Scarlet Letter</TITLE>

<AUTHOR>Nathaniel Hawthorne</AUTHOR>

</BOOK>

In this document, the BOOK element type is declared to have element content The (TITLE, AUTHOR) following the element name in the declaration is known

as the content model A content model indicates the allowed types of child

ele-ments and their order In this example, the content model indicates that a BOOK element must have exactly one TITLE child element followed by exactly one AUTHOR child element

A content model can have either of the following two basic forms:

ele-ment must contain a specific sequence of child eleele-ment types You

separate the names of the child element types with commas For

ex-ample, the following DTD indicates that a MOUNTAIN document

element must have one NAME child element, followed by one

HEIGHT child element, followed by one STATE child element:

<!DOCTYPE MOUNTAIN

[

<!ELEMENT MOUNTAIN (NAME, HEIGHT, STATE)>

<!ELEMENT NAME (#PCDATA)>

<!ELEMENT HEIGHT (#PCDATA)>

<!ELEMENT STATE (#PCDATA)>

]

>

Trang 8

Hence, the following document element would be valid:

<MOUNTAIN>

<NAME>Wheeler</NAME>

<HEIGHT>13161</HEIGHT>

<STATE>New Mexico</STATE>

</MOUNTAIN>

The following document element, however, would be invalid because the order of the child element types isn’t as declared:

<MOUNTAIN> <! Invalid element! >

<STATE>New Mexico</STATE>

<NAME>Wheeler</NAME>

<HEIGHT>13161</HEIGHT>

</MOUNTAIN>

Omitting a child element type or including the same child element type more than once would also be invalid As you can see, this is a very rigid form of declaration

ele-ment can have any one of a series of possible child eleele-ment types, which are separated using | characters For example, the following DTD specifies that a FILM element can contain one STAR child

ele-ment, or one NARRATOR child eleele-ment, or one INSTRUCTOR

child element:

<!DOCTYPE FILM [

<!ELEMENT FILM (STAR | NARRATOR | INSTRUCTOR)>

<!ELEMENT STAR (#PCDATA)>

<!ELEMENT NARRATOR (#PCDATA)>

<!ELEMENT INSTRUCTOR (#PCDATA)>

]

>

Hence, the following document element would be valid:

<FILM>

<STAR>Robert Redford</STAR>

</FILM>

Trang 9

<!ELEMENT TITLE (#PCDATA | SUBTITLE)*>

<!ELEMENT SUBTITLE (#PCDATA)>

The following are valid TITLE elements, conforming to this

declaration:

<TITLE>Moby-Dick <SUBTITLE>Or, The Whale</SUBTITLE></TITLE>

<TITLE><SUBTITLE>Or, The Whale</SUBTITLE> Moby-Dick</TITLE>

<TITLE>Moby-Dick</TITLE>

<TITLE>

<SUBTITLE>Or, The Whale</SUBTITLE>

<SUBTITLE>Another Subtitle</SUBTITLE>

</TITLE>

<TITLE></TITLE>

Declaring Attributes

In a valid XML document, you must also explicitly declare all attributes that you intend to use with the document’s elements You define all the attributes as-sociated with a particular element by using a type of DTD markup declaration

known as an attribute-list declaration This declaration does the following:

■ It defines the names of the attributes associated with the element In

a valid document, you can include in an element start-tag only those

attributes defined for that element

■ It specifies the data type of each attribute

■ It specifies for each attribute whether that attribute is required If the

attribute isn’t required, the attribute-list declaration also indicates

what the processor should do if the attribute is omitted (The

decla-ration might, for example, provide a default attribute value that the

processor will pass to the application.)

note

You can declare elements and attributes in any order in a DTD For example, you can declare the attribute-list specification for a particular element before you declare that element

Trang 10

The Form of an Attribute-List Declaration

An attribute-list declaration has the following general form:

<!ATTLIST Name AttDefs>

Here, Name is the type name of the element associated with the attribute or at-tributes AttDefs is a series of one or more attribute definitions, each of which

defines one attribute (The order of the attribute definitions in the attribute-list declaration isn’t significant You can always include the attribute specifications

in an element start-tag in any order.)

An attribute definition has the following form:

Name AttType DefaultDecl

Here, Name is the name of the attribute (To review the rules for legal attribute names, see “Rules for Creating Attributes” on page 63.) AttType is the attribute type, which is the kind of value that can be assigned to the attribute (I’ll de-scribe the attribute type in the next section.) And DefaultDecl is the default dec-laration, which indicates whether the attribute is required and provides other

information (I’ll describe the default declaration later in this chapter.)

Say, for example, that you’ve declared an element type named FILM like this:

<!ELEMENT FILM (TITLE, (STAR | NARRATOR | INSTRUCTOR))>

Here’s an example of an attribute-list declaration that declares two attributes—

named Class and Year—for FILM elements:

<!ATTLIST FILM Class CDATA “fictional” Year CDATA #REQUIRED> Here are the different parts of this declaration:

Second attribute definition

Default declaration Attribute type Attribute name Attribute name

Attribute type

Default declaration

First attribute definition Name of associated element

An attribute-list declaration

You can assign to the Class attribute any legal quoted string (the CDATA

key-word); if you omit the attribute from a particular element, it will automatically

be assigned the default value fictional You can assign to the Year attribute any

legal quoted string; this attribute, however, must be assigned a value in every FILM element (the #REQUIRED keyword), and it therefore doesn’t have a default value

Ngày đăng: 03/07/2014, 07:20