Its contents can be validly inserted within an INTRODUCTION element which can have any type of content, as shown in this example: Here’s what this article covers: &topics; The XML pr
Trang 1138 XML Step by Step
Relative URLs in XML documents work just like relative URLs in HTML pages For more details on exactly how they work, see “Using an External DTD Subset Only” on page 121
The entity file contains the entity’s replacement text, which can include only items that can legally be inserted into an element (character data, nested ele-ments, and so on, as described in “Types of Content in an Element” on page 54) As you’ll learn later in this chapter, you can ultimately insert a general
ex-ternal parsed entity only within an element’s content, and not within an
attribute’s value
note
In a general external parsed entity file, you can optionally include a text decla-ration in addition to the entity’s replacement text The text decladecla-ration must come at the very beginning of the file For information, see the sidebar “Char-acters, Encoding, and Languages” on page 77
As an example, the following DTD defines the external file Topics.xml as a gen-eral external parsed entity:
<!DOCTYPE ARTICLE
[
<!ELEMENT ARTICLE (TITLEPAGE, INTRODUCTION, SECTION*)>
<!ELEMENT TITLEPAGE (#PCDATA)>
<!ELEMENT INTRODUCTION ANY>
<!ELEMENT SECTION (#PCDATA)>
<!ELEMENT HEADING (#PCDATA)>
<!ENTITY topics SYSTEM "Topics.xml">
]
>
Here are the contents of the Topics.xml file:
<HEADING>Topics</HEADING>
The Need for XML
The Official Goals of XML
Standard XML Applications
Real-World Uses for XML
Trang 2Chapter 6 Defining and Using Entities 139
This particular external entity file contains two of the items that you can include
in an XML element: a nested element and a block of character data Its contents can be validly inserted within an INTRODUCTION element (which can have any type of content), as shown in this example:
<INTRODUCTION>
Here’s what this article covers:
&topics;
</INTRODUCTION>
The XML processor will replace the entity reference (&topics;) with the
replace-ment text from the external entity file, and process the text just as if you had typed it into the document at the position of the reference, like this:
<INTRODUCTION>
Here’s what this article covers:
<HEADING>Topics</HEADING>
The Need for XML
The Official Goals of XML
Standard XML Applications
Real-World Uses for XML
</INTRODUCTION>
Declaring a General External Unparsed Entity
A declaration for a general external unparsed entity has this form:
<!ENTITY EntityName SYSTEM SystemLiteral NDATA NotationName>
Here, EntityName is the name of the entity You can select any name, provided
that you follow the general entity naming rules given in “Declaring a General Internal Parsed Entity” earlier in this chapter
SystemLiteral is a system identifier that describes the location of the file
containing the entity data It works the same way as the system identifier for describing the location of a general external parsed entity, which I explained in the previous section
note
The keyword NDATA indicates that the entity file contains unparsed data This keyword derives from SGML, where it stands for notation data
Trang 3140 XML Step by Step
NotationName is the name of a notation declared in the DTD The notation
de-scribes the format of the data contained in the entity file or gives the location of
a program that can process that data I’ll explain notation declarations in the next section
The general external unparsed entity file can contain any type of text or nontext data It should, of course, conform to the format description provided by the specified notation
For example, the DTD in the following XML document defines the file Faun.gif (which contains an image of a book cover) as a general external unparsed entity
named faun The name of this entity’s notation is GIF, which is defined to point
to the location of a program that can display a graphics file in the GIF format (ShowGif.exe) The DTD also defines an empty element named COVERIMAGE,
and an ENTITY type attribute for that element named Source:
<?xml version="1.0"?>
<!DOCTYPE BOOK
[
<!ELEMENT BOOK (TITLE, AUTHOR, COVERIMAGE)>
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT AUTHOR (#PCDATA)>
<!ELEMENT COVERIMAGE EMPTY>
<!ATTLIST COVERIMAGE Source ENTITY #REQUIRED>
<!NOTATION GIF SYSTEM "ShowGif.exe">
<!ENTITY faun SYSTEM "Faun.gif" NDATA GIF>
]
>
<BOOK>
<TITLE>The Marble Faun</TITLE>
<AUTHOR>Nathaniel Hawthorne</AUTHOR>
<COVERIMAGE Source="faun" />
</BOOK>
In the document element, the Source attribute of the COVERIMAGE element is
assigned the name of the external entity that contains the graphics data for the
cover image to be displayed Because Source has the ENTITY type, you can
assign it the name of a general external unparsed entity In fact, the only way you can use this type of entity is to assign its name to an ENTITY or ENTITIES type attribute
Trang 4Chapter 6 Defining and Using Entities 141
note
Unlike an external parsed entity file, a general external unparsed entity file is not accessed directly by the XML processor Rather, the processor merely pro-vides the entity name, system identifier, and notation name to the application Likewise, the processor doesn’t access a location or program indicated by a notation, but only passes the notation name and system identifier to the ap-plication In fact, the Internet Explorer XML processor doesn’t even check whether a general external unparsed entity file, or the target of a notation, exists The application can do what it wants with the entity and notation in-formation For example, it might run the program associated with the notation and have it display the data in the entity file In Chapter 11, you’ll learn how
to write Web page scripts that access entities and notations
Declaring a Notation
A notation describes a particular data format It does this by providing the ad-dress of a description of the format, the adad-dress of a program that can handle data in that format, or a simple format description You can use a notation to describe the format of a general external unparsed entity (as you saw in the pre-vious section), or you can assign a notation to an attribute that has the NOTA-TION enumerated type (as described in “Specifying an Enumerated Type” in Chapter 5)
A notation has the following general form:
<!NOTATION NotationName SYSTEM SystemLiteral>
Here, NotationName is the notation name You can choose any name you want,
provided that it begins with a letter or underscore (_), followed by zero or more letters, digits, periods (.), hyphens (-), or underscores You should normally
choose a meaningful name that indicates the format For example, if you define
a notation to describe the bitmap format, you might name it BMP (However,
the XML specification states that names beginning with the letters xml, in any
combination of uppercase or lowercase letters, are “reserved for standardiza-tion.” Although Internet Explorer doesn’t enforce this restriction, it’s better not
to begin names with xml to avoid future problems.)
SystemLiteral is a system identifier that can be delimited using either single
quotes (') or double quotes ("), and can contain any characters except the quo-tation character used to delimit it You can include in the system identifier any format description that would be meaningful to the application that is going to display or handle the XML document (Remember that the XML processor
Trang 5Chapter 6 Defining and Using Entities 143
Declaring Parameter Entities
You declare a parameter entity using a form of markup declaration similar to that used for general entities In the following sections, you’ll learn how to de-clare both types of parameter entities
Declaring a Parameter Internal Parsed Entity
A declaration for a parameter internal parsed entity has the following
general form:
<!ENTITY % EntityName EntityValue>
Here, EntityName is the name of the entity You can select any name, provided
that you follow these rules:
■ The name must begin with a letter or underscore (_), followed by
zero or more letters, digits, periods (.), hyphens (-), or underscores
■ The XML specification states that names beginning with the letters
xml (in any combination of uppercase or lowercase letters) are
“re-served for standardization.” Although Internet Explorer doesn’t
en-force this restriction, it’s better not to begin names with xml to avoid
future problems
■ Remember that case is significant in all text within markup,
includ-ing entity names Thus, an entity named Spot is a different entity
than one named spot.
EntityValue is the value of the entity The value you assign a parameter internal entity is a series of characters delimited with quotes, known as a quoted string
or literal You can assign any literal value to a parameter internal entity,
provided that you observe these rules:
■ The string can be delimited using either single quotes (') or double
quotes (")
■ The string cannot contain the same quotation character used to
de-limit it
■ The string cannot include an ampersand (&) except to begin a
char-acter or general entity reference Nor can it include the percent sign
(%) (for an exception, see the sidebar “An Additional Location for
Parameter Entity References” on page 151)
Trang 6146 XML Step by Step
The entity file contains the entity’s replacement text, which must consist of com-plete markup declarations of the types allowed in a DTD—specifically, element type declarations, attribute-list declarations, entity declarations, notation decla-rations, processing instructions, or comments (I described these types of
markup declarations in “Creating the Document Type Definition” in Chapter 5.) You can also include parameter entity references between markup declara-tions, and you can include IGNORE and INCLUDE sections I described IG-NORE and INCLUDE sections in “Conditionally Ignoring Sections of an External DTD Subset” in Chapter 5 (For exceptions to the guidelines given in this paragraph, see the sidebar “An Additional Location for Parameter Entity References” on page 151.)
note
In a parameter external entity file, you can optionally include a text declaration
in addition to the entity’s replacement text The text declaration must come at the very beginning of the file For information, see the sidebar “Characters, Encoding, and Languages” on page 77
You can use parameter external entities to store groups of related declarations Say, for example, that your business sells books, CDs, posters, and other items You could place the declarations for each type of item in a separate file This would allow you to combine these groups of declarations in various ways For instance, you might want to create an XML document that describes only your inventory of books and CDs To do this, you could include your book and CD declarations in the document’s DTD by using parameter external entities, as shown in this example XML document:
<?xml version=”1.0"?>
<!DOCTYPE INVENTORY
[
<!ELEMENT INVENTORY (BOOK | CD)*>
<!ENTITY % book_decls SYSTEM “Book.dtd”>
<!ENTITY % cd_decls SYSTEM “CD.dtd”>
%book_decls;
%cd_decls;
]
>
Trang 7Chapter 6 Defining and Using Entities 147
<INVENTORY>
<BOOK>
<BOOKTITLE>The Marble Faun</BOOKTITLE>
<AUTHOR>Nathaniel Hawthorne</AUTHOR>
<PAGES>473</PAGES>
</BOOK>
<CD>
<CDTITLE>Concerti Grossi Opus 3</CDTITLE>
<COMPOSER>Handel</COMPOSER>
<LENGTH>72 minutes</LENGTH>
</CD>
<BOOK>
<BOOKTITLE>Leaves of Grass</BOOKTITLE>
<AUTHOR>Walt Whitman</AUTHOR>
<PAGES>462</PAGES>
</BOOK>
<!— additional items —>
</INVENTORY>
Here are the contents of the Book.dtd entity file:
<!ELEMENT BOOK (BOOKTITLE, AUTHOR, PAGES)>
<!ELEMENT BOOKTITLE (#PCDATA)>
<!ELEMENT AUTHOR (#PCDATA)>
<!ELEMENT PAGES (#PCDATA)>
And here are the contents of the CD.dtd entity file:
<!ELEMENT CD (CDTITLE, COMPOSER, LENGTH)>
<!ELEMENT CDTITLE (#PCDATA)>
<!ELEMENT COMPOSER (#PCDATA)>
<!ELEMENT LENGTH (#PCDATA)>
Notice that a parameter external entity works much like an external DTD sub-set Parameter external entities, however, are more flexible—they allow you to include several external declaration files and to include them in any order (Re-call that an external DTD subset is always processed after the entire internal DTD subset has been processed.)
Trang 8150 XML Step by Step
Entity type Form of entity reference, Places where you can insert
where EntityName is the an entity reference (example) name of the entity
General external EntAttr=’EntityName’ ■ You can’t insert a reference unparsed where EntAttr is an to this type of entity, but
ENTITY or ENTITIES you can identify the entity type attribute by assigning its name to an
attribute that has the ENTITY or ENTITIES type (see “Declaring a General External Unparsed Entity”)
Parameter internal %EntityName; ■ In a DTD where markup
within markup declarations
(for an exception, see the sidebar “An Additional Location for Parameter Entity References” follow-ing this table) (see “Declar-ing a Parameter Internal Parsed Entity”)
Parameter external %EntityName; ■ In a DTD where markup
within markup declarations
(for an exception, see the sidebar “An Additional Location for Parameter Entity References” follow-ing this table) (see “Declar-ing a Parameter External Parsed Entity”)
Character 	 or &#xh; ■ In an element’s content (see reference where 9 is the numeric “Inserting Character
code for the character References”)
in decimal, and h is the ■ In an attribute value (the numeric code in default value in an attribute hexadecimal definition, or the assigned
value in an element start-tag) (see “Inserting Charac-ter References”)
■ In the literal value of an internal entity declaration (see “Inserting Character References”)
continued
Trang 9Chapter 6 Defining and Using Entities 151
An Additional Location for Parameter
Entity References
In this chapter, I’ve stated that you can insert a parameter entity reference only where markup declarations can occur in a DTD—not within markup declarations—and therefore a parameter entity must contain one or more complete markup declarations of the types allowed in a DTD This is a safe rule that you can use in any situation and that will let you work with pa-rameter entities without undue complexity
The XML specification, however, does allow you to insert a reference to an internal or external parameter entity within markup declarations, as well
as between markup declarations, provided that the markup declarations
occur in an external DTD subset or in a parameter external parsed entity
file and not in an internal DTD subset The permissible content of an
en-tity depends upon where you are going to insert it If you insert an enen-tity reference within a markup declaration, the entity can of course contain a
legal fragment of a markup declaration rather than a complete markup
declaration You can insert a parameter entity reference in most places within markup (including within the literal value of an internal entity dec-laration)
The ability to insert parameter entity references within markup declarations makes parameter internal entities much more useful than implied by the example I gave earlier in the chapter in “Declaring a Parameter Internal Parsed Entity.” You could, for example, store a complex attribute defini-tion in a parameter internal entity and then assign that attribute to an en-tire group of elements by simply inserting the entity reference into each element’s attribute-list declaration (This would save typing, reduce the size
of the document, and make it easier to modify the attribute definition.)
However, the guidelines for including references to parameter entities within markup declarations are complex The XML specification includes more than a dozen distinct rules describing where parameter entities can be in-serted in markup declarations, what they can contain, and how they must nest with the surrounding markup declaration content (hence my decision
to omit the details from this chapter) But if you want to explore this terri-tory, you’ll find complete information in sections 2, 3, and 4 of the XML
specification at http://www.w3.org/TR/REC-xml.
Trang 10152 XML Step by Step
Entity Reference Example 1
The following XML document declares two general internal parsed entities, am and en The document uses a reference to am to assign a default value to the Na-tionality attribute, and it uses a reference to en to assign a value to the National-ity attribute in the AUTHOR element An advantage of using an entNational-ity here is
that you could change the value throughout the entire document (assuming it had many elements) by simply editing the entity declaration (for example,
changing the value of en from “English” to “British”).
<?xml version="1.0"?>
<!DOCTYPE INVENTORY
[
<!ENTITY am "American">
<!ENTITY en "English">
<!ELEMENT INVENTORY (BOOK*)>
<!ELEMENT BOOK (TITLE, AUTHOR)>
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT AUTHOR (#PCDATA)>
<!ATTLIST AUTHOR Nationality CDATA "&am;">
]
>
<INVENTORY>
<BOOK>
<TITLE>David Copperfield</TITLE>
<AUTHOR Nationality="&en;">Charles Dickens</AUTHOR>
</BOOK>
<! other elements >
</INVENTORY>
Entity Reference Example 2
The following DTD defines a general internal parsed entity (int_entity) and a general external parsed entity (ext_entity) It then defines another general inter-nal parsed entity (combo_entity) and inserts both previous entities into the combo_entity value.
<!DOCTYPE INVENTORY
[
<!ENTITY int_entity "internal entity value">