In HTML 4.01 and XHTML 1.0 documents, character encoding is indicated using a meta element see the sidebar, XML Declarations for the method for XML documents.. XML declarations indicate
Trang 1Which One Should You Use?
With so many DTDs to choose from, it may seem daunting to choose the best one Here are some guidelines to help you
Transitional or strict
If you are learning markup for the first time, there is no reason to learn legacy HTML practices or use deprecated attributes, so you’re well on the way to compliance with one of the Strict DTD versions
However, if you inherit a site that has already been heavily marked up using deprecated elements and attributes, and you don’t have time or resources to rewrite the source, then a Transitional DTD may be the appropriate choice
HTML or XHTML
Whether to use HTML or XHTML is a more subtle issue XHTML offers a number of benefits, some of which leverage the power of XML:
It is future-proof, which means that it will be compatible with the web technologies and browsers that are on the horizon XHTML is the way of the future, but because it is backward compatible, you can start using it right away
Its stricter syntax requirements make it easier for screen readers and other assistive devices to handle
Stricter markup rules such as closing all elements makes style sheet appli-cation cleaner and more predictable
Many mobile devices such as cell phones and PDAs are adopting XHTML
as the authoring standard, so your pages will work better on those devices
It can be combined with other XML languages in a single document
As an XML language, it can be parsed and used by any XML software You can take information and data from XML applications and port it into XHTML more easily To use the proper term, XML data can be easily transformed into XHTML
While it is true that the future of web markup will be based on XHTML, HTML is certainly not dead It remains a viable option, and is universally supported by current browsers If none of the benefits listed above sound like
a compelling reason to take on XHTML, HTML is still okay
However, because you are learning this stuff for the first time, and because the differences between XHTML and HTML are really quite minor, you might
as well learn to write in the stricter XHTML syntax right off the bat, then you’ll be one step ahead of the game Writing well-formed XHTML is even
•
•
•
•
•
•
•
Trang 2Validating Your Documents
easier if you are using a web authoring tool such as Adobe (Macromedia)
Dreamweaver or Microsoft Expression Web because you can configure it to
write code in XHTML automatically—just be sure you have the latest version
of the software so it is up to speed with the latest requirements
What the pros do
For professional-caliber web site production, most web developers follow the
XHTML 1.0 Strict DTD Doing so makes sure that the markup is semantic
and does not use any of the deprecated and presentational elements and
attributes (style sheets are used instead, as is the proper practice) It also has
all of the benefits of XHTML that were just listed This isn’t to say that you
have to make all of your web sites XHTML Strict too, but I thought you might
like to know
Validating Your Documents
The other thing that professional web developers do is validate their markup
What does that mean? To validate a document is to check your markup to
make sure that you have abided by all the rules of whatever DTD you are
using Documents that are error-free are said to be valid It is strongly
recom-mended that you validate your documents, especially for professional sites
Valid documents are more consistent on a variety of browsers, they display
more quickly, and are more accessible
Right now, browsers don’t require documents to be valid (in other words,
they’ll do their best to display them, errors and all), but any time you stray
from the standard you introduce unpredictability in the way the page is
dis-played or handled by alternative devices Furthermore, one day there will be
strict XHTML browsers that will require valid and well-formed documents
So how do you make sure your document is valid? You could check it yourself
or ask a friend, but humans make mistakes, and you aren’t really expected to
memorize every minute rule in the specifications Instead, you use a
valida-tor, software that checks your source against the DTD you specify These are
some of the things validators check for:
The inclusion of a DOCTYPE declaration Without it the validator
doesn’t know which version of HTML or XHTML to validate against
An indication of the character encoding for the document (character
encoding is covered in the next section)
The inclusion of required rules and attributes
Non-standard elements
Mismatched tags
•
•
•
•
•
Validation Tools
Developers use a number of helpful tools for checking and correcting errors in (X)HTML documents These are a few of the most popular.
HTML Tidy
HTML Tidy, by Dave Raggett, checks (X)HTML documents for errors and corrects them There
is an online version available
at infohound.net/tidy Find out about downloadable versions
of HTML Tidy at www.w3.org/
People/Raggett/tidy and tidy.
sourceforge.net
Firebug
Firebug is a popular plug-in to the Firefox browser that debugs (X)HTML, CSS, and JavaScript, among many other features It is available as a free download at
addons.mozilla.org/firefox/1843.
Validation Tools
Developers use a number of helpful tools for checking and correcting errors in (X)HTML documents These are a few of the most popular.
HTML Tidy
HTML Tidy, by Dave Raggett, checks (X)HTML documents for errors and corrects them There
is an online version available
at infohound.net/tidy Find out about downloadable versions
of HTML Tidy at www.w3.org/
People/Raggett/tidy and tidy.
sourceforge.net
Firebug
Firebug is a popular plug-in to the Firefox browser that debugs (X)HTML, CSS, and JavaScript, among many other features It is available as a free download at
addons.mozilla.org/firefox/1843.
Trang 3Nesting errors
DTD rule violations
Typos, and other minor errors
The W3C offers a free online validator at validator.w3.org Figure 10-2 shows the W3C Markup Validation Service as it appeared as of this writing (they are known to make tweaks and improvements) There are three options for checking a page: enter the URL of a page on the Web, upload a file from your computer, or just paste the source into a text area on the page The best way
to get a feel for how the validation process works is to try it yourself Give it
a go in Exercise 10-2
•
•
•
Figure 10-2 The W3C’s Markup
Validation Service.
Figure 10-2 The W3C’s Markup
Validation Service.
In this exercise, you’ll validate some documents using the W3C validation service The documents in this exercise are provided for you online at www.learningwebdesign com/materials
Start by validating the document blackgoose.html (it should look familiar because it
was the basis of the examples in Chapter 4, Creating a Simple Page) I’ve included a DOCTYPE declaration that instructs the validator to validate the document against the HTML 4.01 Strict DTD I’ve also purposefully introduced a few errors to the document Knowing what the errors are in advance will give you a better feel for how the validator finds and reports errors.
The required title element is missing in the head element.
The img element is missing the required alt attribute Note also that the img element uses HTML syntax, that is, it does not have a trailing slash.
The p elements are not closed.
OK, let’s get validating!
Make sure you have a copy of blackgoose.html on your hard drive Open a
browser and go to validator.w3.org We’ll use the “Validate by File Upload” option
Select “Browse” and navigate to blackgoose.html Once you’ve selected it, click the
“Check” button on the validator page.
The validator immediately hands back the results (Figure 10-3) It should come
as no surprise that “this page is not HTML 1.0 Strict ”There are apparently three things that prevent it from being so
First, although it is not listed as an error, it complains that it could not find the Character Encoding We’ll talk about character encodings in the next section, so let’s not worry about that one for now
The first real error listed is that the head element is “not finished.” If you look at the source, you can see that there is indeed a closing </head> tag there, so that isn’t the issue The problem here (as hinted in the second paragraph under the error listing) is that the element is missing required content In this case, it’s the missing
title element that is generating the error This is a good example of the fact that
validation error messages can be a bit cryptic, but at least it points you to the line
of code that is amiss so you can start troubleshooting.
The second error, as expected, is the missing alt attribute in the img element.
1.
2.
Trang 4Validating Your Documents
Try adding a title element and alt attribute, save the file, and validate it again
This time it should “tentatively” pass It still doesn’t like that missing character
encoding, but we know we can take care of that later.
Figure 10-3 The error report generated by the W3C validator.
Now let’s validate another document, x-blackgoose.html It is identical to the
blackgoose.html that we just validated, except it specifies the XHTML 1.0 Strict
DTD in its DOCTYPE declaration This will give us a chance to see how the rules for
XHTML differ from HTML
Go to validator.w3.org and upload x-blackgoose.html It is not valid, of course,
and there is still that character encoding problem But look at the new list of
errors—there are 14 compared to only two when we validated it as HTML I’m not
going to show them all in a figure, but I will call your attention to a few key issues
First, look at Error 3, “Line 27 column 30: end tag for “img” omitted.” The problem
here is that although img is an empty element, it must be terminated with a
closing slash in XHTML (<img />)
Now look at Error 5 that states something to the effect that you aren’t allowed to
use an h2 in this context It’s saying that because it thinks you are trying to put an
h2 inside an unclosed img element, which doesn’t make any sense
The lesson here is that one error early in the document can generate a whole
list of errors down the line It is a good idea to make obvious changes, and then
revalidate to see the impact of the correction
Try fixing the img element by adding the alt attribute and the trailing slash, then
reupload the document and check it There will still be a long list of errors, but
now they are related to the p elements not being closed Continue fixing errors
until you get the document to be “tentatively” validated.
Now it’s time to take care of the character encoding so we can get a true sense of
accomplishment by having our documents validate completely.
3.
4.
5.
Trang 5Character Encoding
Before I show you how to specify the character encoding for your documents,
I think it is useful to know what a character encoding is.
Because the Web is worldwide, there are hundreds of written languages with a staggering number of unique character shapes that may need to
be displayed on a web page These include not only the various alphabets (Western, Hebrew, Arabic, and so on), but also ideographs (characters that indicate a whole word or concept) for languages such as Chinese, Japanese, and Korean
Various sets of characters have been standardized for use on computers and over networks For example, the set of 256 characters most commonly used
in Western languages has been standardized and named Latin-1 (or ISO
8859-1, to use its formal identifier) Latin-1 was the character encoding used for HTML 2.0 and 3.0, and you can still use it for documents today
Unicode
The big kahuna of character sets, however, is Unicode (ISO/IEC 10646), which includes the characters for most known languages of the world There are tens of thousands of characters in Unicode, and room in the specification for roughly a million The Unicode character set may be encoded (converted
to ones and zeros) several ways, the most popular being the UTF-8 encoding You may also see UTF-16 or UTF-32, which use different numbers of bytes
to describe characters
UTF-8 is the recommended encoding for all HTML 4.01, XHTML, and XML documents You may remember seeing “Falling back to UTF-8” in the error message in the validator results Now you know that it was just assuming you wanted to use the default character encoding of Unicode for your document type
Specifying the character encoding
There are several ways to associate a character encoding with a document One way is to ask your server administrator to configure the server to include the character encoding in the HTTP header, a chunk of information that a server attaches to every web document before returning it to the browser However, because this information can be separated from the document content, the W3C also recommends that you include the character encoding
in the document itself
In HTML 4.01 and XHTML 1.0 documents, character encoding is indicated using a meta element (see the sidebar, XML Declarations for the method for XML documents) The meta element is an empty element that provides information about the document, such as its creation date, author, copyright
Note
Other specialized character encodings
include ISO 5 (Cyrillic), ISO
8859-6 (Arabic), ISO 8859-7 (Greek), ISO
8859-8 (Hebrew), and three Japanese
encodings (ISO-2022-JP, SHIFT_JIS, and
EUC-JP).
Note
Other specialized character encodings
include ISO 5 (Cyrillic), ISO
8859-6 (Arabic), ISO 8859-7 (Greek), ISO
8859-8 (Hebrew), and three Japanese
encodings (ISO-2022-JP, SHIFT_JIS, and
EUC-JP).
XML Declarations
The character encoding for XML
documents should be provided in an
XML declaration XML declarations
indicate the version of XML used in
the document and may also include
the character encoding.
The following is an example of
the XML declaration that the W3C
recommends for XHTML documents
It must appear before the DOCTYPE
declaration.
<?xml version="1.0"
encoding="utf-8"?>
XML declarations are not required
for all XML documents, but the
W3C encourages authors to include
them in XHTML documents They
are required when the character
encoding is something other than
the defaults UTF-8 or UTF-16
Unfortunately, despite the W3C’s
encouragement, XML declarations
are usually omitted because they
are problematic for current HTML
browsers.
XML Declarations
The character encoding for XML
documents should be provided in an
XML declaration XML declarations
indicate the version of XML used in
the document and may also include
the character encoding.
The following is an example of
the XML declaration that the W3C
recommends for XHTML documents
It must appear before the DOCTYPE
declaration.
<?xml version="1.0"
encoding="utf-8"?>
XML declarations are not required
for all XML documents, but the
W3C encourages authors to include
them in XHTML documents They
are required when the character
encoding is something other than
the defaults UTF-8 or UTF-16
Unfortunately, despite the W3C’s
encouragement, XML declarations
are usually omitted because they
are problematic for current HTML
browsers.
Trang 6Putting It All Together
information, and, as we’ll focus on in this section, the character encoding and
the type of file
The meta element goes in the head of the document, as shown in this XHTML
example (note the trailing slash in the empty meta element):
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
<title>Sample document</title>
</head>
The http-equiv attribute identifies that this meta element is providing
infor-mation about the content type of the document
The content attribute provides the details of the content type in a two-part
value The first part says that this is an HTML text file (in technical terms, it
identifies its media type as text/html) But wait, didn’t we just say that this
is in an XHTML document? That’s fine XHTML 1.0 documents can
mas-querade as HTML text documents for reasons of backward compatibility
(XHTML 1.1 documents, however, must be identified as application/xml, and
unfortunately, browsers don’t support that well quite yet)
Finally, we get to the second part that specifies the character encoding for this
document as utf-8
For another look, here is a meta element for an (X)HTML document that uses
the Latin-1 character encoding Try it out yourself in Exercise 10-3
<meta http-equiv="content-type" content="text/html;charset=ISO-8859-1">
Putting It All Together
Okay! We’ve covered a lot of ground in this chapter in the effort to kick your
documents up a notch into true standards compliance We looked at the
vari-ous versions of HTML and XHTML and what makes them different, how
to specify which version (DTD) you used to write your document and how
browsers use that information, how to validate your document, and how to
specify its character encoding and media type
Note
Information provided by the http-equiv attribute is processed by the
brows-er as though it had received it in an HTTP header Thus, it is an HTTP EQUIValent.
Note
Information provided by the http-equiv attribute is processed by the
brows-er as though it had received it in an HTTP header Thus, it is an HTTP EQUIValent.
In the earlier exercise, you should have fixed all of the errors in blackgoose.html (the
HTML document) and x-blackgoose.html (the XHTML version of the same content)
The meta element with the character encoding should be all that stands between
you and the thrill of validating against the Strict DTDs.
Try adding the meta element as shown in the previous section to both documents,
and reupload them in the validator.
HINT: Be sure that the meta element has a trailing slash in the XHTML document, and
be sure to omit the space and slash in the HTML version.
In the earlier exercise, you should have fixed all of the errors in blackgoose.html (the
HTML document) and x-blackgoose.html (the XHTML version of the same content)
The meta element with the character encoding should be all that stands between
you and the thrill of validating against the Strict DTDs.
Try adding the meta element as shown in the previous section to both documents,
and reupload them in the validator.
HINT: Be sure that the meta element has a trailing slash in the XHTML document, and
be sure to omit the space and slash in the HTML version.
Trang 7What it boils down to is that the minimal document structure for standards compliant documents have a few extra elements than the basic skeleton we created back in Chapter 4 The following examples show the minimal markup for HTML 4.01 Strict and XHTML 1.0 Strict documents (you can adapt these
by changing the DTD in the DOCTYPE declaration and the character encod-ing) The good news is that, if you’ve read the chapter, now you understand exactly what the extra markup means
HTML .01 strict
This is the minimal document structure for HTML 4.01 Strict documents as recommended by the W3C
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>An HTML 4.01 Strict document</title>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
</head>
<body>
<p> The document content goes here </p>
</body>
</html>
XHTML 1.0 strict
This is the minimal document structure for XHTML 1.0 Strict documents as recommended by the W3C
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>An XHTML 1.0 Strict document</title>
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
</head>
<body>
<p> The document content goes here </p>
</body>
</html>
Note that this example omits the XML declaration (see the XML Declarations sidebar earlier in this chapter), because it is problematic for current browsers
as of this writing
Note
These document templates are also
avail-able for download at
www.learningweb-design.com/materials/
Note
These document templates are also
avail-able for download at
www.learningweb-design.com/materials/
Trang 8Test Yourself Test Yourself
This chapter was information-packed Are you ready to see how much you
absorbed?
Who fought in the infamous Browser Wars of the 1990s?
What is the difference between Transitional and Strict HTML 4.01?
How are HTML 4.01 Strict and XHTML 1.0 Strict the same? How are
they different?
Name four significant syntax requirements in XHTML
Look at these valid markup examples and determine whether each is
HTML or XHTML:
<IMG SRC="panda.jpg" ALT="panda eating leaves">
<img src="orchid.jpg" alt="orchid" width=100 height=150 >
<img src="flipflop.gif" alt="closeup of foot in sandal" />
1
2
3
4
5
Trang 9What extra attributes must be applied to the html element in XHTML documents?
How do you get a standards compliant browser to display your page in Standards Mode?
Name two advantages that XHTML offers over HTML
What is ISO 8859-1?
6
7
8
9
Trang 10IN THIS PART
Chapter 11
Cascading Style Sheets
Orientation
Chapter 12
Formatting Text (Plus More Selectors)
Chapter 13
Colors and Backgrounds (Plus Even More Selectors and External Style Sheets)
Chapter 14
Thinking Inside the Box (Padding, Borders, and Margins)
Chapter 15
Floating and Positioning
Chapter 16
Page Layout with CSS
Chapter 17
CSS Techniques
CSS FOR