Symbol Description Entity Name Number Code “ Double quote & quot; & #34; Non - breaking space & nbsp; & #160; To write an element and attribute into your page so that the code is show
Trang 2E Character Encodings
In Appendix D, I discussed how computers store information, how a character - encoding scheme is
a table that translates between characters, and how they are stored in the computer
The most common character set (or character encoding) in use on computers is ASCII (The American Standard Code for Information Interchange), and it is probably the most widely used character set for encoding text electronically You can expect all computers browsing the Web to understand ASCII
Character Set Description
ASCII American Standard Code for Information Interchange, which is used on
most computers
The problem with ASCII is that it supports only the upper - and lowercase Latin alphabet, the numbers 0 – 9, and some extra characters: a total of 128 characters in all Here are the printable characters of ASCII (the other characters are things such as line feeds and carriage - return characters)
! `` # $ % & ` ( ) * + , - /
0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~
However, many languages use either accented Latin characters or completely different alphabets
ASCII does not address these characters, so you need to learn about character encodings if you want to use any non - ASCII characters
Trang 3Appendix E: Character Encodings
750
Character encodings are also particularly important if you want to use symbols, as these cannot be
guaranteed to transfer properly between different encodings (from some dashes to some quotation
mark characters) If you do not indicate the character encoding the document is written in, some of the
special characters might not display
The International Standards Organization created a range of character sets to deal with different national
characters ISO - 8859 - 1 is commonly used in Western versions of authoring tools such as Macromedia
Dreamweaver, as well as applications such as Windows Notepad
Character Set Description
ISO - 8859 - 1 Latin alphabet part 1
Covering North America, Western Europe, Latin America, the Caribbean, Canada, Africa
ISO - 8859 - 2 Latin alphabet part 2
Covering Eastern Europe including Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper Sorbian, and Lower Sorbian
ISO - 8859 - 3 Latin alphabet part 3
Covering SE Europe, Esperanto, Maltese, Turkish, and miscellaneous others ISO - 8859 - 4 Latin alphabet part 4
Covering Scandinavia/Baltics (and others not in ISO - 8859 - 1) ISO - 8859 - 5 Latin/Cyrillic alphabet part 5
ISO - 8859 - 6 Latin/Arabic alphabet part 6
ISO - 8859 - 7 Latin/Greek alphabet part 7
ISO - 8859 - 8 Latin/Hebrew alphabet part 8
ISO - 8859 - 9 Latin 5 alphabet part 9 (same as ISO - 8859 - 1 except Turkish characters replace
Icelandic ones) ISO - 8859 - 10 Latin 6 Lappish, Nordic, and Eskimo
ISO - 8859 - 15 The same as ISO - 8859 - 1 but with more characters added
ISO - 8859 - 16 Latin 10 Covering SE Europe Albanian, Croatian, Hungarian, Polish,
Romanian and Slovenian, plus can be used in French, German, Italian, and Irish Gaelic
ISO - 2022 - JP Latin/Japanese alphabet part 1
ISO - 2022 - JP - 2 Latin/Japanese alphabet part 2
ISO - 2022 - KR Latin/Korean alphabet part 1
Trang 4Appendix E: Character Encodings
751
It is helpful to note that the first 128 characters of ISO - 8859 - 1 match those of ASCII, so you can safely use those characters as you would in ASCII
The Unicode Consortium was then set up to devise a way to show all characters of different languages,
rather than have these different incompatible character codes for different languages
Therefore, if you want to create documents that use characters from multiple character sets, you will be able to do so using the single Unicode character encodings Furthermore, users should be able to view documents written in different character sets, providing their processor (and fonts) support the Unicode standards, no matter what platform they are on or which country they are in By having the single character encoding, you can reduce software development costs because the programs do not need to be designed to support multiple character encodings
One problem with Unicode is that a lot of older programs were written to support only 8 - bit character sets (limiting them to 256 characters), which is nowhere near the number required for all languages
Unicode therefore specifies encodings that can deal with a string in special ways so as to make enough space for the huge character set it encompasses These are known as UTF - 8, UTF - 16, and UTF - 32
Character Set Description
UTF - 8 A Unicode Translation Format that comes in 8 - bit units That is, it comes in
bytes A character in UTF - 8 can be from 1 to 4 bytes long, making UTF - 8
variable width
UTF - 16 A Unicode Translation Format that comes in 16 - bit units That is, it comes in
shorts It can be 1 or 2 shorts long, making UTF - 16 variable width
UTF - 32 A Unicode Translation Format that comes in 32 - bit units That is, it comes in
longs It is a fixed - width format and is always 1 “ long ” in length
The first 256 characters of Unicode character sets correspond to the 256 characters of ISO - 8859 - 1
By default, HTML 4 processors should support UTF - 8, and XML processors are supposed to support UTF - 8 and UTF - 16; therefore, all XHTML - compliant processors should also support UTF - 16 (as XHTML
is an application of XML)
For more information on internationalization and different character sets and encodings, see www.i18nguy.com/
Trang 6F
Some characters are reserved in XHTML; for example, you cannot use the greater than and less than signs or angle brackets within your text because the browser could mistake them for markup
-XHTML processors must support the five special characters listed in the table that follows
Symbol Description Entity Name Number Code
“ Double quote & quot; & #34;
Non - breaking space & nbsp; & #160;
To write an element and attribute into your page so that the code is shown to the user rather than being processed by the browser (for example, as < div id= ” character ” > ), you would write:
& lt;div id= & quot;character & quot; & gt;
There is also a long list of special characters that HTML 4.0 – aware processors should support In order for these to appear in your document, you can use either the numerical code or the entity name For example, to insert a copyright symbol you can use either of the following:
& copy; 2008
& #169; 2008
Trang 7Appendix F: Special Characters
754
The special characters have been split into the following sections:
Character Entity References for ISO 8859 - 1 Characters
Character Entity References for Symbols, Mathematical Symbols, and Greek Letters
Character Entity References for Markup - Significant and Internationalization Characters
They are taken from the W3C website at www.w3.org/TR/REC - html40/sgml/entities.html
Character Entity References for ISO 8859 - 1 Characters
Symbol Description
Entity Name
Number Code
No - break space = non - breaking space & nbsp; & #160;
¡ Inverted exclamation mark & iexcl; & #161;
¢ Cent sign & cent; & #162;
£ Pound sign & pound; & #163;
¤ Currency sign & curren; & #164;
¥ Yen sign = yuan sign & yen; & #165;
¦ Broken bar = broken vertical bar & brvbar; & #166;
§ Section sign & sect; & #167;
¨ Diaeresis = spacing diaeresis & uml; & #168;
© Copyright sign & copy; & #169;
a
− Feminine ordinal indicator & ordf; & #170;
« Left pointing double angle quotation mark = left
pointing guillemet
& laquo; & #171;
¬ Not sign & not; & #172;
Soft hyphen = discretionary hyphen & shy; & #173;
® Registered sign = registered trademark sign & reg; & #174;
¯ Macron = spacing macron = overline = APL overbar & macr; & #175;
˚ Degree sign & deg; & #176;
± Plus - minus sign = plus - or - minus sign & plusmn; & #177;
Trang 8Appendix F: Special Characters
755
Symbol Description
Entity Name
Number Code
3 Superscript three = superscript digit three = cubed & sup3; & #179;
´ Acute accent = spacing acute & acute; & #180;
µ Micro sign & micro; & #181;
¶ Pilcrow sign = paragraph sign & para; & #182;
· Middle dot = Georgian comma = Greek middle dot & middot; & #183;
¸ Cedilla = spacing cedilla & cedil; & #184;
1 Superscript one = superscript digit one & sup1; & #185;
° Masculine ordinal indicator & ordm; & #186;
» Right - pointing double angle quotation mark = right
pointing guillemet
& raquo; & #187;
¼ Vulgar fraction one - quarter = fraction one - quarter & frac14; & #188;
½ Vulgar fraction one - half = fraction one - half & frac12; & #189;
¾ Vulgar fraction three quarters = fraction three
quarters
& frac34; & #190;
¿ Inverted question mark = turned question mark & iquest; & #191;
À Latin capital letter A with grave = Latin capital letter
A grave
& Agrave; & #192;
Á Latin capital letter A with acute & Aacute; & #193;
 Latin capital letter A with circumflex & Acirc; & #194;
à Latin capital letter A with tilde & Atilde; & #195;
Ä Latin capital letter A with diaeresis & Auml; & #196;
Å Latin capital letter A with ring above = Latin capital
letter A ring
& Aring; & #197;
Æ Latin capital letter AE = Latin capital ligature AE & AElig; & #198;
Ç Latin capital letter C with cedilla & Ccedil; & #199;
È Latin capital letter E with grave & Egrave; & #200;
É Latin capital letter E with acute & Eacute; & #201;
Ê Latin capital letter E with circumflex & Ecirc; & #202;
(continued)
Trang 9Appendix F: Special Characters
756
Symbol Description
Entity Name
Number Code
Ë Latin capital letter E with diaeresis & Euml; & #203;
Ì Latin capital letter I with grave & Igrave; & #204;
Í Latin capital letter I with acute & Iacute; & #205;
Î Latin capital letter I with circumflex & Icirc; & #206;
Ï Latin capital letter I with diaeresis & Iuml; & #207;
Ð Latin capital letter ETH & ETH; & #208;
Ñ Latin capital letter N with tilde & Ntilde; & #209;
Ò Latin capital letter O with grave & Ograve; & #210;
Ó Latin capital letter O with acute & Oacute; & #211;
Ô Latin capital letter O with circumflex & Ocirc; & #212;
Õ Latin capital letter O with tilde & Otilde; & #213;
Ö Latin capital letter O with diaeresis & Ouml; & #214;
⫻ Multiplication sign & times; & #215;
Ø Latin capital letter O with stroke = Latin capital letter
O slash
& Oslash; & #216;
Ù Latin capital letter U with grave & Ugrave; & #217;
Ú Latin capital letter U with acute & Uacute; & #218;
Û Latin capital letter U with circumflex & Ucirc; & #219;
Ü Latin capital letter U with diaeresis & Uuml; & #220;
Ý Latin capital letter Y with acute & Yacute; & #221;
Þ Latin capital letter THORN & THORN; & #222;
ß Latin small letter sharp s = ess - zed & szlig; & #223;
à Latin small letter a with grave = Latin small letter a
grave
& agrave; & #224;
á Latin small letter a with acute & aacute; & #225;
â Latin small letter a with circumflex & acirc; & #226;
Trang 10Appendix F: Special Characters
757
Symbol Description
Entity Name
Number Code
ã Latin small letter a with tilde & atilde; & #227;
ä Latin small letter a with diaeresis & auml; & #228;
å Latin small letter a with ring above = Latin small
letter a ring
& aring; & #229;
æ Latin small letter ae = Latin small ligature ae & aelig; & #230;
ç Latin small letter c with cedilla & ccedil; & #231;
è Latin small letter e with grave & egrave; & #232;
é Latin small letter e with acute & eacute; & #233;
ê Latin small letter e with circumflex & ecirc; & #234;
ë Latin small letter e with diaeresis & euml; & #235;
ì Latin small letter i with grave & igrave; & #236;
í Latin small letter i with acute & iacute; & #237;
î Latin small letter i with circumflex & icirc; & #238;
ï Latin small letter i with diaeresis & iuml; & #239;
ð Latin small letter eth & eth; & #240;
ñ Latin small letter n with tilde & ntilde; & #241;
ò Latin small letter o with grave & ograve; & #242;
ó Latin small letter o with acute & oacute; & #243;
ô Latin small letter o with circumflex & ocirc; & #244;
õ Latin small letter o with tilde & otilde; & #245;
ö Latin small letter o with diaeresis & ouml; & #246;
÷ Division sign & divide; & #247;
ø Latin small letter o with stroke = Latin small letter o
slash
& oslash; & #248;
ù Latin small letter u with grave & ugrave; & #249;
ú Latin small letter u with acute & uacute; & #250;
û Latin small letter u with circumflex & ucirc; & #251;
(continued)
Trang 11Appendix F: Special Characters
758
Symbol Description
Entity Name
Number Code
ü Latin small letter u with diaeresis & uuml; & #252;
ý Latin small letter y with acute & yacute; & #253;
þ Latin small letter thorn & thorn; & #254;
ÿ Latin small letter y with diaeresis & yuml; & #255;
Character Entity References for Symbols,
Mathematical Symbols, and Greek Letters
Trang 12Appendix F: Special Characters
Trang 13Appendix F: Special Characters
ϕ Greek small letter phi & phi; & #966;
χ Greek small letter chi & chi; & #967;
ψ Greek small letter psi & psi; & #968;
ω Greek small letter omega & omega; & #969;
θ Greek small letter theta symbol & thetasym; & #977;
ϒ Greek upsilon with hook symbol & upsih; & #978;
ϖ Greek pi symbol & piv; & #982;
• Bullet = black small circle & bull; & #8226;
… Horizontal ellipsis = three dot
leader
& hellip; & #8230;
′ Prime = minutes = feet & prime; & #8242;
″ Double prime = seconds = inches & Prime; & #8243;
苵 Overline = spacing overscore & oline; & #8254;
/ Fraction slash & frasl; & #8260;
℘ Script capital P = power set =
Weierstrass p
& weierp; & #8472;
ℑ Blackletter capital I = imaginary
part
& image; & #8465;
ℜ Blackletter capital R = real part
symbol
& real; & #8476;
™ Trademark sign & trade; & #8482;
Trang 14Appendix F: Special Characters
← Left arrow & larr; & #8592;
↑ Up arrow & uarr; & #8593;
→ Right arrow & rarr; & #8594;
↓ Down arrow & darr; & #8595;
↔ Left - right arrow & harr; & #8596;
↵ Down arrow with corner leftward =
carriage return
& crarr; & #8629;
⇐ Left double arrow & lArr; & #8656;
⇑ Up double arrow & uArr; & #8657;
⇒ Right double arrow & rArr; & #8658;
⇓ Down double arrow & dArr; & #8659;
⇔ Left - right double arrow & hArr; & #8660;
∀ For all & forall; & #8704;
∂ Partial differential & part ; & #8706;
∃ There exists & exist; & #8707;
∅ Empty set = null set = diameter & empty; & #8709;
∇ Nabla = backward difference & nabla; & #8711;
∈ Element of & isin; & #8712;
∉ Not an element of & notin; & #8713;
像 Contains as member & ni; & #8715;
∏ n - ary product = product sign & prod; & #8719;
∑ n - ary summation & sum; & #8721;
(continued)
Trang 15Appendix F: Special Characters
762
Code
− Minus sign & minus; & #8722;
∗ Asterisk operator & lowast; & #8727;
√ Square root = radical sign & radic; & #8730;
∝ Proportional to & prop; & #8733;
∞ Infinity & infin; & #8734;
∧ Logical and = wedge & and; & #8743;
∨ Logical or = vee & or ; & #8744;
∩ Intersection = cap & cap; & #8745;
∪ Union = cup & cup; & #8746;
∴ Therefore & there4; & #8756;
∼ Tilde operator = varies with =
similar to
& sim; & #8764;
≅ Approximately equal to & cong; & #8773;
≈ Almost equal to = asymptotic to & asymp; & #8776;
≠ Not equal to & ne; & #8800;
≡ Identical to & equiv; & #8801;
≤ Less than or equal to & le; & #8804;
≥ Greater than or equal to & ge; & #8805;
⊂ Subset of & sub; & #8834;
⊃ Superset of & sup; & #8835;
⊄ Not a subset of & nsub; & #8836;
⊆ Subset of or equal to & sube; & #8838;
⊇ Superset of or equal to & supe; & #8839;
⊕ Circled plus = direct sum & oplus; & #8853;
⊗ Circled times = vector product & otimes; & #8855;
Trang 16Appendix F: Special Characters
& perp; & #8869;
⋅ Dot operator & sdot; & #8901;
Left ceiling = apl upstile & lceil; & #8968;
Right ceiling & rceil; & #8969;
Left floor = apl downstile & lfloor; & #8970;
Right floor & rfloor; & #8971;
〈 Left - pointing angle bracket = bra & lang; & #9001;
〉 Right - pointing angle bracket = ket & rang; & #9002;
♠ Black spade suit & spades; & #9824;
♣ Black club suit = shamrock & clubs; & #9827;
♥ Black heart suit = valentine & hearts; & #9829;
♦ Black diamond suit & diams; & #9830;
Mar kup - Significant and Internationalization
Characters
" Quotation mark = APL quote & quot; & #34;
& Ampersand & amp; & #38;
< Less - than sign & lt; & #60;
> Greater - than sign & gt; & #62;
(continued)
Trang 17Appendix F: Special Characters
764
Œ Latin capital ligature OE & OElig; & #338;
œ Latin small ligature oe & oelig; & #339;
Š Latin capital letter S with caron & Scaron; & #352;
š Latin small letter s with caron & scaron; & #353;
Ÿ Latin capital letter Y with diaeresis & Yuml; & #376;
ˆ Modifier letter circumflex accent & circ; & #710;
˜ Small tilde & tilde; & #732;
En space & ensp; & #8194;
Em space & emsp; & #8195;
Thin space & thinsp; & #8201;
Zero width non - joiner & zwnj; & #8204;
Zero width joiner & zwj; & #8205;
Left - to - right mark & lrm; & #8206;
Right - to - left mark & rlm; & #8207;
– En dash & ndash; & #8211;
— Em dash & mdash; & #8212;
’ Left single quotation mark & lsquo; & #8216;
‘ Right single quotation mark & rsquo; & #8217;
‚ Single low - 9 quotation mark & sbquo; & #8218;
“ Left double quotation mark & ldquo; & #8220;
” Right double quotation mark & rdquo; & #8221;
„ Double low - 9 quotation mark & bdquo; & #8222;
† Dagger & dagger; & #8224;
‡ Double dagger & Dagger; & #8225;
Trang 18Appendix F: Special Characters
765
‰ Per mille sign & permil; & #8240;
‹ Single left - pointing angle quotation
mark (proposed, but not yet standardized)
& lsaquo; & #8249;
› Single right - pointing angle quotation
mark (proposed, but not yet standardized)
& rsaquo; & #8250;
€ Euro sign & euro; & #8364;
Trang 20G Language Codes
The following table shows the two - letter ISO 639 language codes that are used to declare the language of a document in the lang and xml:lang attributes It covers many of the world ’ s major languages
Country ISO Code
Abkhazian AB Afan (Oromo) OM Afar AA Afrikaans AF Albanian SQ Amharic AM Arabic AR Armenian HY Assamese AS Aymara AY Azerbaijani AZ Bashkir BA Basque EU Bengali; Bangla BN
Continued
Country ISO Code
Bhutani DZ Bihari BH Bislama BI Breton BR Bulgarian BG Burmese MY Byelorussian BE Cambodian KM Catalan CA Chinese ZH Corsican CO Croatian HR Czech CS Danish DA
Trang 21Appendix G: Language Codes
768
Country ISO Code
Dutch NL English EN Esperanto EO Estonian ET Faroese FO Fiji FJ Finnish FI French FR Frisian FY Galician GL Georgian KA German DE Greek EL Greenlandic KL Guarani GN Gujarati GU Hausa HA Hebrew HE Hindi HI Hungarian HU Icelandic IS Indonesian ID Interlingua IA Interlingue IE Inuktitut IU Inupiak IK Irish GA Italian IT
Country ISO Code
Japanese JA Javanese JV Kannada KN Kashmiri KS Kazakh KK Kinyarwanda RW Kirghiz KY Korean KO Kurdish KU Kurundi RN Laothian LO Latin LA Latvian; Lettish LV Lingala LN Lithuanian LT Macedonian MK Malagasy MG Malay MS Malayalam ML Maltese MT Maori MI Marathi MR Moldavian MO Mongolian MN Nauru NA Nepali NE Norwegian NO Occitan OC
Trang 22Appendix G: Language Codes
769
Country ISO Code
Oriya OR Pashto; Pushto PS Persian (Farsi) FA Polish PL Portuguese PT Punjabi PA Quechua QU Rhaeto - Romance RM Romanian RO Russian RU Samoan SM Sangho SG Sanskrit SA Scots Gaelic GD Serbian SR Serbo - Croatian SH Sesotho ST Setswana TN Shona SN Sindhi SD Singhalese SI Siswati SS Slovak SK Slovenian SL Somali SO Spanish ES Sudanese SU Swahili SW
Country ISO Code
Swedish SV Tagalog TL Tajik TG Tamil TA Tatar TT Telugu TE Thai TH Tibetan BO Tigrinya TI Tonga TO Tsonga TS Turkish TR Turkmen TK Twi TW Uigur UG Ukrainian UK Urdu UR Uzbek UZ Vietnamese VI Volapuk VO Welsh CY Wolof WO Xhosa XH Yiddish YI Yoruba YO Zhuang ZA Zulu ZU
Trang 24H MIME Media Types
You have seen the type attribute used throughout this book on a number of elements, the value of which is a MIME media type
MIME (Multipurpose Internet Mail Extension) media types were originally devised so that e - mails could include information other than plain text MIME media types indicate the following things:
How the parts of a message, such as text and attachments, are combined into the message The way in which each part of the message is specified
The way the items are encoded for transmission so that even software that was designed
to work only with ASCII text can process the message
As you have seen, however, MIME types are not just for use with e - mail; they were adopted by web servers as a way to tell web browsers what type of material was being sent to them so that they could cope with that kind of file correctly
MIME content types consist of two parts:
A main type
A sub - type The main type is separated from the sub - type by a forward slash character — for example, text/html for HTML
This appendix is organized by the main types:
text image multipart
Trang 25Appendix H: MIME Media Types
text/html for HTML files
text/rtf for text files using rich text formatting
MIME types are officially supposed to be assigned and listed by the Internet Assigned Numbers
Authority (IANA)
Many of the popular MIME types in this list (all those that begin with “ x - ” ) are not assigned by the
IANA and do not have official status (Having said that, I should mention that some of these are very
popular and browsers support them, such as audio/x - mp3 You can see the list of official MIME types at
www.iana.org/assignments/media-types/ )
Those preceded with vnd are vendor - specific
The most popular MIME types are listed in this appendix in a bold typeface to help you find them
text
Note that, when specifying the MIME type of a content - type field (for example in a < meta > element),
you can also indicate the character set for the text being used For example:
rtf
Trang 26Appendix H: MIME Media Types
773
sgml
t140 tab - separated - values uri - list
vnd.abc vnd.curl vnd.DMClientScript vnd.fly
vnd.fmi.flexstor vnd.in3d.3dml vnd.in3d.spot vnd.IPTC.NewsML
vnd.IPTC.NITF vnd.latex - z vnd.motorola.reflex vnd.ms - mediapackage vnd.net2phone.commcenter.command vnd.sun.j2me.app - descriptor
vnd.wap.si vnd.wap.sl vnd.wap.wml vnd.wap.wmlscript
xml xml - external - parsed - entity
image
bmp cgm g3fax
tiff
tiff - fx vnd.cns.inf2 vnd.djvu vnd.dwg vnd.dxf vnd.fastbidsheet vnd.fpx
vnd.fst vnd.fujixerox.edmics - mmr vnd.fujixerox.edmics - rlc vnd.globalgraphics.pgb vnd.microsoft.icon
Continued
Trang 27Appendix H: MIME Media Types
(continued)
Trang 28Appendix H: MIME Media Types
775
G726 - 32 G726 - 40 G728 G729 G729D G729E GSM GSM - EFR L8 L16 L20 L24 LPC
MPA MP4A - LATM
mpa - robust
mpeg mpeg4 - generic
parityfec PCMA PCMU prs.sid QCELP RED SMV SMV0
SMV - QCP telephone - event tone
VDVI vnd.3gpp.iufp vnd.cisco.nse vnd.cns.anp1 vnd.cns.inf1 vnd.digital - winds vnd.everad.plj vnd.lucent.voice vnd.nokia.mobile - xmf vnd.nortel.vbk vnd.nuera.ecelp4800 vnd.nuera.ecelp7470 vnd.nuera.ecelp9600 vnd.octel.sbc vnd.qcelp — deprecated, use audio/qcelp vnd.rhetorex.32kadpcm
vnd.sealedmedia.softseal.mpeg vnd.vmx.cvsd
Trang 29Appendix H: MIME Media Types
quicktime
SMPTE292M vnd.fvt vnd.motorola.video vnd.motorola.videop vnd.mpegurl
vnd.nokia.interleaved - multimedia vnd.objectvideo
vnd.sealed.mpeg1 vnd.sealed.mpeg4 vnd.sealed.swf vnd.sealedmedia.softseal.mov vnd.vivo
s - http sip sipfrag
Trang 30Appendix H: MIME Media Types
vnd.gs - gdl
vnd.gtw vnd.mts vnd.parasolid.transmit.binary vnd.parasolid.transmit.text vnd.vtu
vrml
application
activemessage andrew - inset applefile atomicmail batch - SMTP beep+xml cals - 1840 cnrp+xml commonground cpl+xml
cybercash dca - rft dec - dx dicom dvcs EDI - Consent
EDI - X12 EDIFACT eshop font - tdpfr http hyperstudio iges index index.cmd index.obj index.response index.vnd iotp ipp isup mac - binhex40
Continued
Trang 31Appendix H: MIME Media Types
rtf sdp set - payment set - payment - initiation set - registration set - registration - initiation sgml
sgml - open - catalog sieve
slate timestamp - query timestamp - reply tve - trigger vemmi vnd.3gpp.pic - bw - large vnd.3gpp.pic - bw - small vnd.3gpp.pic - bw - var vnd.3gpp.sms vnd.3M.Post - it - Notes vnd.accpac.simply.aso vnd.accpac.simply.imp vnd.acucobol
(continued)
Trang 32Appendix H: MIME Media Types
779
vnd.acucorp vnd.adobe.xfdf vnd.aether.imp vnd.amiga.ami vnd.anser - web - certificate - issue - initiation vnd.anser - web - funds - transfer - initiation vnd.audiograph
vnd.blueice.multipass vnd.bmi
vnd.businessobjects vnd.canon - cpdl vnd.canon - lips vnd.cinderella vnd.claymore vnd.commerce - battelle vnd.commonspace vnd.contact.cmsg vnd.cosmocaller vnd.criticaltools.wbs+xml vnd.ctc - posml
vnd.cups - postscript vnd.cups - raster vnd.cups - raw vnd.curl vnd.cybank vnd.data - vision.rdz vnd.dna
vnd.dpgraph
vnd.dreamfactory vnd.dxr
vnd.ecdis - update vnd.ecowin.chart vnd.ecowin.filerequest vnd.ecowin.fileupdate vnd.ecowin.series vnd.ecowin.seriesrequest vnd.ecowin.seriesupdate vnd.enliven
vnd.epson.esf vnd.epson.msf vnd.epson.quickanime vnd.epson.salt vnd.epson.ssf vnd.ericsson.quickcall vnd.eudora.data vnd.fdf
vnd.ffsns vnd.fints vnd.FloGraphIt vnd.framemaker vnd.fsc.weblaunch vnd.fujitsu.oasys vnd.fujitsu.oasys2 vnd.fujitsu.oasys3 vnd.fujitsu.oasysgp vnd.fujitsu.oasysprs
Continued
Trang 33Appendix H: MIME Media Types
vnd.groove - tool - message
vnd.groove - tool - template
vnd.japannet - directory - service vnd.japannet - jpnstore - wakeup vnd.japannet - payment - wakeup vnd.japannet - registration vnd.japannet - registration - wakeup vnd.japannet - setstore - wakeup vnd.japannet - verification vnd.japannet - verification - wakeup vnd.jisp
vnd.kde.karbon vnd.kde.kchart vnd.kde.kformula vnd.kde.kivio vnd.kde.kontour vnd.kde.kpresenter vnd.kde.kspread vnd.kde.kword vnd.kenameaapp vnd.kidspiration
(continued)
Trang 34Appendix H: MIME Media Types
781
vnd.koan vnd.liberty - request+xml vnd.llamagraphics.life - balance.desktop vnd.llamagraphics.life - balance
.exchange+xml vnd.lotus - 1 - 2 - 3 vnd.lotus - approach vnd.lotus - freelance vnd.lotus - notes vnd.lotus - organizer vnd.lotus - screencam vnd.lotus - wordpro vnd.mcd
vnd.mediastation.cdkey vnd.meridian - slingshot vnd.micrografx.flo vnd.micrografx.igx vnd.mif
vnd.minisoft - hp3000 - save vnd.mitsubishi.misty - guard.trustweb vnd.Mobius.DAF
vnd.Mobius.DIS vnd.Mobius.MBK vnd.Mobius.MQY vnd.Mobius.MSL vnd.Mobius.PLC vnd.Mobius.TXF vnd.mophun.application
vnd.mophun.certificate vnd.sss - ntf
vnd.street - stream vnd.svd
vnd.swiftview - ics vnd.triscape.mxs vnd.trueapp vnd.truedoc vnd.ufdl vnd.uiq.theme vnd.uplanet.alert vnd.uplanet.alert - wbxml vnd.uplanet.bearer - choice vnd.uplanet.bearer - choice - wbxml vnd.uplanet.cacheop
vnd.uplanet.cacheop - wbxml vnd.uplanet.channel
vnd.uplanet.channel - wbxml vnd.uplanet.list
vnd.uplanet.list - wbxml vnd.uplanet.listcmd vnd.uplanet.listcmd - wbxml vnd.uplanet.signal
vnd.vcx vnd.vectorworks vnd.vidsoft.vidconference vnd.visio
Continued
Trang 35Appendix H: MIME Media Types
(continued)
Trang 36Deprecated and Browser - Specific Mar kup
As the versions of HTML and XHTML have developed, quite a lot of markup has been deprecated ,
which is the W3C’s way of alerting web developers that is is likely to be removed from future versions of HTML and XHTML and that web - page authors should stop using it (although there is
an acknowledgment that some people may still need to use it for a while) Where markup is deprecated, there is usually an acceptable alternative way to achieve the same goal (in many cases using CSS)
You can still use quite a lot of the deprecated markup that you meet in this chapter when using the Transitional XHTML DOCTYPE, but Strict XHTML has already removed most of the elements and attributes that affect presentation of elements
I have included the details of these elements and attributes in this book, despite the fact that the markup is deprecated or out of date, because you are likely to come across it in other people ’ s code, and on very rare occasions you might need to resort to using some of this markup in order to get a specific job done
In addition to deprecated markup, I will introduce some of the browser - specific markup that you may come across This is markup that browser manufacturers added to their browsers to allow users to do more things than they could in competing browsers — but these browser - specific elements and attributes never made it into the HTML recommendations, and are therefore referred
to as browser - specific markup
This appendix covers the following:
Elements and attributes that have been deprecated in recent versions of HTML and XHTML
Specification of font appearances without using CSS Control of backgrounds without using CSS
❑
❑
❑
Trang 37Appendix I: Deprecated and Browser-Specifi c Markup
784
Control of presentations of links, lists, and tables without using CSS
Elements and attributes that control the formatting of a document
Elements, attributes, and styles that Microsoft added to IE (but that are not supported by other
browser manufacturers)
Before you look at any of this markup, however, here ’ s a quick word on why a good part of this
appendix is deprecated markup
Why Deprecated Mar kup Exists
In the introduction to this book, I explained how XHTML 1.0 was created after HTML had reached
version 4.01 The elements and attributes are virtually identical, but the syntax of XHTML is much
stricter (for example, you must use lowercase letters in tag names, attributes must be enclosed in double
quotes, and so on)
Up to that point, with each version of HTML, new elements and attributes were added and old ones
removed These changes have been necessary because web - page authors have wanted to create
increasingly complicated pages, and also because there has been an increasing drive to separate the
content of web pages from the rules that describe how the page should be displayed
In older versions of HTML, before CSS was introduced, HTML contained markup that could be used to
control the presentation of a web page (such as the < font > element that would control the font used in a
document, or the bgcolor attribute that would set the background color of a page)
When CSS was introduced to style web pages, all of the HTML markup that had previously controlled
how a page would appear could be removed (This is a big source of deprecated markup.)
When your web pages just focus on the content (the words themselves), its structure (the headings and
paragraphs), and its meaning (using elements that indicate their contents are an address or a quote), you
end up with much simpler documents You can also present the same document in different ways, which
is particularly helpful considering that there are an increasing number of different devices being used to
access the Web (from mobile phones to game consoles), all of which have different - sized screens and
abilities, which may need styling in different ways
Older Pages Break Many Rules
You should be aware that a lot of the pages you see on the Web probably break a lot of the rules you have
learned in this book so far You will see element and attribute names in upper - and lowercase, you will
see missing quotation marks on attribute values, even attributes without values, and you will see
elements that do not have closing tags You will see pages without DOCTYPE declarations and pages
littered with deprecated markup Keep in mind, however, that many of the pages that break the rules
you have learned might have been written when the rules were not as strict, and at the time of writing
the code may have been perfectly acceptable Indeed, the fact that web browsers would try to show
pages even if the markup contained errors (and that they would skip over tags that they did not
❑
❑
❑
Trang 38Appendix I: Deprecated and Browser-Specifi c Markup
785
understand) significantly helped the adoption of HTML, because it helped people who did not program
to develop web pages far more easily than languages that showed complex error messages when they encountered something they didn ’ t understand
It wasn ’ t just humans who wrote code that might be frowned upon these days The early versions of authoring tools such as Microsoft FrontPage and Macromedia Dreamweaver sometimes generated code that had strange capitalization or missing quotation marks, and featured attributes without values This does not make it okay to follow their lead; the first versions of these programs were written before XHTML came along with its stricter rules
Having said all this, it is also worth noting that when HTML 5 comes out (which is unlikely to be before 2011), it will probably relax some of the rules imposed by XHTML (for example it is likely to allow authors to mix upper - and lowercase again, and they might not need to close all elements) Having said that, I think that there will still be advantages to learning code using the stricter XHTML syntax For example, many of the tools written to work with XML can also be used with XHTML These might not work if you have written HTML 5 pages that do not adhere to the stricter XHTML syntax
Even if a page with bad or deprecated markup renders fine in your browser, it ’ s still wise to avoid this markup because your pages are less likely to appear as you intended on the increasing number of devices being used on the Web
Fonts
In this section, you learn about several elements (and their attributes) that affect the appearance of text and fonts, all of which have been deprecated
The < font > Element
The < font > element was introduced in HTML 3.2 and deprecated in HTML 4.0 It allows you to indicate the typeface, size, and color of font the browser should display between the opening < font > and closing
< /font > tags You could probably find many sites that are still littered with < font > tags, one for each time you see the style of text change on the page
The following table shows the three attributes the < font > element relies upon:
face Specify the typeface that
should be used
Name of the typeface to use (can include more than one name in order of preference)
size Specify the size of the font A number between 1 and 7 where 1 is the
smallest font size and 7 is the largest font size
color Specify the color of the font A color name or hex value (see Appendix D)
Trang 39Appendix I: Deprecated and Browser-Specifi c Markup
786
The following is an example of how the < font > element would have been used ( ai_eg01.html ) You
can see that there are three occurrences of the < font > element:
< > This is the browser’s default font < /p >
< font face=”arial, verdana, sans-serif” size=”2” >
< h1 > Example of the & lt;font & gt; Element < /h1 >
< > < font size=”4” color=”darkgray” > Here is some size 3 writing
in the color called darkgray The typeface is determined by the
previous & lt;font & gt; element that contains this paragraph < /font > < /p >
< > < font face=”courier” size=”2” color=”#000000” > Now here is a courier
font, size 2, in black < /font > < /p >
As you can see from Figure I - 1, all the writing within a < font > element follows the rules laid down in
the attributes that you can see on the opening < font > tag The first paragraph is in the browser ’ s default
font (which is probably a size 3 Times family font in black) The first < font > element appears directly
after this paragraph and contains the rest of the page, therefore acting like a default setting for the rest of
the page, which should appear in an Arial typeface
As you can see, the name of the Arial typeface is followed by the typeface Verdana; this is supposed to be
a second choice if Arial is not available Then if Verdana is not available, the browser ’ s default sans - serif
font should be used:
< font face=”arial, verdana, sans-serif” size=”2” >
This < font > element also indicates that the default size of the text in the rest of the document should be
size 2 Note that this < font > element does not override the size of the < h1 > element, but it does affect
the typeface used — the heading is written in Arial
Trang 40Appendix I: Deprecated and Browser-Specifi c Markup
787
While this < font > element is acting as a default for most of the page, if you want a particular part of the page to have any other font properties, you can indicate so in another < font > element
You can see in the second paragraph that color and the size of the font are changed to dark gray and size 4
< > < font size=”4” color=”darkgray” > Here is some size 4 darkgray writing < /font > < /p >
The third paragraph then uses a different typeface, a smaller size, and black:
< > < font face=”courier” size=”2” color=”#000000” > Now here is a courier font, size 2, in back < /font > < /p >
Note that you may have to use < font > elements inside < td > and < th > elements, as the styles specified outside tables are not inherited by the text inside cells Figure I - 2 shows you the different font sizes from
1 to 7 ( ai_eg02.html )
Figure I-2
Font sizes can change slightly from browser to browser, so you cannot rely on them to be exactly the same number of pixels tall or wide in a layout
The preferred method with CSS would be to use the font - family , font - size , and color properties
on the element containing the text that you wanted to style You learned about these CSS properties in Chapter 7
The text Attribute
The text attribute is used on the < body > element to indicate the default color for text in the document;
it was deprecated in HTML 4 Its value should be either a color name or a hex color For example ( ai_eg03.html ):
< body text=”#999999” >
This text should be in a different color than the next bit < font color=”#000000” > which is black < /font > , and now back to gray
< /body >