XML for beginnerseXtensible Markup Language... HTML• HTML is a HyperText Markup language – Designed for a specific application, namely, presenting and linking hypertext... Main Features
Trang 1XML for beginners
eXtensible Markup Language
Trang 2Introduction and Motivation
Dr Praveen Madiraju Modified from Dr.Sagiv’s slides
Trang 3XML vs HTML
• HTML is a HyperText Markup language
– Designed for a specific application, namely, presenting and linking hypertext
Trang 5Main Features of XML
• No fixed set of tags
– New tags can be added for new
applications
• An agreed upon set of tags can be used
in many applications
– Namespaces facilitate uniform and
coherent descriptions of data
• For example, a namespace for address books determines whether to use <tel> or <phone>
Trang 6Main Features of XML (cont’d)
• XML has the concept of a schema
– DTD and the more expressive XML
Schema
• XML is a data model
– Similar to the semistructured data model
• XML supports internationalization (
Unicode ) and platform independence
(an XML file is just a character file)
Trang 7XML is the Standard for
Trang 8XML is not Alone
• XML Schemas strengthen the data-modeling capabilities of XML (in comparison to XML with only DTDs)
• XPath is a language for accessing parts of
XML documents
• XLink and XPointer support cross-references
• XSLT is a language for transforming XML
documents into other XML documents
(including XHTML, for displaying XML files)
– Limited styling of XML can be done with CSS alone
• XQuery is a lanaguage for querying XML
documents
Trang 9The Two Facets of XML
• Some XML files are just text documents with tags that denote their structure and include
some metadata (e.g., an attribute that gives the name of the person who did the
proofreading)
– See an example on the next slide
– XML is a subset of SGML (Standard Generalized
Markup Language)
• Other XML documents are similar to
database files (e.g., an address book)
Trang 10XML can Describe the Structure of a Document
Trang 11XML Syntax
W3Schools Resources on XML Syntax
Trang 12The Structure of XML
• XML consists of tags and text
• Tags come in pairs <date> </date>
– good
<date> <day> </day> </date>
– bad
<date> <day> </date> </day>
(You can’t do <i> <b> </i> </b> in HTML)
Trang 13<name> Lisa Simpson </name>
<mother idref = “marge”/>
<father idref = “homer”/>
of a name and
a value
Trang 14XML Text
XML has only one “basic” type – text
It is bounded by tags, e.g.,
<title> The Big Sleep </title>
<year> 1935 </ year> – 1935 is still text
• XML text is called PCDATA
– (for parsed character data)
• It uses a 16-bit encoding, e.g., \&\#x0152 for
Trang 15XML Structure
• Nesting tags can be used to express
various structures, e.g., a tuple
Trang 16XML Structure (cont’d)
• We can represent a list by using the
same tag repeatedly:
Trang 18The segment of an XML document
between an opening and a corresponding closing tag is called an element
Trang 21The Header Tag
• <?xml version= "1.0" standalone= "yes/no"
Trang 22Processing Instructions
<?xml version="1.0"?>
<?xml-stylesheet href="doc.xsl" type="text/xsl"?>
<!DOCTYPE doc SYSTEM "doc.dtd">
<doc>Hello, world!<! Comment 1 ></doc>
<?pi-without-data?>
<! Comment 2 >
<! Comment 3 >
Trang 23Enter the member by the name on his or her papers Use the
NAME tag The NAME tag has two attributes Common (all in
lowercase, please!) is the dog's call name Breed (also in all
lowercase) is the dog's breed Please see the breed reference
guide for acceptable breeds Your entry should look something
like this:
</ DESCRIPTION >
< EXAMPLE >
<![CDATA[ <NAME common="freddy"
breed"=springer-spaniel">Sir Fredrick of Ledyard's End</NAME> ]]>
We want to see the text as is, even though
it includes tags
Trang 24A Complete XML Documenthttp://www.mscs.mu.edu/~praveen/Teachi ng/fa05/AdvDb/Lectures/bib.xml
Trang 25Well-Formed XML Documents
• An XML document (with or without a DTD) is
well-formed if
– Tags are syntactically correct
– Every tag has an end tag
– Tags are properly nested
– There is a root tag
– A start tag does not have two occurrences of the same attribute
An XML document must be well formed