* called by SAX2Internal when it has found the product name * @param name the product name * called by SAX2Internal when it has found a price * @param vendor vendor’s name * @param price
Trang 1if(args.length < 2){
System.out.println(“java com.psol.xbe.BestDeal filename delivery”);return;
}
ComparingMachine comparingMachine =new ComparingMachine(Integer.parseInt(args[1]));
SAX2Internal sax2Internal =new SAX2Internal(comparingMachine);
try{Parser parser = ParserFactory.makeParser(PARSER_NAME);
parser.setDocumentHandler(sax2Internal);
parser.parse(args[0]);
}catch(SAXException e){
Exception x = e.getException();
if(null != x)throw x;
elsethrow e;
“ days”);
}}Listing 8.4: continued
Trang 2* This class receives events from the SAX2Internal adapter
* and does the comparison required
* This class holds the “business logic.”
*/
class ComparingMachine{
targetDelivery = td;
}
251 Maintaining the State
continues
Trang 3* called by SAX2Internal when it has found the product name
* @param name the product name
* called by SAX2Internal when it has found a price
* @param vendor vendor’s name
* @param price price proposal
* @param delivery delivery time proposal
*/
public void compare(String vendor,double price,int delivery){
if(delivery <= targetDelivery){
if(bestPrice > price){
bestPrice = price;
vendorName = vendor;
proposedDelivery = delivery;
}}}
/**
* property accessor: vendor’s name
* @return the vendor with the cheapest offer so far
Trang 4* property accessor: best price
* @return the best price so far
* property accessor: proposed delivery
* @return the proposed delivery time
* property accessor: product name
* @return the product name
/**
* SAX event handler to adapt from the SAX interface to
* whatever the application uses internally
*/
class SAX2Internalextends HandlerBase{
/**
253 Maintaining the State
continues
Trang 5* state constants
*/
final protected int START = 0,
PRODUCT = 1,PRODUCT_NAME = 2,VENDOR = 3,VENDOR_NAME = 4,VENDOR_PRICE = 5;
Trang 6* @param attributes element’s attributes
{case START:
if(name.equals(“product”))state = PRODUCT;
break;
case PRODUCT:
if(name.equals(“name”)){
state = PRODUCT_NAME;
currentElement = new LeafElement(name,attributes);
}if(name.equals(“vendor”))state = VENDOR;
break;
case VENDOR:
if(name.equals(“name”)){
state = VENDOR_NAME;
currentElement = new LeafElement(name,attributes);
}if(name.equals(“price”)){
state = VENDOR_PRICE;
currentElement = new LeafElement(name,attributes);
}break;
}}
255 Maintaining the State
continues
Trang 7* content of the element
* @param chars documents characters
* @param start first character in the content
* @param length last character in the content
*/
public void characters(char[] chars,int start,int length){
switch(state){
case PRODUCT_NAME:
case VENDOR_NAME:
case VENDOR_PRICE:
currentElement.append(chars,start,length); break;
}}
case PRODUCT_NAME:
if(name.equals(“name”)){
state = PRODUCT;
comparingMachine.setProductName(
currentElement.getText());
}break;
case VENDOR:
if(name.equals(“vendor”))Listing 8.4: continued
Trang 8state = PRODUCT;
break;
case VENDOR_NAME:
if(name.equals(“name”)){
state = VENDOR;
currentVendor = currentElement;
}break;
case VENDOR_PRICE:
if(name.equals(“price”)){
state = VENDOR;
double price = toDouble(currentElement.getText());
Dictionary attributes =currentElement.getAttributes();
String stringDelivery =(String)attributes.get(“delivery”);
int delivery = Integer.parseInt(stringDelivery);
comparingMachine.compare(currentVendor.getText(),
price,delivery);
}break;
}}
/**
* helper method: turn a string in a double
* @param string number as a string
* @return the number as a double, or 0.0 if it cannot convert
continues
Trang 9return stringDouble.doubleValue();
elsereturn 0.0;
}}
/*
* helper class: used to store a leaf element content
*/
class LeafElement{
* creates a new element
* @param n element’s name
* @param a element’s attributes
*/
public LeafElement(String n,AttributeList al){
name = n;
attributes = new Hashtable();
for(int i = 0;i < al.getLength();i++)attributes.put(al.getName(i),al.getValue(i));Listing 8.4: continued
Trang 10/**
* append to the current text
* @param chars array of characters
* @param start where to start in chars
* @param length how many characters to consider in chars
259 Maintaining the State
Trang 11You compile and run this application just like the “Cheapest” application introduced previously The results depend on the urgency of the delivery You will notice that this program takes two parameters: the filename and the longest delay.
java -classpath c:\xml4j\xml4j.jar;classes com.psol.xbe.BestDeal
➥product.xml 60
returns
The best deal is proposed by XMLi
a XML Training at 699.0 delivered in 45 days
whereas
java -classpath c:\xml4j\xml4j.jar;classes com.psol.xbe.BestDeal
➥product.xml 3
returns
The best deal is proposed by Emailaholic
a XML Training at 1999.0 delivered in 2 days
A Layered Architecture
Listing 8.4 is the most complex application you have seen so far It’s logical: The SAX parser is very low level so the application has to take over a lot of the work.
The application is organized around two classes: SAX2Internaland
ComparisonMachine SAX2Internalmanages the interface with the SAX parser It manages the state and groups several elements in a coherent way For that purpose, it uses the LeafElementclass as temporary storage.
ComparisonMachinehas the logic to perform price comparison It also tains information in a structure that is optimized for the application, not XML The architecture for this application is illustrated in Figure 8.7
main-SAX2Internalhandles several events from DocumentHandler It registers the
startElement, endElement, and characterevents.
O U T P U T
Trang 12Figure 8.7: The architecture for the application
When processing these events, SAX2Internal needs to know where it is in the document tree When handling a character event, for example, it needs
to know whether the text is attached to a name or to a price element It also needs to know whether the name is the product name or the vendor name
States
A SAX parser, unlike a DOM parser, does not provide context information Therefore, the application has to track its location within the document First, you have to identify all possible states and determine how to transi- tion from one state to the next It’s easy to derive states from the document structure in Figure 8.6.
It is obvious that the application will first encounter a product tag The first state should therefore be “within a product element.” From there, the application will reach a product name The second state is therefore “within
a name element in the product element.”
The next element has to be a vendor, so the third state is “within a vendor element in the product element.” The fourth state is “within a name ele- ment in a vendor element in a product element” because a name follows the vendor.
The name is followed by a price element and the corresponding state is
“within a price element in a vendor element in a product element.”
Afterward, the parser will encounter either a price element or another vendor element You already have state for these two cases.
It’s easier to visualize this concept on a graph with state and state tions, such as the one shown in Figure 8.8 Note that there are two differ- ent states related to two different name elements depending on whether you are dealing with the product/name or product/vendor/name.
transi-261 Maintaining the State
Trang 13Figure 8.8: State transition diagram
In the example, the statevariable is the current state:
case START:
if(name.equals(“product”))
E X A M P L E
E X A M P L E
Trang 14state = PRODUCT;
break;
case PRODUCT:
if(name.equals(“name”)){
state = PRODUCT_NAME;
currentElement = new LeafElement(name,attributes);
}if(name.equals(“vendor”))state = VENDOR;
break;
case VENDOR:
if(name.equals(“name”)){
state = VENDOR_NAME;
currentElement = new LeafElement(name,attributes);
}if(name.equals(“price”)){
state = VENDOR_PRICE;
currentElement = new LeafElement(name,attributes);
}break;
}}
SAX2Internalcreates instances of LeafElementto temporarily store the tent of the nameand priceelements At any time, SAX2Internalmaintains a small subset of the tree in memory Note that, unlike DOM, it never builds the complete tree but builds only small subsets It also discards the subset
con-as the application consumes them
C A U T I O N
The values in AttributeListare available only during the startElementevent
Consequently,LeafElementcopies them to a Dictionary
2 The character event is used to record the content of an element It makes sense to record text only in the nameand priceelements, so the event handler uses the state.
263 Maintaining the State
Trang 15switch(state) {
case PRODUCT_NAME:
if(name.equals(“name”)){
state = PRODUCT;
comparingMachine.setProductName(
currentElement.getText());
}break;
case VENDOR:
if(name.equals(“vendor”))state = PRODUCT;
break;
case VENDOR_NAME:
if(name.equals(“name”)){
state = VENDOR;
currentVendor = currentElement;
}break;
case VENDOR_PRICE:
if(name.equals(“price”)){
state = VENDOR;
double price = toDouble(currentElement.getText());
Dictionary attributes =currentElement.getAttributes();
Trang 16String stringDelivery =(String)attributes.get(“delivery”);
int delivery = Integer.parseInt(stringDelivery);
comparingMachine.compare(currentVendor.getText(),
price,delivery);
}break;
}
Lessons Learned
Listing 8.4 is typical for a SAX application The SAX event handler ages the data in the format most appropriate for the application It might have to build a partial tree (in this case, using LeafElement) in the process The application logic (in ComparisonMachine) is totally unaware of XML As far as it is concerned, the data could be coming from a database or a comma-delimited file.
pack-Because of this clean-cut separation between the application logic and the parsing, it is a good idea to adopt a layered approach and use a separate class for the event handler.
The example also clearly illustrates that SAX is more efficient than DOM but it requires more work from the programmer With a SAX parser, the programmer has to explicitly manage states and transitions between states With DOM, the state was implicit in the recursive walk of the tree
Flexibility XML is a very flexible standard However, in practice, XML applications are only as flexible as you, the programmer, make them In this section, we will look at some tips to ensure your applications exploit XML flexibility
Build for Flexibility
This application puts very few constraints on the structure of the ing document It simply ignores new elements For example, it would accept the following vendorelement:
underly-<vendor>
<name>Playfield Training</name>
<contact>John Doe</contact>
265 Flexibility
E X A M P L E
Trang 17Enforce a Structure
It’s not difficult to enforce a specific structure The following code snippet checks the structure and throws a SAXExceptionif a vendor element con- tains anything but name or price elements.
case VENDOR:
if(name.equals(“name”)){
state = VENDOR_NAME;
currentElement = new LeafElement(name,attributes);
}else if(name.equals(“price”)){
state = VENDOR_PRICE;
currentElement = new LeafElement(name,attributes);
}elsethrow new SAXException(“<name> or <price> expected”);
E X A M P L E
Trang 20This chapter looks at the mirror problem: how to write XML documents
from an application The mirror component for the parser is called a
gener-ator Whereas the parser reads XML documents, the generator writes them.
In this chapter, you learn how to write documents
• through DOM, which is ideal for modifying XML documents.
• through your own generator, which is more efficient.
The Parser Mirror
In practice, some parsers integrate a generator They can read and write
XML documents Consequently, the term parser is often used to symbolize
the combination of the parser and the generator.
There are two schools of thought when it comes to generators:
• The first school argues that you need packaged generators for the same reason you need packaged parsers: to shield the programmer from the XML syntax.
• The other school argues that writing XML documents is simple and can easily be done with ad hoc code.
As usual, I’m a pragmatist and I choose one option or the other depending
on the needs of the application at hand In general, however, it is cally easier to generate XML documents than to read them This is because you control what you write but the author controls what you read.
dramati-Indeed, when reading a document, you may have to deal not only with tags but also with entities, exotic character sets, and notations—not to mention errors and DTD validation.
Trang 21However, when writing the document, you decide If your applications don’t need entities, don’t use them If you are happy with ASCII, stick to it Most applications need few of the features of XML besides the tagging mecha- nism.
Therefore, although a typical XML parser is a thousand lines of code, a ple but effective generator can be written in a dozen lines.
sim-This chapter looks at both approaches You’ll start by using a DOM parser
to generate XML documents and then you’ll see how to write your own erator Finally, you will see how to support different DTDs.
gen-The techniques are illustrated with JavaScript but port easily in to Java Modifying a Document with DOM
In Chapter 7, “The Parser and DOM,” you saw how DOM parsers read uments That is only one half of DOM The other half is writing XML docu- ments DOM objects have methods to support creating or modifying XML documents.
doc-Listing 9.1 is the XML price list used in Chapter 7.
✔ The example in the section “A DOM Application” in Chapter 7 (page 199) converted theprices into Euros and printed the result
With small changes to the original application, you can record the new prices in the original document.
Listing 9.1: XML Price List
Trang 22<! need one character in the text area >
<TEXTAREA NAME=”output” ROWS=”22” COLS=”70” READONLY>
Listing 9.2 is familiar from Chapter 7 Listing 9.3 is the JavaScript file,
conversion.js, where the real action takes place Figure 9.1 shows the result in a browser.
271 Modifying a Document with DOM
Trang 23Listing 9.3: JavaScript Codefunction convert(form,xmldocument){
var fname = form.fname.value,output = form.output,rate = form.rate.value;
output.value = “”;
var document = parse(fname,xmldocument),topLevel = document.documentElement;walkNode(topLevel,document,rate);
addHeader(document,rate);
output.value = document.xml;
}
function parse(uri,xmldocument){
xmldocument.async = false;
xmldocument.load(uri);
if(xmldocument.parseError.errorCode != 0)alert(xmldocument.parseError.reason);
return xmldocument;
}
function walkNode(node,document,rate){
if(node.nodeType == 1){
if(node.nodeName == “product”)walkProduct(node,document,rate);else
{var children,i;
Trang 24children = node.childNodes;
for(i = 0;i < children.length;i++)walkNode(children.item(i),document,rate);
}}}
function walkProduct(node,document,rate){
if(node.nodeType == 1 && node.nodeName == “product”){
var price,children, i;
}// append the new child after looping to avoid infinite loopvar element = document.createElement(“price”),
text = document.createTextNode(getText(price) * rate);
function addHeader(document,rate){
var comment = document.createComment(
“Rate used for this conversion: “ + rate),stylesheet = document.createProcessingInstruction(
“xml-stylesheet”,
273 Modifying a Document with DOM
continues
Trang 25“href=\”prices.css\” type=\”text/css\””),topLevel = document.documentElement;
document.insertBefore(comment,topLevel);
document.insertBefore(stylesheet,comment);
}
function getText(node){
return node.firstChild.data;
}Listing 9.3: continued
O U T P U T
Figure 9.1: Result in a browser
This example displays the XML document in a form The section “Doing Something with the XML Documents” explains how to save it.
Inserting Nodes
1 Most of Listing 9.3 is familiar It walks through the price list and converts prices from American dollars to Euros The novelty is that it inserts a new price element in the price list with the price in Euros
It also adds a currencyattribute to every price element.
E X A M P L E
Trang 26function walkProduct(node,document,rate){
if(node.nodeType == 1 && node.nodeName == “product”){
var price,children,i;
}// append the new child after looping to avoid infinite loopvar element = document.createElement(“price”),
text = document.createTextNode(getText(price) * rate);
The DOM Documentobject has methods to create elements, comments, text nodes, processing instruction, and so on The walkProduct()function uses both createElement()and createTextNode().
The DOM Nodeobject has methods for adding and removing objects from the document tree Because most DOM objects are derived from Node, they inherit these methods The walkProduct()function uses appendChild()to insert the new nodes.
Finally, Elementhas a setAttribute()method that creates new attributes
C A U T I O N
Don’t add children to a node while looping through them, or you will create an infiniteloop
275 Modifying a Document with DOM