Rails for Java Developers phần 9 pot

These serve the same role as Java packages or Ruby modules, namespaces but the syntax is different: The namespace is http://www.relevancellc.com/sample.. Here’s the one-line YAML versi

Trang 1

XML is intended to be human-readable and self-describing XML is

human-readable because it is a text format, and it is self-describing

because data is described by elements such as <user>, <username>, elements

and <homepage>in the preceding example Another option for

repre-senting usernames and home pages would be XML attributes:

The attribute syntax is obviously more terse It also implies

seman-tic differences Attributes are unordered, while elements are ordered

Attributes are also limited in the values they may contain: Some

char-acters are illegal, and attributes cannot contain nested data (elements,

on the other hand, can nest arbitrarily deep)

There is one last wrinkle to consider with this simple XML document

What happens when it travels in the wide world and encounters other

elements named <user>? To prevent confusion, XML allows

names-paces These serve the same role as Java packages or Ruby modules, namespaces

but the syntax is different:

<rel :user xmlns:rel= "http://www.relevancellc.com/sample"

username= "stu"

homepage= "http://blogs.relevancellc.com" >

</rel :user>

The namespace is http://www.relevancellc.com/sample That would be a

lot to type in front of an element name, so xmlns:relestablishes relas a

prefix Reading the previous document, an XML wonk would say that

<user>is in thehttp://www.relevancellc.com/samplenamespace

YAML is a response to the complexity of XML (YAML stands for YAML

Ain’t Markup Language) YAML has many things in common with XML

Most important, both YAML and XML can be used to represent and

seri-alize complex, nested data structures What special advantages does

YAML offer?

The YAML criticism of XML boils down to a single sentence XML has

two concepts too many:

• There is no need for two different forms of nested data Elements

are enough

• There is no need for a distinct namespace concept; scoping is

suf-ficient for namespacing

To see why attributes and namespaces are superfluous in YAML, here

are three YAML variants of the same configuration file:

Trang 2

Download code/rails_xt/samples/why_yaml.rb

user:

username: stu

homepage: http://blogs.relevancellc.com

As you can see, YAML uses indentation for nesting This is more terse

than XML’s approach, which requires a closing tag

The second XML example used attributes to shorten the document to a

single line Here’s the one-line YAML version:

user: {username: stu, homepage: http://blogs.relevancellc.com}

The one-line syntax introduces{}as delimiters, but there is no semantic

distinction in the actual data Name/value data, called a simple

map-pingin YAML, is identical in the multiline and one-line documents simple mappingHere’s a YAML “namespace”:

http://www.relevancellc.com/sample:

user: {username: stu, homepage: http://blogs.relevancellc.com}

There is no special namespace construct in YAML, because scope

pro-vides a sufficient mechanism In the previous document, user belongs

to http://www.relevancellc.com/sample Replacing the words “belongs to”

with “is in the namespace” is a matter of taste

It is easy to convert from YAML to a Ruby object:

irb(main):001:0> require 'yaml'

=> true

irb(main):002:0> YAML.load("{username: stu}")

=> {"username"=>"stu"}

Or from a Ruby object to YAML:

irb(main):003:0> YAML.dump 'username'=>'stu'

=> " - \nusername: stu"

The leading - – \n:is a YAML document separator This is optional, and

we won’t be using it in Rails configuration files See the sidebar on the

next page for pointers to YAML’s constructs not covered here

Items in a YAML sequence are prefixed with’- ’:

- one

- two

- three

Trang 3

Data Formats: More Complexity

For Rails configuration, you may never need YAML knowledge

beyond this chapter But, if you delve into YAML as a

general-purpose data language, you will discover quite a bit more

complexity Here are a few areas of complexity, with XML’s

approach to the same issues included for comparison:

Complexity YAML Approach XML Approach

whitespace Annoying rules Annoying rules

Repeated data Aliases and anchors Entities, SOAP sect 5

If you are making architectural decisions about data formats,

you will want to understand these issues For YAML, a good

place to start is the YAML Cookbook.∗

∗ http://yaml4r.sourceforge.net/cookbook/

There is also a one-line syntax for sequences, which from a Ruby

per-spective could hardly be more convenient A single-line YAML sequence

is also a legal Ruby array:

irb(main):015:0> YAML.load("[1, 2, 3]")

=> [1, 2, 3]

irb(main):016:0> YAML.dump [1,2,3]

=> " - \n- 1\n- 2\n- 3"

Beware the significant whitespace, though! If you leave it out, you will

be in for a rude surprise:

irb(main):018:0> YAML.load("[1,2,3]")

=> [123]

Without the whitespace after each comma, the elements all got

com-pacted together YAML is persnickety about whitespace, out of

defer-ence to tradition that markup languages must have counterintuitive

whitespace rules With YAML there are two things to remember:

• Any time you see a single whitespace character that makes the

format prettier, the whitespace is probably significant to YAML

That’s YAML’s way of encouraging beauty in the world

• Tabs are illegal Turn them off in your editor

Trang 4

If you are running inside the Rails environment, YAML is even

eas-ier The YAML library is automatically imported, and all objects get a

In many situations, YAML’s syntax for serialization looks very much

like the literal syntax for creating hashes or arrays in some

(hypotheti-cal) scripting language This is no accident YAML’s similarity to script

syntax makes YAML easier to read, write, and parse Why not take this

similarity to its logical limit and create a data format that is also valid

source code in some language? JSON does exactly that

9.4 JSON and Rails

The JavaScript Object Notation (JSON) is a lightweight

data-inter-change format developed by Douglas Crockford JSON has several

rel-evant advantages for a web programmer JSON is a subset of legal

JavaScript code, which means that JSON can be evaluated in any

JavaScript-enabled web browser Here are a few examples of JSON

First, an array:

authors = [ 'Stu' , 'Justin' ]

And here is a collection of name/value pairs:

prices = {lemonade: 0.50, cookie: 0.75}

Unless you are severely sleep deprived, you are probably saying “This

looks almost exactly like YAML.” Right JSON is a legal subset of

Java-Script and also a legal subset of YAML (almost) JSON is much simpler

than even YAML—don’t expect to find anything like YAML’s anchors

and aliases In fact, the entire JSON format is documented in one short

web page athttp://www.json.org

JSON is useful as a data format for web services that will be

con-sumed by a JavaScript-enabled client and is particularly popular for

Ajax applications

Trang 5

Rails extends Ruby’s core classes to provide ato_jsonmethod:

If you need to convert from JSON into Ruby objects, you can parse

them as YAML, as described in Section9.3, YAML and XML Compared,

on page261 There are some corner cases where you need to be careful

that your YAML is legal JSON; see _why’s blog post4 for details

JSON and YAML are great for green-field projects, but many developers

are committed to an existing XML architecture Since XML does not look

like program source code, converting between XML and programming

language structures is an interesting challenge

It is to this challenge, XML parsing, that we turn next

9.5 XML Parsing

To use XML from an application, you need to process an XML

docu-ment, converting it into some kind of runtime object model This

pro-cess is called XML parsing Both Java and Ruby provide several differ- XML parsing

ent parsing APIs

Ruby’s standard library includes REXML, an XML parser that was

orig-inally based on a Java implementation called Electric XML REXML is

feature-rich and includes XPath 1.0 support plus tree, stream, SAX2,

pull, and lightweight APIs This section presents several examples using

REXML to read and write XML

Rails programs also have another choice for writing XML Builder is a

special-purpose library for writing XML and is covered in Section 9.7,

Creating XML with Builder, on page276

4 http://redhanded.hobix.com/inspect/jsonCloserToYamlButNoCigarThanksAlotWhitespace.html

Trang 6

The next several examples will parse this simple Ant build file:

Download code/Rake/simple_ant/build.xml

</target>

</target>

</target>

</project>

Each example will demonstrate a different approach to a simple task:

extracting aTargetobject withnameanddependsproperties

Push Parsing

First, we’ll look at a Java SAX (Simple API for XML) implementation

SAX parsers are “push” parsers; you provide a callback object, and

the parser pushes the data through various callback methods on that

object:

Download code/java_xt/src/xml/SAXDemo.java

throws ParserConfigurationException, SAXException, IOException {

SAXParserFactory f = SAXParserFactory.newInstance();

SAXParser sp = f.newSAXParser();

sp.parse(file, new DefaultHandler() {

String qname, Attributes attributes)

Trang 7

An REXML SAX approach looks like this:

Download code/rails_xt/samples/xml/sax_demo.rb

def get_targets(file)

targets = []

parser = SAX2Parser.new(file)

parser.listen(:start_element, %w{target}) do |u,l,q,atts|

targets << {:name=>atts[ 'name' ], :depends=>atts[ 'depends' ]}

end

parser.parse

targets

end

Even though they are implementing the same API, the Ruby and Java

approaches have two significant differences Where the Java

implemen-tation uses a factory, the Ruby implemenimplemen-tation instantiates the parser

directly And where the Java version uses an anonymous inner class,

the Ruby version uses a block

These language issues are discussed in the Joe Asks on page 272

and in Section 3.9, Functions, on page 92, respectively These

differ-ences will recur with the other XML parsers as well, but we won’t bring

them up again

There is also a smaller difference The Ruby version takes advantage

of one of Ruby’s many shortcut notations The %wshortcut provides a shortcut notationssimple syntax for creating an array of words For example:

irb(main):001:0> %w{these are words}

=> ["these", "are", "words"]

The %w syntax makes it convenient for Ruby’s start_element to take a

second argument, the elements in which we are interested Instead of

listening for all elements, the Ruby version looks only for the <target>

element that we care about:

Download code/rails_xt/samples/xml/sax_demo.rb

parser.listen(:start_element, %w{target}) do |u,l,q,atts|

Pull Parsing

A pull parser is the opposite of a push parser Instead of implementing

a callback API, you explicitly walk forward through an XML document

As you visit each node, you can call accessor methods to get more

infor-mation about that node

Trang 8

In Java, the pull parser is called the Streaming API for XML (StAX).

StAX is not part of the J2SE, but you can download it from the Java

Community Process website.5 Here is a StAX implementation of

getTar-get( ):

Download code/java_xt/src/xml/StAXDemo.java

- XMLInputFactory xif= XMLInputFactory.newInstance();

- XMLStreamReader xsr = xif.createXMLStreamReader( new FileInputStream(f));

- for ( int event = xsr.next();

Unlike the SAX example, the StAX version explicitly iterates over the

document by calling next( ) (line 6) Then, we detect whether we care

about the parser event in question by comparing theeventvalue to one

or more well-known constants (line 9)

Here’s the REXML pull version ofget_targets( ):

5 if event.start_element? and event[0] == 'target'

- targets << {:name=>event[1][ 'name' ], :depends=>event[1][ 'depends' ]}

Trang 9

As with the StAX example, the REXML version explicitly iterates over

the document nodes Of course, the REXML version takes advantage

of Ruby’s each( ) (line 4) Where StAX provided an event number and

well-known constants to compare with, the REXML version provides an

actual event object, with boolean accessors such as start_element? for

the different event types (line 5)

Despite their API differences, push and pull parsers have a lot in

com-mon They both move in one direction, forward through the document

This can be efficient if you can process nodes one at a time, without

needing content or state from elsewhere in the document If you need

random access to document nodes, you will probably want to use a tree

parser, discussed next

Tree Parsing

Tree parsers represent an XML document as a tree in memory,

typi-cally loading in the entire document Tree parsers allow more

power-ful navigation than push parsers, because you have random access to

the entire document On the other hand, tree parsers tend to be more

expensive and may be overkill for simple operations

Tree parser APIs come in two flavors: the DOM and everything else The

Document Object Model (DOM) is a W3C specification and aspires to

be programming language neutral Many programming languages also

offer a tree parsing API that takes better advantage of specific language

features Here is thebuild.xmlexample implemented with Java’s built-in

- Document doc = db.parse(file);

5 NodeList nl = doc.getElementsByTagName( "target" );

- Target[] targets = new Target[nl.getLength()];

- for ( int n=0; n<nl.getLength(); n++) {

- Element e = (Element) nl.item(0);

Trang 10

The Java version finds users withgetElementsByTagName( ) in line 5 The

value returned is a NodeList, which is a DOM-specific class Since the

DOM is language-neutral, it does not support Java’s iterators, and

loop-ing over the nodes requires aforloop (line 7)

Next, using REXML’s tree API, here is the code:

Download code/rails_xt/samples/xml/dom_demo.rb

- Document.new(file).elements.each( "//target" ) do |e|

- targets << {:name=>e.attributes[ "name" ],

5 :depends=>e.attributes[ "depends" ]}

REXML does not adhere to the DOM Instead, the elements( ) method

returns an object that supports XPath In XPath, the expression//target

matches all elements named target Building atop XPath, iteration can

then be performed in normal Ruby style witheach( ) (line 3)

Of course, Java supports XPath too, as you will see in the following

section

XPath

XML documents have a hierarchical structure, much like the file

sys-tem on a computer File syssys-tems have a standard notation for

address-ing specific files For example, path/to/foorefers to the file foo, in the

to directory, in the path Better yet, shell programs use wildcards to

address multiple files at once:path/*refers to all files contained in the

pathdirectory

The XML Path Language (XPath) brings path addressing to XML XPath

is a W3C Recommendation for addressing parts of an XML document

(seehttp://www.w3.org/TR/xpath.html)

The previous section showed a trivial XPath example, using //targetto

select all <target>elements Our purpose here is to show how to access

the XPath API using Java and Ruby, not to learn the XPath language

itself Nevertheless we feel compelled to pick a slightly more interesting

example

Trang 11

Joe Asks .

Why Are the Java XML Examples So Verbose?

The Ruby XML examples are so tight that you have to expect there’s a

catch Are the Ruby XML APIs missing something important?

What the Java versions have, and the Ruby versions lack utterly,

is abstract factories Many Java APIs expose their key objects via

abstract factories Instead of sayingnew Document, we say

Document-BuilderFactory.someFactoryMethod() The purpose of factory methods in

this context is keep our options open If we want to switch

implemen-tations later, to different parser, we can reconfigure the factory

with-out changing a line of code On the other hand, callingnewlimits your

options Sayingnew Foo()gives you aFoo, period You can’t change

your mind and get subclass ofFooor a mock object for testing

The Ruby language is designed so that abstract factories are generally

unnecessary, for three reasons:

• In Ruby, the newmethod can return anything you want Most

important,newcan return instances of a different class, so

choos-ingnewnow does not limit your options

• Ruby objects are duck-typed (see Section3.7, Duck Typing, on

page89) Since objects are defined by what they can do, rather

than what they are named, it is easier to change your mind and

have one kind of object stand in for another

• Ruby classes are open Choosing Foo now doesn’t limit your

options later, because you can always reopenFooand tweak

its behavior

In Java, having to choose between abstract factories andnew

under-mines agility A central agile theme is “Build what you need now, in

a way that can easily evolve to what you discover you need next

week.” For every new class, we have to make a Big Up-Front

Deci-sion (BUFD, often also BFUD) “Will it need pluggable implementations

later?” If yes, use factory If no, callnew The more BUFDs a language

avoids, the easier it is to be agile In Java’s defense, you can avoid

the dilemma posed by abstract factories in several ways You can skip

factories and use delegation behind the scenes to select alternate

implementations A great example is the JDOM (http://www.jdom.org),

which is much easier to use than the J2SE APIs With Aspect-Oriented

Programming (AOP), you can unmake past decisions by weaving in

new decisions With Dependency Injection (DI), you can pull

configu-ration choices out of your code entirely Pointers to more reading on

all this are in the references section at the end of the chapter

Trang 12

The following Java program finds the name of all <target> elements

whosedependsattribute isprepare:

String[] results = new String[nl.getLength()];

- for ( int n=0; n<nl.getLength(); n++) {

15 results[n] = nl.item(n).getNodeValue();

Java’s XPath support builds on top of its DOM support, so most of

this code should look familiar Starting on line 4 you will see several

lines of factory code to create the relevant DOM and XPath objects The

actual business of the method is conducted on line 10 when the XPath

expression is evaluated The results are in the form of aNodeList, so the

iteration beginning on line 13 is nothing new either

Ruby’s XPath code also builds on top of the tree API you have already

That’s it Just one line of code The XPath API in Ruby is all business,

no boilerplate In fact, the syntax can be made even tighter, as shown

in the sidebar on the next page

Trang 13

The Symbol#to_proc Trick

You may be thinking that this Ruby XPath example is a bit too

The Rails team thought so and provided another syntax to be

used when invoking blocks:

XPath.match(Document.new(file),

"//target[@depends='prepare']/@name" ).map(&:value)

The new syntax &:value takes advantage of Ruby’s alternate

syntax for passing blocks, by passing an explicitProcobject (A

Proc is a block instantiated as a class so you can manipulate

it in normal Ruby ways.) Of course, :value is not a Proc; it’s a

Symbol! Rails finesses this by defining an implicit conversion from

The Symbol#to_proc trick is interesting because it demonstrates

an important facet of Ruby The Ruby language encourages

modifications to its syntax Framework designers such as the

Rails team do not have to accept Ruby “as is.” They can bend

the language to meet their needs

Định dạng
Số trang	27
Dung lượng	192,04 KB