In this file, you can, for example, find the name doc-of the document, its keywords and statistics like thetotal number of words that it contains.. FEEAATTUURREE Generating OpenOffice.or
Trang 1JANUARY 2005 VOLUME IV - ISSUE 1
The Magazine For PHP Professionals
TM
Trang 461 Tips & Tricks
Javascript Remote Scripting with PHP
Trang 6EDDIITTOORRIIAALL
Anew year is upon us—and quite a few interesting things
have already happened We just published our first book, for
example The Zend PHP Certification Practice Test Book, which
I co-wrote with John Coggeshall, has just been unleashed on the
PHP community with (if I may unleash some personal pride)
extremely good results In a separate—but far more important—
piece of news, PHP was named “language of the year 2004” by a
site that tracks language usage in the development community
PHP 5 continues to plow along quickly and efficiently, with a new
point release scheduled for release soon that will introduce some
much-anticipated new functionality
However you look at it, 2005 is poised to be a marquee year for
PHP There is so much going on that I can hardly keep my head
around it and continue my daily activity here at php|a
headquar-ters (although I must say that the two weeks of vacation I took
around Christmas were very helpful in getting my head wrapped
around doing absolutely nothing What a pleasant change of
pace…) On the other hand, 2004 was, in many ways a marquee
year for PHP as well This just highlights the positive direction that
the language is taking, shaped in many ways by the vast amount
of work that everyone in the community—even those who can just
be found complaining on the mailing lists—has put into defining
its goals and needs
Here at php|a, there are three important news items that I want
to share with you this month
First of all, John W Holmes, who has taken care of our Tips &
Tricks column since the very first issue, is leaving us John is a
Captain in the U.S Army, and his “day job” is keeping him way too
busy to deal with such a demanding column on a monthly basis I
know you expected me to say this, but I really, really enjoyed
work-ing with John He has the (unfortunately rare) ability to be
techni-cally accurate and linguistitechni-cally clear—his writings were always as
pleasant to read as they were to edit Luckily, the T&T column is
far from over—but more about that will have to wait for another
editorial Thank you, John, and Godspeed
Just to stay on the editorial front, we have a new column
start-ing this month, titled Test Pattern and penned by Marcus Baker In
his column, Marcus will be dealing with the issue of proper
soft-ware design as applied to PHP development, from patterns, to
tier-ing, to testing The goal of this column is to challenge you, dear
readers, not just to write more efficient, but also more beautiful
code—to make every single one of your applications a little work
of art that is well-thought-out, properly designed and executed
flawlessly Marcus has an awesome job ahead of him, but, then
again, he is an awesome fellow, so I’m sure that you’ll enjoy his
writings
Finally, you may have heard about some recent security issues
that have struck both PHP and PHP-based applications,
notably the popular forum software phpBB The reaction to
these issues has been less than stellar, in my humble opinion,
php|architect
Volume IV - Issue 1 January, 2005
Publisher
Marco Tabini
Editorial Team
Arbi Arzoumani Peter MacIntyre Eddie Peloke
Graphics & Layout
php|architect (ISSN 1709-7169) is published twelve times a year by Marco Tabini & Associates, Inc., P.O Box 54526, 1771 Avenue Road, Toronto, ON M5M 4N5, Canada
Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, list- ings and figures, the publisher assumes no responsibilities with regards
of use of the information contained herein or in all associated material.
Contact Information:
General mailbox: info@phparch.com
Editorial: editors@phparch.com
Subscriptions: subs@phparch.com
Sales & advertising: sales@phparch.com
Technical support: support@phparch.com
Copyright © 2003-2004 Marco Tabini & Associates, Inc — All Rights Reserved
Trang 7NEEWW SSTTUUFFFF
What’s New!
php|architect launches php| tropics 2005
Ever wonder what it's like to learn PHP in paradise? Well, this year we've decided to give you a chance to find out!
We're proud to announce php|tropics 2005, a new conference that will take place between May 11-15 at the Moon Palace Resort in Cancun, Mexico The Moon Palace is an all- inclusive (yes, we said all inclusive!) resort with over 100 acres of ground and 3,000 ft of private beach, as well as excellent state-of-the-art meeting facilities
As always, we've planned an in-depth set of tracks for you, combined with a generous amount of downtime for your enjoyment (and your family's, if you can take them along with you).
We even have a very special early-bird fee in effect for a limited time only.
For more information, go to http://www.phparch.com/tropics
Zend Technologies Unveils Integrated Software Platform
Zend has announced the unveiling of Zend Platform 1.1
Zend Technologies, Inc., creator and ongoing innovator of PHP, products and services supporting the
development, deployment and management of PHP-based applications, today unveiled Zend Platform 1.1.
The newest member in the Zend family of products is the first integrated software platform that supports
the reliability, scalability and interoperability requirements of business critical PHP applications The
uct was developed based on direct feedback from hundreds of Zend customers currently using Zend
prod-ucts to develop and manage corporate applications, and is currently in use at Zend customer sites Zend
Platform adds a wide range of new functionality that speeds time to production and improves end user
satisfaction by increasing the overall performance of enterprise applications Zend Platform 1.1 is available
immediately.
"As PHP matures and evolves, the need for an integrated solution for building and deploying business
critical applications becomes more relevant," said Pamela Roussos, vice president of marketing at Zend
Zend Platform is the first comprehensive lifecycle management solution for PHP users and is the only
next generation infrastructure product that directly supports the development and deployment of
busi-ness critical enterprise PHP applications The feedback from customers was critical in our development of
this solution, and directly addresses the needs in our user community."
For more information visit: h http://www.zend.com
The Zend PHP Certification Practice Test Book is now available!
We're happy to announce that, after many months of hard work, the Zend PHP Certification Practice Test Book, written by John Coggeshall and Marco Tabini, is now available for sale from our website and most book sellers worldwide!
The book provides 200 questions designed as a learning and practice tool for the Zend PHP Certification exam Each question has been written and edited by four members of the Zend Education Board the very same group who prepared the exam The questions, which cover every topic in the exam, come with a detailed answer that explains not only the correct choice, but also the question's intention, pitfalls and the best strategy for tackling similar topics during the exam.
For more information, visit h http://www.phparch.com/cert/mock_testing.php p
Trang 8The development team has also released a new website for better information and communication purposes.
New is the possibility to download struts4php in the current version as PEAR Package under
h
http://www.struts4php.org/pear/struts4php-current.tgz
For more information visit: h http.//www.struts4php.org
Check out some of the hottest new releases from PEAR.
PEAR 1.3.4
PEAR 1.3.4 fixes a serious problem caused by a bug in all versions of PHP that caused multiple registration of the shutdown function of PEAR.php, makes pear help listing more useful by putting the how-to-use info at the bottom of the listing, and several bug fixes.
Net_Monitor 0.0.7
A unified interface for checking the availability services on external servers and sending meaningful alerts through a variety of media if
a service becomes unavailable
I18Nv2 0.10.0
This package provides basic support to localize your application, like locale based formatting of dates, numbers and currencies.
Beside that it attempts to provide an OS independent way to sseettllooccaallee(()) and aims to provide language, country and currency names translated into many languages.
Net_FTP 1.3.0RC2
Net_FTP allows you to communicate with FTP servers in a more comfortable waythan the native FTP functions of PHP do The class implements everything nativly supported by PHP and additionally features like recursive up- and downloading, dircreation and chmod- ding It although implements an observer pattern to allow for example the view of a progress bar
PHP_Fork 0.2.0
PHP_Fork class Wrapper around the ppccnnttll ffoorrkk(()) stuff with a API set like Java language.
Practical usage is done by extending this class, and re-defining the rruunn(()) method.
[see basic example]
This way PHP developers can enclose logic into a class that extends PHP_Fork, then execute the ssttaarrtt(()) method that forks a child
process Communications with the forked process is ensured by using a Shared Memory Segment; by using a user-defined signal and this shared memory developers can access to child process methods that returns a serializable variable.
The shared variable space can be accessed with the two methods:
• void setVariable($name, $value)
• mixed getVariable($name)
$$nnaammee must be a valid PHP variable name;
$$vvaalluuee must be a variable or a serializable object.
Resources (db connections, streams, etc.) cannot be serialized and so they’re not correctly handled.
Requires PHP build with ——eennaabbllee ccllii ——wwiitthh ppccnnttll ——eennaabbllee sshhmmoopp
Only runs on *NIX systems, because Windows lacks of the pcntl ext.
Php 4.3.10 and 5.0.3 Released
The PHP Development Team would like to announce the immediate release of PHP 4.3.10 and PHP 5.0.3 These are maintenance
releases that in addition to non-critical bug fixes address several very serious security issues All Users of PHP are strongly encouraged
to upgrade to one of these releases as soon as possible
For changes since PHP 4.3.9, please consult the PHP 4 ChangeLog For changes since PHP 5.0.2, please consult the PHP 5 ChangeLog For more information, visit h http://www.php.net t
Trang 9NEEWW SSTTUUFFFF
MySQL Query Browser 1.1.5
MySQL.com announces the latest relase of the MySQL Query Browser MySQL.com claims: ” MySQL Query
Browser is the easiest visual tool for creating, executing, and optimizing SQL queries for your MySQL
Database Server The MySQL Query Browser gives you a complete set of drag-and-drop tools to visually
build, analyze and manage your queries.”
For more information or to download, visit: h http://www.mysql.com/products/query-browser
PHP awarded programming language of 2004
A new post on php.net announces PHP being awarded programming language of the
year.
PHP has been awarded the Programming Language of 2004, according to the TIOBE
Programming Community Index This index uses information collected from the
popu-lar search engines, and are based on the world-wide availability of skilled engineers,
courses and third party vendors Congratulations to us all!
For more information visit: w www.php.net
eAccelerator 0.9.1
eAccelerator announces their latest
release 0.9.1.
What is it? The eAccelerator sourceforge
page describes it as: ” a further development from mmcache PHP Accelerator &
Encoder It increases [the] performance of PHP scripts by caching them in compiled
state, so that the overhead of compiling is almost completely eliminated.”
Get more information from h http://eaccelerator.sourceforge.net/Home e
Looking for a new PHP Extension? Check out some of the lastest offerings from PECL.
WinBinder 0.34.117
WinBinder is a new extension that allows PHP programmers to build native Windows
applications It wraps the Windows API in a lightweight, easy-to-use library so that
program creation is quick and straightforward
mailparse 2.1
Mailparse is an extension for parsing and working with email messages.
It can deal with rfc822 and rfc2045 (MIME) compliant messages.
newt 0.3
PHP-NEWT - PHP language extension for RedHat Newt library, a terminal-based
win-dow and widget library for writing applications with user friendly interface Once this
extension is enabled in PHP it will provide the use of Newt widgets, such as windows,
buttons, checkboxes, radiobuttons, labels, editboxes, scrolls, textareas, scales, etc.
Use of this extension if very similar to the original Newt API of C programming
lan-guage
ssh2 0.4.1
Provides bindings to the functions of libssh2 which implements the SSH2 protocol.
libssh2 is available from h http://www.sourceforge.net/projects/libssh2
phpMyFAQ 1.5.0 Alpha 2
phpmyfaq.de announces “The second alpha version of phpMyFAQ 1.5.0 is available This
ver-sion is PHP5 compatible and introduces a faster template engine LDAP support is now a
selec-table option and the traditional Chinese and Japanese language files were updated Beside
some code improvements we fixed a lot of bugs Do not use this version in production systems, but test this version and report bugs!” Get the latest info from p phpmyfaq.de
with the blame bucket beingpassed around several handsrather than focusing everyone’senergy on ensuring not onlythat bugs would be fixed, butthat people would be properlyinformed and made aware ofthem
For my part, I’ve decided that
we should help with what weknow how to do best: inform-ing people On January 1st (talkabout a New Year resolution!),
we started a new mailing listdedicated exclusively to PHPsecurity You can read moreabout it in this month’s exit(0)column (which you’ll find atthe end of the magazine), so Iwon’t bore you here by dupli-cating the details I simplyhope that the mailing list willhelp everyone keep securitymore in check; PHP hasbecome so popular that we can
no longer afford to hidebeneath the folds of the Weband hope that no-one will findout about any of our weakness-es—they will, and we must beready to deal with the conse-quences
Until next month, happy ings!
read-Editorial
Hello Goodbye
Continued
Trang 10Getting started
Before we get started with the actual coding, we need
to get an overview of the format we are working with
OpenOffice.org documents are XML files stored inside
ZIP archives In the list below, you can see the
directo-ry structure of the files inside an OpenOffice.org
docu-ment When you unzip this document, you will simply
get some plain XML files and directories
Of course, in order to generate an OpenOffice.org
doc-ument, we need to perform this process in reverse and
create a ZIP file from the XML files that we create In
this article, I will show you exactly how you can use PHP
to produce all the different files required to form a valid
OpenOffice.org Write document
The files shown in the example above are the bare
minimum required to make up a document that
OpenOffice.org will recognize as valid If you, for
exam-ple, have embedded an image inside your document,
then this is also stored as a separate file in a foldernamed PPiiccttuurreess (not shown above)
The ccoonntteenntt xxmmll file contains the actual text in yourdocument Headers, paragraphs, lists and tables arerecorded in an XML format for content that is well doc-umented, unlike the formats used by competitors likeMicrosoft Word This is the file on which this article willfocus for most of the time
The ssttyylleess xxmmll file contains the definition of thefonts, colors, sizes and other stylistic elements used bythe document To draw a parallel with HTML, thiswould be the equivalent of a CSS file, while ccoonntteenntt xxmmllwould be the equivalent of the HTML document towhich the stylesheet applies
mmeettaa xxmmll contains “meta” information about the ument In this file, you can, for example, find the name
doc-of the document, its keywords and statistics like thetotal number of words that it contains You can alsostore meta information, like Dublin Core, in mmeettaa xxmmll
As you might know, OpenOffice.org is getting more and
more users This article will show you how you can
gener-ate documents from PHP that the Writer component of
OpenOffice.org can read It’s a follow up to the author’s
previous article, which appeared in the October 2004
issue of php|architect and dealt with extracting
informa-tion from OpenOffice documents.
Trang 11FEEAATTUURREE
Generating OpenOffice.org Documents with PHP
Dublin Core is a meta data standard that defines a
generic set of attributes used to describe a specific
piece of information
SSeettttiinnggss xxmmll is actually intended for the
OpenOffice.org editor itself It’s used to store GUI
set-tings and, since this has nothing to do with content, we
are not going to look into how we can store arbitrary
settings into this file
MMaanniiffeesstt xxmmll is a simple XML file that contains a
ref-erence to each file that makes up the document The
type of the document is stored in the mmiimmeettyyppee file,
which contains a single line with the MIME
denomina-tion of the document’s type, such as, for example,
“application/vnd.sun.xml.writer” for a Writer file
Getting the Content Right
Since the most important part of our document is
(obviously) its content, I will start by showing you how
we can generate a minimal ccoonntteenntt xxmmll file As you can
see in Listing 1, after the document type definitions we
get to the main element, ooffffiiccee::ddooccuummeenntt ccoonntteenntt,
which, in turn, contains optional elements like
ooffffiiccee::ssccrriipptt, ooffffiiccee::ffoonntt ddeeccllss and
ooffffiiccee::aauuttoo mmaattiicc ssttyylleess In our example, they are empty
The ooffffiiccee::bbooddyy element contains the document’s
contents themselves, but, before we get into its details,
we should also mention the tteexxtt::sseeqquueennccee ddeeccllss
ele-ment, which is used for numbering items in the
docu-ment and defining in which order different items are
numbered In our sample document, I have just
sup-plied the default order
Immediately after the sequence declaration, we insert
our actual content in the ooffffiiccee::bbooddyy element In this
minimal document, I have just added a small
para-graph that displays the text “Hello World!”
Paragraphs and Inline Styles
The most common element that is usually added a ument is a simple paragraph In the code snippedbelow, you can see the basic syntax of a minimal para-graph:
doc-<text:p text:style-name=’Standard’>Hello World!</text:p>
Notice that the namespace prefix tteexxtt is used This isused for all textual elements in the document—whichmakes it very simple to extract all textual informationfrom it The paragraph is also styled with the SSttaannddaarrddstyle The definition of this style can be found in the inthe ssttyylleess xxmmll file
If you need to add whitespace in your text, you canuse the tteexxtt::ss element This is the spacing element:you define how many characters this space takes up inthe tteexxtt::cc attribute See the example below for a sim-ple space definition which takes up 4 characters:
<text:s text:c=”4”/>
Inside paragraphs, you normally have formatting ments like bold, italic and underline InOpenOffice.org, these styles do not have any matchingtags—a tteexxtt::ssppaann tag to which different styles areattached is used instead All unique spans are given astyle name with the tteexxtt::ssttyyllee nnaammee attribute The def-inition of these styles if actually found in theccoonntteenntt xxmmll files under the ooffffiiccee::aauuttoommaattiicc ssttyylleess
ele-1 <?xml version =’1.0’ encoding =’UTF-8’ ?>
2 <!DOCTYPE office:document-content PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument1.0//EN’ ‘office.dtd’>
24 <text:sequence-decl text:display-outline-level=’0’ text:name=’Illustration’/>
25 <text:sequence-decl text:display-outline-level=’0’ text:name=’Table’/>
26 <text:sequence-decl text:display-outline-level=’0’ text:name=’Text’/>
27 <text:sequence-decl text:display-outline-level=’0’ text:name=’Drawing’/>
Trang 12element To make some text bold, we therefore first
need to create a definition of the style, for example like
the one shown below:
Every style that is part of the automatic styles group is
defined with the ssttyyllee::ssttyyllee element and given a
unique name Its properties are set using the
ssttyyllee::pprrooppeerrttiieess element—that is, the same way as all
styles are defined in ssttyylleess xxmmll In our style, we simply
defined the ffoo::ffoonntt wweeiigghhtt to be bboolldd This is the basic
definition of bold text
Once we have created a style definition for it, we can
use the T1 style to mark bold text as such We simply
add a span element around the text we want to affect
and set the style name to TT11 Below, you see an
exam-ple of how we mark text as being bold in our content:
<text:p text:style-name=’Standard’>Here are
<text:span text:style-name=”T1”>some bold
text.</text:span></text:p>
Headers
Headers are important parts of any document, as they
are used to structure content They are defined at the
same level as paragraphs using the tteexxtt::hh element As
with all other text elements, you define the style of the
element separately In addition, you need to define the
level of the header using the tteexxtt::lleevveell attribute For
example, here’s an example of a Level-1 header
ele-ment To create a header with another level you simply
change the level and style attribute
<text:h text:style-name=”Heading 1” text:level=”1”>A
header</text:h>
Images
Of course, any nice-looking document contains images
The OpenOffice.org document format is a collection of
files, and images are no exception You need to put any
image you want displayed in your documents in a
sub-directory called PPiiccttuurreess This image, in turn, needs to
be referenced in the ccoonntteenntt xxmmll file You can place
images inside paragraphs using the ddrraaww::iimmaaggee
ment The XML code below shows a typical image
is displayed in the document We also need to supplythe path to the image, which is relative from the root
of your document package; this is done with thexxlliinnkk::hhrreeff attribute
In order to calculate the size of the image, we need
to translate pixels into inches To do so, we must findthe size of the image in pixels, which can be done (forexample) using the ggeettiimmaaggeessiizzee(())standard PHP func-tion We also need the DPI (dot-per-inch) settings forthe image For example, 75 is a common setting forlow-resolution images If you use a high quality printer
or publisher, images usually need to be at least 300DPI.Once we know how many dots (which in our case is thesame as pixels) we have, we can easily calculate the size
of the image in inches The formula for this is:
Size in inches = pixel width/DPI
For example, a 300-pixel-wide image printed at 75DPIwill be 300 / 75 = 4 inches wide The same calculation
is used for the height In PHP, the size calculation can
be done as shown below
$fileName = “/path/to/myimage.jpg”;
$sizeArray = getimagesize( $fileName );
$width = $sizeArray[0] / 75;
$height = $sizeArray[1] / 75;
Note that this is just one example You can also use the
EXIF extension to retrieve the number of DPI
Lists
You may also want to have some lists inside your ment These are placed at the same level as paragraphsand headers in your the content XML file You can haveboth unordered and ordered lists using thetteexxtt::uunnoorrddeerreedd lliisstt and tteexxtt::oorrddeerreedd lliisstt elementsrespectively The list contains one or more tteexxtt::lliisstt iitteemm elements, which, in turn, enclose the content foreach list item—normally, just a paragraph with sometext inside, but (at least in theory) as complex as youneed it to be The XML snippet below shows anunordered list containing two elements You can seethat lists, like all other block elements, are also styledwith the tteexxtt::ssttyyllee nnaammee attribute OpenOffice.orgnormally uses LL11, LL22, and so on for naming list styles,but that’s just an arbitrary convention you can choose
docu-to ignore in favour of your own flavour if you like
Trang 13FEEAATTUURREE
Generating OpenOffice.org Documents with PHP
<text:p text:style-name=”P1”>Item text</text:p>
Writing the XML file
We have now looked at how we can build the
ccoonn tteenntt xxmmll file, which is the most important portion of
our document In fact, if we do not care about what
styles are used, the only thing we have to worry about
is the content file—and use just a set of standard
tem-plates for the remainder ofthe document
components
In my previous article about OpenOffice.org and PHP,
I showed how you can use PHP to parse the XML files
using a standard PHP DOM XML parser and XSLT
trans-formations We can also use a DOM XML library to
gen-erate our XML documents, but, in this article, I will use
a much simpler approach: we’ll just write the XML textdirectly to a file
In my production code, I also do not use a DOMlibrary to generate the XML, since doing so wouldmake the code quite a bit slower My code, in fact, is assimple as the one shown in the snippets below.Essentially, I place the XML code making up our ccoonn tteenntt xxmmll file in a variable and then write it to disk Inthe code snippet below, for example, I named the vari-able $$ccoonntteennttXXMMLL:
You can see a complete (but still simple) XML
docu-1 <?xml version =’1.0’ encoding =’UTF-8’ ?>
2 <!DOCTYPE office:document-content PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument1.0//EN’ ‘office.dtd’>
28 <text:sequence-decl text:display-outline-level=’0’ text:name=’Illustration’/>
29 <text:sequence-decl text:display-outline-level=’0’ text:name=’Table’/>
30 <text:sequence-decl text:display-outline-level=’0’ text:name=’Text’/>
31 <text:sequence-decl text:display-outline-level=’0’ text:name=’Drawing’/>
32 </text:sequence-decls>
33 <text:h text:style-name=”Heading 1” text:level=”1”>A header</text:h>
34 <text:p text:style-name=’Standard’>Hello World!</text:p>
35 <text:p text:style-name=’Standard’>Some spaces: <text:s text:c=”4”/></text:p>
36 <text:p text:style-name=’Standard’>Here are <text:span text:style-name=”T1”>some bold text.</text:span></text:p>
Trang 14FEEAATTUURREE
Generating OpenOffice.org Documents with PHP
ment for ccoonntteenntt xxmmll in Listing 2
If you are using PHP 5, you can actually save quite a
few lines of code by using the ffiillee ppuutt ccoonntteennttss(())
function instead of opening and writing to the file
man-ually An example of this approach is shown in the code
snippet below This little script performs the same
oper-ation as the first example: first, it opens the file; then
writes the contents to it, and finally closes it—it just
does so with only two lines of code We do not need to
make it any harder than this, and we’ll use the same
technique to generate the other XML files that are part
of our document as well
$contentXML = “<?xml version=’1.0’ ”;
file_put_contents( “path/to/content.xml”, $contentXML
);
Styling our Document
If you want to introduce your own custom design
ele-ments in your document, you’ll need to work with the
ssttyylleess xxmmll file As mentioned earlier, this is where you
define fonts, colors, sizes and positions, and so on
Let’s use headers as an example In Listing 3, you can
see a minimal ssttyylleess xxmmll file containing the style
defi-nition for “Header 1.” Our style defidefi-nitions are placed
directly under the ooffffiiccee::ssttyyllee element
A style is defined with the element ssttyyllee::ssttyyllee Eachstyle needs a unique name, which is set by using thessttyyllee::nnaammee attribute The ssttyyllee::ffaammiillyy attribute, onthe other hand, defines the type of style used Forheaders, we use paragraph styles, since headers arerendered basically in the same way as a paragraph.Every style has a “parent” style, which is used as a base
to set the underlying “defaults” of the style The parentstyle is defined with the ssttyyllee::ppaarreenntt ssttyyllee nnaammeeattribute Finally the last attribute in our style definition
is ssttyyllee::ttyyppee, which, in our case is just “text”
The actual properties of the style, such as font sizeand color, are defined in the ssttyyllee::pprrooppeerrttiieess element,which is a child of ssttyyllee::ssttyyllee In our example, wehave defined the font size with the ffoo::ffoonntt ssiizzee ele-ment, the font weight as bold with the ffoo::ffoonntt wweeiigghhttattribute and, finally, the color with the ffoo::ccoolloorr attrib-ute The color is defined in a hex triplet—that’s thesame way as you define colors in CSS and HTML, which
is something you’re probably familiar with
The Manifest
The mmaanniiffeesstt xxmmll file contains information about all thefiles that make up the document This is an XML filewith two basic elements The main node, called mmaannii ffeesstt::mmaanniiffeesstt, contains one element for each file The
1 <?xml version =’1.0’ encoding =’UTF-8’ ?>
2 <!DOCTYPE office:document-styles PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument 1.0//EN’ ‘office.dtd’>
1 <?xml version =’1.0’ encoding =’UTF-8’ ?>
2 <!DOCTYPE manifest:manifest PUBLIC ‘-//OpenOffice.org//DTD Manifest 1.0//EN’ ‘Manifest.dtd’>
3 <manifest:manifest xmlns:manifest=’http://openoffice.org/2001/manifest’>
4 <manifest:file-entry manifest:media-type=’application/vnd.sun.xml.writer’ manifest:full-path=’/’/>
5 <manifest:file-entry manifest:media-type=’’ manifest:full-path=’Pictures/’/>
6 <manifest:file-entry manifest:media-type=”image/gif” manifest:full-path=”Pictures/myimage.jpg”/>
7 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’content.xml’/>
8 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’styles.xml’/>
9 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’meta.xml’/>
10 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’settings.xml’/>
11 </manifest:manifest>
Listing 4
Trang 15specific files are then defined using the
mmaanniiffeesstt::ffiillee eennttrryy element The mimetype of each file is declared in
the mmaanniiffeesstt::mmeeddiiaa ttyyppee attribute, and its path with
mmaanniiffeesstt::ffuullll ppaatthh Listing 4 contains the
mmaanniiffeesstt xxmmll file for our basic document with one
image
Meta information
If we want to add meta information to our document,
we can store it in mmeettaa xxmmll At a minimum, information
like document generator, creation date, number of
edits and document statistics is stored here However,
you can store much more data in this file if you like For
example, if you need to comply with the Dublin Core
meta data standard, you can store that information in
this file You can see an example in Listing 5
Settings
The sseettttiinnggss xxmmll file is only used by the
OpenOffice.org writer editor to remember which
tool-bars were open and which GUI settings the user had
enabled when a document was last being edited Since
we won’t be doing any editing, we are not going to
look much into this document other than to generate a
very minimally valid version of it Listing 6 shows the
minimal XML document for sseettttiinnggss xxmmll—you can
just hard-code this value in a variable and reuse it as
needed without worrying about it anymore
Packing the Final Document
We have now looked at what kind of XML is used in allthe different files that make up an OpenOffice.org doc-ument The ccoonntteenntt xxmmll and ssttyylleess xxmmll files, in thatorder, are by far the most important portions of thedocument I also showed you how you can simplymanipulate your XML in PHP without the aid of anyspecialized library, so, if you have stored all the XMLfiles in one directory and done the same with your pic-ture, we are ready to package our document In fact, allthat’s left to do is to ZIP down all the files in this folderand re-name the resulting file so that its extension is ssxxww Voila!
The simplest way of creating the ZIP archive is to runthe zziipp command found on most operating systemsdirectly from PHP This can be done by using the eexxeecc(())
function in PHP to instantiate and execute the mand directly from the directory where you havestored the document files The code snippet belowshows a simple way of changing to the directory whereyou have stored the files and ZIP the latter down into
com-an OpenOffice.org writer file:
chdir( “path/to/files” );
exec( “zip -r /oo_document_name.sxw *” );
If you are not dynamically changing the contents of themmeettaa xxmmll and mmiimmeettyyppee files, there is no reason to actu-
1 <?xml version =’1.0’ encoding =’UTF-8’ ?>
2 <!DOCTYPE office:document-meta PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument 1.0//EN’ ‘office.dtd’>
1 <?xml version =’1.0’ encoding =’UTF-8’ ?>
2 <!DOCTYPE office:document-settings PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument 1.0//EN’ ‘office.dtd’>
January 2005 PHP Architect www.phparch.com
Generating OpenOffice.org Documents with PHP
15
Trang 16ally generate them from PHP In fact, even mmaanniiffeesstt xxmmll
does not really need to be correct—I’ve happened to
create documents that did not have images properly
defined in the manifest and still worked fine Of course,
however, you should generate these files correctly
every time in order to comply with the document
standard
On Not Re-inventing the Wheel
If you want to get started quickly with OpenOffice.org
writer document generation, you do not necessarily
have to write the whole code yourself There already
exist some libraries that can take care of the dirty work
for you For example, the eZ publish content
manage-ment system has a library for generating
OpenOffice.org writer documents This library is part of
the public extension libraries in the product
PHPDocWriter is another library that enables you to
generate OpenOffice.org writer documents easily in
PHP There are probably other libraries out there as
well, so you can decide for yourself if you want to write
your own generator or if you want to use an existing
one for your needs
Use Cases
At this point, you should have a pretty good idea of
how you can generate an OpenOffice.org writer
docu-ment with PHP As you can see, this is much more work
than simply typing up the document in
OpenOffice.org—except, of course, for those cases
where you can’t A typical use case for this technology
is a CMS system that needs to export its content to a
standard document format, thus lowering the
invest-ment of time and money required to manipulate and
reuse it This is the same method I used when writing
content export functionality from eZ publish to
OpenOffice.org
The Future
The OpenOffice.org XML format will be changed in the
future Currently, members of OASIS (Organization for
the Advancement of Structured Information Standards)
are working together on creating an open XML
stan-dard for office applications based on the existingOpenOffice.org XML format Their goal is to create aneven more generic content format that can be used bydifferent applications—even CMS systems are covered
by their mission statement Since this new standard isbased on the OpenOffice.org XML file format, youshouldn’t need massive changes to support it when it isreleased Some people even speculate that this formatcan be adopted by Microsoft Word so that, in thefuture, you will be able save your office application data
in the same format regardless of which application youuse The Oasis Open Office XML format is currentlyavailable as a draft that has been approved by theOASIS Open Office Technical committee
Final Words
We have now looked at how we can generate anOpenOffice.org writer document containing the mostcommon formatting features used in word processing.This article, of course, only touches the surface of whatyou can store in a document by using theOpenOffice.org XML file format, but, hopefully, it givesyou some ideas on how you can harness the power ofthis open— source office suite in your own application.Generating OpenOffice.org spreadsheets and presen-tation documents is done in much the same way and,
by reading up on the specifications of each XML file mat, you should be able to create your ownz in notime
for-F
FEEAATTUURREE
Generating OpenOffice.org Documents with PHP
To Discuss this article:
http://forums.phparch.com/192
Bård Farstad is one of the three co-founders of eZ systems He has been working professionally with CMS development since 1999 and is the author of many general purpose libraries, such as an XML parser, SOAP library (client/server) and an XML-RPC library (client/server) He is also one of the main developers of the eZ publish CMS In his spare time, he likes to play with his daughter and play the guitar; he is also “aquascap- ing,” the art of decorating aquaria You can reach Bård via e-mail at
b bf@ez.no o.
Resources
OpenOffice.org homepage h http://openoffice.org
OpenOffice.org XML file format h http://xml.openoffice.org
eZ publish homepage h http://ez.no o
eZ publish public extension repository h http://zev.ez.no/svn/extensions/ /
PHP DocWriter h http://phpdocwriter.sourceforge.net/ /
Oasis Open Office XML Format h http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office
Trang 18FEEAATTUURREE
Sitting in his comfy fortieth-story office, he
effort-lessly presses a button, sending a tracking robot
into the sky from the roof of the huge sky scraper
The robot swoops down towards the ground, stopping
just feet away from the top of its target, a late model
truck hauling some sort of cargo The robot follows the
vehicle and takes a picture every few seconds; it then
sends the photos to the fortieth-story office using its
Wi-Fi antenna When the photos are received, on a
computer monitor in the office a dot is placed on a
map showing where they were taken, tracking the
tar-get all the way to its destination
Of course, this story is fiction—except for the images
that were taken The images mentioned in the story
were encoded with the longitude and latitude received
from GPS satellites With the advancement of digital
photography and GPS technology, we have now access
to these types of images
I remember when I first saw a GPS device It was
about six years ago, I was in an archeology class in
col-lege and the only one who could touch the device was
the teacher because it was so expensive The device we
used in that class was very basic and only told the
direc-tion and longitude and latitude We had to use it in the
shade because the LCD display was so dim But, it was
cool to be able to use the device, because not many
people had them at the time These days, GPS devices
are almost as cheap as free and are much more ticated and wrapped in much smaller packages, andyou can actually see what’s on the LCD display right inthe sunshine Because the space that a GPS deviceneeds to function has been reduced, they’re ending up
sophis-in very unusual but useful places
More and more cell phones hit the market everyyear—and every manufacturer is adding new technolo-
gy to their continually shrinking devices One of themore recent developments is the integration of GPSinto cell phones Besides just being able to tell whereyou are at any given second, this new feature opensmany doors for navigation and tracking with thephone It definitely gives a new meaning to the words
“big brother,” though
Along with GPS has been the inclusion of a camerainto the phone Sure, digital cameras have been in thephones for a while, much longer than GPS, but withthe new feature a melding of the two has occurred If
Ever wonder where a photo was taken? We’ve all seen
exotic or interesting photos, but really had no way of
pin-pointing their location Now, with the marriage of GPS
and digital photography, we can know the exact location
of where our favorite photos were taken.
Trang 19FEEAATTUURREE
your phone is equipped with GPS and also has a digital
camera built into it, you should be able to send images
via email or even download them to your computer
later that have the latitude, longitude, direction and
even the speed at which the photo was taken attached
to it The information is not visible on the photo itself,
but it actually gets embedded into the image’s headers
when the photo is taken Since the information is in the
photos, it is just a mater of retrieving it from the image,
which you will then be able to plot on a map to show
where the photo was taken—just like in my fictional
short story above
When I first heard about the GPS data included with
images from cell phones, I figured it was sent as a line
in an email, totally separate from the image Some early
adopters of the technology did go this route, but for
the most part today the images actually have the GPS
information placed into the EXIF tags that are located
in the JPEG image that is sent
What is EXIF?
Here’s a definition of EXIF:
Exchangeable image file format, a JPEG-encoded file
format for digital cameras that has similar tags to
TIFF.
Most of the digital, professional to cell-phone quality,
cameras on the market use this standard to include tags
in the headers that provide information about the
photo Some are so detailed that they’ll give you the
type of camera, camera settings, resolution of the
pic-ture, time and date it was taken and a lot more Having
these tags embedded into the photo will give the
pho-tographer or viewer the ability to know how and even
where the photo was taken and can be viewed at
any-time without having to go to a separate file
The standard was actually created in 1995 by the
JEITA (Japan Electronic Informational Technology
Association) to standardize not only the way the image
was recorded, but also the attributes of the photo
Since there is a standard, many programs that are not
related to any type of camera or brand can view these
headers and use the information for other applications
You can even install and configure PHP with the ability
to view EXIF headers directly without using a third
party tool, which, incidentally, is what we’ll be doing in
a little while
In this article, we’ll see how to pull out the longitude
and latitude from a photo’s EXIF headers using a library
called Exifer written by Jake Olefsky We will then
for-mat the GPS inforfor-mation into a useable forfor-mat, and
plot the location of the photo onto a world map
The sample image that we are using (Figure 1)
actu-ally came from a professional digital camera The
pho-tographer used a program to include in the EXIF
head-ers GPS information from a GPS handheld device which
recorded the location and time of the photographerthroughout his day of photography If you get a photofrom a cell phone with GPS or from a professional dig-ital camera that had the GPS info added later on, thelongitude and latitude will still be placed into the EXIFheaders in the same format
Let’s Get Plotting
Before you get started, you will need to grab the latestversion of Exifer I used version 1.4 for this example, at
h http://www.jakeo.com/software/exif/ This is the library
we will use to get the longitude and latitude tion I chose this route because it was the simplest toget up and running and doesn’t require any installation
informa-or configuration of PHP apart from placing the filesonto your web server
We will have to use some sort of library to parsethrough and decode these EXIF tags For the most part,the tags are not in any human readable format If you
go to the image and open it up in some sort of text tor you will see just a bunch of gibberish—not really
edi-Where in the World was that Photo Taken?
Figure 1
Copyright © FreeFoto ( f freefoto.com m ), Ian Britton (A photograph of a munications tower—but this photo is able to communicate other things besides just the image.)
Trang 20com-anything you will be able to work with.
The first listing (Listing 1) includes the eexxiiff pphhpp file
from the Exifer library so that we can pull the EXIF
infor-mation from our image The $$ppaatthh variable indicates
where on our server the image is located We will then
set the $$vveerrbboossee setting to zero If verbose is set to one,
it will include the image’s raw output as well, which, in
our case, would just end up as extra overhead in the
array Now the last line of the listing will actually grab
the EXIF tags from the image, read through them and
place them into an array from which we can then pull
out any of the information in a very easy way
Included in the Exifer library is the ability to view a list
of all available information and attributes for a photo
You can run the index file that is located with the Exifer
package—it basically reads and outputs to the screen
the array that is created from the eexxiiff pphhpp file
Listing 2 actually calls out the longitude and latitude
from the photo Most of the photos that have this
infor-mation will be formatted in hhhh mmmm ssss (hours, minutes
and seconds) and look something like (34 55.43 23.44)
when retrieved from the array With our example
image, the format is just hhhh mmmm
Listing 3 actually breaks the longitude and latitude
into an array of hours, minutes and seconds Because
we get the longitude and latitude in this format, our
coordinates will need to be converted into just a string
of numbers with a decimal point—something like(35.445845) Listing 4 takes each element of the arrayand converts each of the sets of numbers to give us twousable numbers to be able to plot on our world map The conversion method is as follows:
First the seconds are divided by 60
$lats = $result [ GPS ][ Latitude ];
$lons = $result [ GPS ][ Longitude ];
Listing 2
$lat_ar = explode (“ “, $lats );
$lon_ar = explode (“ “, $lons );
Trang 21FEEAATTUURREE
with the minutes which were combined with the
sec-onds
hh+mm
In listing 5, I have included my home town’s
longi-tude and latilongi-tude to plot on the map This will also
serve as way to check the accuracy of the plotting If
the dot on the map is not where it should be, you will
know something is wrong with the code or map You
can, of course, change this information with your own
area’s longitude and latitude Most of the map services
on the internet should allow you to translate your
address into longitude and latitude coordinates I have
found that h http://www.maporama.com does include most
countries around the world and gives you the
longi-tude and latilongi-tude in an easy to read format along the
left side of their site
Now we come to actually plotting the locations on a
map There are many different ways to do so, but, for
this example, we will use one of the easiest and most
basic methods by taking advantage of what is called a
cylindrical projection map You can use the same map Iused, or you can find your own cylindrical projectionmap The map can be larger or smaller and will it notaffect this code at all Most of the cylindrical projectionmaps will work, as long as they do not have any type ofborder around them Even the most simplistic border
could throw off the calculations in later listings I havetested this program using the abstract maps from
h http://flatplanet.sourceforge.net/maps/and it works justfine, even though the maps on this site or much biggerthan our example map
Also, in listing 5 we will start to create our map I usethe iimmaaggeeccoolloorrrreessoollvvee function to define my colorsbecause I found that it is more accurate than using the
iimmaaggeeccoolloorraallllooccaattee with the map I have chosen Onecolor, $$rreedd11, will be used for the dot that locates myhome town, while another, $$rreedd22, will be used to locatewhere the photo was taken when we plot the two onthe map
Where in the World was that Photo Taken?
$lat_ar = explode (“ “, $lats );
$lon_ar = explode (“ “, $lons );
$im = imagecreatefromjpeg (“map.jpg”);
$red1 = imagecolorresolve ( $im , 255 , , );
$red2 = imagecolorresolve ( $im , 255 , , );
Listing 5
the collected wisdom of many programmers solving a
$width = imagesx ( $im );
$height = imagesy ( $im );
Trang 22FEEAATTUURREE
Listing 6 grabs the dimensions of the image so that
we can use it in the math that is used to plot the
loca-tions
Listing 7 is the method for getting an xx and yy position
to plot on the map using our previously-converted
lon-gitude and latitude coordinates from our photo and the
height and width dimensions of the map It will then
create a red square on the map image where the
loca-tion of the photo was taken Again, this is probably one
of the simplest and definitely not the only or best ways
of plotting a location onto a map
Listing 8 uses the same method and creates a
slight-ly darker square to show my home town as a reference
point to check the accuracy of the plotting
Listing 9 will now display the newly created map in
your browser If everything goes to plan, you should see
the same results as Figure 2—a world map with the
location where the image was taken marked in the
United Kingdom If you have replaced my home town’s
coordinates with yours, you should see another dot
locating your area; otherwise the dot should be around
the Mexico border You may want to check the
accura-cy of the plotting against another mapping system like
h
http://geoengine.nima.mil/muse-cgi-bin/rast_roam.cgi i
This site actually has a lot of useful world maps, as well
as black-and-white aerial photo maps taken from
satel-lites that you can plot coordinates onto and also
down-load in a high- or low-resolution format
There are other ways of plotting maps and I would
definitely suggest locating some other ways to show
where the photos were taken As you can see from the
map generated by our program, the results we obtain
only provide a general idea of where the image was
taken but, if the map did have the ability to zoom in,
the user would be able to see exactly where it was
taken Many maps that you can find from the internet
are available in vector and raster versions and will
sometimes include additional geographical
informa-tion, such as streets and highways—using them will
definitely add more detail I know that the Tiger
map-ping system (h http://tiger.census.gov/cgi-bin/mapbrowse-
-t
tbl) created by the US Census Bureau allows you to
cre-ate maps with very detailed information, but its data islimited to the US Using a more detailed map willmean your user will be able to better pinpoint whereand how to get to the place where the image wastaken You may also want to check out the article titled
“Webmapping with MapServer,” written by RodrigoBecke Cabral and published in the July 2004 issue ofphp|architect
And there you have a program whose capabilities aresimilar to those mentioned in my short story above(minus the cool flying robot) Now, you can take a roadtrip with your GPS-enabled cell phone and take pictures
of all the cool roadside monuments and then make amap showing where you’ve been and what you’veseen
Examples
There are several sites out there that are dedicated tomapping out where photos are taken Most of thesesites or projects are strictly for entertainment andexperimentation purposes and very few have beenapplied to the professional marketplace One of thebest examples of using this technology is
h http://www.geosnapper.com This site is dedicated to pho-tos that have GPS data It plots all the photos uploadedonto a world map and you merely scroll over to the sec-tion of the world you would like see and click on animage—a very good and entertaining use of the tech-nology
Another good example is the WWMX (World WideMedia Exchange), which can be found at
h http://www.wwmx.org This site is a little different becauseyou actually download an application that works withthe GPS logs—basically a file usually giving a GPS coor-dinate every minute—that come from a GPS device Itthen matches the timestamp of the photos you havetaken with a coordinating time on the GPS log andselects the appropriate location
With this information on hand, it creates a map andwebpage on which the location where every photo wastaken is plotted You then can take the map and web-page and upload it to your own website The users ofthis system are primarily taking advantage of it for trav-
el logs and for showing others the path they took andwhat they saw along the way
Since cell phones were married to digital cameras, ithas been becoming more and more popular to createblogs using images taken with your phone Now, withthe integration of GPS, we are seeing a new level added
to the standard blog Several of the moblog (mobileblog) sites also allow you to click an image and locatewhere that photo was taken
Here is more of a professional use of this technology (or of a form of it, at any rate) At
h http://arcweb.esri.com/sc/album/index.html l, you are able
to view and locate images from all over the world Thissystem actually translates the street address of the pho-
Where in the World was that Photo Taken?
Trang 23FEEAATTUURREE
tographer into latitude and longitude Even though it
works slightly differently from ours, it is still creating a
map with the location of the photo
This, however, is probably my favorite
uses of the GPS-tagged images: h http://www.downgoes-
-t
thesystem.com/devzone/exiftest/final/ Users actually
email the images to the site from their cell phones and
the site reads the EXIF headers and plots the images
onto the map, which is limited to Tokyo The map
allows the user to zoom in and see where the photo
was taken and also see the photo by clicking on the
location marker I probably like this one the most
because they are actually using tagged cell phone
images and not adding the data to the photo at a later
time
Final Thoughts
As you can see, there are quite a few people and
com-panies developing or toying around with this type of
technology in some form or other A handful of these
sites have yet to tap into the full features that EXIF tags
provide Having to input the GPS data separately is
redundant and could possibly be less accurate,
espe-cially if you do have a cell phone or device that records
this information into the photo it self
It seems to me that GPS is becoming more and more
popular and will, I assume, be packaged with more and
more electronic products in the future Today, we see
GPS technology placed into delivery trucks to trackdrivers around town or across the state Or, you canhave a device installed into your car that tracks it if it isstolen Having GPS in our lives helps us in so manyways It is definitely helpful when lost on a highway, oreven out on an uncharted trail Being able to pinpointyour location or at least know the general area whereyou are is a great benefit
Hopefully, this program, as well as the idea of usingGPS-embedded photos, has sparked your own imagi-nation I know I was quite intrigued with the possibili-ties of the technology and uses that it might hold fromthe very first time I saw it The melding of GPS and dig-ital images in cell phones is fairly new and its usefulnesshas not been fully explored It will be very interesting tosee where it goes in the future
To Discuss this article:
http://forums.phparch.com/193
Ron is the technical director/senior programmer for Conveyor Group (h http://www.conveyorgroup.com m), a Southern-California based web development firm His responsibilities include technology development, programming, IT and network management, strategic research, server systems management (webmaster), and website projects leader.
Award-winning IDE for dynamic languages, providing a powerful workspace for editing, debugging and testing your programs Features advanced support for Perl, PHP, Python, Tcl and XSLT, on Linux, Solaris and Windows
Download your free evalutation at www.ActiveState.com/Komodo30
Where in the World was that Photo Taken?
Trang 24TEESSTT PPAATTTTEERRNN
Layering is essential The only way our rather feeble
brains can cope with software development at all is
by a process of divide and conquer This is because
bugs are easy to fix once you find them—but finding
them is the problem If we can make a part of our code
completely unaware of the other parts, we know for
sure that any errors in it are local Layering is the
grand-est expression of divide-and-conquer: it divides our
entire application into a very few pieces and declares
that each one can only be influenced by itself and, at
most, one other In particular each layer can only see
the next one down
This is easy to understand and works well It’s no
sur-prise, then, that this technique has been applied to
complex enterprise applications and that there are lots
of layered systems to choose from It also means,
unfor-tunately, that terminology has suffered Layers are
sometimes called logical tiers, or just tiers You also see
texts where “tiering” or layering is described as the
sep-aration of hardware, that is, the use of multiple servers
Faced with this confusion and the need to fit an
expla-nation into a single article, I am going to have to punt
My preferred solution in this arena is four layers, so I’ll
take as my starting point the one used by Eric Evans in
“Domain Driven Design” (published by Addison
Wesley)
Then we’ll prod and poke it
The Four-Layer Architecture
As you can see in Figure 1, the layers in our model arepresentation, application, domain and infrastructure Ifyou are not used to UML, then the tabbed boxes arepackages—they are, basically, big dollops of code Thearrows show visibility, so that the application layer isblissfully unaware of the presentation layer, for exam-ple To demonstrate the way the layers work, I amgoing to use the very trivial example of a contact man-ager Firstly, let’s see what the presentation code wouldlook like for the single task of e-mailing someone:
The four-layer, or four-tier, architecture is an
enterprise development classic The trouble is
that, for small projects (or big simple ones) it is
complete overkill What happens when we try to
simplify this layering?
Trang 25January 2005 PHP Architect www.phparch.com 25
Shedding a Tier
<body>Message sent to <?= $_GET[‘name’]
?></body>
</html>
The method and style of interaction, or, in this case, the
lack thereof, is what makes up the presentation layer If
you can imagine changing the way the application is
used—for example, switching to a GUI or a web
servic-es API—then anything that would change must go into
this layer That’s actually a lot of stuff: it naturally
includes JavaScript, CSS, form parameters and the
HTML, but it also includes sessions and maintaining
authorization After all, these will be different for, say, a
desktop application compared with a web one
The presentation layer is allowed to interrogate the
application one, here represented by the CCoommmmuunniittyy
class Let’s look at that next:
class Community {
function mail($name, $title, $message) {
$finder = new PersonFinder();
I don’t have the space to build a complete four-layer
application at this point, so I am going to have to
illus-trate the ideas with code fragments from now on
The application layer is the glue that binds all of the
components together It’s all about actions written in a
language that the business stakeholders would
under-stand The domain objects contain the more innate
business rules An example of domain knowledge is
how the e-mail is sent The application layer knows
nothing of this process, extra headers, formatting, and
so on It just kicks off the domain code
What makes something an application object and
what makes it a domain object is subtle The distinction
comes about because applications change more
fre-quently, often in response to what users want from the
business The knowledge of the business domain itself
is acquired more slowly and with a lot more effort In
fact, so much effort goes into this process of discussion
between the developers and stakeholders that it is a
process known as “knowledge crunching.” By contrast,
the application code should tell a simple story of what
is going on In our example, this boils down to finding
a person, getting a contact point from them and
final-ly sending the message The grammar just then isEnglish, but the grammar of our code snippet is PHPsyntax
In our example, I am choosing the CCoommmmuunniittyy class to
be part of the application layer, but I would expectclasses like PPeerrssoonn to be used in several applicationswithin an organization Because of this, I think it’s safe
to assume then that PPeerrssoonn is a domain layer object.Let’s look at a domain object next:
class Person {
} } }
There are business rules, even here in this trivial ple Ordering by preference means that we are takingthe first of a possible list of contact points Becausethere are other ways to contact our people, we had tospecify a media, in this case e-mail Unlike the applica-
exam-tion layer example, we have some clutter caused by thedatabase access We’ll take a broom to this in a littlewhile
As we descend to the lowest infrastructure level, westart to get to the nitty-gritty The code the domainobject is using is stuff that could be common to anyorganisation—library code if you will Here is someinfrastructure code:
class Connection {
} } return new ResultSet(result);
}
T
TEESSTT PPAATTTTEERRNN
remainder of the application—it does not, by itself, separate all of the
pres-entation logic ”
Trang 26TEESSTT PPAATTTTEERRNN
Shedding a Tier
If you are like me, then you have written this type of
code a lot of times More likely, you have had the good
sense to use one of the many free libraries instead
You are probably thinking that all of these little
class-es and filclass-es could be replaced with a single top level
script that would be a whole lot simpler That would be
a good point For such a simple task, it’s definitely
worth noting that I would have a hard time disagreeing
with you The four-layer architecture only really comes
into play once the job starts to get complicated For
smaller projects, we can simplify to taste, so let’s look at
some shortcuts
Merging Application and Presentation
The blending of the layers can be seen in its most naive
form like so:
All we have done is taken the code in the old
applica-tion class and pasted it straight into our top level script
It’s the simplest way to combine the layers and, in fact,
you often see this approach hidden behind a template
All a template engine really does is separate the visual
formatting from the remainder of the application—it
does not, by itself, separate all of the presentation logic
We still have the $$ GGEETT array in our code, for instance
This choice is great for separating the HTML so that it
can be edited by graphic designers It doesn’t manage
to free you of the navigation and form handling
However, this is usually fine if you are just building web
applications and are changing only the look and feel
On the positive side, this approach is often good
enough for standalone applications It is also well
understood, especially within the PHP community, and
is a quick way to turn an HTML mock up into a
work-ing system The downside is that it will be hard to
inte-grate into other applications and much harder to test
Because the application code here lives in scripts, it will
have to be tested by looking at web pages—altogether
a rather coarse approach It works well for a small ect, but this is about as much of a hole as I like to digbefore I get nervous The warning signs are tricky bugswith things like security and also excessive duplication
proj-of code across the top level scripts
Merging Application and Domain layers
Because of the very slight difference between these twolayers, it is common to merge them into a single one.That becomes one of the classic three-layer architec-tures There is no difference in the code—it is just thatthe CCoommmmuunniittyy class is declared to be in the domainlayer
You can actually merge the application and domainlayers with few ill effects, but with one caution Thesymptom to watch out for is domain layer objects thatare difficult to test because of configuration Becausedomain objects are the part of the business rules youwould like to reuse across the organisation, you don’twant them tied to a specific server This may happenbecause they have fixed paths for files, or perhapsresources such as database passwords need to be glob-
al This is a sign of future trouble These kinds of sions most definitely belong in the application layer andthere is a big win in passing all of this into the domainFigure 1
Trang 27deci-January 2005 PHP Architect www.phparch.com 27
Shedding a Tier
objects as parameters
I would split them into two camps if you have a lot of
server-specific configuration It’s no fun searching hard
drives for missing files
Purifying the Domain Layer
Objects representing the business domain will probably
have to be saved to a database, a process called
persist-ence The so called ActiveRecord pattern is the simplest
way to make objects persistent If the infrastructure
layer is very primitive, then the domain objects have to
do a lot of work communicating with the database The
ActiveRecord pattern is really no pattern at all—the
domain object will handle all of this work itself,
although you may be able to factor some of it out with
inheritance The earlier PPeerrssoonn class is an ActiveRecord
Although it has some help from the infrastructure
class-es, the metaphor is still one of database rows This extra
translation effort to go from a tabular database view to
an object view is called object/relational impedance,
and that’s not so nice when mixed in with your
busi-ness code (you may want to read up on Rick Morris’
articles on this topic that appeared in the August 2004
and November 2004 issues of php|a) Now, a full
dis-cussion of persistence patterns is a book in itself (e.g
Nock), but pushing out the database code comes down
to two basic ideas: external mappers and internal
acces-sors
The DataAccessor pattern, or DataAccessObject or
DAO, wraps all of the database code into a single object
that the domain object can call For example, when
This separation is invisible to the outside world The
domain object, here our earlier PPeerrssoonn, is in charge of
creating and using the accessor Note that the accessor
just deals with database data and only has getters and
setters The data coming back could be other objects or
arrays of data It doesn’t have to correspond to a single
row on the database and this can do a lot to clean up
the domain layer code: it’s the very simple approach of
delegating to an internal object to do all of the dirty
work
The opposite approach is the DataMapper pattern
With this method, we gut the domain objects of all of
the database code and use another separate class to do
PHP is, at last, starting to see some libraries emerge
to ease the workload for saving objects In order ofsophistication, they include PEAR::DB_DataObject,Propel and MetaStorage
Removing the Domain Layer
So much has revolved around the business domainlayer up to now that it may seem rather strange that itcould be removed What does an application look likewithout any business logic? Well, you can still use thedatabase operations of creating, reading, updating anddestroying, or CRUD for short, and you will also getmainly tabular data back The end result is just a simplereporting application, but these are common in thePHP world
Although limited in applicability, there is a way tomake applications of this type spectacularly quick towrite Apart from dispatching queries, they only have todeal with a single type of object, namely the set ofresults returned from a query The class is usuallynamed RReeccoorrddSSeett or some such similar name and, formemory efficiency, it is usually implemented as sometype of iterator—that is, you read a row at a time Asthere is only one type of object to display, it is easy tobuild a library of display components, usually calledwidgets or controls, to work with it These can rangefrom simple drop-down list widgets right up to elabo-rate editable table widgets
To show what this looks like, imagine we are going todisplay a table of people Here is a possible code frag-ment for the presentation layer:
T
TEESSTT PPAATTTTEERRNN
Trang 28Looks easy—and it is—but that’s only because we are
playing to this system’s strength by simply displaying
tables What happens if we have to do some
calcula-tions on the columns or add other external data into
the output? Because the row data is only actually
loaded on demand, we have two options
The first option is to pull all of the data out, perform
our calculations and then create a new RReeccoorrddSSeett from
scratch We then pass that back instead That’s OK for
small amounts of data, but messy Notice that the
RReeccoorrddSSeett is a key abstraction here Because these
objects don’t have to come from a database, we are
free to build or intercept them and so squeeze in an
additional logical layer Only in so doing can we justify
calling this another form of three-layer architecture
The other option is to run our code as the rows are
fetched As the widgets are going to do all of the work,
we have to modify the nneexxtt(()) call We could inherit
from the RReeccoorrddSSeett, but preferable is wrapping it in a
class that looks identical This just passes the calls to the
real RReeccoorrddSSeett underneath This trick is called the
Decorator pattern, or, in this context, usually a filter
The code then looks like this:
<?php
$community = new Community();
$people =
&$community->findByCategory(‘friends’);
$filter = &new WithEmailsAsLinks($people);
$widget = &new TableWidget($filter);
?><html>
<head><title>My Friends</title></head>
<?php print $widget->paint() ?>
</html>
By writing our WWiitthhEEmmaaiillssAAssLLiinnkkss filter, we can later
intercept the nneexxtt(())call to manipulate the rows as they
pass through I am imagining that the “email” field
would be converted to an HTML anchor tag on each
nneexxtt(())call For big lumps of tabular data, this is a
com-mon technique It has the added benefit that the same
filters can be used again and again over an application
They are difficult to work with for tricky domain logic,
though, and ridiculous overkill if you mostly fetch a
sin-gle row or object at a time
Faced with this constraining style, one solution is to
move all of the complex business logic into the
data-base as stored procedures or triggers and use PHP as a
presentation tier only This then becomes a
database-driven application If all of your information is stored on
a relational database, the application does not changetoo much and the skills are available, then this is also atried and tested option
Most of the decisions so far have been easy to backout of, but the decision to go the RReeccoorrddSSeett routerather than the domain layer route is more of a fork inthe road If you will be dealing with mostly tabular dataand the bulk of your system is database-driven, thenthe RReeccoorrddSSeett model is probably the way to go If, onthe other hand, you are frequently working with singlecomplicated items or managing information from morethan just databases, go with the domain model Ingoing that route, the simplest first split will be straightdown the middle, namely get the domain layer awayfrom the presentation
That should give our brains a fighting chance
Further Reading
The essential enterprise patterns book is “Patterns ofEnterprise Application Architecture” by Martin Fowler(Addison-Wesley) Most of the patterns described herecome from that book In the same vein, but limited topersistence mechanisms, is “Data Access Patterns” byClifton Nock (also Addison-Wesley)
The persistence libraries mentioned are: PEARDB_DataObject at p pear.php.net/package/DB_DataObject t,Propel at p propel.phpdb.org/wiki/ / and MetaStorage at
w www.meta-language.net/metastorage.html l
To Discuss this article:
http://forums.phparch.com/194
Marcus Baker is a senior software developer at Wordtracker and part time web development consultant His website is at
h http://www.lastcraft.com/ Marcus is also a co-founder of the PHPLondon organization.
Trang 29FEEAATTUURREE
Trang 30FEEAATTUURREE
There are a couple of different methods of
con-verting characters to other characters
Transliteration is the process of converting a
specific character to different characters or
groups of characters
Examples of transliteration are the converting of the
Norwegian “å” to “aa” (ligature normalization), “ç” to
“c” (diacritical removal), “ÿ” to “Ÿ” (changing case),
“ ” to “YU” (Cyrillic to Latin transliteration) and “©”
to “(c)” (special decomposition) For each of those
con-versions, special filters can be used and the order of
fil-ters is important too For example, you will want to run
a ligature normalization filter before the diacritical
removal filter so that “å” does not become “a”, but
“aa” like Norwegian people would expect As you can
imagine, the definition of some of those filters can be
pretty large, especially the Han to Pin Yin transliteration
because of the great number of Chinese characters
Transliteration from one script to another will most
likely never be one hundred percent accurate, as the
way characters are transliterated to the Latin script is
sometimes affected by country, but most often just by
the person who does the transliteration Therefore,
transliteration can only achieve an approximation of a
script when we transliterate texts
Why Is This Needed at All?
You might be wondering why one would need a
method or an extension to transliterate characters from
one character set to another one, but there are a ple of situations where this is really useful One exam-ple is a content management system where you wouldwant to create an URL path out of the title of a docu-ment A first method would be to simply conduct thefollowing steps:
cou-• Convert the title of the document to lowercase characters
• Replace all characters not in the range of zz00 99 with an underscore
aa • Remove underscores at the beginning andending of the generated title
• Remove multiple underscores in a row
As an example, the title: “42: The answer to life, andeverything.” would first become: “42: the answer tolife, and everything.”, then “42 theanswer_to_life and_everything_” and, finally,
“42_the_answer_to_life_and_everything,” which is asuitable name to use in a URL This algorithm works finefor English text, but, if the title of the document had
Trang 31FEEAATTUURREE
Transliteration with PHP
contained the word “français,” for example, the final
result would have been “fran_ais,” which is no longer
representative of the document’s title For different
scripts, such as Cyrillic or Japanese Katana, this is
obvi-ously not going to be useful at all In eZ publish, URLs
are not the only things that need some form of
“man-gling” in order to create a usable string For example,
other items include identifiers for attributes (fields) in a
content object, searching and generated package
names, and so on Each of those cases might need
dif-ferent rules for creating a usable string as
representa-tion for the items For example, if you are normalizing
a string for a search engine, you might want to retain
spaces, while they should be removed if you are
prepar-ing a strprepar-ing for use in a URL Other uses for strprepar-ings
might not even allow underscores at all
The Translit Extension
One possibility is to implement these filters with PHP
code, though this is not very fast It is what you had to
do before the translit extension existed The translit
extension makes it possible to apply filters on strings of
text to perform different transliteration rules The
extension provides two functions only:
ttrraannsslliitteerraattee ffiilltteerrss ggeett and ttrraannsslliitteerraattee The
first function returns an array with all available filters,
while the second one provides the functionality needed
to apply transliteration filters to strings
Installing the translit extension can be done by
sim-ply running:
pear install http://pecl.php.net/get/translit
This will install the latest version of the transliteration
extension, which, at the time of this writing, was beta
version 0.5 In order for this to work, you do need a
correct build environment for PECL extensions; this
includes “fitting” versions of the autotools: autoconf
2.13, libtool 1.4.3, and automake 1.4-p6 or similar
Newer versions might also work, but they can throw
quite a few warnings If this is the case, you should
downgrade your autotools to the versions I just
men-tioned You can also get some information from the
PHP manual by visiting this URL:
h
http://php/manual/en/install.pecl.php
Another installation dependency of the translit
exten-sion is the iconv extenexten-sion, which you either need to
have compiled into PHP (Unix) or loaded in by
specify-ing it in your php.ini file with an eexxtteennssiioonn== line before
the translit extension
When the extension is installed and enabled inphp.ini, you can use the ttrraannsslliitteerraattee ffiilltteerrss ggeett
function to see if everything is working:
This should return all supported filters
Character Sets and Unicode
The extension needs to deal with a lot of different acter sets (e.g.: Latin, Greek, Cyrillic, and so on).Because none of the normal 8-bit character sets, or the
char-Chinese Big5, are compatible with each other, the
extension uses Unicode characters to perform its formations on
trans-To implement efficient filters, the transliteration
extension does not use UTF-8 encoding internally, as
doing so would require too much overhead when
pars-ing the strpars-ing each time; instead, it uses UCS-2, which
always stores one Unicode character as two bytes Thismakes it possible to perform integer arithmetic on thecharacters, allowing for very fast filtering In return, thismeans that you need to convert your data if you want
to transform strings encoded in character-sets otherthan UCS-2 Fortunately, the ttrraannsslliitteerraattee functionallows you to specify the input and output charactersets for the transliteration This is where the iconvextension, on which the transliteration extensiondepends, comes into place
In Listing 1, for example, we execute the
normalize_ligature filter on the string “Vær så god.”
resulting in “Vaer saa god.” As you can see, the tion is easy to use: the first parameter is simply thestring that you want to execute a filter on, while thesecond parameter is an array containing the filters thatyou want to execute The third and fourth parametersare the character set of the incoming data and outgo-ing data respectively
func-The second parameter contains an array of filters,which means you can execute multiple filters with thesame function call and be sure that the order in whichthey are executed is preserved, based on the contents
of the array you pass
Trang 32FEEAATTUURREEF
FEEAATTUURREE
Transliteration Filters - Latin
Currently, the transliteration extension provides
sup-port for different groups of scripts, and each of those
scripts has different filters
For the latin script, the group of filters consists of
dia-critical_remove, lowercase_latin, normalize_ligature and
uppercase_latin Not only do those four filters deal with
the Basic Latin and Latin-1 Supplement Unicode
blocks—they also support Latin Extended-A and
Latin-Extended-B This means that the ddiiaaccrriittiiccaall rreemmoovvee
filter will be able to remove diacritical signs for all Latin
and Latin-like characters available in Unicode It will
therefore correctly convert all characters in the string
“ ” to uppercase and then remove all the diacritical
signs from it Listing 2 illustrates this
A couple of additional filters are required to be able
to generate URL-safe names from article titles For this,
we need to follow these steps:
• Expand all ligatures (the normalize_ligature
filter, “å” to “aa”)
• Remove all remaining diacritical signs (the
diacritical_remove filter, “é” to “e”)
• Convert the string to lowercase (the
latin_lowercase filter, “FoO” to “foo”)
• Normalize all punctuation so that the
remove_punctuation filter can remove it (the
compact_underscores filter, “_foo 42 ” to
“foo_42”)
It is fairly trivial again to execute all those filters on thestring from which we want to create a fitting URL-safename, as you can see in Listing 3 In line 2, we defineour string PPoolliittiieett:: ØØkksseeaannggrreeppeett vvaarr ppllaannllaaggtt \\nn,and in lines 3 to 7 our filter array Line 8 then executesall the filters on the string, treating incoming data as
UTF-8, but producing 7bit ASCII data as output In our
case, we want 7bit ASCII as output because this filtersout all other scripts Line 9 instructs vim, my editor ofchoice, to treat the file as UTF-8 data
Greek
The technique above works very well for Latin-basedlanguages, but as soon as a different writing is used, it will miserably fail Imagine a Greek string like
exe-cute the same filters as for latin strings, the result will be
“ ”, which is, of course, completely useles Hence,
we have to transliterate this Greek text to the Latin
script first, which is what the greek_transliterate filter
does In Listing 4, you can see that by prepending the
greek_transliterate filter, the output of the transliteration
process is something we can use as part of a URL,although it might not be a 100% correct translitera-
Trang 33There are two more filters for the Greek script: one,
greek_uppercase, will convert all lowercase letters to
uppercase, while the other, greek_lowercase, will
con-vert all uppercase letters to lowercase Listing 5 shows
both filters in action
Cyrillic
The same ideas apply to the Cyrillic script as for the
Greek script Unlike the Greek script, which is currently
only used in Greece, the Cyrillic script is used for
sever-al languages There are indeed some differences in
pref-erences in those countries on how to transliterate
Cyrillic into Latin Therefore, the transliteration
exten-sion does not only contain a generic transliterate_cyrillic
filter, but also a transliterate_cyrillic_bulgarian filter that
changes some of the transliterations for specific letters
In the future, extra filters for even more languages may
be added, but for now there is only one, specific to the
Bulgarian language
In Listing 6, you can see the different filters for the
Cyrillic script in action; notice how the “ ”is
transliter-ated differently in Bulgarian compared to the default
transliteration rules for the Cyrillic script—naturally,
there are more characters that are different than only
this one
Hebrew
This is a very interesting one, as it has no concept of
upper and lower case characters Accordingly, there isonly one filter related to the Hebrew script: hebrew_transliterate
Listing 7 shows the filter in action From the
“mobrq_yskn_t_mdynot” One other interesting ture about Hebrew is that most writings don’t seem touse vowels at all
fea-Asian Scripts
This is where all the real fun begins—at least for one who doesn’t know anything about Asian lan-guages Asian languages usually don’t conform to the
some-“we use letters to represent text” rule For example,Chinese uses ideograms, while Japanese uses those inaddition to two other scripts, and Korean uses com-bined letters as characters Another point is thatChinese doesn’t use any spaces between words, whichmakes it really hard to come up with a sensibleRomanization strategy
For now, the transliteration implements a few filtersrelated to CJK (Chinese, Japanese, Korean) scripts The
first one, hangul_to_jamo, converts the combined
Korean syllables (Hangul) back into letters (Jamo) TheUnicode character set supports both the combined syl-lables as well as the separate letters that form the com-bined syllables This is done in a very algorithmic way,fortunately The second Korean related filter—
jamo_transliterate—Romanizes the separate Jamo
char-1 <?php
2 $string = <<<END
6 END; echo $string , “\n\n”;
7 echo transliterate ( $string , array(‘greek_uppercase’), ‘utf-8’, ‘utf-8’), “\n\n”;
8 echo transliterate ( $string , array(‘greek_lowercase’), ‘utf-8’, ‘utf-8’), “\n\n”;
9 echo transliterate ( $string , array(‘greek_transliterate’), ‘utf-8’, ‘utf-8’), “\n\n”;
5 echo transliterate ( $string , array(‘cyrillic_uppercase’), ‘utf-8’, ‘utf-8’), “\n\n”;
6 echo transliterate ( $string , array(‘cyrillic_lowercase’), ‘utf-8’, ‘utf-8’), “\n\n”;
7 echo transliterate ( $string , array(‘cyrillic_transliterate’), ‘utf-8’, ‘utf-8’), “\n\n”;
8 echo transliterate ( $string , array(‘cyrillic_transliterate_bulgarian’), ‘utf-8’, ‘utf-8’), “\n\n”;