1. Trang chủ
  2. » Công Nghệ Thông Tin

Tài liệu Hello, Goodbye docx

67 202 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Hello, Goodbye
Tác giả Bồrd Farstad, Marcus Baker, Ron Goff, Peter B. MacIntyre, Derick Rethans, Chris Shiflett, Chirag Ahmedabadi, John W. Holmes, Rami Kayyali, Marco Tabini
Trường học PHP Architect
Chuyên ngành PHP
Thể loại Bài báo
Năm xuất bản 2005
Định dạng
Số trang 67
Dung lượng 3,42 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

In this file, you can, for example, find the name doc-of the document, its keywords and statistics like thetotal number of words that it contains.. FEEAATTUURREE Generating OpenOffice.or

Trang 1

JANUARY 2005 VOLUME IV - ISSUE 1

The Magazine For PHP Professionals

TM

Trang 4

61 Tips & Tricks

Javascript Remote Scripting with PHP

Trang 6

EDDIITTOORRIIAALL

Anew year is upon us—and quite a few interesting things

have already happened We just published our first book, for

example The Zend PHP Certification Practice Test Book, which

I co-wrote with John Coggeshall, has just been unleashed on the

PHP community with (if I may unleash some personal pride)

extremely good results In a separate—but far more important—

piece of news, PHP was named “language of the year 2004” by a

site that tracks language usage in the development community

PHP 5 continues to plow along quickly and efficiently, with a new

point release scheduled for release soon that will introduce some

much-anticipated new functionality

However you look at it, 2005 is poised to be a marquee year for

PHP There is so much going on that I can hardly keep my head

around it and continue my daily activity here at php|a

headquar-ters (although I must say that the two weeks of vacation I took

around Christmas were very helpful in getting my head wrapped

around doing absolutely nothing What a pleasant change of

pace…) On the other hand, 2004 was, in many ways a marquee

year for PHP as well This just highlights the positive direction that

the language is taking, shaped in many ways by the vast amount

of work that everyone in the community—even those who can just

be found complaining on the mailing lists—has put into defining

its goals and needs

Here at php|a, there are three important news items that I want

to share with you this month

First of all, John W Holmes, who has taken care of our Tips &

Tricks column since the very first issue, is leaving us John is a

Captain in the U.S Army, and his “day job” is keeping him way too

busy to deal with such a demanding column on a monthly basis I

know you expected me to say this, but I really, really enjoyed

work-ing with John He has the (unfortunately rare) ability to be

techni-cally accurate and linguistitechni-cally clear—his writings were always as

pleasant to read as they were to edit Luckily, the T&T column is

far from over—but more about that will have to wait for another

editorial Thank you, John, and Godspeed

Just to stay on the editorial front, we have a new column

start-ing this month, titled Test Pattern and penned by Marcus Baker In

his column, Marcus will be dealing with the issue of proper

soft-ware design as applied to PHP development, from patterns, to

tier-ing, to testing The goal of this column is to challenge you, dear

readers, not just to write more efficient, but also more beautiful

code—to make every single one of your applications a little work

of art that is well-thought-out, properly designed and executed

flawlessly Marcus has an awesome job ahead of him, but, then

again, he is an awesome fellow, so I’m sure that you’ll enjoy his

writings

Finally, you may have heard about some recent security issues

that have struck both PHP and PHP-based applications,

notably the popular forum software phpBB The reaction to

these issues has been less than stellar, in my humble opinion,

php|architect

Volume IV - Issue 1 January, 2005

Publisher

Marco Tabini

Editorial Team

Arbi Arzoumani Peter MacIntyre Eddie Peloke

Graphics & Layout

php|architect (ISSN 1709-7169) is published twelve times a year by Marco Tabini & Associates, Inc., P.O Box 54526, 1771 Avenue Road, Toronto, ON M5M 4N5, Canada

Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, list- ings and figures, the publisher assumes no responsibilities with regards

of use of the information contained herein or in all associated material.

Contact Information:

General mailbox: info@phparch.com

Editorial: editors@phparch.com

Subscriptions: subs@phparch.com

Sales & advertising: sales@phparch.com

Technical support: support@phparch.com

Copyright © 2003-2004 Marco Tabini & Associates, Inc — All Rights Reserved

Trang 7

NEEWW SSTTUUFFFF

What’s New!

php|architect launches php| tropics 2005

Ever wonder what it's like to learn PHP in paradise? Well, this year we've decided to give you a chance to find out!

We're proud to announce php|tropics 2005, a new conference that will take place between May 11-15 at the Moon Palace Resort in Cancun, Mexico The Moon Palace is an all- inclusive (yes, we said all inclusive!) resort with over 100 acres of ground and 3,000 ft of private beach, as well as excellent state-of-the-art meeting facilities

As always, we've planned an in-depth set of tracks for you, combined with a generous amount of downtime for your enjoyment (and your family's, if you can take them along with you).

We even have a very special early-bird fee in effect for a limited time only.

For more information, go to http://www.phparch.com/tropics

Zend Technologies Unveils Integrated Software Platform

Zend has announced the unveiling of Zend Platform 1.1

Zend Technologies, Inc., creator and ongoing innovator of PHP, products and services supporting the

development, deployment and management of PHP-based applications, today unveiled Zend Platform 1.1.

The newest member in the Zend family of products is the first integrated software platform that supports

the reliability, scalability and interoperability requirements of business critical PHP applications The

uct was developed based on direct feedback from hundreds of Zend customers currently using Zend

prod-ucts to develop and manage corporate applications, and is currently in use at Zend customer sites Zend

Platform adds a wide range of new functionality that speeds time to production and improves end user

satisfaction by increasing the overall performance of enterprise applications Zend Platform 1.1 is available

immediately.

"As PHP matures and evolves, the need for an integrated solution for building and deploying business

critical applications becomes more relevant," said Pamela Roussos, vice president of marketing at Zend

Zend Platform is the first comprehensive lifecycle management solution for PHP users and is the only

next generation infrastructure product that directly supports the development and deployment of

busi-ness critical enterprise PHP applications The feedback from customers was critical in our development of

this solution, and directly addresses the needs in our user community."

For more information visit: h http://www.zend.com

The Zend PHP Certification Practice Test Book is now available!

We're happy to announce that, after many months of hard work, the Zend PHP Certification Practice Test Book, written by John Coggeshall and Marco Tabini, is now available for sale from our website and most book sellers worldwide!

The book provides 200 questions designed as a learning and practice tool for the Zend PHP Certification exam Each question has been written and edited by four members of the Zend Education Board the very same group who prepared the exam The questions, which cover every topic in the exam, come with a detailed answer that explains not only the correct choice, but also the question's intention, pitfalls and the best strategy for tackling similar topics during the exam.

For more information, visit h http://www.phparch.com/cert/mock_testing.php p

Trang 8

The development team has also released a new website for better information and communication purposes.

New is the possibility to download struts4php in the current version as PEAR Package under

h

http://www.struts4php.org/pear/struts4php-current.tgz

For more information visit: h http.//www.struts4php.org

Check out some of the hottest new releases from PEAR.

PEAR 1.3.4

PEAR 1.3.4 fixes a serious problem caused by a bug in all versions of PHP that caused multiple registration of the shutdown function of PEAR.php, makes pear help listing more useful by putting the how-to-use info at the bottom of the listing, and several bug fixes.

Net_Monitor 0.0.7

A unified interface for checking the availability services on external servers and sending meaningful alerts through a variety of media if

a service becomes unavailable

I18Nv2 0.10.0

This package provides basic support to localize your application, like locale based formatting of dates, numbers and currencies.

Beside that it attempts to provide an OS independent way to sseettllooccaallee(()) and aims to provide language, country and currency names translated into many languages.

Net_FTP 1.3.0RC2

Net_FTP allows you to communicate with FTP servers in a more comfortable waythan the native FTP functions of PHP do The class implements everything nativly supported by PHP and additionally features like recursive up- and downloading, dircreation and chmod- ding It although implements an observer pattern to allow for example the view of a progress bar

PHP_Fork 0.2.0

PHP_Fork class Wrapper around the ppccnnttll ffoorrkk(()) stuff with a API set like Java language.

Practical usage is done by extending this class, and re-defining the rruunn(()) method.

[see basic example]

This way PHP developers can enclose logic into a class that extends PHP_Fork, then execute the ssttaarrtt(()) method that forks a child

process Communications with the forked process is ensured by using a Shared Memory Segment; by using a user-defined signal and this shared memory developers can access to child process methods that returns a serializable variable.

The shared variable space can be accessed with the two methods:

• void setVariable($name, $value)

• mixed getVariable($name)

$$nnaammee must be a valid PHP variable name;

$$vvaalluuee must be a variable or a serializable object.

Resources (db connections, streams, etc.) cannot be serialized and so they’re not correctly handled.

Requires PHP build with ——eennaabbllee ccllii ——wwiitthh ppccnnttll ——eennaabbllee sshhmmoopp

Only runs on *NIX systems, because Windows lacks of the pcntl ext.

Php 4.3.10 and 5.0.3 Released

The PHP Development Team would like to announce the immediate release of PHP 4.3.10 and PHP 5.0.3 These are maintenance

releases that in addition to non-critical bug fixes address several very serious security issues All Users of PHP are strongly encouraged

to upgrade to one of these releases as soon as possible

For changes since PHP 4.3.9, please consult the PHP 4 ChangeLog For changes since PHP 5.0.2, please consult the PHP 5 ChangeLog For more information, visit h http://www.php.net t

Trang 9

NEEWW SSTTUUFFFF

MySQL Query Browser 1.1.5

MySQL.com announces the latest relase of the MySQL Query Browser MySQL.com claims: ” MySQL Query

Browser is the easiest visual tool for creating, executing, and optimizing SQL queries for your MySQL

Database Server The MySQL Query Browser gives you a complete set of drag-and-drop tools to visually

build, analyze and manage your queries.”

For more information or to download, visit: h http://www.mysql.com/products/query-browser

PHP awarded programming language of 2004

A new post on php.net announces PHP being awarded programming language of the

year.

PHP has been awarded the Programming Language of 2004, according to the TIOBE

Programming Community Index This index uses information collected from the

popu-lar search engines, and are based on the world-wide availability of skilled engineers,

courses and third party vendors Congratulations to us all!

For more information visit: w www.php.net

eAccelerator 0.9.1

eAccelerator announces their latest

release 0.9.1.

What is it? The eAccelerator sourceforge

page describes it as: ” a further development from mmcache PHP Accelerator &

Encoder It increases [the] performance of PHP scripts by caching them in compiled

state, so that the overhead of compiling is almost completely eliminated.”

Get more information from h http://eaccelerator.sourceforge.net/Home e

Looking for a new PHP Extension? Check out some of the lastest offerings from PECL.

WinBinder 0.34.117

WinBinder is a new extension that allows PHP programmers to build native Windows

applications It wraps the Windows API in a lightweight, easy-to-use library so that

program creation is quick and straightforward

mailparse 2.1

Mailparse is an extension for parsing and working with email messages.

It can deal with rfc822 and rfc2045 (MIME) compliant messages.

newt 0.3

PHP-NEWT - PHP language extension for RedHat Newt library, a terminal-based

win-dow and widget library for writing applications with user friendly interface Once this

extension is enabled in PHP it will provide the use of Newt widgets, such as windows,

buttons, checkboxes, radiobuttons, labels, editboxes, scrolls, textareas, scales, etc.

Use of this extension if very similar to the original Newt API of C programming

lan-guage

ssh2 0.4.1

Provides bindings to the functions of libssh2 which implements the SSH2 protocol.

libssh2 is available from h http://www.sourceforge.net/projects/libssh2

phpMyFAQ 1.5.0 Alpha 2

phpmyfaq.de announces “The second alpha version of phpMyFAQ 1.5.0 is available This

ver-sion is PHP5 compatible and introduces a faster template engine LDAP support is now a

selec-table option and the traditional Chinese and Japanese language files were updated Beside

some code improvements we fixed a lot of bugs Do not use this version in production systems, but test this version and report bugs!” Get the latest info from p phpmyfaq.de

with the blame bucket beingpassed around several handsrather than focusing everyone’senergy on ensuring not onlythat bugs would be fixed, butthat people would be properlyinformed and made aware ofthem

For my part, I’ve decided that

we should help with what weknow how to do best: inform-ing people On January 1st (talkabout a New Year resolution!),

we started a new mailing listdedicated exclusively to PHPsecurity You can read moreabout it in this month’s exit(0)column (which you’ll find atthe end of the magazine), so Iwon’t bore you here by dupli-cating the details I simplyhope that the mailing list willhelp everyone keep securitymore in check; PHP hasbecome so popular that we can

no longer afford to hidebeneath the folds of the Weband hope that no-one will findout about any of our weakness-es—they will, and we must beready to deal with the conse-quences

Until next month, happy ings!

read-Editorial

Hello Goodbye

Continued

Trang 10

Getting started

Before we get started with the actual coding, we need

to get an overview of the format we are working with

OpenOffice.org documents are XML files stored inside

ZIP archives In the list below, you can see the

directo-ry structure of the files inside an OpenOffice.org

docu-ment When you unzip this document, you will simply

get some plain XML files and directories

Of course, in order to generate an OpenOffice.org

doc-ument, we need to perform this process in reverse and

create a ZIP file from the XML files that we create In

this article, I will show you exactly how you can use PHP

to produce all the different files required to form a valid

OpenOffice.org Write document

The files shown in the example above are the bare

minimum required to make up a document that

OpenOffice.org will recognize as valid If you, for

exam-ple, have embedded an image inside your document,

then this is also stored as a separate file in a foldernamed PPiiccttuurreess (not shown above)

The ccoonntteenntt xxmmll file contains the actual text in yourdocument Headers, paragraphs, lists and tables arerecorded in an XML format for content that is well doc-umented, unlike the formats used by competitors likeMicrosoft Word This is the file on which this article willfocus for most of the time

The ssttyylleess xxmmll file contains the definition of thefonts, colors, sizes and other stylistic elements used bythe document To draw a parallel with HTML, thiswould be the equivalent of a CSS file, while ccoonntteenntt xxmmllwould be the equivalent of the HTML document towhich the stylesheet applies

mmeettaa xxmmll contains “meta” information about the ument In this file, you can, for example, find the name

doc-of the document, its keywords and statistics like thetotal number of words that it contains You can alsostore meta information, like Dublin Core, in mmeettaa xxmmll

As you might know, OpenOffice.org is getting more and

more users This article will show you how you can

gener-ate documents from PHP that the Writer component of

OpenOffice.org can read It’s a follow up to the author’s

previous article, which appeared in the October 2004

issue of php|architect and dealt with extracting

informa-tion from OpenOffice documents.

Trang 11

FEEAATTUURREE

Generating OpenOffice.org Documents with PHP

Dublin Core is a meta data standard that defines a

generic set of attributes used to describe a specific

piece of information

SSeettttiinnggss xxmmll is actually intended for the

OpenOffice.org editor itself It’s used to store GUI

set-tings and, since this has nothing to do with content, we

are not going to look into how we can store arbitrary

settings into this file

MMaanniiffeesstt xxmmll is a simple XML file that contains a

ref-erence to each file that makes up the document The

type of the document is stored in the mmiimmeettyyppee file,

which contains a single line with the MIME

denomina-tion of the document’s type, such as, for example,

“application/vnd.sun.xml.writer” for a Writer file

Getting the Content Right

Since the most important part of our document is

(obviously) its content, I will start by showing you how

we can generate a minimal ccoonntteenntt xxmmll file As you can

see in Listing 1, after the document type definitions we

get to the main element, ooffffiiccee::ddooccuummeenntt ccoonntteenntt,

which, in turn, contains optional elements like

ooffffiiccee::ssccrriipptt, ooffffiiccee::ffoonntt ddeeccllss and

ooffffiiccee::aauuttoo mmaattiicc ssttyylleess In our example, they are empty

The ooffffiiccee::bbooddyy element contains the document’s

contents themselves, but, before we get into its details,

we should also mention the tteexxtt::sseeqquueennccee ddeeccllss

ele-ment, which is used for numbering items in the

docu-ment and defining in which order different items are

numbered In our sample document, I have just

sup-plied the default order

Immediately after the sequence declaration, we insert

our actual content in the ooffffiiccee::bbooddyy element In this

minimal document, I have just added a small

para-graph that displays the text “Hello World!”

Paragraphs and Inline Styles

The most common element that is usually added a ument is a simple paragraph In the code snippedbelow, you can see the basic syntax of a minimal para-graph:

doc-<text:p text:style-name=’Standard’>Hello World!</text:p>

Notice that the namespace prefix tteexxtt is used This isused for all textual elements in the document—whichmakes it very simple to extract all textual informationfrom it The paragraph is also styled with the SSttaannddaarrddstyle The definition of this style can be found in the inthe ssttyylleess xxmmll file

If you need to add whitespace in your text, you canuse the tteexxtt::ss element This is the spacing element:you define how many characters this space takes up inthe tteexxtt::cc attribute See the example below for a sim-ple space definition which takes up 4 characters:

<text:s text:c=”4”/>

Inside paragraphs, you normally have formatting ments like bold, italic and underline InOpenOffice.org, these styles do not have any matchingtags—a tteexxtt::ssppaann tag to which different styles areattached is used instead All unique spans are given astyle name with the tteexxtt::ssttyyllee nnaammee attribute The def-inition of these styles if actually found in theccoonntteenntt xxmmll files under the ooffffiiccee::aauuttoommaattiicc ssttyylleess

ele-1 <?xml version =’1.0’ encoding =’UTF-8’ ?>

2 <!DOCTYPE office:document-content PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument1.0//EN’ ‘office.dtd’>

24 <text:sequence-decl text:display-outline-level=’0’ text:name=’Illustration’/>

25 <text:sequence-decl text:display-outline-level=’0’ text:name=’Table’/>

26 <text:sequence-decl text:display-outline-level=’0’ text:name=’Text’/>

27 <text:sequence-decl text:display-outline-level=’0’ text:name=’Drawing’/>

Trang 12

element To make some text bold, we therefore first

need to create a definition of the style, for example like

the one shown below:

Every style that is part of the automatic styles group is

defined with the ssttyyllee::ssttyyllee element and given a

unique name Its properties are set using the

ssttyyllee::pprrooppeerrttiieess element—that is, the same way as all

styles are defined in ssttyylleess xxmmll In our style, we simply

defined the ffoo::ffoonntt wweeiigghhtt to be bboolldd This is the basic

definition of bold text

Once we have created a style definition for it, we can

use the T1 style to mark bold text as such We simply

add a span element around the text we want to affect

and set the style name to TT11 Below, you see an

exam-ple of how we mark text as being bold in our content:

<text:p text:style-name=’Standard’>Here are

<text:span text:style-name=”T1”>some bold

text.</text:span></text:p>

Headers

Headers are important parts of any document, as they

are used to structure content They are defined at the

same level as paragraphs using the tteexxtt::hh element As

with all other text elements, you define the style of the

element separately In addition, you need to define the

level of the header using the tteexxtt::lleevveell attribute For

example, here’s an example of a Level-1 header

ele-ment To create a header with another level you simply

change the level and style attribute

<text:h text:style-name=”Heading 1” text:level=”1”>A

header</text:h>

Images

Of course, any nice-looking document contains images

The OpenOffice.org document format is a collection of

files, and images are no exception You need to put any

image you want displayed in your documents in a

sub-directory called PPiiccttuurreess This image, in turn, needs to

be referenced in the ccoonntteenntt xxmmll file You can place

images inside paragraphs using the ddrraaww::iimmaaggee

ment The XML code below shows a typical image

is displayed in the document We also need to supplythe path to the image, which is relative from the root

of your document package; this is done with thexxlliinnkk::hhrreeff attribute

In order to calculate the size of the image, we need

to translate pixels into inches To do so, we must findthe size of the image in pixels, which can be done (forexample) using the ggeettiimmaaggeessiizzee(())standard PHP func-tion We also need the DPI (dot-per-inch) settings forthe image For example, 75 is a common setting forlow-resolution images If you use a high quality printer

or publisher, images usually need to be at least 300DPI.Once we know how many dots (which in our case is thesame as pixels) we have, we can easily calculate the size

of the image in inches The formula for this is:

Size in inches = pixel width/DPI

For example, a 300-pixel-wide image printed at 75DPIwill be 300 / 75 = 4 inches wide The same calculation

is used for the height In PHP, the size calculation can

be done as shown below

$fileName = “/path/to/myimage.jpg”;

$sizeArray = getimagesize( $fileName );

$width = $sizeArray[0] / 75;

$height = $sizeArray[1] / 75;

Note that this is just one example You can also use the

EXIF extension to retrieve the number of DPI

Lists

You may also want to have some lists inside your ment These are placed at the same level as paragraphsand headers in your the content XML file You can haveboth unordered and ordered lists using thetteexxtt::uunnoorrddeerreedd lliisstt and tteexxtt::oorrddeerreedd lliisstt elementsrespectively The list contains one or more tteexxtt::lliisstt iitteemm elements, which, in turn, enclose the content foreach list item—normally, just a paragraph with sometext inside, but (at least in theory) as complex as youneed it to be The XML snippet below shows anunordered list containing two elements You can seethat lists, like all other block elements, are also styledwith the tteexxtt::ssttyyllee nnaammee attribute OpenOffice.orgnormally uses LL11, LL22, and so on for naming list styles,but that’s just an arbitrary convention you can choose

docu-to ignore in favour of your own flavour if you like

Trang 13

FEEAATTUURREE

Generating OpenOffice.org Documents with PHP

<text:p text:style-name=”P1”>Item text</text:p>

Writing the XML file

We have now looked at how we can build the

ccoonn tteenntt xxmmll file, which is the most important portion of

our document In fact, if we do not care about what

styles are used, the only thing we have to worry about

is the content file—and use just a set of standard

tem-plates for the remainder ofthe document

components

In my previous article about OpenOffice.org and PHP,

I showed how you can use PHP to parse the XML files

using a standard PHP DOM XML parser and XSLT

trans-formations We can also use a DOM XML library to

gen-erate our XML documents, but, in this article, I will use

a much simpler approach: we’ll just write the XML textdirectly to a file

In my production code, I also do not use a DOMlibrary to generate the XML, since doing so wouldmake the code quite a bit slower My code, in fact, is assimple as the one shown in the snippets below.Essentially, I place the XML code making up our ccoonn tteenntt xxmmll file in a variable and then write it to disk Inthe code snippet below, for example, I named the vari-able $$ccoonntteennttXXMMLL:

You can see a complete (but still simple) XML

docu-1 <?xml version =’1.0’ encoding =’UTF-8’ ?>

2 <!DOCTYPE office:document-content PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument1.0//EN’ ‘office.dtd’>

28 <text:sequence-decl text:display-outline-level=’0’ text:name=’Illustration’/>

29 <text:sequence-decl text:display-outline-level=’0’ text:name=’Table’/>

30 <text:sequence-decl text:display-outline-level=’0’ text:name=’Text’/>

31 <text:sequence-decl text:display-outline-level=’0’ text:name=’Drawing’/>

32 </text:sequence-decls>

33 <text:h text:style-name=”Heading 1” text:level=”1”>A header</text:h>

34 <text:p text:style-name=’Standard’>Hello World!</text:p>

35 <text:p text:style-name=’Standard’>Some spaces: <text:s text:c=”4”/></text:p>

36 <text:p text:style-name=’Standard’>Here are <text:span text:style-name=”T1”>some bold text.</text:span></text:p>

Trang 14

FEEAATTUURREE

Generating OpenOffice.org Documents with PHP

ment for ccoonntteenntt xxmmll in Listing 2

If you are using PHP 5, you can actually save quite a

few lines of code by using the ffiillee ppuutt ccoonntteennttss(())

function instead of opening and writing to the file

man-ually An example of this approach is shown in the code

snippet below This little script performs the same

oper-ation as the first example: first, it opens the file; then

writes the contents to it, and finally closes it—it just

does so with only two lines of code We do not need to

make it any harder than this, and we’ll use the same

technique to generate the other XML files that are part

of our document as well

$contentXML = “<?xml version=’1.0’ ”;

file_put_contents( “path/to/content.xml”, $contentXML

);

Styling our Document

If you want to introduce your own custom design

ele-ments in your document, you’ll need to work with the

ssttyylleess xxmmll file As mentioned earlier, this is where you

define fonts, colors, sizes and positions, and so on

Let’s use headers as an example In Listing 3, you can

see a minimal ssttyylleess xxmmll file containing the style

defi-nition for “Header 1.” Our style defidefi-nitions are placed

directly under the ooffffiiccee::ssttyyllee element

A style is defined with the element ssttyyllee::ssttyyllee Eachstyle needs a unique name, which is set by using thessttyyllee::nnaammee attribute The ssttyyllee::ffaammiillyy attribute, onthe other hand, defines the type of style used Forheaders, we use paragraph styles, since headers arerendered basically in the same way as a paragraph.Every style has a “parent” style, which is used as a base

to set the underlying “defaults” of the style The parentstyle is defined with the ssttyyllee::ppaarreenntt ssttyyllee nnaammeeattribute Finally the last attribute in our style definition

is ssttyyllee::ttyyppee, which, in our case is just “text”

The actual properties of the style, such as font sizeand color, are defined in the ssttyyllee::pprrooppeerrttiieess element,which is a child of ssttyyllee::ssttyyllee In our example, wehave defined the font size with the ffoo::ffoonntt ssiizzee ele-ment, the font weight as bold with the ffoo::ffoonntt wweeiigghhttattribute and, finally, the color with the ffoo::ccoolloorr attrib-ute The color is defined in a hex triplet—that’s thesame way as you define colors in CSS and HTML, which

is something you’re probably familiar with

The Manifest

The mmaanniiffeesstt xxmmll file contains information about all thefiles that make up the document This is an XML filewith two basic elements The main node, called mmaannii ffeesstt::mmaanniiffeesstt, contains one element for each file The

1 <?xml version =’1.0’ encoding =’UTF-8’ ?>

2 <!DOCTYPE office:document-styles PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument 1.0//EN’ ‘office.dtd’>

1 <?xml version =’1.0’ encoding =’UTF-8’ ?>

2 <!DOCTYPE manifest:manifest PUBLIC ‘-//OpenOffice.org//DTD Manifest 1.0//EN’ ‘Manifest.dtd’>

3 <manifest:manifest xmlns:manifest=’http://openoffice.org/2001/manifest’>

4 <manifest:file-entry manifest:media-type=’application/vnd.sun.xml.writer’ manifest:full-path=’/’/>

5 <manifest:file-entry manifest:media-type=’’ manifest:full-path=’Pictures/’/>

6 <manifest:file-entry manifest:media-type=”image/gif” manifest:full-path=”Pictures/myimage.jpg”/>

7 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’content.xml’/>

8 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’styles.xml’/>

9 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’meta.xml’/>

10 <manifest:file-entry manifest:media-type=’text/xml’ manifest:full-path=’settings.xml’/>

11 </manifest:manifest>

Listing 4

Trang 15

specific files are then defined using the

mmaanniiffeesstt::ffiillee eennttrryy element The mimetype of each file is declared in

the mmaanniiffeesstt::mmeeddiiaa ttyyppee attribute, and its path with

mmaanniiffeesstt::ffuullll ppaatthh Listing 4 contains the

mmaanniiffeesstt xxmmll file for our basic document with one

image

Meta information

If we want to add meta information to our document,

we can store it in mmeettaa xxmmll At a minimum, information

like document generator, creation date, number of

edits and document statistics is stored here However,

you can store much more data in this file if you like For

example, if you need to comply with the Dublin Core

meta data standard, you can store that information in

this file You can see an example in Listing 5

Settings

The sseettttiinnggss xxmmll file is only used by the

OpenOffice.org writer editor to remember which

tool-bars were open and which GUI settings the user had

enabled when a document was last being edited Since

we won’t be doing any editing, we are not going to

look much into this document other than to generate a

very minimally valid version of it Listing 6 shows the

minimal XML document for sseettttiinnggss xxmmll—you can

just hard-code this value in a variable and reuse it as

needed without worrying about it anymore

Packing the Final Document

We have now looked at what kind of XML is used in allthe different files that make up an OpenOffice.org doc-ument The ccoonntteenntt xxmmll and ssttyylleess xxmmll files, in thatorder, are by far the most important portions of thedocument I also showed you how you can simplymanipulate your XML in PHP without the aid of anyspecialized library, so, if you have stored all the XMLfiles in one directory and done the same with your pic-ture, we are ready to package our document In fact, allthat’s left to do is to ZIP down all the files in this folderand re-name the resulting file so that its extension is ssxxww Voila!

The simplest way of creating the ZIP archive is to runthe zziipp command found on most operating systemsdirectly from PHP This can be done by using the eexxeecc(())

function in PHP to instantiate and execute the mand directly from the directory where you havestored the document files The code snippet belowshows a simple way of changing to the directory whereyou have stored the files and ZIP the latter down into

com-an OpenOffice.org writer file:

chdir( “path/to/files” );

exec( “zip -r /oo_document_name.sxw *” );

If you are not dynamically changing the contents of themmeettaa xxmmll and mmiimmeettyyppee files, there is no reason to actu-

1 <?xml version =’1.0’ encoding =’UTF-8’ ?>

2 <!DOCTYPE office:document-meta PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument 1.0//EN’ ‘office.dtd’>

1 <?xml version =’1.0’ encoding =’UTF-8’ ?>

2 <!DOCTYPE office:document-settings PUBLIC ‘-//OpenOffice.org//DTD OfficeDocument 1.0//EN’ ‘office.dtd’>

January 2005 PHP Architect www.phparch.com

Generating OpenOffice.org Documents with PHP

15

Trang 16

ally generate them from PHP In fact, even mmaanniiffeesstt xxmmll

does not really need to be correct—I’ve happened to

create documents that did not have images properly

defined in the manifest and still worked fine Of course,

however, you should generate these files correctly

every time in order to comply with the document

standard

On Not Re-inventing the Wheel

If you want to get started quickly with OpenOffice.org

writer document generation, you do not necessarily

have to write the whole code yourself There already

exist some libraries that can take care of the dirty work

for you For example, the eZ publish content

manage-ment system has a library for generating

OpenOffice.org writer documents This library is part of

the public extension libraries in the product

PHPDocWriter is another library that enables you to

generate OpenOffice.org writer documents easily in

PHP There are probably other libraries out there as

well, so you can decide for yourself if you want to write

your own generator or if you want to use an existing

one for your needs

Use Cases

At this point, you should have a pretty good idea of

how you can generate an OpenOffice.org writer

docu-ment with PHP As you can see, this is much more work

than simply typing up the document in

OpenOffice.org—except, of course, for those cases

where you can’t A typical use case for this technology

is a CMS system that needs to export its content to a

standard document format, thus lowering the

invest-ment of time and money required to manipulate and

reuse it This is the same method I used when writing

content export functionality from eZ publish to

OpenOffice.org

The Future

The OpenOffice.org XML format will be changed in the

future Currently, members of OASIS (Organization for

the Advancement of Structured Information Standards)

are working together on creating an open XML

stan-dard for office applications based on the existingOpenOffice.org XML format Their goal is to create aneven more generic content format that can be used bydifferent applications—even CMS systems are covered

by their mission statement Since this new standard isbased on the OpenOffice.org XML file format, youshouldn’t need massive changes to support it when it isreleased Some people even speculate that this formatcan be adopted by Microsoft Word so that, in thefuture, you will be able save your office application data

in the same format regardless of which application youuse The Oasis Open Office XML format is currentlyavailable as a draft that has been approved by theOASIS Open Office Technical committee

Final Words

We have now looked at how we can generate anOpenOffice.org writer document containing the mostcommon formatting features used in word processing.This article, of course, only touches the surface of whatyou can store in a document by using theOpenOffice.org XML file format, but, hopefully, it givesyou some ideas on how you can harness the power ofthis open— source office suite in your own application.Generating OpenOffice.org spreadsheets and presen-tation documents is done in much the same way and,

by reading up on the specifications of each XML file mat, you should be able to create your ownz in notime

for-F

FEEAATTUURREE

Generating OpenOffice.org Documents with PHP

To Discuss this article:

http://forums.phparch.com/192

Bård Farstad is one of the three co-founders of eZ systems He has been working professionally with CMS development since 1999 and is the author of many general purpose libraries, such as an XML parser, SOAP library (client/server) and an XML-RPC library (client/server) He is also one of the main developers of the eZ publish CMS In his spare time, he likes to play with his daughter and play the guitar; he is also “aquascap- ing,” the art of decorating aquaria You can reach Bård via e-mail at

b bf@ez.no o.

Resources

OpenOffice.org homepage h http://openoffice.org

OpenOffice.org XML file format h http://xml.openoffice.org

eZ publish homepage h http://ez.no o

eZ publish public extension repository h http://zev.ez.no/svn/extensions/ /

PHP DocWriter h http://phpdocwriter.sourceforge.net/ /

Oasis Open Office XML Format h http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office

Trang 18

FEEAATTUURREE

Sitting in his comfy fortieth-story office, he

effort-lessly presses a button, sending a tracking robot

into the sky from the roof of the huge sky scraper

The robot swoops down towards the ground, stopping

just feet away from the top of its target, a late model

truck hauling some sort of cargo The robot follows the

vehicle and takes a picture every few seconds; it then

sends the photos to the fortieth-story office using its

Wi-Fi antenna When the photos are received, on a

computer monitor in the office a dot is placed on a

map showing where they were taken, tracking the

tar-get all the way to its destination

Of course, this story is fiction—except for the images

that were taken The images mentioned in the story

were encoded with the longitude and latitude received

from GPS satellites With the advancement of digital

photography and GPS technology, we have now access

to these types of images

I remember when I first saw a GPS device It was

about six years ago, I was in an archeology class in

col-lege and the only one who could touch the device was

the teacher because it was so expensive The device we

used in that class was very basic and only told the

direc-tion and longitude and latitude We had to use it in the

shade because the LCD display was so dim But, it was

cool to be able to use the device, because not many

people had them at the time These days, GPS devices

are almost as cheap as free and are much more ticated and wrapped in much smaller packages, andyou can actually see what’s on the LCD display right inthe sunshine Because the space that a GPS deviceneeds to function has been reduced, they’re ending up

sophis-in very unusual but useful places

More and more cell phones hit the market everyyear—and every manufacturer is adding new technolo-

gy to their continually shrinking devices One of themore recent developments is the integration of GPSinto cell phones Besides just being able to tell whereyou are at any given second, this new feature opensmany doors for navigation and tracking with thephone It definitely gives a new meaning to the words

“big brother,” though

Along with GPS has been the inclusion of a camerainto the phone Sure, digital cameras have been in thephones for a while, much longer than GPS, but withthe new feature a melding of the two has occurred If

Ever wonder where a photo was taken? We’ve all seen

exotic or interesting photos, but really had no way of

pin-pointing their location Now, with the marriage of GPS

and digital photography, we can know the exact location

of where our favorite photos were taken.

Trang 19

FEEAATTUURREE

your phone is equipped with GPS and also has a digital

camera built into it, you should be able to send images

via email or even download them to your computer

later that have the latitude, longitude, direction and

even the speed at which the photo was taken attached

to it The information is not visible on the photo itself,

but it actually gets embedded into the image’s headers

when the photo is taken Since the information is in the

photos, it is just a mater of retrieving it from the image,

which you will then be able to plot on a map to show

where the photo was taken—just like in my fictional

short story above

When I first heard about the GPS data included with

images from cell phones, I figured it was sent as a line

in an email, totally separate from the image Some early

adopters of the technology did go this route, but for

the most part today the images actually have the GPS

information placed into the EXIF tags that are located

in the JPEG image that is sent

What is EXIF?

Here’s a definition of EXIF:

Exchangeable image file format, a JPEG-encoded file

format for digital cameras that has similar tags to

TIFF.

Most of the digital, professional to cell-phone quality,

cameras on the market use this standard to include tags

in the headers that provide information about the

photo Some are so detailed that they’ll give you the

type of camera, camera settings, resolution of the

pic-ture, time and date it was taken and a lot more Having

these tags embedded into the photo will give the

pho-tographer or viewer the ability to know how and even

where the photo was taken and can be viewed at

any-time without having to go to a separate file

The standard was actually created in 1995 by the

JEITA (Japan Electronic Informational Technology

Association) to standardize not only the way the image

was recorded, but also the attributes of the photo

Since there is a standard, many programs that are not

related to any type of camera or brand can view these

headers and use the information for other applications

You can even install and configure PHP with the ability

to view EXIF headers directly without using a third

party tool, which, incidentally, is what we’ll be doing in

a little while

In this article, we’ll see how to pull out the longitude

and latitude from a photo’s EXIF headers using a library

called Exifer written by Jake Olefsky We will then

for-mat the GPS inforfor-mation into a useable forfor-mat, and

plot the location of the photo onto a world map

The sample image that we are using (Figure 1)

actu-ally came from a professional digital camera The

pho-tographer used a program to include in the EXIF

head-ers GPS information from a GPS handheld device which

recorded the location and time of the photographerthroughout his day of photography If you get a photofrom a cell phone with GPS or from a professional dig-ital camera that had the GPS info added later on, thelongitude and latitude will still be placed into the EXIFheaders in the same format

Let’s Get Plotting

Before you get started, you will need to grab the latestversion of Exifer I used version 1.4 for this example, at

h http://www.jakeo.com/software/exif/ This is the library

we will use to get the longitude and latitude tion I chose this route because it was the simplest toget up and running and doesn’t require any installation

informa-or configuration of PHP apart from placing the filesonto your web server

We will have to use some sort of library to parsethrough and decode these EXIF tags For the most part,the tags are not in any human readable format If you

go to the image and open it up in some sort of text tor you will see just a bunch of gibberish—not really

edi-Where in the World was that Photo Taken?

Figure 1

Copyright © FreeFoto ( f freefoto.com m ), Ian Britton (A photograph of a munications tower—but this photo is able to communicate other things besides just the image.)

Trang 20

com-anything you will be able to work with.

The first listing (Listing 1) includes the eexxiiff pphhpp file

from the Exifer library so that we can pull the EXIF

infor-mation from our image The $$ppaatthh variable indicates

where on our server the image is located We will then

set the $$vveerrbboossee setting to zero If verbose is set to one,

it will include the image’s raw output as well, which, in

our case, would just end up as extra overhead in the

array Now the last line of the listing will actually grab

the EXIF tags from the image, read through them and

place them into an array from which we can then pull

out any of the information in a very easy way

Included in the Exifer library is the ability to view a list

of all available information and attributes for a photo

You can run the index file that is located with the Exifer

package—it basically reads and outputs to the screen

the array that is created from the eexxiiff pphhpp file

Listing 2 actually calls out the longitude and latitude

from the photo Most of the photos that have this

infor-mation will be formatted in hhhh mmmm ssss (hours, minutes

and seconds) and look something like (34 55.43 23.44)

when retrieved from the array With our example

image, the format is just hhhh mmmm

Listing 3 actually breaks the longitude and latitude

into an array of hours, minutes and seconds Because

we get the longitude and latitude in this format, our

coordinates will need to be converted into just a string

of numbers with a decimal point—something like(35.445845) Listing 4 takes each element of the arrayand converts each of the sets of numbers to give us twousable numbers to be able to plot on our world map The conversion method is as follows:

First the seconds are divided by 60

$lats = $result [ GPS ][ Latitude ];

$lons = $result [ GPS ][ Longitude ];

Listing 2

$lat_ar = explode (“ “, $lats );

$lon_ar = explode (“ “, $lons );

Trang 21

FEEAATTUURREE

with the minutes which were combined with the

sec-onds

hh+mm

In listing 5, I have included my home town’s

longi-tude and latilongi-tude to plot on the map This will also

serve as way to check the accuracy of the plotting If

the dot on the map is not where it should be, you will

know something is wrong with the code or map You

can, of course, change this information with your own

area’s longitude and latitude Most of the map services

on the internet should allow you to translate your

address into longitude and latitude coordinates I have

found that h http://www.maporama.com does include most

countries around the world and gives you the

longi-tude and latilongi-tude in an easy to read format along the

left side of their site

Now we come to actually plotting the locations on a

map There are many different ways to do so, but, for

this example, we will use one of the easiest and most

basic methods by taking advantage of what is called a

cylindrical projection map You can use the same map Iused, or you can find your own cylindrical projectionmap The map can be larger or smaller and will it notaffect this code at all Most of the cylindrical projectionmaps will work, as long as they do not have any type ofborder around them Even the most simplistic border

could throw off the calculations in later listings I havetested this program using the abstract maps from

h http://flatplanet.sourceforge.net/maps/and it works justfine, even though the maps on this site or much biggerthan our example map

Also, in listing 5 we will start to create our map I usethe iimmaaggeeccoolloorrrreessoollvvee function to define my colorsbecause I found that it is more accurate than using the

iimmaaggeeccoolloorraallllooccaattee with the map I have chosen Onecolor, $$rreedd11, will be used for the dot that locates myhome town, while another, $$rreedd22, will be used to locatewhere the photo was taken when we plot the two onthe map

Where in the World was that Photo Taken?

$lat_ar = explode (“ “, $lats );

$lon_ar = explode (“ “, $lons );

$im = imagecreatefromjpeg (“map.jpg”);

$red1 = imagecolorresolve ( $im , 255 , , );

$red2 = imagecolorresolve ( $im , 255 , , );

Listing 5

the collected wisdom of many programmers solving a

$width = imagesx ( $im );

$height = imagesy ( $im );

Trang 22

FEEAATTUURREE

Listing 6 grabs the dimensions of the image so that

we can use it in the math that is used to plot the

loca-tions

Listing 7 is the method for getting an xx and yy position

to plot on the map using our previously-converted

lon-gitude and latitude coordinates from our photo and the

height and width dimensions of the map It will then

create a red square on the map image where the

loca-tion of the photo was taken Again, this is probably one

of the simplest and definitely not the only or best ways

of plotting a location onto a map

Listing 8 uses the same method and creates a

slight-ly darker square to show my home town as a reference

point to check the accuracy of the plotting

Listing 9 will now display the newly created map in

your browser If everything goes to plan, you should see

the same results as Figure 2—a world map with the

location where the image was taken marked in the

United Kingdom If you have replaced my home town’s

coordinates with yours, you should see another dot

locating your area; otherwise the dot should be around

the Mexico border You may want to check the

accura-cy of the plotting against another mapping system like

h

http://geoengine.nima.mil/muse-cgi-bin/rast_roam.cgi i

This site actually has a lot of useful world maps, as well

as black-and-white aerial photo maps taken from

satel-lites that you can plot coordinates onto and also

down-load in a high- or low-resolution format

There are other ways of plotting maps and I would

definitely suggest locating some other ways to show

where the photos were taken As you can see from the

map generated by our program, the results we obtain

only provide a general idea of where the image was

taken but, if the map did have the ability to zoom in,

the user would be able to see exactly where it was

taken Many maps that you can find from the internet

are available in vector and raster versions and will

sometimes include additional geographical

informa-tion, such as streets and highways—using them will

definitely add more detail I know that the Tiger

map-ping system (h http://tiger.census.gov/cgi-bin/mapbrowse-

-t

tbl) created by the US Census Bureau allows you to

cre-ate maps with very detailed information, but its data islimited to the US Using a more detailed map willmean your user will be able to better pinpoint whereand how to get to the place where the image wastaken You may also want to check out the article titled

“Webmapping with MapServer,” written by RodrigoBecke Cabral and published in the July 2004 issue ofphp|architect

And there you have a program whose capabilities aresimilar to those mentioned in my short story above(minus the cool flying robot) Now, you can take a roadtrip with your GPS-enabled cell phone and take pictures

of all the cool roadside monuments and then make amap showing where you’ve been and what you’veseen

Examples

There are several sites out there that are dedicated tomapping out where photos are taken Most of thesesites or projects are strictly for entertainment andexperimentation purposes and very few have beenapplied to the professional marketplace One of thebest examples of using this technology is

h http://www.geosnapper.com This site is dedicated to pho-tos that have GPS data It plots all the photos uploadedonto a world map and you merely scroll over to the sec-tion of the world you would like see and click on animage—a very good and entertaining use of the tech-nology

Another good example is the WWMX (World WideMedia Exchange), which can be found at

h http://www.wwmx.org This site is a little different becauseyou actually download an application that works withthe GPS logs—basically a file usually giving a GPS coor-dinate every minute—that come from a GPS device Itthen matches the timestamp of the photos you havetaken with a coordinating time on the GPS log andselects the appropriate location

With this information on hand, it creates a map andwebpage on which the location where every photo wastaken is plotted You then can take the map and web-page and upload it to your own website The users ofthis system are primarily taking advantage of it for trav-

el logs and for showing others the path they took andwhat they saw along the way

Since cell phones were married to digital cameras, ithas been becoming more and more popular to createblogs using images taken with your phone Now, withthe integration of GPS, we are seeing a new level added

to the standard blog Several of the moblog (mobileblog) sites also allow you to click an image and locatewhere that photo was taken

Here is more of a professional use of this technology (or of a form of it, at any rate) At

h http://arcweb.esri.com/sc/album/index.html l, you are able

to view and locate images from all over the world Thissystem actually translates the street address of the pho-

Where in the World was that Photo Taken?

Trang 23

FEEAATTUURREE

tographer into latitude and longitude Even though it

works slightly differently from ours, it is still creating a

map with the location of the photo

This, however, is probably my favorite

uses of the GPS-tagged images: h http://www.downgoes-

-t

thesystem.com/devzone/exiftest/final/ Users actually

email the images to the site from their cell phones and

the site reads the EXIF headers and plots the images

onto the map, which is limited to Tokyo The map

allows the user to zoom in and see where the photo

was taken and also see the photo by clicking on the

location marker I probably like this one the most

because they are actually using tagged cell phone

images and not adding the data to the photo at a later

time

Final Thoughts

As you can see, there are quite a few people and

com-panies developing or toying around with this type of

technology in some form or other A handful of these

sites have yet to tap into the full features that EXIF tags

provide Having to input the GPS data separately is

redundant and could possibly be less accurate,

espe-cially if you do have a cell phone or device that records

this information into the photo it self

It seems to me that GPS is becoming more and more

popular and will, I assume, be packaged with more and

more electronic products in the future Today, we see

GPS technology placed into delivery trucks to trackdrivers around town or across the state Or, you canhave a device installed into your car that tracks it if it isstolen Having GPS in our lives helps us in so manyways It is definitely helpful when lost on a highway, oreven out on an uncharted trail Being able to pinpointyour location or at least know the general area whereyou are is a great benefit

Hopefully, this program, as well as the idea of usingGPS-embedded photos, has sparked your own imagi-nation I know I was quite intrigued with the possibili-ties of the technology and uses that it might hold fromthe very first time I saw it The melding of GPS and dig-ital images in cell phones is fairly new and its usefulnesshas not been fully explored It will be very interesting tosee where it goes in the future

To Discuss this article:

http://forums.phparch.com/193

Ron is the technical director/senior programmer for Conveyor Group (h http://www.conveyorgroup.com m), a Southern-California based web development firm His responsibilities include technology development, programming, IT and network management, strategic research, server systems management (webmaster), and website projects leader.

Award-winning IDE for dynamic languages, providing a powerful workspace for editing, debugging and testing your programs Features advanced support for Perl, PHP, Python, Tcl and XSLT, on Linux, Solaris and Windows

Download your free evalutation at www.ActiveState.com/Komodo30

Where in the World was that Photo Taken?

Trang 24

TEESSTT PPAATTTTEERRNN

Layering is essential The only way our rather feeble

brains can cope with software development at all is

by a process of divide and conquer This is because

bugs are easy to fix once you find them—but finding

them is the problem If we can make a part of our code

completely unaware of the other parts, we know for

sure that any errors in it are local Layering is the

grand-est expression of divide-and-conquer: it divides our

entire application into a very few pieces and declares

that each one can only be influenced by itself and, at

most, one other In particular each layer can only see

the next one down

This is easy to understand and works well It’s no

sur-prise, then, that this technique has been applied to

complex enterprise applications and that there are lots

of layered systems to choose from It also means,

unfor-tunately, that terminology has suffered Layers are

sometimes called logical tiers, or just tiers You also see

texts where “tiering” or layering is described as the

sep-aration of hardware, that is, the use of multiple servers

Faced with this confusion and the need to fit an

expla-nation into a single article, I am going to have to punt

My preferred solution in this arena is four layers, so I’ll

take as my starting point the one used by Eric Evans in

“Domain Driven Design” (published by Addison

Wesley)

Then we’ll prod and poke it

The Four-Layer Architecture

As you can see in Figure 1, the layers in our model arepresentation, application, domain and infrastructure Ifyou are not used to UML, then the tabbed boxes arepackages—they are, basically, big dollops of code Thearrows show visibility, so that the application layer isblissfully unaware of the presentation layer, for exam-ple To demonstrate the way the layers work, I amgoing to use the very trivial example of a contact man-ager Firstly, let’s see what the presentation code wouldlook like for the single task of e-mailing someone:

The four-layer, or four-tier, architecture is an

enterprise development classic The trouble is

that, for small projects (or big simple ones) it is

complete overkill What happens when we try to

simplify this layering?

Trang 25

January 2005 PHP Architect www.phparch.com 25

Shedding a Tier

<body>Message sent to <?= $_GET[‘name’]

?></body>

</html>

The method and style of interaction, or, in this case, the

lack thereof, is what makes up the presentation layer If

you can imagine changing the way the application is

used—for example, switching to a GUI or a web

servic-es API—then anything that would change must go into

this layer That’s actually a lot of stuff: it naturally

includes JavaScript, CSS, form parameters and the

HTML, but it also includes sessions and maintaining

authorization After all, these will be different for, say, a

desktop application compared with a web one

The presentation layer is allowed to interrogate the

application one, here represented by the CCoommmmuunniittyy

class Let’s look at that next:

class Community {

function mail($name, $title, $message) {

$finder = new PersonFinder();

I don’t have the space to build a complete four-layer

application at this point, so I am going to have to

illus-trate the ideas with code fragments from now on

The application layer is the glue that binds all of the

components together It’s all about actions written in a

language that the business stakeholders would

under-stand The domain objects contain the more innate

business rules An example of domain knowledge is

how the e-mail is sent The application layer knows

nothing of this process, extra headers, formatting, and

so on It just kicks off the domain code

What makes something an application object and

what makes it a domain object is subtle The distinction

comes about because applications change more

fre-quently, often in response to what users want from the

business The knowledge of the business domain itself

is acquired more slowly and with a lot more effort In

fact, so much effort goes into this process of discussion

between the developers and stakeholders that it is a

process known as “knowledge crunching.” By contrast,

the application code should tell a simple story of what

is going on In our example, this boils down to finding

a person, getting a contact point from them and

final-ly sending the message The grammar just then isEnglish, but the grammar of our code snippet is PHPsyntax

In our example, I am choosing the CCoommmmuunniittyy class to

be part of the application layer, but I would expectclasses like PPeerrssoonn to be used in several applicationswithin an organization Because of this, I think it’s safe

to assume then that PPeerrssoonn is a domain layer object.Let’s look at a domain object next:

class Person {

} } }

There are business rules, even here in this trivial ple Ordering by preference means that we are takingthe first of a possible list of contact points Becausethere are other ways to contact our people, we had tospecify a media, in this case e-mail Unlike the applica-

exam-tion layer example, we have some clutter caused by thedatabase access We’ll take a broom to this in a littlewhile

As we descend to the lowest infrastructure level, westart to get to the nitty-gritty The code the domainobject is using is stuff that could be common to anyorganisation—library code if you will Here is someinfrastructure code:

class Connection {

} } return new ResultSet(result);

}

T

TEESSTT PPAATTTTEERRNN

remainder of the application—it does not, by itself, separate all of the

pres-entation logic ”

Trang 26

TEESSTT PPAATTTTEERRNN

Shedding a Tier

If you are like me, then you have written this type of

code a lot of times More likely, you have had the good

sense to use one of the many free libraries instead

You are probably thinking that all of these little

class-es and filclass-es could be replaced with a single top level

script that would be a whole lot simpler That would be

a good point For such a simple task, it’s definitely

worth noting that I would have a hard time disagreeing

with you The four-layer architecture only really comes

into play once the job starts to get complicated For

smaller projects, we can simplify to taste, so let’s look at

some shortcuts

Merging Application and Presentation

The blending of the layers can be seen in its most naive

form like so:

All we have done is taken the code in the old

applica-tion class and pasted it straight into our top level script

It’s the simplest way to combine the layers and, in fact,

you often see this approach hidden behind a template

All a template engine really does is separate the visual

formatting from the remainder of the application—it

does not, by itself, separate all of the presentation logic

We still have the $$ GGEETT array in our code, for instance

This choice is great for separating the HTML so that it

can be edited by graphic designers It doesn’t manage

to free you of the navigation and form handling

However, this is usually fine if you are just building web

applications and are changing only the look and feel

On the positive side, this approach is often good

enough for standalone applications It is also well

understood, especially within the PHP community, and

is a quick way to turn an HTML mock up into a

work-ing system The downside is that it will be hard to

inte-grate into other applications and much harder to test

Because the application code here lives in scripts, it will

have to be tested by looking at web pages—altogether

a rather coarse approach It works well for a small ect, but this is about as much of a hole as I like to digbefore I get nervous The warning signs are tricky bugswith things like security and also excessive duplication

proj-of code across the top level scripts

Merging Application and Domain layers

Because of the very slight difference between these twolayers, it is common to merge them into a single one.That becomes one of the classic three-layer architec-tures There is no difference in the code—it is just thatthe CCoommmmuunniittyy class is declared to be in the domainlayer

You can actually merge the application and domainlayers with few ill effects, but with one caution Thesymptom to watch out for is domain layer objects thatare difficult to test because of configuration Becausedomain objects are the part of the business rules youwould like to reuse across the organisation, you don’twant them tied to a specific server This may happenbecause they have fixed paths for files, or perhapsresources such as database passwords need to be glob-

al This is a sign of future trouble These kinds of sions most definitely belong in the application layer andthere is a big win in passing all of this into the domainFigure 1

Trang 27

deci-January 2005 PHP Architect www.phparch.com 27

Shedding a Tier

objects as parameters

I would split them into two camps if you have a lot of

server-specific configuration It’s no fun searching hard

drives for missing files

Purifying the Domain Layer

Objects representing the business domain will probably

have to be saved to a database, a process called

persist-ence The so called ActiveRecord pattern is the simplest

way to make objects persistent If the infrastructure

layer is very primitive, then the domain objects have to

do a lot of work communicating with the database The

ActiveRecord pattern is really no pattern at all—the

domain object will handle all of this work itself,

although you may be able to factor some of it out with

inheritance The earlier PPeerrssoonn class is an ActiveRecord

Although it has some help from the infrastructure

class-es, the metaphor is still one of database rows This extra

translation effort to go from a tabular database view to

an object view is called object/relational impedance,

and that’s not so nice when mixed in with your

busi-ness code (you may want to read up on Rick Morris’

articles on this topic that appeared in the August 2004

and November 2004 issues of php|a) Now, a full

dis-cussion of persistence patterns is a book in itself (e.g

Nock), but pushing out the database code comes down

to two basic ideas: external mappers and internal

acces-sors

The DataAccessor pattern, or DataAccessObject or

DAO, wraps all of the database code into a single object

that the domain object can call For example, when

This separation is invisible to the outside world The

domain object, here our earlier PPeerrssoonn, is in charge of

creating and using the accessor Note that the accessor

just deals with database data and only has getters and

setters The data coming back could be other objects or

arrays of data It doesn’t have to correspond to a single

row on the database and this can do a lot to clean up

the domain layer code: it’s the very simple approach of

delegating to an internal object to do all of the dirty

work

The opposite approach is the DataMapper pattern

With this method, we gut the domain objects of all of

the database code and use another separate class to do

PHP is, at last, starting to see some libraries emerge

to ease the workload for saving objects In order ofsophistication, they include PEAR::DB_DataObject,Propel and MetaStorage

Removing the Domain Layer

So much has revolved around the business domainlayer up to now that it may seem rather strange that itcould be removed What does an application look likewithout any business logic? Well, you can still use thedatabase operations of creating, reading, updating anddestroying, or CRUD for short, and you will also getmainly tabular data back The end result is just a simplereporting application, but these are common in thePHP world

Although limited in applicability, there is a way tomake applications of this type spectacularly quick towrite Apart from dispatching queries, they only have todeal with a single type of object, namely the set ofresults returned from a query The class is usuallynamed RReeccoorrddSSeett or some such similar name and, formemory efficiency, it is usually implemented as sometype of iterator—that is, you read a row at a time Asthere is only one type of object to display, it is easy tobuild a library of display components, usually calledwidgets or controls, to work with it These can rangefrom simple drop-down list widgets right up to elabo-rate editable table widgets

To show what this looks like, imagine we are going todisplay a table of people Here is a possible code frag-ment for the presentation layer:

T

TEESSTT PPAATTTTEERRNN

Trang 28

Looks easy—and it is—but that’s only because we are

playing to this system’s strength by simply displaying

tables What happens if we have to do some

calcula-tions on the columns or add other external data into

the output? Because the row data is only actually

loaded on demand, we have two options

The first option is to pull all of the data out, perform

our calculations and then create a new RReeccoorrddSSeett from

scratch We then pass that back instead That’s OK for

small amounts of data, but messy Notice that the

RReeccoorrddSSeett is a key abstraction here Because these

objects don’t have to come from a database, we are

free to build or intercept them and so squeeze in an

additional logical layer Only in so doing can we justify

calling this another form of three-layer architecture

The other option is to run our code as the rows are

fetched As the widgets are going to do all of the work,

we have to modify the nneexxtt(()) call We could inherit

from the RReeccoorrddSSeett, but preferable is wrapping it in a

class that looks identical This just passes the calls to the

real RReeccoorrddSSeett underneath This trick is called the

Decorator pattern, or, in this context, usually a filter

The code then looks like this:

<?php

$community = new Community();

$people =

&$community->findByCategory(‘friends’);

$filter = &new WithEmailsAsLinks($people);

$widget = &new TableWidget($filter);

?><html>

<head><title>My Friends</title></head>

<?php print $widget->paint() ?>

</html>

By writing our WWiitthhEEmmaaiillssAAssLLiinnkkss filter, we can later

intercept the nneexxtt(())call to manipulate the rows as they

pass through I am imagining that the “email” field

would be converted to an HTML anchor tag on each

nneexxtt(())call For big lumps of tabular data, this is a

com-mon technique It has the added benefit that the same

filters can be used again and again over an application

They are difficult to work with for tricky domain logic,

though, and ridiculous overkill if you mostly fetch a

sin-gle row or object at a time

Faced with this constraining style, one solution is to

move all of the complex business logic into the

data-base as stored procedures or triggers and use PHP as a

presentation tier only This then becomes a

database-driven application If all of your information is stored on

a relational database, the application does not changetoo much and the skills are available, then this is also atried and tested option

Most of the decisions so far have been easy to backout of, but the decision to go the RReeccoorrddSSeett routerather than the domain layer route is more of a fork inthe road If you will be dealing with mostly tabular dataand the bulk of your system is database-driven, thenthe RReeccoorrddSSeett model is probably the way to go If, onthe other hand, you are frequently working with singlecomplicated items or managing information from morethan just databases, go with the domain model Ingoing that route, the simplest first split will be straightdown the middle, namely get the domain layer awayfrom the presentation

That should give our brains a fighting chance

Further Reading

The essential enterprise patterns book is “Patterns ofEnterprise Application Architecture” by Martin Fowler(Addison-Wesley) Most of the patterns described herecome from that book In the same vein, but limited topersistence mechanisms, is “Data Access Patterns” byClifton Nock (also Addison-Wesley)

The persistence libraries mentioned are: PEARDB_DataObject at p pear.php.net/package/DB_DataObject t,Propel at p propel.phpdb.org/wiki/ / and MetaStorage at

w www.meta-language.net/metastorage.html l

To Discuss this article:

http://forums.phparch.com/194

Marcus Baker is a senior software developer at Wordtracker and part time web development consultant His website is at

h http://www.lastcraft.com/ Marcus is also a co-founder of the PHPLondon organization.

Trang 29

FEEAATTUURREE

Trang 30

FEEAATTUURREE

There are a couple of different methods of

con-verting characters to other characters

Transliteration is the process of converting a

specific character to different characters or

groups of characters

Examples of transliteration are the converting of the

Norwegian “å” to “aa” (ligature normalization), “ç” to

“c” (diacritical removal), “ÿ” to “Ÿ” (changing case),

“ ” to “YU” (Cyrillic to Latin transliteration) and “©”

to “(c)” (special decomposition) For each of those

con-versions, special filters can be used and the order of

fil-ters is important too For example, you will want to run

a ligature normalization filter before the diacritical

removal filter so that “å” does not become “a”, but

“aa” like Norwegian people would expect As you can

imagine, the definition of some of those filters can be

pretty large, especially the Han to Pin Yin transliteration

because of the great number of Chinese characters

Transliteration from one script to another will most

likely never be one hundred percent accurate, as the

way characters are transliterated to the Latin script is

sometimes affected by country, but most often just by

the person who does the transliteration Therefore,

transliteration can only achieve an approximation of a

script when we transliterate texts

Why Is This Needed at All?

You might be wondering why one would need a

method or an extension to transliterate characters from

one character set to another one, but there are a ple of situations where this is really useful One exam-ple is a content management system where you wouldwant to create an URL path out of the title of a docu-ment A first method would be to simply conduct thefollowing steps:

cou-• Convert the title of the document to lowercase characters

• Replace all characters not in the range of zz00 99 with an underscore

aa • Remove underscores at the beginning andending of the generated title

• Remove multiple underscores in a row

As an example, the title: “42: The answer to life, andeverything.” would first become: “42: the answer tolife, and everything.”, then “42 theanswer_to_life and_everything_” and, finally,

“42_the_answer_to_life_and_everything,” which is asuitable name to use in a URL This algorithm works finefor English text, but, if the title of the document had

Trang 31

FEEAATTUURREE

Transliteration with PHP

contained the word “français,” for example, the final

result would have been “fran_ais,” which is no longer

representative of the document’s title For different

scripts, such as Cyrillic or Japanese Katana, this is

obvi-ously not going to be useful at all In eZ publish, URLs

are not the only things that need some form of

“man-gling” in order to create a usable string For example,

other items include identifiers for attributes (fields) in a

content object, searching and generated package

names, and so on Each of those cases might need

dif-ferent rules for creating a usable string as

representa-tion for the items For example, if you are normalizing

a string for a search engine, you might want to retain

spaces, while they should be removed if you are

prepar-ing a strprepar-ing for use in a URL Other uses for strprepar-ings

might not even allow underscores at all

The Translit Extension

One possibility is to implement these filters with PHP

code, though this is not very fast It is what you had to

do before the translit extension existed The translit

extension makes it possible to apply filters on strings of

text to perform different transliteration rules The

extension provides two functions only:

ttrraannsslliitteerraattee ffiilltteerrss ggeett and ttrraannsslliitteerraattee The

first function returns an array with all available filters,

while the second one provides the functionality needed

to apply transliteration filters to strings

Installing the translit extension can be done by

sim-ply running:

pear install http://pecl.php.net/get/translit

This will install the latest version of the transliteration

extension, which, at the time of this writing, was beta

version 0.5 In order for this to work, you do need a

correct build environment for PECL extensions; this

includes “fitting” versions of the autotools: autoconf

2.13, libtool 1.4.3, and automake 1.4-p6 or similar

Newer versions might also work, but they can throw

quite a few warnings If this is the case, you should

downgrade your autotools to the versions I just

men-tioned You can also get some information from the

PHP manual by visiting this URL:

h

http://php/manual/en/install.pecl.php

Another installation dependency of the translit

exten-sion is the iconv extenexten-sion, which you either need to

have compiled into PHP (Unix) or loaded in by

specify-ing it in your php.ini file with an eexxtteennssiioonn== line before

the translit extension

When the extension is installed and enabled inphp.ini, you can use the ttrraannsslliitteerraattee ffiilltteerrss ggeett

function to see if everything is working:

This should return all supported filters

Character Sets and Unicode

The extension needs to deal with a lot of different acter sets (e.g.: Latin, Greek, Cyrillic, and so on).Because none of the normal 8-bit character sets, or the

char-Chinese Big5, are compatible with each other, the

extension uses Unicode characters to perform its formations on

trans-To implement efficient filters, the transliteration

extension does not use UTF-8 encoding internally, as

doing so would require too much overhead when

pars-ing the strpars-ing each time; instead, it uses UCS-2, which

always stores one Unicode character as two bytes Thismakes it possible to perform integer arithmetic on thecharacters, allowing for very fast filtering In return, thismeans that you need to convert your data if you want

to transform strings encoded in character-sets otherthan UCS-2 Fortunately, the ttrraannsslliitteerraattee functionallows you to specify the input and output charactersets for the transliteration This is where the iconvextension, on which the transliteration extensiondepends, comes into place

In Listing 1, for example, we execute the

normalize_ligature filter on the string “Vær så god.”

resulting in “Vaer saa god.” As you can see, the tion is easy to use: the first parameter is simply thestring that you want to execute a filter on, while thesecond parameter is an array containing the filters thatyou want to execute The third and fourth parametersare the character set of the incoming data and outgo-ing data respectively

func-The second parameter contains an array of filters,which means you can execute multiple filters with thesame function call and be sure that the order in whichthey are executed is preserved, based on the contents

of the array you pass

Trang 32

FEEAATTUURREEF

FEEAATTUURREE

Transliteration Filters - Latin

Currently, the transliteration extension provides

sup-port for different groups of scripts, and each of those

scripts has different filters

For the latin script, the group of filters consists of

dia-critical_remove, lowercase_latin, normalize_ligature and

uppercase_latin Not only do those four filters deal with

the Basic Latin and Latin-1 Supplement Unicode

blocks—they also support Latin Extended-A and

Latin-Extended-B This means that the ddiiaaccrriittiiccaall rreemmoovvee

filter will be able to remove diacritical signs for all Latin

and Latin-like characters available in Unicode It will

therefore correctly convert all characters in the string

“ ” to uppercase and then remove all the diacritical

signs from it Listing 2 illustrates this

A couple of additional filters are required to be able

to generate URL-safe names from article titles For this,

we need to follow these steps:

• Expand all ligatures (the normalize_ligature

filter, “å” to “aa”)

• Remove all remaining diacritical signs (the

diacritical_remove filter, “é” to “e”)

• Convert the string to lowercase (the

latin_lowercase filter, “FoO” to “foo”)

• Normalize all punctuation so that the

remove_punctuation filter can remove it (the

compact_underscores filter, “_foo 42 ” to

“foo_42”)

It is fairly trivial again to execute all those filters on thestring from which we want to create a fitting URL-safename, as you can see in Listing 3 In line 2, we defineour string PPoolliittiieett:: ØØkksseeaannggrreeppeett vvaarr ppllaannllaaggtt \\nn,and in lines 3 to 7 our filter array Line 8 then executesall the filters on the string, treating incoming data as

UTF-8, but producing 7bit ASCII data as output In our

case, we want 7bit ASCII as output because this filtersout all other scripts Line 9 instructs vim, my editor ofchoice, to treat the file as UTF-8 data

Greek

The technique above works very well for Latin-basedlanguages, but as soon as a different writing is used, it will miserably fail Imagine a Greek string like

exe-cute the same filters as for latin strings, the result will be

“ ”, which is, of course, completely useles Hence,

we have to transliterate this Greek text to the Latin

script first, which is what the greek_transliterate filter

does In Listing 4, you can see that by prepending the

greek_transliterate filter, the output of the transliteration

process is something we can use as part of a URL,although it might not be a 100% correct translitera-

Trang 33

There are two more filters for the Greek script: one,

greek_uppercase, will convert all lowercase letters to

uppercase, while the other, greek_lowercase, will

con-vert all uppercase letters to lowercase Listing 5 shows

both filters in action

Cyrillic

The same ideas apply to the Cyrillic script as for the

Greek script Unlike the Greek script, which is currently

only used in Greece, the Cyrillic script is used for

sever-al languages There are indeed some differences in

pref-erences in those countries on how to transliterate

Cyrillic into Latin Therefore, the transliteration

exten-sion does not only contain a generic transliterate_cyrillic

filter, but also a transliterate_cyrillic_bulgarian filter that

changes some of the transliterations for specific letters

In the future, extra filters for even more languages may

be added, but for now there is only one, specific to the

Bulgarian language

In Listing 6, you can see the different filters for the

Cyrillic script in action; notice how the “ ”is

transliter-ated differently in Bulgarian compared to the default

transliteration rules for the Cyrillic script—naturally,

there are more characters that are different than only

this one

Hebrew

This is a very interesting one, as it has no concept of

upper and lower case characters Accordingly, there isonly one filter related to the Hebrew script: hebrew_transliterate

Listing 7 shows the filter in action From the

“mobrq_yskn_t_mdynot” One other interesting ture about Hebrew is that most writings don’t seem touse vowels at all

fea-Asian Scripts

This is where all the real fun begins—at least for one who doesn’t know anything about Asian lan-guages Asian languages usually don’t conform to the

some-“we use letters to represent text” rule For example,Chinese uses ideograms, while Japanese uses those inaddition to two other scripts, and Korean uses com-bined letters as characters Another point is thatChinese doesn’t use any spaces between words, whichmakes it really hard to come up with a sensibleRomanization strategy

For now, the transliteration implements a few filtersrelated to CJK (Chinese, Japanese, Korean) scripts The

first one, hangul_to_jamo, converts the combined

Korean syllables (Hangul) back into letters (Jamo) TheUnicode character set supports both the combined syl-lables as well as the separate letters that form the com-bined syllables This is done in a very algorithmic way,fortunately The second Korean related filter—

jamo_transliterate—Romanizes the separate Jamo

char-1 <?php

2 $string = <<<END

6 END; echo $string , “\n\n”;

7 echo transliterate ( $string , array(‘greek_uppercase’), ‘utf-8’, ‘utf-8’), “\n\n”;

8 echo transliterate ( $string , array(‘greek_lowercase’), ‘utf-8’, ‘utf-8’), “\n\n”;

9 echo transliterate ( $string , array(‘greek_transliterate’), ‘utf-8’, ‘utf-8’), “\n\n”;

5 echo transliterate ( $string , array(‘cyrillic_uppercase’), ‘utf-8’, ‘utf-8’), “\n\n”;

6 echo transliterate ( $string , array(‘cyrillic_lowercase’), ‘utf-8’, ‘utf-8’), “\n\n”;

7 echo transliterate ( $string , array(‘cyrillic_transliterate’), ‘utf-8’, ‘utf-8’), “\n\n”;

8 echo transliterate ( $string , array(‘cyrillic_transliterate_bulgarian’), ‘utf-8’, ‘utf-8’), “\n\n”;

Ngày đăng: 17/01/2014, 18:20

w