1. Trang chủ
  2. » Công Nghệ Thông Tin

Tài liệu PROFILING PHP pdf

70 430 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Profiling PHP
Tác giả Andrei Zmievski, James Cox, Wez Furlong, Stuart Herbert, Peter James, George Schlossnagle, Ilia Alshanetsky, John Coggeshall, Jason Sweat
Trường học PHP Architect
Chuyên ngành PHP Programming
Thể loại Magazine
Năm xuất bản 2004
Thành phố Toronto
Định dạng
Số trang 70
Dung lượng 3,13 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Each file found in the directory is matched against the eerreegg pattern on line 41 and, if that operation is successful, a new item with the keyword as a key and the file's path as the

Trang 1

FEBRUARY 2004 VOLUME III - ISSUE 2

See inside for details

Get Ready For

Caching Techniques for the PHP Developer

Offline News Management with PHP-GTK

EXtending PHP

Handling PHP Arrays from C

The Need for Speed

Writing More efficient PHP scripts

Trang 3

Visit us at www.phparch.com/cruise for more details.

March 1 st - March 5 th 2004

Andrei Zmievski -Andrei's Regex Clinic, James Cox - XML for the Masses,

Wez Furlong - Extending PHP, Stuart Herbert - Safe and Advanced Error Handling

We’ve got you covered,

from port to sockets.

Port Canaveral • Coco Cay • Nassau

Plus: Stream socket programming, debugging techniques, writing high-performance code,

data mining, PHP 101, safe and advanced error handling in PHP5, programming smarty,

and much, much more!

In partnership with Zend Technologies

Zend Studio 3.0 is theofficial PHP IDE of

Trang 4

The Need For Speed

Optimizing your PHP Applications

Trang 5

Existing subscribers

can upgrade to

the Print edition

and save!

Login to your account

for more details.

NEW!

*By signing this order form, you agree that we will charge your account in Canadian dollars for the “CAD” amounts indicated above Because of fluctuations in the exchange rates, the actual amount charged in your currency on your credit card statement may vary slightly.

**Offer available only in conjunction with the purchase of a print subscription.

Choose a Subscription type:

CCaannaaddaa//UUSSAA $$ 8833 9999 CCAADD (($$5599 9999 UUSS**)) IInntteerrnnaattiioonnaall SSuurrffaaccee $$111111 9999 CCAADD (($$7799 9999 UUSS**)) IInntteerrnnaattiioonnaall AAiirr $$112255 9999 CCAADD (($$8899 9999 UUSS**))CCoommbboo eeddiittiioonn aadddd oonn $$ 1144 0000 CCAADD (($$1100 0000 UUSS))((pprriinntt ++ PPDDFF eeddiittiioonn))

Your charge will appear under the name "Marco Tabini & Associates, Inc." Please allow up to 4 to 6 weeks for your subscription to be established and your first issue

to be mailed to you.

*US Pricing is approximate and for illustration purposes only.

php|architect Subscription Dept.

VISA Mastercard American Express

Credit Card Number:

The Magazine For PHP Professionals

YYoouu’’llll nneevveerr kknnoow w w whhaatt w wee’’llll ccoom mee uupp w wiitthh nneexxtt

Trang 6

Graphics & Layout

Chris Shiflett, Morgan Tocker

php|architect (ISSN 1709-7169) is published twelve times a year by Marco Tabini & Associates, Inc., P.O Box 54526, 1771 Avenue Road, Toronto, ON M5M 4N5, Canada Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes

no responsibilities with regards of use of the information contained herein or in all ciated material.

asso-Contact Information:

General mailbox: info@phparch.com

Editorial: editors@phparch.com

Subscriptions: subs@phparch.com

Sales & advertising: sales@phparch.com

Technical support: support@phparch.com

Copyright © 2003-2004 Marco Tabini & Associates, Inc.

— All Rights Reserved

php|architect As I write this, I'm sitting in myoffice—about forty degrees Celsius warmerthan outside and, therefore, a much better place to

work in that that the local park—suffering from an

awful cold and sitting by a collection of (clean) tissues

discreetly stashed on my desk, ready for use As you

can expect, I'm not particularly happy about either fact

(make that three facts—the cold outside, the cold in

my body, and the fact that I'm sitting in an office when

I could really be somewhere else far away from

any-thing that even remotely resembles a computer)

Incidentally, with php|cruise coming at the beginning

of March, I should hopefully be able to get rid of at

least two problems—and I'm still working on finding a

way to avoid computers during that trip

But I ramble—a clear sign that the cold medicine is

wearing off Let me instead tell you something about

this month's issue With the popularity that PHP enjoys

nowadays comes the fact that it is used as the

back-bone of more and more high-traffic sites A simple

con-sequence of this is that an increasing number of

devel-opers are "hitting the wall" and finally feeling the limits

of what the "let's just do it in PHP" approach can do

Building a website is always a high-wire balance of

budgeting, respecting deadlines and writing the best

code possible, but there's nothing quite as bad as

find-ing out that the way you've done thfind-ings is incapable of

meeting the demands of your website—and, by the

time you realize that you have a problem, it's usually

too late to think about a solution short of calling your

travel agent and inquiring about that non-extradition

country you heard of

Therefore, this month we dedicate a fair amount of

room to the performance management of PHP

applica-tions George Schlossnagle's article—based on an

excerpt from his latest book, published by SAMS—talks

about profiling, a concept that I have very rarely seen

associated with PHP applications Profiling takes the

guesswork out of understanding where the bottlenecks

in your application are, allowing you to focus on

find-ing the best possible resolution

The problem with profiling is that it only allows you

to identify the problems and not solve them Luckily,

Ilia Alshanetsky and Bruno Pedro offer two other

excel-lent articles on improving the performance of PHP

without affecting the code itself (if you can, why not

avoid the risk of introducing even more bugs?) While

Ilia focuses on ways to make the PHP interpreter itself

run faster, Bruno examines the topic of caching—both

at the network and script level

This month we also start a new column—Security

Corner—written by Chris Shiflett The daily number of

security advisories, patches, break-ins and source-code

thefts that we see reported in the media every day has

Continued on page 8

Trang 7

PHP 4.3.5RC1 has been released

for testing This is the first release

candidate and should have a very low number of

prob-lems and/or bugs Nevertheless, please download and

test it as much as possible on real-life applications to

uncover any remaining issues List of changes can be

found in the NEWS file

For more information visit: http://qa.php.net/

PHP Community Logo Contest

Following Chris Shiflett’s recent announcement of the

PHP Community Site, he is holding a contest to

find a logo that embodies the spirit of the PHP

community Everyone is welcome to participate,

and you can submit as Many entries as you like

Please send all entries to

logos@phpcommuni-ty.org And include the name with which you want

to be credited

The contest ends 29 Feb 2004, and php|architect is

offering a free PDF subscription to the winner For

updated news about the contest, as well as a

chance to view the current entries, visit:

http://www.phpcommunity.org/logos/

Good luck to all who enter!

ZEND Studio 3.0.2 Zend has announced the release of Zend Studio 3.0.2client What’s new? Zend.com lists some of the bugfixes as:

• ZDE didn’t load when using a new keymapconfig from an older version

• Save As Project didn’t always work

• Server Center activator tried to open thewrong URL

• js files were not opened with JavaScripthighlighting

• Shift-Delete and Shift-Backspace didn’t workproperly

• Find&Replace was very slow under Linux

• Add Comment sometimes erroneously mented out a line that wasn’t selected

com-• Added configurable limit for the number ofdisplayed syntax errors

There have also been improvements to the debugger,code completion, code analyzer, IE toolbar, and someMac OSX changes

Get more information from Zend.com

Trang 8

MySQL Administrator

MySQL.organnounces: MySQL Administrator is a

pow-erful new visual administration console that makes it

significantly easier to administer your MySQL servers

and gives you better visibility into how your databases

are operating MySQL Administrator integrates base management and maintenance into a single,seamless environment, with a clear and intuitive graph-ical user interface Now you can easily perform all thecommand line operations visually, including configur-ing servers, administering users, dynamically monitor-ing database health, and more

data-Get more information from:

http://www.mysql.com/products/administrator/index.html

Check out some of the hottest new releases from PEAR.

DB 1.6.0 RC4

DB is a database abstraction layer providing:

• an OO-style query API

• a DSN (data source name) format for specifyingdatabase servers

• prepare/execute (bind) emulation for databasesthat don’t support it natively

• a result object for each query response

• Compatible with PHP4 and PHP 5

• much more…

DB layers itself on top of PHP’s existing databaseextensions The currently supported extensionsare: dbase, fbsql, interbase, informix, msql, mssql,mysql, mysqli, oci8, odbc, pgsql, sqlite and sybase(DB style interfaces to LDAP servers and MS ADO(using COM) are also avaible from a separate pack-age)

System_ProcWatch 0.4With this package, you can monitor runningprocesses based upon an XML configuration file,XML string, INI file or an array where you definepatterns, conditions and actions

Net_IMAP 0.7 Provides an implementation of the IMAP4Rev1protocol using PEAR’s Net_Socket and the option-

al Auth_SASL class

XML_Beautifier 1.1XML_Beautifier will add indentation and linebreaks to you XML files, replace all entities, formatyour comments and makes your document easier

to read You can influence the way your document

is beautified with several options

Looking for a new PHP Extension? Check out

some of the latest offerings from PECL.

opendirectory 0.2.2

Open Directory is a directory service architecture

whose programming interface provides a

central-ized way for applications and services to retrieve

information stored in directories The Open

Directory architecture consists of the

DirectoryServices daemon, which receives Open

Directory client API calls and sends them to the

appropriate Open Directory plug-in

statgrab 0.1

libstatgrab is a library that provides a common

interface for retrieving a variety of system statistics

on a number of *NIX like systems

This extension allows you to call the functions

made available by libstatgrab library

Sasl 0.1.0

SASL is the Simple Authentication and Security

Layer (as defined by RFC 2222) It provides a

sys-tem for adding plugable authenticating support to

connection-based protocols The SASL Extension

for PHP makes the Cyrus SASL library functions

available to PHP It aims to provide a 1-to-1

wrap-per around the SASL library to provide the greatest

amount of implementation flexibility To that end,

it is possible to build both a client-side and

server-side SASL implementation entirely in PHP

SQLLite 1.0.2

SQLite is a C library that implements an

embedda-ble SQL database engine Programs that link with

the SQLite library can have SQL database access

without running a separate RDBMS process This

extension allows you to access SQLite databases

from within PHP Windows binary available from:

http://snaps.php.net/win32/PECL_STABLE/p

hp_sqlite.dll

Trang 9

PHPWeather 2.2.1

PHP Weather announces the release of version 2.2.1

PHP Weather makes it easy to show the current

weath-er on your webpage All you need is a local airport, that

makes some special weather reports called METARs

The reports are updated once or twice an hour

Get more information from :

http://sourceforge.net/projects/phpweather/

PHPEclipse Debugger

PHP Eclipse adds PHP support to the Eclipse IDE

Framework This snapshot introduces the first version of

the PHPEclipse debugger plugin

For more information visit:

http://www.phpeclipse.de

MySQL and Zend Working Together

From Zend and MySQL – These two have Joined Forces

to Strengthen Open Source Web Development

MySQL AB, developer of the world’s most popular open

source database, and Zend Technologies, designers of the

PHP Web scripting engine, today announced a partnership

to simplify and improve productivity in developing and

deploying Web applications with open source

technolo-gies Through the alliance, the companies are improving

compatibility and integration between the MySQL base and Zend’s PHP products to make it easier for busi- nesses to use complete open source solutions, such as the popular LAMP (Linux, Apache, MySQL and PHP) software stack

data-As part of the partnership, MySQL AB and Zend areoffering partner products to their respective customers,enabling easier product procurement and deploymentfor Web application infrastructures The companies willalso commit development resources to design productintegration and compatibility modules for both ven-dors’ platforms

For more information visit: www.zend.comSAXY 0.3 SAXY is a Simple API for XML (SAX) XML parser for PHP

4 It is lightweight, fast, and modeled on the methods

of the Expat parser for compatibility The primary goal

of SAXY is to provide PHP developers with an tive to Expat that is written purely in PHP Since SAXY isnot an extension, it should run on any Web hostingplatform with PHP 4 and above installed

alterna-This release allows CDATASection tags to be served, rather than converted to Text Nodes

pre-For more information visit:

In his article on offline news management, Morgan Tocker writes about how PHP-GTK, that most hidden ofPHP gems, can be used to improve content management by providing a proper GUI application that doesn'trequire you to completely rewrite all your code

Finally—last but not least-Wez Furlong picks up where his article from last month left off and delves into thedeep bowels of the Zend Engine to show you how a PHP extension written in C can manipulate PHP arrays—it's not quite as easy as from a script but close enough once you know what you're doing

Well, that's it for this month By the time I will be writing my next editorial, I plan to be either boasting about

my suntan or complaining about sunburn Either way, you can expect me to report on our adventure on thehigh seas—until then, happy reading!

Editorial: Contiuned from page 5

Trang 10

Despite the fact that it sounds like some

mysteri-ous Italian pasta, Gnokii is really just a project

aimed to develop tools and drivers for Nokia

mobile phones-that is, software that makes it possible

to control a Nokia phone physically connected to your

server via a serial port Gnokii works like the Nokia Data

Suite, which is shipped with more advanced models

from Nokia: you can use it to send SMS messages, edit

contacts and so on—pretty much everything you

nor-mally do with your thumb on the phone's keypad

Gnokii itself is composed of many tools, including a

set of GUI applications that facilitate the remote

opera-tion of the telephone; we are really only interested in a

small subset of these tools called ssmmssdd, or SMS daemon,

which provides an interface for rapid access to the

phone's SMS capabilities With the SMS daemon up

and running, we can use PHP to interact with the

phone, send and receive SMS messages and, of course,

build whatever logic we need based on the content of

the messages that we receive and send In short, my

goal with this article is to show you how to configure

software and hardware so that you can get the same

kind of service as you would normally obtain from a big

company selling mobile services like SMS gateways—

but at a fraction of the price

Major Components of the Final

Application

The final application that we will create throughout this

article is a simple SMS server that awaits a message

from a user and acts on its contents It is made up of

three major components:

• A Nokia cell phone, which must beconnected properly to the server

• The ssmmssdd application from the Gnokiipackage, which must, of course, becompiled and configured correctly

• The PHP scripts that provide the actualserver functionality

The flow of the application will be as follows:

• The user send an SMS message to theserver

• The ssmmssdd daemon picks it up and matically puts it into its database

auto-• Our server scans ssmmssdd's database odically for new messages

peri-• When a new message arrives, its tents are examined and the server acts

con-on them, for example by replying tothe user with another message

Code Directory: sms-gnokii

REQUIREMENTS

SMS-shorthand for Short Message Service-is the standard

used by cellular phone networks worldwide to allow their

customers to exchange small text messages using their

handsets Despite its limitations, SMS is very popular

with cell phone users-and it has rapidly become a

wide-ly-used bridge between the Internet and mobile users.

Trang 11

Hardware needed

When it comes to cellular communications, the bad

thing about hardware is that it often costs a lot of

money, but the goal of this project is precisely to

pro-vide a low-cost alternative, so the expenses associated

with it should be quite reasonable What you'll need in

terms of hardware is a Nokia phone and a serial cable

to hook it up to your server I will, of course, expect that

you already have a server and that it is capable of

run-ning the Gnokii tools and PHP In my environment, I

have used a Nokia 3310, which is quite new but not

very expensive, and works perfectly for my needs

There are no "official" connection cables available for

the 3310, but a company from the UK called Cellsavers

(http://www.cellsavers.co.uk) have come up with a very

ingenious serial cable with a connector that you can fit

behind the battery on the phone For those who don't

know, there are 4 metal pins that are probably used by

Nokia to install software and perform other

program-ming on to the phone, and those nice folks at

Cellsavers managed to figure out how to use them to

control the phone through a serial port There might be

other companies supplying the same type of product,

but I have not seen any around

Another important note about the hardware is that

you will need to get a battery charger for the phone

One often comes with the package, and you can plug

it in and leave the phone on forever without having to

worry about the batteries

Installing Gnokii and smsd

Before starting to install Gnokii and ssmmssdd, make

sure you have MySQL installed and working

properly on your server Installing Gnokii is quite

straightforward—it involves little more than the

usual ccoonnffiigguurree mmaakkee mmaakkee iinnssttaallll steps However,

there are some configuration options that I find

important

The first might be a matter of taste, but I like to place

everything belonging to Gnokii in //uussrr//llooccaall//GGnnookkiiii

Therefore, I will use pprreeffiixx==//uussrr//llooccaall//GGnnookkiiii when

invoking it Next, the wwiitthhoouutt xx configuration switch

indicates that we will not need to use the xxggnnookkiiii GUI

application to send SMS messages and manage the

phone If you want to take a look at the graphical tools,

you can of course skip this parameter, but on a Unix

serv-er whserv-ere you normally do not have Xwindows installed

you'll get a whole lot of errors if you do so The last

parameter is eennaabbllee sseeccuurriittyy, which turns on a lot of

security-related features in the package, like the ability to

change the PIN number I find them useful, so I usually

turn them on

The resulting configure line will be as follows:

./configure prefix=/usr/local/Gnokii without-x

10 $CONFIG [ ‘keywords_directory’ ] = ‘./keywords/’ ;

11 $CONFIG [ ‘default_email’ ] = ‘eric@persson.tm’ ;

12 $CONFIG [ ‘database_username’ ] = ‘root’ ;

13 $CONFIG [ ‘database_password’ ] = ‘’ ;

14 $CONFIG [ ‘database_hostname’ ] = ‘localhost’ ;

15 $CONFIG [ ‘database_database’ ] = ‘sms’ ;

16 17

18 /*

37 $keywords = array();

38

39 $dh = opendir ( $CONFIG [ ‘keywords_directory’ ]);

40 if( $dh ){

41 while( $filename = readdir ( $dh ) ){

42 if( ereg ( ‘^([a-z0-9_]*).php$’ , $filename , $match ) ){

43 $keywords [ $match [ ]] =

$CONFIG [ ‘keywords_directory’ ] $filename ;

44 echo date ( ‘Y-m-d H:i:s’ ) ’:

‘ $match [ ] chr ( 10 );

47

48 if( sizeof ( $keywords )== 0 )

49 return_error ( ‘Keyword directory was empty.’ );

Trang 12

Once you've downloaded the Gnokii tarball from

http://www.Gnokii.org—the latest version at the time of

this writing is Gnokii-0.5.5—you can decompress it andstart the compilation process:

# gzip dc Gnokii0.5.5.tar.gz | tar xof

[global]

port = /dev/ttyS1 model = 3310 initlength = default connection = serial bindir = /usr/local/Gnokii/sbin/

Make sure that you have connected your phone to thecorrect serial port as you specified in the configuration.Also, check the model of your phone and enter it accord-ingly The iinniittlleennggtthh variable controls the number ofcharacters sent to the phone during initialization; youdon't normally want to change this setting—unless youhave problems with the connection, I suggest that youuse the default value (at least initially)

The ccoonnnneeccttiioonn variable should be set to sseerriiaall, sincewe'll be connecting to the phone using the serial port

In case you're wondering, it's possible to configure it touse an infrared connection instead

Now, it's time to test it all and see if everything worksfine A good starting point here is to try and send out anSMS message using Gnokii:

81

82 include_once( $phpfile );

83 if( function_exists ( $keyword ) )

84 $keyword ( $message , $sender );

mistakes as the one

101 if( strpos ( $message , ‘ ‘ )> 0 )

102 $match_part = substr ( $message , 0 , strpos ( $message , ‘

108 if( isset( $keywords [ $match_part ]) ){

109 include_once( $keywords [ $match_part ]);

110 if( function_exists ( $match_part ) )

111 $match_part ( $message , $sender );

130 return sprintf ( ‘%1.3fs’ , (( $end_seconds

-$start_seconds )+ $end_fraction - $start_fraction ));

131

132 }

133

134

135 /* Connect to the mysql database */

136 $connection = mysql_connect ( $CONFIG [ ‘database_hostname’ ],

$CONFIG [ ‘database_username’ ], $CONFIG [ ‘database_password’ ]);

142 /* Select the database that contains the smsd tables */

143 mysql_select_db ( $CONFIG [ ‘database_database’ ], $connection ) or

return_error ( ‘Could not select database’ );

163 match_message ( $sms [ ‘text’ ], $sms [ ‘number’ ]);

Trang 13

Clearly, you will need to replace the xxxxxxxxxxxxxxxxxx above

with a real, working phone number that you can test for

the message to arrive (you could, in fact, use the same

number as the cell phone you're using to send the

mes-sage) If you don't receive the message, or if you get an

error, you may want to step back and look at the

config-uration and build procedure once again, just to make

sure that you haven't missed anything

The next step consists of configuring ssmmssdd so that we

can send messages out onto the network

programmati-cally It's obviously important to have Gnokii working

first, since ssmmssdd relies on the same runtime configuration

libraries The ssmmssdd source code is located in the //ssmmssdd//

folder under the directory where you unpacked the

Gnokii tarball

SSmmssdd can work either with a database or with a

filesys-tem but, for the purposes of this article, we will only

focus on configuring it to use MySQL The daemon is not

compiled by default when you compile Gnokii, so that

will have to be our next step You will need to manually

edit the Makefile and change every instance of the path

to the MySQL installation in the DB Modules section

Next, you can build the executables:

# make

# make libmysql.so

# make install

Setting Up the smsd Database

Since we want to use ssmmssdd with MySQL, we need to

cre-ate a database for it to use For simplicity's sake, we'll call

it ssmmss and grant a new MySQL user with login ssmmss andpassword ssmmss access to it Naturally, if you move into aproduction environment where security is a concern,you may want to use a more secure username/passwordcombination Keep in mind that anyone who can accessyour ssmmss database can insert rows into the outbox andtherefore send messages from the connected phone

On a larger system, the possibility for abuse is certainlythere—and therefore security is worth at least someconsideration

In the ssmmssdd directory of your tarball, you will also find

a SQL file called ssmmss ttaabblleess mmyyssqqll ssqqll that contains thetable definitions needed to run the daemon All you need

to do is import these into your database and you are allset to go There is also a file for those that preferPostgreSQL, but we will focus on MySQL here

Installing daemontoolsThe ddaaeemmoonnttoooollss package is a collection of tools thatcan be used to monitor and manage UNIX-based serv-ices Its installation procedure is quite straightforward,since there aren't too many options or configurationdirectives The only thing to keep in mind is that somedifferences in newer versions of glibc (2.3.1 andabove) may require you to patch the ddaaeemmoonnttoooollsssource before you try to compile it The patch youneed is called the "errno-patch" and fixes an incom-patible declaration of the eerrrrnnoo variable made in thesource I've seen some people claim that this problem

is caused by bad programming practices, but theerror really only started popping up when changeswere made to glibc, so I'm not too sure as to how truethat is Whatever the real reason, if you encounter thisproblem, simply patch the source and you'll be justfine If you need to download the patch, you can get

it from http://www.qmail.org/moni.csi.hu/pub/glibc-2.3.1/.Then, follow the daemontools installation instruc-tions, which you can find at

http://cr.yp.to/daemontools/install.html.

If you're not familiar with patching software, this isdone by downloading the software, extracting it, andthen using the ppaattcchh program to affect the actualchanges in the source code More information aboutthe eerrrrnnoo patching process and ddaaeemmoonnttoooollss can befound at http://www.qmail.org/moni.csi.hu/pub/glibc-

2.3.1/INSTRUCTIONS but, generally speaking, you canget away with something like this:

# tar zxvf daemontools-0.76.tar.gz

# cd admin/daemontools-0.76

# patch -p1 /path/to/daemontools-0.76.errno.patch

On to Some PHPOur main PHP script will be as small and efficient aspossible, since it will be running as a daemon on ourserver all the time Its main task will be to check if thereare new messages in the SMS inbox table and, if so,

9 $return_message = ‘Hello to you my friend!’ ;

10 /* Output a log message */

“‘ $message ’” from “‘ $sender ’” with “‘

$return_mes-sage ’”’ chr ( 10 );

12

13 /* Send the reply to the sender */

14 mysql_query ( ‘INSERT INTO outbox SET

number=”’ $sender ’”, text=”’ $return_message ’”,

processed_date=”0”, insertdate=now(), error=0,

Trang 14

match them against the possible keywords that we

have created, so that the appropriate action can be

taken We'll call this script ssmmssppaarrssee pphhpp

First of all, let's decide how we're going to structure

our application Since our main goal is to respond to

certain keywords, we'll start by creating a few "keyword

scripts", which are really nothing more than standalone

PHP files stored in a subdirectory called kkeeyywwoorrddss

For example, if we wanted to define a keyword called

hheelllloo, our directory structure would like this:

./keywords/

./keywords/hello.php

./smsparse.php

As you can see, each keyword has its own PHP file

We simply use the keyword as the filename for the

script that contains the actions associated with it in

order to simplify the entire process

Let's now have a look at ssmmssppaarrssee pphhpp, which you can

see in Listing 1 At the beginning of the script (in the

rreeaadd kkeeyywwoorrddssfunction), we read through the contents

of the kkeeyywwoorrdd directory Each file

found in the directory is matched

against the eerreegg(()) pattern on line 41

and, if that operation is successful, a

new item with the keyword as a key

and the file's path as the value is added

to the array that the function returns

at the end of its execution

As you can see, on line 148 we sort

the array in a descending fashion

based on the length of each key We

do this so that longer keywords are

checked for first and we don't end up

in a situation where a word like "eat" is

matched instead of "Seattle" because it

is shorter

The main portion of the application works by

execut-ing a loop indefinitely At every iteration, we check if a

new message has arrived in the inbox and, if that is the

case, match the message against the active keywords

that we have identified at the beginning of the script,

and finally sleep for 1 second before the next cycle

For the actual matching process, I have written two

alternatives that use different approaches The first one,

mmaattcchh mmeessssaaggee(()), is the most fault tolerant but also the

slowest one The second one, mmaattcchh mmeessssaaggee ffaasstt(()), is

not as tolerant but will save some CPU resources byusing a faster algorithm The difference won't probably

be dramatic, but on a heavily loaded server or with alarge list of keywords it may well have an impact on theoverall performance of the system

In mmaattcchh mmeessssaaggee(())(lines 74-92), the message is firstcleaned of unwanted characters such as non-alphanu-meric values and spaces, and converted to lowercase.Next, the function cycles through all the keywords andperforms an eerreegg(())match against the "clean" version ofthe message If a match occurs, the PHP file correspon-ding to the keyword is included and executed

The mmaattcchh mmeessssaaggee ffaassttfunction, on the other hand,works by taking the first word in the message and con-verting it to lowercase The word is then used to per-form a search in the keyword array and, if a match isfound, the appropriate PHP file is included and execut-ed

Writing Keyword Scripts Since keyword scripts are an idea I came up specifically

for this article, it's probably a good idea

to discuss them a little Essentially, akeyword script simply contains codethat determines what happens when akeyword is matched To make it possi-ble for multiple scripts to coexist, theactual functionality is stored in a func-tion that has the same name as the key-word that corresponds to a particularscript

Let's assume, for example, that wewant to match the word "hello" at thebeginning of a message and reply with

an SMS of our own In this, case, we'dhave to write a PHP script, calledhheelllloo pphhpp, similar to the one shown inListing 2 As you can see, the file contains a function, called

hheelllloo(()), that accepts the incoming message and thesender's phone number as arguments

Sending a reply to the sender through SMS is a simpleprocess—all we need to do is add a row to the outboxtable of the ssmmss database The SMS daemon will period-ically poll the database for new outgoing messages andsend them automatically

Trang 15

Your Own PHP Daemon: Using

daemontools

The last step in our quest consists of setting up our PHP

script to run as a daemon You could, in theory, simply

run the script and detach it from the console, but if

you're running a proper server, a more robust

configu-ration is required—and this is where the ddaaeemmoonnttoooollss

package comes into place

The configuration of ddaaeemmoonnttoooollss is a bit complicated

compared to the other packages we have seen in this

article because it involves a relatively large number of

files and directories However, once one realizes that

there is method to the madness, it's not quite so bad

Given the amount of space alloted for this article, I will

leave it up to you to get ddaaeemmoonnttoooollssup and running—

the documentation is very clear and there are plenty of

resources for this purpose on the Net

When ddaaeemmoonnttoooollss is installed it creates a directory

called //sseerrvviiccee This will contain information on all the

various services that ddaaeemmoonnttoooollss is running; a program

called ssuuppeerrvviissee monitors the //sseerrvviiccee directory and

takes care of starting and keeping the services running

as needed Compared to "normal daemons", which are

started at boot time, ddaaeemmoonnttoooollss services are started by

ssuuppeerrvviissee and, if any of them is killed or dies

unexpect-edly, ssuuppeerrvviissee itself takes care of restarting them again

automatically

Therefore, ddaaeemmoonnttoooollss is an excellent solution if you

want your services to be running all the time and be

monitored for failures of any kind However, not all

services are suitable to run with this package—they

have to behave in a certain manner that makes it

pos-sible for ssuuppeerrvviissee to interact with them in an

automat-ed fashion

Luckily, most applications can be modified so that

they can be compatible with ssuuppeerrvviissee, and our

ssmmssppaarrsseerr script is no exception First of all, we must

ensure that the script can be run without having to

explicitly invoke the PHP interpreter Under a UNIX

shell, this is done by introducing a "shebang", that is, a

special command at the beginning of the file that tells

the shell interpreter which application the script should

be piped through in order for it to be executed

Let's start by figuring out where PHP is installed:

# whereis php

On my machine, a RedHat 8 server, the commands

outputs the following:

php: /usr/local/bin/php /usr/local/lib/php

/usr/local/lib/php.ini

This means that I have the PHP interpreter's binary

installed in //uussrr//llooccaall//bbiinn//pphhpp

It's now time to create a service directory for our

serv-ice We'll start by creating a "service" directory for

ssmmssppaarrssee in the //uussrr//llooccaall//ssmmssppaarrssee// directory, where

I will assume that you have stored the ssmmssppaarrssee pphhppscript and its underlying directory structure with all thekeyword scripts We will call the directory ssuuppeerrvviissee ssmmssppaarrssee:

That's it! If we now create a symlink from the iiccee directory to our newly created folder, ssuuppeerrvviissee willautomatically take care of starting and monitoring ourserver:

er, we will create a subdirectory to house the executionfiles for ssmmssdd:

# mkdir -p /usr/local/Gnokii/supervise-smsd/

Next, we'll write a new rruunn file:

#!/bin/sh exec /usr/local/Gnokii/bin/smsd -u sms -p sms -d sms -m mysql

Finally, to start the ssmmssdd run file, we link the vviissee ssmmssdd directory into //sseerrvviiccee with:

ssuuppeerr # ln -s /usr/local/Gnokii/supervise-smsd/

/service/supervise-smsd/

If you now check your process list, you should seeyour ssmmssppaarrssee and ssmmssdd processes listed-that is, if youhave done everything right:

we can diagnose any problems properly should anything

go wrong As part of the ddaaeemmoonnttoooollss package, you will

F

Trang 16

find a small program, called mmuullttiilloogg, that is capable of

logging the output of a service directly to a set of

auto-matically-rotated logfiles This means that, if we set up

our service settings properly, we won't even need to

write any special code for the purpose of creating

activi-ty logs!

To enable the logging functionality, start out by

cre-ating a log directory in ssuuppeerrvviissee ssmmssppaarrssee:

# mkdir -p /usr/local/smsparse/supervise-smsparse/log

The logging process acts much like a normal process

running under ssuuppeerrvviissee It needs its own directory

and run file; therefore, we need to create a special run

file at

//uussrr//llooccaall//ssmmssppaarrssee//ssuuppeerrvviissee ssmmssppaarrssee//lloogg//rruunn that contains the following

com-mands:

#!/bin/sh

exec multilog t /main

MMuullttiilloogg supports a wide range of arguments, which,

in turn, make it possible to create very complex logging

rules Our command line above, however, is quite simple

and really just means "add a timestamp on each line, and

store the logfiles in //mmaaiinn" The tt argument represents

the number

of Temps Atomique International (TAI) seconds since

1970-01-01 00:00:10 TAI As you might remember from

Listing 1, we prepend a ddaattee((''YY mm dd HH::ii::ss'')) string

before each line is outputted and, therefore, we will

actu-ally have double timestamps in the log file (naturactu-ally, you

can modify the script to omit its timestamp, or change

the mmuullttiilloogg instantiation to do the same)

We don't need to link the lloogg directory directly from

//sseerrvviiccee The ssuuppeerrvviissee program will execute the

run-file it contains automatically for us However, you must

restart ssuuppeerrvviissee to make it aware of the new log

direc-tory You can, once again, use the ssvvcc program to send

a TERM signal to the service:

# svc -t /service/supervise-smsparse/

A new look at the process list (see Figure 1) will show

you that ssmmssppaarrssee has been started again, together with

the logging process This means that our services are

now managed by ssuuppeerrvviissee and will run indefinitely-all

the while providing us with a nice logfile, which we can

monitor by using the ttaaiill utility:

This example shows that ssmmssppaarrsseerr was started

cor-rectly, and 2 keywords where found, hheelllloo and ssuucccceessss

As you can see, the TAI timestamp at the beginning of

each line is a bit cryptic, but it can be translated into a

human readable form by piping the ttaaiill outputthrough ttaaii6644nnllooccaall like this:

# tail /service/supervise-smsparse/log/main/current |

\ tai64nlocal 2004-01-07 15:57:27.380601500 2004-01-07 15:55:46:

Starting sms parser

2004-01-07 15:57:27.380605500 2004-01-07 15:55:46:

hello 2004-01-07 15:57:27.380607500 2004-01-07 15:55:46:

success

Conclusion The easiest way to test your new Gnokii setup is to grabanother cell phone and send an SMS message contain-ing the word "Hello" to your Gnokii phone If all goeswell, ssmmssppaarrssee will pick it up and reply back with themessage we entered in the hheelllloo keyword script

As you have probably by now realized, it's not thathard to set up a mobile service through which you canexchange information with your users by utilizing SMS.Even if you're not in the business of running SMS gate-ways, you could use it for a variety of other activities.For example, you can use it to provide "fun" services,like interactive voting, or a useful server monitoringinterface for your internal network The list of possibili-ties is very long—and my clients have shown greatinterest in using SMS as a complement to other servic-es

If you're worried about scalability, this solution maynot be for you, as it will have trouble handling a verylarge number of messages on a daily basis However, it

is so inexpensive that it could well be a good startingpoint for a more serious implementation The goodnews is that you'll be able to stay with Gnokii even ifyour needs grow, as newer versions of the package areslated to support multiple phones

To Discuss this article:

http://forums.phparch.com/126

When Eric's not out skiing or hiking, he's working as a freelance

develop-er on various projects His current focus is finishing his education in open-air alpine environments.

Trang 17

Welcome to the world of PHP-GTK Why

intro-duce GTK to a largely web-based language?

Well, convenience and portability come to

mind, for example Sometimes it's not feasible to write

a Java Swing interface when you've invested so much

time in your PHP classes, as you need to rewrite large

portions of code While it could be done, you'd have to

fork your code in two projects, and use two different

languages That's not something you can easily

con-vince many clients to do

Content management—a very common task for most

websites these days—represents a typical example of

an activity that is often performed directly through the

web but that could really be best served by a "true"

GUI-based client application In most circumstances,

creating a separate application is an expensive

proposi-tion, due to the duplication of code involved, the

addi-tional expertise needed and the difficulty of using a

lan-guage that will run properly on a wide variety of

plat-forms In this article, we'll tackle porting an existing

HTML-based news manager to PHP-GTK-and you'll see

how easy it is to make the jump from Web to GUI with

this powerful, if often neglected, platform

In creating our project, we'll start with a data

abstrac-tion layer and a tradiabstrac-tional HTML interface that we'll

ditch later on This article gets a little complex-so as a

prerequisite please install PHP-GTK, and create a table

in mysql with the schema shown in Listing 1 An SQL

dump with a few sample rows of data can be found in

the files for this article—it's always great to have some

sample data to work with

The Data Abstraction Layer

As a general rule, I create a data abstraction layer for every complex project I work on Some people swear

by this approach, others swear at it My personal praise goes to abstraction layers because I can do things like automatically change the modified date of a record without remembering to do it in each instance of SQL code An abstraction layer can also validate data and check the credentials of the person trying to perform changes in a multi-user situation

Consider the code in Listing 2, which represents a simple data abstraction for a news item Once you have this example up and running, you can test creating a row in the database with the code from listing 3 As you can see, once the abstraction layer is established, we don't even have to worry about embedding SQL state-ments in our code

Offline Content Management with PHP-GTK

by Morgan Tocker

PHP: 4.1+ (4.3 or greater recommended) OS: Windows, Linux

Applications: PHP-GTK, MySQL Code: http://code.phparch.com/20/4

Code Directory: gtk-cms

REQUIREMENTS

Over the years, I have had the opportunity to work on a

few content management systems for websites of varying

complexity While each CMS is a little different from the

others, I can’t help but think that sometimes I find myself

performing the same hacks and workarounds over and

over just to get around the limitations of HTML The

desired output of the majority of our PHP work must be

web based—but management of the content doesn’t

have to be.

+ - + - + -+ -+ - + - +

| Field | Type | Null | Key | Default | Extra |

+ - + - + -+ -+ - + - +

| id | int(11) | | PRI | NULL | auto_ increment | | author | varchar(64) | | MUL | | |

| story | text | | | | |

| created | int(10) | YES | | NULL | |

| modified| int(10) | YES | | NUL L | |

| subject | varchar(255) | YES | | NULL | |

+ - + - + -+ -+ - + - +

Listing 1

Trang 18

An HTML-based News Manager

Listings 4 through 6 provide the basis for a very simple

news management system based entirely on the web

Listing 4 (iinnddeexx pphhpp) is the home page of the system,

which creates a list of all the news available in the

data-base Listing 5 (eeddiitt pphhpp) provides the necessary

inter-face for editing the news items and Listing 6 (ssaavvee pphhpp)

takes care of saving our changes to the database

Although this example works well, there are a few

problems with it First of all, we have no data integrity

For example, the author "Morgan Tocker" is probably

the same as the author "Morgan J Tocker" and "M

Tocker" But if I wanted to compile a list of authors

(SSEELLEECCTT ddiissttiinncctt((aauutthhoorr)) FFRROOMM nneewwss WWHHEERREE vviissiibbllee ==

''11'';;), it might well contain each of the three individual

authors that were just mentioned, since we are

allow-ing each user to enter his or her name every time a

news item is created or edited

Another problem is the handling of whitespace in theauthor's name ''TThhiiss '' does not equal ''tthhiiss'' and ''tthhiiss '' does not equal '' tthhiiss'' Got it? Don't laugh—ithappens In an eternal struggle to keep data clean, wecan use ttrriimm(()) to zap off the unwanted whitespace, oruse a HTML <<sseelleecctt>> to solve the typos in our firstexample This would work, but it comes with anotherlimitation: we couldn't easily add more authors to thelist You could add a field called "other author", or write

a bit of JavaScript with an item called "Other " on thelist, whereby an oonncchhaannggee(()) event would prompt theuser for the name of the new author, and then recreatethe list dynamically

What I'd actually like to see here, however, is acombo field A combo box is neither a textfield or aselect box—it's actually both of them at the same

7 var $id ; // primary key auto_increment

8 var $author ; // author

9 var $subject ; // subject of the news article

10 var $created ; // date the article was published

11 var $modified ; // modified date

12 var $story ; // body of the news

13 var $visible ; // bool ? is the record visible

21 foreach( mysql_fetch_array ( $result ) as $field => $value )

22 $this -> $field = $value ;

23

24 } else {

26 mysql_query ( “INSERT into article (created, modified) VALUES (UNIX_TIMESTAMP(), UNIX_TIMESTAMP())” );

36

37 mysql_query ( “UPDATE article SET $property = ‘$value’, modified = UNIX_TIMESTAMP() WHERE id = ‘“ $this -> id ”’” );

38 $this -> $property = $value ;

8 $news -> set_property ( ‘author’ , ‘Morgan Tocker’ );

9 $news -> set_property ( ‘subject’ , ‘An article by Morgan’ );

10 $news -> set_property ( ‘visible’ , ‘1’ );

11 $news -> set_property ( ‘story’ , ‘This is the body of my message’ );

Trang 19

time—and it's a blessing (or a curse if you prefer) to all

modern operating systems that someone left it out of

the HTML 4.0 specification

Getting Your Feet Wet With GTK

Since the kind of functionality that we want cannot be

provided by a web browser (at least not without a

mas-sive amount of custom work), we'll have to turn

else-where—and that's where PHP-GTK comes into play

Our PHP-GTK application actually provides a "true" GUI

to our news management system, and works on a

dif-ferent machine from that of the webserver

The core of the application is shown in Listing 7 As

you can see, the PHP-GTK version of the news

manag-er is a bit more complex than the plain-HTML one,

although the length of the script is quite deceptive,

since the functionality of the three scripts that made up

the previous application has now been incorporated

into a single one

At the core, however, the application is extremely

simple Essentially, we create a set of GTK objects, and

connect them to various handlers, which, in turn, are

automatically called by the system when a specific

event takes place—such as, for example, the user

click-ing on a button Figure 1 shows you the application

running on a Linux system

The PHP-GTK application requires a copy of ddaattaa pphhpp,which was our Listing 2, so, if you update your classlibrary, be sure to copy it over to your PHP-GTK appli-cation Naturally, this is a great aspect of writing allyour applications with the same language, since you'reable to happily recycle your code as many times as youwant, and you can run it on a variety of platforms

There is a configuration option in our ddaattaa pphhpp whichchooses the MySQL server to connect to In the webserver's case, it's probably llooccaallhhoosstt In the case of thePHP-GTK application, however, you will probably beconnecting to the database remotely and, therefore,you should enter the IP or hostname of your server

Now that the application is running, notice how thecombo box used for the author's name makes theapplication easier to use Rather than having to buildadditional pages or cumbersome Javascript-based solu-tions, we can rely on the combo box to allow the user

to either choose an existing author or create a new onethrough a single control

Remembering DataI'm an Apple Cocoa programmer, and Cocoa applica-tions feature a concept called "defaults" A default is

F

Figure 1

Trang 20

basically the PHP equivalent to a session that never

expires It's a variable that you can set, and will remain

available to you indefinitely, even if you shut down the

application and launch it agagin

Defaults can be really handy for settings and

prefer-ences, although they are not quite as easy to

imple-ment in a PHP-GTK application as they are in Cocoa

Luckily, I've written a PHP script to store this data, so

you won't have to It creates a file called

$$SSCCRRIIPPTT NNAAMMEE sseessssiioonn, where it stores default

informa-tion When you first install (or execute) the application,

be sure to create this file in advance with the proper

permissions, so that no error will be output even if the

user under which the script is running does not have

write access to the folder where the defaults file resides

To tap into the features of defaults, you'll need to add

the following line to the beginning of your file:

<?php

include_once session.php;

?>

Creating a default is the same as creating a session

The GTK application can store data in the $$ SSEESSSSIIOONN

super global, and the same data will be available on

relaunch The following is an example:

informa-Making the GTK-APP work offline

Now that we have a GUI-based application that doesn'trequire a browser and a web server to run, the nextstep would be to make it independent of the database

as well, so that you can use it as a completely "offline"application that can be run even when no connectivity

15 <h2>Edit record <?php echo $news -> id?> </h2>

16 <form method=”POST” action=”save.php”>

17 <INPUT type=”hidden” name=’id’ value=’ <?php echo $news -> id?> ’>

17 header ( “Location: index.php” );

18 19

20 ?>

Listing 6

Trang 21

We're 90% there already All we really have to do is

build a proper system of caching and check to make

sure no changes have occurred since our last update

There are two generally accepted ways of performing

this last operation:

• Checking if the data has changed

from the data we grabbed

• Checking to see if the timestamp or

the last-modified date is more recent

than the timestamp from when we

grabbed the record

For our application, I am going to select the second

of these choices, given that it's easier to compare

time-stamps than it is to compare content, particularly if

there's a lot of it However, keep in mind that

time-stamps are always going to be based on the local

machine's clock and, without the database acting as a

broker to determine absolute time, it's possible that

your content will de-synchronize, thus causing

unwant-ed inconsistencies Here's how we'll be performing our

up-to-date checks:

<?php

$database_copy = new news($id, true);

if ($news->modified <= $database_copy->modified) { // Provide a warning - our copy is out of date } else {

// you may update safely }

?>

Caching Content Since we cannot store the information in the database,

we need a means to cache our information until we cansynchronize it Given that they provide a persistentoffline storage mechanism, defaults seem to be the per-fect choice here

We are going to cache each of the objects for laterretrieval by adding an uuppddaattee ccaacchhee(()) method to ourddaattaa pphhpp class, which you can see in Listing 9 Forexample, to check if we have a cache for record ID 6,

we can see if it's an object:

<?php

If (is_object($_SESSION['record']['6'])) { // we have cache for 6.

}

?>

To make the synchronization process faster, we could

also only accept cached data that is less than 72 hoursold as good without making the roundtrip to the data-base to check whether it has changed

<?php

if (is_object($_DEFAULT['record']['6']) && (time() <

$_DEFAULT['record']['6']->modified + (3600*72)) { // we have recent cache for 6

}

?>

In this case, however, you really want to make surethat your time is properly synchronized with theMySQL server—you may choose to get your currenttime by executing a SSEELLEECCTT UUNNIIXX TTIIMMEESSTTAAMMPP(()) on thedatabase server

Before we write the data back to the database, wewill have to check to see that no changes have occurred

40 session_set_save_handler ( ‘ session_open’ , ‘ session_close’ ,

‘ session_read’ , ‘ session_write’ , ‘ session_destroy’ ,

11 $tmp = new news ( $id );

13 $_SESSION [? record ?][ $id ] = $tmp ;

Trang 22

while the application was working offline If there were

changes, we will need to display a proper warning—for

example by showing a dialogue box

Where to go from here

In order for the application to be more versatile, you

may want to integrate it with the equivalent of an

"Outbox", where changes to content are written to, but

no updates take place straight away The outbox will

just be another array of records saved in your defaults—

very similar to a cache but organized in a different way

that makes it easier to catch and revise updates before

they take place

A good news management system could work

simi-larly to the way most mail clients work, with the

poten-tial to work both online and offline depending on

whether a connection to the database is available

Once this mechanism is in place, you can take

advan-tage of the application's layout to add more

functional-ity, such as workflow management For example, if

your environment calls for the approval of news items

before they are published, you could manage the entire

flow of operations through a series of "drop boxes"

where each item is deposited by users with the proper

credentials

Another possible improvement would be to include

the possibility of marking certain changes or new news

items as "drafts", so that you can save them (without

publishing them on to the database) and work on them

later

Finally, the editing method is very basic and would be

much more effective, particularly for non-technicalusers, if it were based on a more advanced interface.Interestingly enough, PHP-GTK also supports Scintilla, avery advanced open-source component that plugs intoGTK to provide extended editing capabilities (once youdownload it from http://www.scintilla.org/, you can com-pile it into your version of PHP-GTK with //ccoonnffiigguurree eennaabbllee sscciinnttiillllaa eennaabbllee ggttkkhhttmm) By working aScintilla component into your system, you could makethe editing process much easier for your users

Tips for Writing Applications with PHP-GTK

To Discuss this article:

http://forums.phparch.com/127

Morgan Tocker is a freelance developer living and working in Brisbane, Australia His consultancy business, w www.icedotblue.com m, is responsi- ble for all sorts of php hacks

Error Checking

The lifespan of your typical PHP-GTK application is usually longer than that of its web-based

coun-terparts It will have to keep running for several hours, with functions being called over and over

again For a GTK application, you may find that you will want to manage your error handling, and

check the integrity of your variables frequently While you should be doing this with web-based

applications, too, there is less of an opportunity for laziness in GTK

For example, I had a problem with an earlier version of PHP-GTK where the incorrect data seemed

to be returned intermittently – and my application crashed and burned In going through it with a

fine-tooth comb I checked the integrity of data at a few points and, if it didn’t return the

expect-ed results, I either triexpect-ed again or producexpect-ed a ‘nicer’ error

In Summary, it’s a good idea to check that an item is still an array/object/integer (or whatever it

was supposed to be) and that it is not empty/null Personally, I look forward to the release of PHP

5 and exception handling, when GTK & PHP can be taken to the next level and it will become

eas-ier to tackle these issues

Portability, Recycling, and Reusing

Another good idea is to try and store the important parts of your code nested in function calls, as

opposed to using the traditional linear approach Keeping in mind the way callbacks work, you will

find it easier to work with both a web-based and a GTK version of the same application if they both

use OOP techniques Finally, try to separate your code from your desired output, so that you can

create a file like ddaattaa pphhpp and share it between the two without the need to branch your code

Trang 23

Can’t stop thinking about PHP?

Write for us!

Trang 24

In the last issue, we talked a little about the Zend

Engine internals and how they relate to writing an

extension, about how to create an extension skeleton

using the eexxtt sskkeell tool, how to write extension

func-tions and access their parameters (using a ssccaannff(()) style

function), how to return simple types (like strings and

integers) and how to build up a PHP array We covered

a fair amount of ground, but there are still plenty more

things to learn about PHP extension writing

In this issue, we're going to look at arrays again and

see how it is possible to build multi-dimensional arrays

and how to traverse the elements of, or look-up a

par-ticular value from an array

Multi-Dimensional Arrays

As we saw last time, PHP arrays are implemented using

hash-tables This approach allows indexing the array

using a string or integer key to fetch its values Since a

hash-table is not a native C type, fetching its values is

not quite as simple as with native C arrays On top of

that, the Zend Engine has no built-in support for

multi-dimensional arrays—they are simply implemented by

storing another array in the appropriate slot of the

hash-table This can be a difficult or daunting prospect

for the budding extension author, especially

consider-ing the state of the internals documentation, even

though it is actually quite simple to implement

For our first example, let's create a two dimensional

array where the first dimension contains a list of first

names and the second dimension a list of surnames If

you're not sure what I mean, Listing 1 contains the PHP

script equivalent for the C code in Listing 2 The

con-tents of Listing 1 should be self-explanatory, so let'stake a look through Listing 2 now, line by line

Lines 1 through 5 declare a C-style 2D array The twosets of square brackets tell the compiler that it has twodimensions; the first dimension has 3 slots, while thesecond dimension has 2 slots These correspond to the

3 sets of first and last names that we are going to use

to initialize our PHP array Lines 7 and 8 are commentsdescribing the prototype for the function Hopefullyyou will recall that these comments, although theyhave no effect on the code itself, are an important cod-ing convention that helps to remind you how the func-tion is intended to be used Line 9 uses the

PPHHPP FFUUNNCCTTIIOONNmacro to declare the actual PHP function.Lines 11 and 12 declare some temporary variables—iiwill represent the person whose name we are adding,and jj will indicate if we are looking at their first or lastname The ttmmppaarrrraayy variable, as its name implies, willact as temporary storage for the array we create foreach person Line 14 initializes the PHP function'srreettuurrnn vvaalluuee as an array, and then we begin a loop online 16 which will step through each person in ournames array, using the variable ii as the counter For

Code: http://code.phparch.com/20/2Code Directory: extensions

REQUIREMENTS

As we saw last time, writing PHP extensions in C isn't quite

as difficult as you might think In this issue, we're going to

dive into the hash API and use it to traverse arrays and

fetch values from them.

Trang 25

each person, we allocate a PHP variable using the

MMAAKKEE SSTTDD ZZVVAALL(())macro, we set it up as an array (lines

17 and 18), and then we step through each of their

names and add them as string elements to our

tempo-rary array (lines 19 to 21) Having prepared our

"per-son" array, we need to add it to our "people" array—the

return value for the function (line 24)

The code should be fairly simple to follow, although

you might be wondering about two things in

particu-lar The first thing you might ask is whether you should

(or should not) worry about freeing the temporary

array value In this case you should not free it—we

"gave" it to the Zend Engine when we used

aadddd nneexxtt iinnddeexx zzvvaall(()), and the engine will take care of

freeing it at the appropriate time If we were to free it

ourselves, we would cause a crash some time later in

the script that would be difficult to track down

The other question you might be asking is whether

we need to return something from the function The

answer is no—the C function prototype is declared as a

vvooiidd function, so it has nothing to return in the usual

sense Instead, PHP passes us a rreettuurrnn vvaalluuee variable

that we populate—it is this variable that will be passed

back into your PHP script when the function returns

Since the first thing we are doing is setting up the

rreettuurrnn vvaalluuee, we don't need to do anything special

after the loops that populate it and, therefore, we

sim-ply "fall out" of the bottom of the function

As you can see, building a multi-dimensional array is

not that hard Although my example is quite succinct,the same principle can be used to build PHP arrays withany number of dimensions—you simply create a newintermediate array to hold the contents of the dimen-sion you want to add, and then add it You're not lim-ited to strings for the values either—you can use anyvalid zzvvaall value (integers, real numbers, strings,resources and boolean values, or even resources if youwant to)

Now that you are have mastered returning dimension arrays, how about looking at working withmultidimensional arrays that have been passed intoyour function?

multi-Getting Stuff Out of ArraysThere are two things that you will typically want to dowith an array that has been passed to your function—either you want to look up a specific keyed value and

do something with it, or you want to step through allvalues and do something with each of them We'll dealwith the first of these now

So far, we've used some really convenient macros toadd items to arrays—these macros insulate us from thenot-so-pretty guts of the hash table implementation.However, we've now reached a point where we muststep beyond these macros—because there are nomacros for fetching an item from an array

Before we delve in, it's worth thinking for a minuteabout how you use arrays in your PHP scripts Imaginethat you have a PHP script that accepts a couple of

$$ GGEETTparameters—name and age—and displays them

on some kind of e-card Let's also pretend that the ageparameter is optional-the e-card will happily displaysomething good regardless of whether the age param-eter is passed or not PHP (being the nice flexible thingthat it is), will allow you to access the age parameterusing $$ GGEETT[[''aaggee'']] syntax, even if it is not there (thevalue returned to your script will be NNUULLLL in that caseand, at worst, the interpreter will print out a warningmessage to indicate that the element does not exist) Ifyou are slightly more strict with your code, you mightfirst want to check that the age value is present byusing iisssseett(()) and then take a different course ofaction

This is a simple validation of input parameters and,while PHP allows you to be a lazy script coder, it does-n't allow you to be a lazy extension author—you mustcheck if an element is present before you access it, sincethe NULL you get back from the hash API is the kindthat causes a crash if you don't handle it properly Withthat in mind, take a look at Listing 3, which representsour hypothetical e-card generating function The idea isthat you pass an array of values to the function, and itwill pull out the name and age

Lines 1 to 3 are the usual prototype comments andthe PPHHPP FFUUNNCCTTIIOONN declaration Next, we declare a vari-

5 array( “Rasmus” , “Lerdorf” ),

6 array( “Zeev” , “Suraski” ),

7 array( “Andi” , “Gutmans” )

7 /* {{{ proto array phpa_2d_array()

8 Returns a 2d array of names */

Trang 26

able to point to the array passed in as the parameter to

the function on line 5 This is the same as the way that

we declared the temporary array variable from the last

example Line 6 declares two variables to hold the

name and age values—they are declared as zzvvaall ****

because the hash table stores zzvvaall ** and returns a

pointer to its storage address This allows you to

modi-fy its stored value if you wish, but you don't want to do

something like that unless you are really confident in

your abilities—in my experience, it's better to just stick

to using the main API functions

The next thing is fetching the array parameter using

zzeenndd ppaarrssee ppaarraammeetteerrss(()) The "a" format code

indi-cates that we want an array value; we are storing it into

the variable named aarrrraayy If the user

doesn't supply a single array as the

parameter, an appropriate warning

message is displayed and our function

will return a NNUULLLL value (remember that

the default return value is NNUULLLL, so we

don't need to do anything special to

get a NNUULLLL value here)

Now we're in new territory—

zzeenndd hhaasshh ffiinndd(())is the function to use

to look up a value by string key It

accepts four parameters; the first is a

pointer to a hash table, the second is a

pointer to the key string, the third is the

length of the key, including the NNUULL

ter-minator and the fourth is a pointer to a

zzvvaall **** that will receive the value if it exists You can

get at the hash table contained in a zzvvaall using the

ZZ AARRRRVVAALL PP(()) macro Before you use it, you must make

sure that the zzvvaall really does reference an array value,

otherwise you will get garbage results and most likely a

crash In this case, zzeenndd ppaarrssee ppaarraammeetteerrss(())has already

performed the check for us (we told it we wanted an

array), so we don't need to do anything further

The zzeenndd hhaasshh ffiinndd function returns SSUUCCCCEESSSS if the

element exists or FFAAIILLUURREE if it does not Beware—the

values for SSUUCCCCEESSSS and FFAAIILLUURREE are such that you must

always explicitly compare for the value you want to

check—do not assume that SSUUCCCCEESSSS will evaluate to

TTRRUUEE or that FFAAIILLUURREE will evaluate to FFAALLSSEE Another

potential gotcha is with the length of the string key—it

must include the NNUULL terminator for the string The

con-vention used within PHP is to use the ssiizzeeooff(())

opera-tor when you are passing a string that you know at

compile time, since ssiizzeeooff(()) on a constant string

resolves to the string length plus one for the

termina-tor—it's handled at compile time and saves your CPU a

few cycles when you call the function from your script

However, if you don't know the string at compile time

(perhaps it was passed as a parameter to your function

too) you should not use the ssiizzeeooff(()) operator—it will

resolve to the size of a string pointer, not the size of the

string itself So, at runtime, you need to call ssttrrlleenn(())

and add one to the result to arrive at the correct lengthfor a key

You might be wondering about the string ((vvooiidd****))cast on the last parameter to zzeenndd hhaasshh ffiinndd(())—it isjust there to keep the compiler from issuing an incor-rect warning Remember that this function wants toreturn a pointer to its storage for the element? In C, ageneric pointer to something has the type vvooiidd **, andwhen you want to return a value by reference in C, youadd an extra asterisk, so the type becomes vvooiidd ****.Since we are dealing with data that is already a point-

er, we have an extra level of indirection that makes ourfourth parameter appear to be a vvooiidd ****** equivalent—

this causes the compiler to issue awarning because it looks like we mighthave made a mistake In this case weare safe, so we use the cast to hide thewarning Be very careful though—it isstill very easy to make mistakes whendealing with all these pointers, even ifyou are an experienced C coder

Back to the listing then—we havenow managed to fetch the nnaammee ele-ment from the array that was passed

to our function, and now we want toprint it out as a string It is very impor-tant to stress that the value we have is

a zzvvaall and that, beyond that, we don'tknow anything else about it If it is astring, we can just print out the string value, but if it hasany other type it will need to be converted first, other-wise we risk crashing the engine The

ccoonnvveerrtt ttoo ssttrriinngg eexx(()) API call will handle this tion for us in the best possible way—it will do nothing

situa-1 /* {{{ proto void phpa_emit_ecard(array fields)

2 Emits a personalized e-card greeting */

Trang 27

if the value is already a string, otherwise it will convert

it to a string by making a copy of the value and

con-verting the copy The reason for making a copy is that

you don't want to change the original value directly,

since this would be reflected in the script as a sudden

"magical" change in the type of that array element

Now that we have the name in a string form, we

sim-ply print it out to the output buffer mechanism using

the zzeenndd pprriinnttff(()) function (it's equivalent to the

pprriinnttff(())function you'd call from your PHP scripts, but

channels its output through the scripting engine, so

that it can be inserted properly in the script's overall

output buffer) Note that we are using ZZ SSTTRRVVAALL PPPP(())

to access the underlying string value Earlier in the

arti-cle we used ZZ AARRRRVVAALL PP(())to get at an array value—you

can see that the names and functions of these two

macros are similar and reasonably intuitive—the former

returns a string value while the latter returns the array

value (the underlying hash table) The potentially

con-fusing part of the names is the trailing PP or PPPP—what

does that mean? Each PP represents a level of pointer

indirection, so if you are accessing a zzvvaall **, you should

use the PP version of the macro, but if you are

access-ing a zzvvaall **** you should use the PPPP version of themacro There are a whole bunch of related macros thatallow you to access the string value, string length, inte-ger value, floating point value and so on Keep in mindthat you should not use these macros unless you knowthat the zzvvaall is of the appropriate type

Having now printed the name, we proceed to lookupthe age This is done in a similar away to above, but thistime we want to print the age as a number, so we use

ccoonnvveerrtt ttoo lloonngg eexx(())to ensure that we have a integervalue, and ZZ LLVVAALL PPPP(())to access that value If the agewas not found in the array, instead of printing an age-specific salutation, a more generic message is usedinstead

That's it—our function is complete Or is it? When weprint out the name using zzeenndd pprriinnttff(()), we are relying

on the string being a regular C-style NUL-terminatedstring, since that is what the pprriinnttff(()) family of func-tions expect Since any string in PHP could potentially

be a binary string (maybe it is a far-eastern multi-bytestring) we are probably going to end up clipping thestring at the wrong point and generating broken out-put The fix for this situation is to use the PPHHPPWWRRIITTEE(())

macro instead and pass ZZ SSTTRRVVAALL PPPP((nnaammee)) and

ZZ SSTTRRLLEENN PPPP((nnaammee))as its parameters

If you want to access an array element using an arrayindex, you can use the zzeenndd hhaasshh iinnddeexx ffiinndd(()) func-tion—it takes 3 parameters—the first is the hash table,the second is the integer value of the key and the third

is a pointer to a zzvvaall **** In other words, you use it inthe same way as you use zzeenndd hhaasshh ffiinndd(()), but instead

of passing the string and the string length, you pass theinteger value of the key

Iterating ArraysNow we know how to pull specific items out of anarray, what about doing the equivalent of ffoorreeaacchh(()), sothat we can print a list of names? Before we delve intothe C code, let's just refresh our memories about how

we can iterate arrays in the PHP script itself There arethree different ways to achieve this; the first and sim-plest approach that is familiar to programmers comingfrom other languages is to use an integer counter andstep through the elements from 0 to the number-of-elements-minus-one using a ffoorr loop This firstapproach is fine if your array is only ever indexed byintegers, but this doesn't always hold true in PHP Thatleads us on to the second method Arrays have an inter-nal position pointer that you can adjust using the eenndd(()),

nneexxtt(()), pprreevv(()), ccuurrrreenntt(()), eeaacchh(()) and rreesseett(()) tions Using various combinations of these allows you

func-to step through and fetch elements from the array Thismethod is useful, but since they operate on the internalarray pointer, anything else that changes that pointerwhile you are looping over it will mess up the loop Thefinal approach is to use the ffoorreeaacchh(())control structure

F

1 /* {{{ proto void phpa_iterate_array(array array)

2 For each element of the array, print the key and value */

Trang 28

that was introduced in PHP 4 ffoorreeaacchh(())works in a

sim-ilar way to eeaacchh(())and nneexxtt(()), although it is has a little

more tolerance to things messing with the internal

position pointer, since it creates a copy of the array

before working on it

It should be apparent that touching the internal array

pointer while inside a looping control structure is a bad

thing, so we want to do something that is more like the

traditional for-loop approach, and store the array

posi-tion in a local variable in our extension funcposi-tion Of

course, we want it to work with string keys as well as

integer keys

Let's look at Listing 4, which demonstrates how to

iterate an array and print out the keys and values Lines

1-3 have the familiar prototype comments and

PPHHPP FFUUNNCCTTIIOONN declaration Lines 5-10 declare the

vari-ables that we will be using—we have a variable to hold

reference the array parameter, another to hold a

point-er to the key if it is a string, anothpoint-er for the length of

that string, a long to hold the integer value of the key

if it is not a string, a zzvvaall **** to hold the element value

and lastly we have a HHaasshhPPoossiittiioonn variable that will

keep track of where we are in the array (you can think

of this as being a bit like the integer index you would

use in a traditional ffoorr-loop, except that it works with

string indices too) Lines 12-15 validate the function

parameters to ensure that we receive only a single

array

Now we are ready to being the actual iteration The

first thing we want to do is initialize our HHaasshhPPoossiittiioonn

variable so that it points to the first element of the

array—this is achieved by calling

zzeenndd hhaasshh iinntteerrnnaall ppooiinntteerr rreesseett eexx(())and passing it

the hash table from the array and a pointer to the ppooss

variable The name of this function is a little

misleading-it doesn't touch the internal pointer at all

We want to keep looping until we run out of

ele-ments, so let's use a wwhhiillee-loop and check the return

value of the zzeenndd hhaasshh ggeett ccuurrrreenntt ddaattaa eexx(())

func-tion This function is similar to zzeenndd hhaasshh iinnddeexx ffiinndd(()),

except that instead of passing an integer index, we are

passing our hash position If the function returns

SSUUCC CCEESSSS, it will have stored the value of the current array

element in our iitteemm variable If there are no more

ele-ments, it will return FFAAIILLUURREE instead; we use this fact to

break out of the wwhhiillee-loop at the appropriate point

We also want the key for this element; we can use

zzeenndd hhaasshh ggeett ccuurrrreenntt kkeeyy eexx(())to get it This function

is a little bit complicated, since it needs to be able to

return a string key (and its length) or an integer key—

so it requires that you pass suitable variables to receives

those values It's important to stress that

requirement-even if you are only interested in integer keys you still

need to pass valid pointers for the string and length

The opposite is also true-if you only want strings you

still need to supply a variable to hold integer values

The zzeenndd hhaasshh ggeett ccuurrrreenntt ddaattaa eexx(())function returnsone of three values—HHAASSHH KKEEYY IISS SSTTRRIINNGG indicates thatthe key is a string key, HHAASSHH KKEEYY IISS LLOONNGG indicates thatthey key is an integer index and HHAASSHH KKEEYY NNOONN EEXXIISSTTAANNTTindicates that there is no element at the current posi-tion I'm using a switch statement to print the key cor-rectly based on its type It is worth noting that there is

no need to check for HHAASSHH KKEEYY NNOONN EEXXIISSTTAANNTT here, since

zzeenndd hhaasshh ggeett ccuurrrreenntt ddaattaa eexx(()) will have returnedFFAAIILLUURREE before we reach this point

The rest of the code inside the loop should be explanatory by now, except for the very last line-weneed to advance to the next element before continuingwith the next iteration of the loop, and we achieve thatusing zzeenndd hhaasshh mmoovvee ffoorrwwaarrdd eexx(())

self-Summing Up

By now you should be feeling pretty good at workingwith arrays in your PHP functions We've seen how tobuild up arrays, multi-dimensional arrays, how to pullvalues out of array by string key and by numeric key.We've also seen how to iterate through the contents of

an array All this should give you plenty of ammunitionfor when you decide to move your PHP code over to C

To Discuss this article:

http://forums.phparch.com/124

Wez Furlong is the Technical Director of The Brain Room Ltd., where he uses PHP not only for the web, but also as an embedded script engine for Linux and Windows applications and systems Wez is a Core Developer

of PHP, having contributed SQLite, COM/.Net, ActivePHP, mailparse and the Streams API (and more) and is the "King" of PECL-PHP's Extension Community Library His consulting firm can be reached at

http://www.thebrainroom.net.

Trang 29

Before starting on our quest for performance, let

me pass along a small word of caution Making

your applications faster is certainly a noble goal

but, unfortunately, it will often require a fair bit of time

and frequently expose or introduce bugs It is

absolute-ly critical that you do not begin optimization

prema-turely, as doing so will virtually guarantee that

dead-lines will be missed and that the likelihood of ending up

with a working program will be slim Only optimize

your applications once the code has been completely

written, tested and deemed acceptable, and always set

specific performance levels you seek to attain Without

a specific goal, you can just keep on optimizing

forev-er, as there will always be some other tricks and

tune-ups you could apply

Now that we've gotten the standard optimization

disclaimer out of the way, let's get to the fun part—

doing the actual work While you can certainly gain

significant performance increases from optimizing

your PHP code, this is usually one type of an

optimiza-tion you would want to leave till the very end when all

other options are exhausted Optimizing the actual

script can be a fairly drawn out process and there is

always a risk of breaking working code Whenever

pos-sible, it is always better to optimize things outside of

your code that will have a positive impact on the

per-formance of your applications As you can probably

guess, the focus of this article will be optimizations

that do not actually require code modification and still

make your PHP applications run much faster

Getting StartedThe first step consists of optimizing the PHP executableitself, which will make all the scripts executed by it runfaster This can be done by making your C compiler,such as ggcccc, work harder when compiling PHP and tunethe binary executable it generates for maximum per-formance This optimization is performed by specifyingseveral settings to the compiler via the CCFFLLAAGGSS environ-ment variable This variable, in turn, is used by the con-figuration script, which then passes these values on tothe compiler at build time It is important to note thatwhile I am mentioning these options only in the con-text of PHP, these optimization flags are applicable to allparts of the system—and the more efficient the system,the faster it will be able to run everything, includingyour PHP applications

Below is an example of a modified PHP building cedure, which leaves room for compile-time tuning

proexport CFLAGS="O3 msse mmmx march=pentium3 \ mcpu=pentium3 -mfpmath=sse -funroll-loops"

-./configure make make install

The Need For Speed

Optimizing your PHP Applications

by Ilia Alshanetsky

PHP: 4.1+

OS: N/AApplications: Optional: Turck Mmcache,APC, PHP Accelerator, Zend Cache

REQUIREMENTS

The ever growing popularity of the web is putting a

con-tinually growing stress on the software and hardware

used to power the common website This article will help

you combat the growing server loads and increase your

web serving capacity without resorting to costly hardware

Trang 30

What do these options do? The first one, OO33,

indi-cates what level of optimization the compiler should

use Normally, PHP uses only OO22, which is considered

to be "safe", as too much optimization can cause

sta-bility issues However, given the evolution of

compil-ers, OO33 is, in my experience, just as safe and many

projects have already adopted it as their default

opti-mization level The main difference between the two

is that OO33 enables function inlining, which allows the

compiler to optimize out some functions by replacing

function calls with a copy of their code Another

opti-mization technique that is enabled by OO33 is register

renaming, which allows the compiler to take

advan-tage of unused registers for various tasks; this is very

handy on modern processors with large numbers

reg-isters that are frequently left unused

The downside of OO33 is that it makes the generated

code nearly impossible to debug, since the register

rearrangement creates a situation where a valid

back-trace in the event of a crash cannot be generated

However, since you should not encounter crashes in a

production environment, this is a fairly acceptable loss

in most situations

In our compilation script above, we have a set of

options that tell the compiler in a fair bit of detail

about what processor the server has and what features

it supports This allows the compiler to apply various

tricks and optimizations that are specific to a

particu-lar CPU (a Pentium III in our case) This is not

normal-ly done when producing binaries for distribution, since

the goal is to generate portable code that can run on

as many models of CPUs for a particular architecture

as possible

Of course, enabling CPU-specific targeting means

that the portability of the generated binary will be

lim-ited to a single processor type For example, code

tai-lored for the Pentium III via the mmaarrcchh and mmccppuu

switches (such as the one in my example) will not

work on older Pentiums and AMD processors If you

are compiling PHP for a server farm that uses all types

of CPUs, you may not want to use CPU tailoring

options as they would require you to compile a

sepa-rate PHP executable for every CPU type

The other three options, mmssssee, mmmmmmxx and

mmffpp mmaatthh==ssssee, indicate that my processor supports these

extended instruction sets and tells the compiler it

should try to use them to generate a more optimal

code SSE and MMX are primarily math-related

instructions sets and their usage can significantly

accelerate any mathematical operations the

underly-ing C code needs to perform

The last option I specify, , tells the compiler that it

should unroll any small loops The effect is the

reduc-tions in the number of instrucreduc-tions the processor

needs to execute, since there is no more loop

However, the resulting binary will be slightly largersince instead of a single instance of the code in theloop, you'll now have the code inside the loop repeat-

ed as many times as the loop would have ran

Configuring PHP ProperlyNow that we have set our compiler options, let'sreview the configuration of PHP itself, as that, too, canhave significant impact on performance

In most cases, PHP is used for serving web pages,usually as an Apache server module The standardapproach is to compile PHP as a shared Apache mod-ule that the web server then loads on startup This isthe recommended approach, as it allows for easy PHPupgrades that do not require recompilation ofApache However, this is most definitely not the most

performance-friendly approach

When generating a dynamically loadable module,the linker will add a series of hooks to allow the mod-ule to be loaded, which, among other things, does notallow the compiler to optimize the generated code tothe fullest The end result is that the compiled PHPexecutable is anywhere between 10% and 25% slow-

er than it would be had it been compiled statically intoApache

# PHP configure line /configure with-apache=/path/to/apache_source

# Apache configure line /configure activate- module=src/modules/php4/libphp4.a

The configuration procedure above will compilePHP directly into Apache, making PHP part of theApache server executable As you can image, thismeans that upgrades of Apache or PHP will requireyou to recompile both packages However, given theinfrequent releases of both projects and relative quickcompilation, the extended build procedure is morethan made up for by the performance increase

You can speed up the increase in compilation timecaused by the static compilation by reducing thenumber of extensions PHP compiles—and that willalso increase performance By default, PHP compiles anumber of extensions that you may never use andthat, in the end, only increase the size of your PHPbinary, causing it to use more memory Worse yet,some extensions will initialize various buffers andparameters on every request, slowing down the dataserving process You should try to compile only theextensions you need and disable extensions that you

do not intend to use

./configure \ disable-all \ disable-cgi \ disable-cli \ with-apache=/path/to/apache_source \

Trang 31

enable-session \

with-pcre-regex \

with-pgsql \

The example above uses the ddiissaabbllee aallll

configu-ration flag to disable all extensions that are enabled by

default in one go, saving the time needed to find all of

the default extensions and disable them It also will

automatically disable all newly enabled-by-default

extensions should any appear in the future without

having to manually go through the configuration The

ddiissaabbllee ccggii and ddiissaabbllee ccllii configuration

direc-tives explicitly disable the generation of the CLI and

CGI SAPIs, whose compilation is not automatically

dis-abled by the -disable-all flag Since only the Apache

SAPI is needed, there is no need to waste time

build-ing binaries that will not be used

Once all the unneeded SAPIs and extensions have

been disabled, the needed extensions are enabled and

the compilation process can begin The end result is a

smaller binary, which is especially important for SAPIs

such as CGI and CLI where the startup costs occur on

every request A smaller binary will load that much

faster allowing it to get to code processing quicker

More importantly, unneeded initializations will not be

performed, making PHP work faster in all instances,

regardless of the underlying SAPI

Optimizing the INI File

With the PHP configuration and compilation out of the

way, it's time to turn to the PHP.INI configuration

directives, which can be used to improve the overall

performance of your scripts as well

I'll begin with the rreeggiisstteerr gglloobbaallss option, which is

already off by default as of PHP 4.2.0 However, many

people still have it enabled, since their configuration

was never updated as they upgraded their versions of

PHP This option makes PHP register a potentially large

number variables based on user and system input, as

well as making certain security exploits possible It is is

recommended to keep this option off and use the

readily available super-globals to access the data

passed by the user through POST and GET queries or

browser cookies

You can further optimize the process of creating

variables based on user input by changing the

vvaarrii aabblleess oorrddeerr directive It indicates which source of

client-generated information should be used to

popu-late the superglobals, as well as in which order they

should be considered when building $$ RREEQQUUEESSTT, which

is a cumulative result of the contents of other

super-globals By default, this option has a value of EEGGPPCCSS,

meaning that data from the system environment, the

server environment, as well as user GGEETT//PPOOSSTT//CCOOOOKKIIEE

input is stored Storage and creation of array elements

inside super-globals can take a hefty amount of

mem-ory and will have a negative impact on performance asthis process is repeated during every single request.Therefore, you can improve the overall performance

of your system by reducing the number of als that are being created In most situations, thismeans that you can set the value of vvaarriiaabblleess oorrddeerr

super-glob-to just GGPPCC, so that only the data passed by the user inthe GGEETT//PPOOSSTT queries or through cookies is storedinside super-globals The effect of this choice is amuch faster input parsing procedure and a smallermemory footprint If you need to use environment orsystem parameters, you can fetch them individuallyusing the ggeetteennvv(()) function instead, which will notcause a consistent performance impact

Beyond the standard super-globals, PHP also createsspecial variables that are used to store data that ispassed via the command line In a web environment,your PHP scripts will never be passed arguments insuch a manner and, therefore, creating those variables

is not necessary You should disablerreeggiisstteerr aarrggcc aarrggvv, which is the PHP setting respon-sible for the creation of these variables, to furtherspeed up your scripts Keep in mind that, if you usethe CLI SAPI, you will need to leave this optionenabled, otherwise your scripts will not be able toretrieve arguments passed to them via the commandline

When parsing user input, PHP automatically escapesthe data to prevent the user from injecting specialcharacters that can potentially result in an undefinedbehavior in certain portions of your scripts Thisautomation is not always needed, since not all datafetched from the user is used in such a manner thatprovides a chance for special characters to cause trou-ble It would be better to disable this automation byturning off the mmaaggiicc qquuootteess ggppcc directive and manu-ally escape the data as needed using aaddddssllaasshheess(()),, orusing whatever is the most appropriate escaping func-tion for the situation For example, in some cases, youneed to use special escaping functions that are specif-ically tailored to secure data in a particular context,such as eessccaappeesshheellllccmmdd(()) for command lines andmmyyssqqll eessccaappee ssttrriinngg(()) for MySQL queries

The advantages of doing your own escaping arenumerous: first of all, you only escape what you need,thus reducing the amount of time PHP spends parsinguser input You also save memory, as the escapingprocess will allocate twice as much memory to store

an escaped string than it would normally for anunescaped one Moreover, you also get a better-designed application that does not depend on a par-ticular server configuration and is capable of workingsecurely in an environment where mmaaggiicc qquuootteess ggppcc isdisabled

Beyond variable creation there are a number of

F

Trang 32

other INI settings that are important for optimization

purposes By default, every PHP request is prefixed

with an XX PPoowweerreedd BByy header, which shows that what

version of PHP you are running For the purposes of

rendering the page, this header is completely useless

and, unless the user fetches the headers manually, it

will never even be visible In fact, just about the only

people who can make use of this field are those trying

to compromise your system and for that purpose need

to determine what software is being run on it It would

be prudent, therefore, to disable sending of this

head-er by setting the eexxppoossee pphhpp setting to off Not only

will this make a potential attacker's job more difficult,

but it will also save a little bit of bandwidth and

slight-ly increase performance by not sending useless data

over the connection with your client

Speaking of sending data across the wire to your

users, this is another area where proper INI

configura-tion can be of much use By default, PHP will print the

data to the user as soon as your script outputs it,resulting in many write operations, each sending asmall bit of data to the socket This can become quiteslow, especially for large pages, since many systemcalls will need to be performed to write the data and

at least some browsers will re-render the page eachtime a small chunk of data is received, making theuser's experience less than pleasant The alternative is

to buffer the data in memory and send it in largechunks, thus reducing the number of writes to thesocket and potentially speeding up the rendering time

on the client

Output buffering can be enabled and controlled viathe oouuttppuutt bbuuffffeerriinngg option, which allows you tospecify how big the memory buffer used to store ascript's output should be Ideally, you would want thisbuffer to be about the same size as the average pageyou send to your clients; this way, your average scriptoutput can be sent across the wire in one large chunk

Figure 1

Trang 33

At the same time, you should be careful not to create

overly large buffers, as each PHP instance will have a

buffer of its own—and, with many instances running

at the same time, this can add up to quite a few

megabytes, potentially exhausting all available

memo-ry

Another solution that can accelerate the process of

sending data to the user is compression PHP supports

a GZIP-compressed output buffer handler that can be

used to compress the data sent to the user in a

man-ner that is automatically recognized by most modern

browsers For those users with compatible browsers,

compression will reduce the size of the page many

times over The decrease in page size is especially

con-venient for users with slow connections, for whom this

technique can shave off several seconds from the time

it takes to load each page In addition, faster data

transmission allows server processes to be freed

earli-er, which, in turn, makes it possible for your server to

handle a greater number of requests in any given

timespan Another pleasant side effect (on a very large

scale) is the reduced bandwidth bill; I have seen

band-width usage cut by as much as 40-50% by simply

introducing compression

Better yet, implementing this feature does not

require any code modification and it can be enabled

by simply setting oouuttppuutt hhaannddlleerr to oobb ggzzhhaannddlleerr

inside the php.ini file Alternatively, you can enable it

for individual virtual hosts inside httpd.conf or

specif-ic directories via htaccess, or even via iinnii sseett(())inside

scripts that output large quantities of text You should,

however, keep in mind that compressing the data

does require CPU power, and will increase the server

load slightly However, in most cases the benefits of

faster loading pages, minimized bandwidth usage and

reduced number of server processes will outweigh the

inevitable slight increase in CPU usage

On occasion, you may find yourself using PHP not

only to send data, but also to retrieve it from a remote

source (for example, when implementing a network

client like an e-mail application that has to retrieve

messages from an IMAP server)

In these situations it is important to keep in mind

that the Internet is not a local storage medium, and

getting data out of it can be quite slow You probably

don't want to spend too much time waiting for the

external source to respond to your query, or you may

run the risk of hogging down your whole server To

prevent endless waiting, you should use the

ddeeffaauulltt ssoocckkeett ttiimmeeoouutt setting, which allows you to

define how many seconds PHP should wait before

giv-ing up on fetchgiv-ing data from a remote source This is

especially important in a web environment, since

while your script is waiting for data its web server

instance cannot be used to serve other requests,

potentially requiring the creation of additionalprocesses and resulting in an increased server load

In addition to remote sockets, you are likely to beworking with local sockets in the form of databaseconnections Tuning your connection parameters is avery important step that will prevent connection over-load, which may result in a performance drop andrefused connections leading to broken pages I recom-mend that you use the mmaaxx lliinnkkss and mmaaxx ppeerrssiisstteennttoptions that exist for most database interfaces to spec-ify how many connections PHP may keep open at anyone time By default, these options are set to -1(unlimited), which in most situations is not a goodidea, since it could lead to PHP trying to open moreconnections than your database server can handle.This setting is especially important when using persist-ent connections, which in an Apache environment willsoon result in each child having their own connectionopen to the database It is absolutely critical to ensurethat there are strict controls to prevent persistent con-nections from taking up all possible database sockets,thus causing the DB server to refuse all other connec-tions

In many instances (for example, if you run a sharedhost), it may be prudent to disable persistent connec-tions altogether via the aallllooww ppeerrssiisstteenntt directive.This will automatically convert all attempts to openpersistent connections into regular connections andhelp preventing a possible overload on your server

PHP's INI settings include several directives that limitthe operations that PHP can perform, such as the abil-ity to access and manipulate files and the amount ofmemory allocated by the interpreter These settingsare quite useful in a shared environment, where youwant to keep a tight leash on your users to ensure thatthey are not abusing the system but, in a dedicatedenvironment where you control a majority (if not all)

of the PHP code executed by the interpreter, they onlyserve to slow down often-used functionality Thus, forperformance reasons it is better not to use thessaaffee mmooddee, ooppeenn bbaasseeddiirr and mmeemmoorryy lliimmiitt directives

in dedicated environments; the checks performed byPHP to enforce them are quite expensive and can lead

to significant performance losses if enabled

Beyond the ConfigurationBesides optimization tricks and configuration tuningthere are several other methodologies that canimprove the performance of PHP applications withoutactually having to dabble in the application's sourcecode

The first and foremost of these tools is an opcode

cache, sometimes referred to as a "PHP compiler",

although the term is really misused Under normal cumstances, before the PHP script can be ran it must

cir-F

Trang 34

first be parsed and converted to a series of instructions

(opcodes) that the Zend Engine can understand This

is a fairly fast process, but in large scripts with many

include files it can take up a significant amount of

time Even in smaller applications, reading the PHP

script from disk and parsing it every single time before

execution can add up It is quite wasteful, since for the

most part the scripts rarely change between

execu-tions and there is really no need to parse the code

from scratch every single time

This is where an opcode cache comes in Instead of

repeated parsing, the generated instructions are

stored inside shared memory (or on disk), so that

fur-ther access to the script does not require reparsing

Additionally, because the opcodes are often stored

directly in memory, file system operations are reduced

to a simple check to determine whether or not the

script has changed since it was cached, thus further

improving performance

Most opcode cache implementations—and there

are several of them on the market nowadays-go even

further and actually optimize the opcodes before

stor-ing them Durstor-ing the traditional compilation process,

the PHP parser tries to speed up the opcode

genera-tion process and does not always generate the most

optimal instructions for the Zend Engine to execute

With an opcode cache, since the parsing is only done

once, it makes sense to spend some time analyzing

the generated opcodes and optimizing them so that

their execution can be as fast as possible The end

result is that, with an opcode cache in place, you may

see your PHP's performance improve anywhere

between 40-600%

As far as opcode caching products go, for the most

part all available solutions offer just about the same

level of performance, with some minor differences My

current favorite is Turck-MMcache

(http://turck-mmcache.sourceforge.net/), which was originally

devel-oped by Dmitry Stogov This particular compiler

comes with a particularly efficient opcode caching

mechanism and a powerful optimizer that in most

cases can allow you to squeeze in a few extra requests

per second compared to its competition This cache

also includes a few other features, such as a memory

session handler and a content caching mechanism,

which can be used to further improve the

perform-ance of your PHP applications Unfortunately, at this

time Dmitry is unable to dedicate time to the project

and the development of MMCache has stalled

However, a number of volunteers have promised to

continue maintaining the project and hopefully will

pick up where Dmitry left off

The Zend Performance Suit (ZPS) is a commercially

available PHP acceleration package offered by Zend

that also implements an opcode cache and an

opti-mizer as well as content caching capabilities The bigplus of ZPS is that it is designed with both experiencedand novice users in mind and provides a very power-ful and user friendly interface to its components This

is especially useful when configuring content caching,which in Mmcache, for example, can require a bit ofmanual labor and testing However, unlike MMcache,ZPS is not free Its licensing model starts at about $499per server, which may put it out of the price range ofsmall site operators

Aside from ZPS, there is also APC, an Open Sourceinitiative that has made big strides in the past year Itsperformance is similar to that of ZPS and MMcache,but the lack of a good optimizer makes it a little slow-

er in certain situations Given its active development,however, there is little doubt that it will eventually beable to match the capabilities of the other implemen-tations

I should also mention the IonCube PHP Accelerator,which was one of the original free opcode cacheimplementations It still works quite well with PHP 4.3series, but has not had any new visible developments

in over a year and consequently does not perform aswell as MMCache or APC in most situations

A Hidden CacheRegardless of whether or not an opcode cache is used,most scripts will still perform a fair number of file sys-tem operations These can become a major bottle-neck, because, while processor and memory speedskeep increasing, hard-drive speeds remain quite slow

It does not take much to reach the maximum read orwrite speed of a drive, which is usually just a fewdozen megabytes per second

For ultimate performance, it is best to eliminate allfilesystem operations While this may seem like animpossible goal, a wonderful invention called a

"ramdisk" makes it attainable without much effort Aramdisk is really little more than the emulation of ahard-drive in memory; as far as programs (includingyour PHP scripts) are concerned, it appears to be justanother run-of-the-mill disk partition However, thedata written in a ramdisk is actually stored directly inthe system's memory, where data throughput is meas-ured in hundreds of megabytes per second

Nearly all operating systems support ramdisks, butLinux actually goes a step further and allows for it to

be bound to a physical drive or directory This meansthat, while you get all the benefits of writing and read-ing data to memory, you also do not risk losing thatdata in the event of a system crash or reboot, since thekernel will automatically synchronize it back to thephysical drive as needed Incidentally, it's also veryeasy to turn on this feature-all you need is someonewith root access and a few spare minutes:

Trang 35

mount bind -ttmpfs /tmp /tmp

mount bind -ttmpfs /home/webroot /home/webroot

The example above binds two commonly used

directories, the temporary directory (frequently used

for session storage and other common operations)

and the directory where web site files can be found

The end result is that virtually all file operation

com-monly performed by PHP are accelerated through the

reduction in the file I/O overhead At the same time,

reliability is not sacrificed for the sake of performance,

making this an ideal solution even for the most

demanding of websites The only downside of this

speed-up is that the ramdisk uses your memory and,

therefore, binding large directories can eat up quite a

bit of space that would otherwise be available to your

applications Thus, you need to exercise a bit of

cau-tion to ensure that directories mapped to ramdisks do

not end up consuming all available memory and force

the operating system to use its much slower swap

memory facilities

And We Didn't Even Touch a Line of

Code!

As you've probably by now realized, there are many

ways to improve the speed of PHP applications

with-out having to perform potentially dangerous codechanges

Equally important is the fact that the changes for themost part require very little time to implement andcan result in massive performance improvements Thisdoes not mean that you should abandon the practice

of optimizing the code itself, which is, of course, animportant tool for making your applications faster.However, when time is of the essence and the pressure

is on, it is always good to know a few tricks to makethe code run faster without having to tinker with it

F

To Discuss this article:

http://forums.phparch.com/128

Ilia Alshanetsky is an active member of the PHP development team and

is the current release manager of PHP 4.3.X Ilia is also the principal developer of FUDforum ( http://fud.prohost.org/forum/ ), an open source bulletin board and a contributor to several other projects He can

be reached at ilia@prohost.org .

Ngày đăng: 11/12/2013, 02:15

TỪ KHÓA LIÊN QUAN

w