Each file found in the directory is matched against the eerreegg pattern on line 41 and, if that operation is successful, a new item with the keyword as a key and the file's path as the
Trang 1FEBRUARY 2004 VOLUME III - ISSUE 2
See inside for details
Get Ready For
Caching Techniques for the PHP Developer
Offline News Management with PHP-GTK
EXtending PHP
Handling PHP Arrays from C
The Need for Speed
Writing More efficient PHP scripts
Trang 3Visit us at www.phparch.com/cruise for more details.
March 1 st - March 5 th 2004
Andrei Zmievski -Andrei's Regex Clinic, James Cox - XML for the Masses,
Wez Furlong - Extending PHP, Stuart Herbert - Safe and Advanced Error Handling
We’ve got you covered,
from port to sockets.
Port Canaveral • Coco Cay • Nassau
Plus: Stream socket programming, debugging techniques, writing high-performance code,
data mining, PHP 101, safe and advanced error handling in PHP5, programming smarty,
and much, much more!
In partnership with Zend Technologies
Zend Studio 3.0 is theofficial PHP IDE of
Trang 4The Need For Speed
Optimizing your PHP Applications
Trang 5Existing subscribers
can upgrade to
the Print edition
and save!
Login to your account
for more details.
NEW!
*By signing this order form, you agree that we will charge your account in Canadian dollars for the “CAD” amounts indicated above Because of fluctuations in the exchange rates, the actual amount charged in your currency on your credit card statement may vary slightly.
**Offer available only in conjunction with the purchase of a print subscription.
Choose a Subscription type:
CCaannaaddaa//UUSSAA $$ 8833 9999 CCAADD (($$5599 9999 UUSS**)) IInntteerrnnaattiioonnaall SSuurrffaaccee $$111111 9999 CCAADD (($$7799 9999 UUSS**)) IInntteerrnnaattiioonnaall AAiirr $$112255 9999 CCAADD (($$8899 9999 UUSS**))CCoommbboo eeddiittiioonn aadddd oonn $$ 1144 0000 CCAADD (($$1100 0000 UUSS))((pprriinntt ++ PPDDFF eeddiittiioonn))
Your charge will appear under the name "Marco Tabini & Associates, Inc." Please allow up to 4 to 6 weeks for your subscription to be established and your first issue
to be mailed to you.
*US Pricing is approximate and for illustration purposes only.
php|architect Subscription Dept.
VISA Mastercard American Express
Credit Card Number:
The Magazine For PHP Professionals
YYoouu’’llll nneevveerr kknnoow w w whhaatt w wee’’llll ccoom mee uupp w wiitthh nneexxtt
Trang 6Graphics & Layout
Chris Shiflett, Morgan Tocker
php|architect (ISSN 1709-7169) is published twelve times a year by Marco Tabini & Associates, Inc., P.O Box 54526, 1771 Avenue Road, Toronto, ON M5M 4N5, Canada Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes
no responsibilities with regards of use of the information contained herein or in all ciated material.
asso-Contact Information:
General mailbox: info@phparch.com
Editorial: editors@phparch.com
Subscriptions: subs@phparch.com
Sales & advertising: sales@phparch.com
Technical support: support@phparch.com
Copyright © 2003-2004 Marco Tabini & Associates, Inc.
— All Rights Reserved
php|architect As I write this, I'm sitting in myoffice—about forty degrees Celsius warmerthan outside and, therefore, a much better place to
work in that that the local park—suffering from an
awful cold and sitting by a collection of (clean) tissues
discreetly stashed on my desk, ready for use As you
can expect, I'm not particularly happy about either fact
(make that three facts—the cold outside, the cold in
my body, and the fact that I'm sitting in an office when
I could really be somewhere else far away from
any-thing that even remotely resembles a computer)
Incidentally, with php|cruise coming at the beginning
of March, I should hopefully be able to get rid of at
least two problems—and I'm still working on finding a
way to avoid computers during that trip
But I ramble—a clear sign that the cold medicine is
wearing off Let me instead tell you something about
this month's issue With the popularity that PHP enjoys
nowadays comes the fact that it is used as the
back-bone of more and more high-traffic sites A simple
con-sequence of this is that an increasing number of
devel-opers are "hitting the wall" and finally feeling the limits
of what the "let's just do it in PHP" approach can do
Building a website is always a high-wire balance of
budgeting, respecting deadlines and writing the best
code possible, but there's nothing quite as bad as
find-ing out that the way you've done thfind-ings is incapable of
meeting the demands of your website—and, by the
time you realize that you have a problem, it's usually
too late to think about a solution short of calling your
travel agent and inquiring about that non-extradition
country you heard of
Therefore, this month we dedicate a fair amount of
room to the performance management of PHP
applica-tions George Schlossnagle's article—based on an
excerpt from his latest book, published by SAMS—talks
about profiling, a concept that I have very rarely seen
associated with PHP applications Profiling takes the
guesswork out of understanding where the bottlenecks
in your application are, allowing you to focus on
find-ing the best possible resolution
The problem with profiling is that it only allows you
to identify the problems and not solve them Luckily,
Ilia Alshanetsky and Bruno Pedro offer two other
excel-lent articles on improving the performance of PHP
without affecting the code itself (if you can, why not
avoid the risk of introducing even more bugs?) While
Ilia focuses on ways to make the PHP interpreter itself
run faster, Bruno examines the topic of caching—both
at the network and script level
This month we also start a new column—Security
Corner—written by Chris Shiflett The daily number of
security advisories, patches, break-ins and source-code
thefts that we see reported in the media every day has
Continued on page 8
Trang 7PHP 4.3.5RC1 has been released
for testing This is the first release
candidate and should have a very low number of
prob-lems and/or bugs Nevertheless, please download and
test it as much as possible on real-life applications to
uncover any remaining issues List of changes can be
found in the NEWS file
For more information visit: http://qa.php.net/
PHP Community Logo Contest
Following Chris Shiflett’s recent announcement of the
PHP Community Site, he is holding a contest to
find a logo that embodies the spirit of the PHP
community Everyone is welcome to participate,
and you can submit as Many entries as you like
Please send all entries to
logos@phpcommuni-ty.org And include the name with which you want
to be credited
The contest ends 29 Feb 2004, and php|architect is
offering a free PDF subscription to the winner For
updated news about the contest, as well as a
chance to view the current entries, visit:
http://www.phpcommunity.org/logos/
Good luck to all who enter!
ZEND Studio 3.0.2 Zend has announced the release of Zend Studio 3.0.2client What’s new? Zend.com lists some of the bugfixes as:
• ZDE didn’t load when using a new keymapconfig from an older version
• Save As Project didn’t always work
• Server Center activator tried to open thewrong URL
• js files were not opened with JavaScripthighlighting
• Shift-Delete and Shift-Backspace didn’t workproperly
• Find&Replace was very slow under Linux
• Add Comment sometimes erroneously mented out a line that wasn’t selected
com-• Added configurable limit for the number ofdisplayed syntax errors
There have also been improvements to the debugger,code completion, code analyzer, IE toolbar, and someMac OSX changes
Get more information from Zend.com
Trang 8MySQL Administrator
MySQL.organnounces: MySQL Administrator is a
pow-erful new visual administration console that makes it
significantly easier to administer your MySQL servers
and gives you better visibility into how your databases
are operating MySQL Administrator integrates base management and maintenance into a single,seamless environment, with a clear and intuitive graph-ical user interface Now you can easily perform all thecommand line operations visually, including configur-ing servers, administering users, dynamically monitor-ing database health, and more
data-Get more information from:
http://www.mysql.com/products/administrator/index.html
Check out some of the hottest new releases from PEAR.
DB 1.6.0 RC4
DB is a database abstraction layer providing:
• an OO-style query API
• a DSN (data source name) format for specifyingdatabase servers
• prepare/execute (bind) emulation for databasesthat don’t support it natively
• a result object for each query response
• Compatible with PHP4 and PHP 5
• much more…
DB layers itself on top of PHP’s existing databaseextensions The currently supported extensionsare: dbase, fbsql, interbase, informix, msql, mssql,mysql, mysqli, oci8, odbc, pgsql, sqlite and sybase(DB style interfaces to LDAP servers and MS ADO(using COM) are also avaible from a separate pack-age)
System_ProcWatch 0.4With this package, you can monitor runningprocesses based upon an XML configuration file,XML string, INI file or an array where you definepatterns, conditions and actions
Net_IMAP 0.7 Provides an implementation of the IMAP4Rev1protocol using PEAR’s Net_Socket and the option-
al Auth_SASL class
XML_Beautifier 1.1XML_Beautifier will add indentation and linebreaks to you XML files, replace all entities, formatyour comments and makes your document easier
to read You can influence the way your document
is beautified with several options
Looking for a new PHP Extension? Check out
some of the latest offerings from PECL.
opendirectory 0.2.2
Open Directory is a directory service architecture
whose programming interface provides a
central-ized way for applications and services to retrieve
information stored in directories The Open
Directory architecture consists of the
DirectoryServices daemon, which receives Open
Directory client API calls and sends them to the
appropriate Open Directory plug-in
statgrab 0.1
libstatgrab is a library that provides a common
interface for retrieving a variety of system statistics
on a number of *NIX like systems
This extension allows you to call the functions
made available by libstatgrab library
Sasl 0.1.0
SASL is the Simple Authentication and Security
Layer (as defined by RFC 2222) It provides a
sys-tem for adding plugable authenticating support to
connection-based protocols The SASL Extension
for PHP makes the Cyrus SASL library functions
available to PHP It aims to provide a 1-to-1
wrap-per around the SASL library to provide the greatest
amount of implementation flexibility To that end,
it is possible to build both a client-side and
server-side SASL implementation entirely in PHP
SQLLite 1.0.2
SQLite is a C library that implements an
embedda-ble SQL database engine Programs that link with
the SQLite library can have SQL database access
without running a separate RDBMS process This
extension allows you to access SQLite databases
from within PHP Windows binary available from:
http://snaps.php.net/win32/PECL_STABLE/p
hp_sqlite.dll
Trang 9PHPWeather 2.2.1
PHP Weather announces the release of version 2.2.1
PHP Weather makes it easy to show the current
weath-er on your webpage All you need is a local airport, that
makes some special weather reports called METARs
The reports are updated once or twice an hour
Get more information from :
http://sourceforge.net/projects/phpweather/
PHPEclipse Debugger
PHP Eclipse adds PHP support to the Eclipse IDE
Framework This snapshot introduces the first version of
the PHPEclipse debugger plugin
For more information visit:
http://www.phpeclipse.de
MySQL and Zend Working Together
From Zend and MySQL – These two have Joined Forces
to Strengthen Open Source Web Development
MySQL AB, developer of the world’s most popular open
source database, and Zend Technologies, designers of the
PHP Web scripting engine, today announced a partnership
to simplify and improve productivity in developing and
deploying Web applications with open source
technolo-gies Through the alliance, the companies are improving
compatibility and integration between the MySQL base and Zend’s PHP products to make it easier for busi- nesses to use complete open source solutions, such as the popular LAMP (Linux, Apache, MySQL and PHP) software stack
data-As part of the partnership, MySQL AB and Zend areoffering partner products to their respective customers,enabling easier product procurement and deploymentfor Web application infrastructures The companies willalso commit development resources to design productintegration and compatibility modules for both ven-dors’ platforms
For more information visit: www.zend.comSAXY 0.3 SAXY is a Simple API for XML (SAX) XML parser for PHP
4 It is lightweight, fast, and modeled on the methods
of the Expat parser for compatibility The primary goal
of SAXY is to provide PHP developers with an tive to Expat that is written purely in PHP Since SAXY isnot an extension, it should run on any Web hostingplatform with PHP 4 and above installed
alterna-This release allows CDATASection tags to be served, rather than converted to Text Nodes
pre-For more information visit:
In his article on offline news management, Morgan Tocker writes about how PHP-GTK, that most hidden ofPHP gems, can be used to improve content management by providing a proper GUI application that doesn'trequire you to completely rewrite all your code
Finally—last but not least-Wez Furlong picks up where his article from last month left off and delves into thedeep bowels of the Zend Engine to show you how a PHP extension written in C can manipulate PHP arrays—it's not quite as easy as from a script but close enough once you know what you're doing
Well, that's it for this month By the time I will be writing my next editorial, I plan to be either boasting about
my suntan or complaining about sunburn Either way, you can expect me to report on our adventure on thehigh seas—until then, happy reading!
Editorial: Contiuned from page 5
Trang 10Despite the fact that it sounds like some
mysteri-ous Italian pasta, Gnokii is really just a project
aimed to develop tools and drivers for Nokia
mobile phones-that is, software that makes it possible
to control a Nokia phone physically connected to your
server via a serial port Gnokii works like the Nokia Data
Suite, which is shipped with more advanced models
from Nokia: you can use it to send SMS messages, edit
contacts and so on—pretty much everything you
nor-mally do with your thumb on the phone's keypad
Gnokii itself is composed of many tools, including a
set of GUI applications that facilitate the remote
opera-tion of the telephone; we are really only interested in a
small subset of these tools called ssmmssdd, or SMS daemon,
which provides an interface for rapid access to the
phone's SMS capabilities With the SMS daemon up
and running, we can use PHP to interact with the
phone, send and receive SMS messages and, of course,
build whatever logic we need based on the content of
the messages that we receive and send In short, my
goal with this article is to show you how to configure
software and hardware so that you can get the same
kind of service as you would normally obtain from a big
company selling mobile services like SMS gateways—
but at a fraction of the price
Major Components of the Final
Application
The final application that we will create throughout this
article is a simple SMS server that awaits a message
from a user and acts on its contents It is made up of
three major components:
• A Nokia cell phone, which must beconnected properly to the server
• The ssmmssdd application from the Gnokiipackage, which must, of course, becompiled and configured correctly
• The PHP scripts that provide the actualserver functionality
The flow of the application will be as follows:
• The user send an SMS message to theserver
• The ssmmssdd daemon picks it up and matically puts it into its database
auto-• Our server scans ssmmssdd's database odically for new messages
peri-• When a new message arrives, its tents are examined and the server acts
con-on them, for example by replying tothe user with another message
Code Directory: sms-gnokii
REQUIREMENTS
SMS-shorthand for Short Message Service-is the standard
used by cellular phone networks worldwide to allow their
customers to exchange small text messages using their
handsets Despite its limitations, SMS is very popular
with cell phone users-and it has rapidly become a
wide-ly-used bridge between the Internet and mobile users.
Trang 11Hardware needed
When it comes to cellular communications, the bad
thing about hardware is that it often costs a lot of
money, but the goal of this project is precisely to
pro-vide a low-cost alternative, so the expenses associated
with it should be quite reasonable What you'll need in
terms of hardware is a Nokia phone and a serial cable
to hook it up to your server I will, of course, expect that
you already have a server and that it is capable of
run-ning the Gnokii tools and PHP In my environment, I
have used a Nokia 3310, which is quite new but not
very expensive, and works perfectly for my needs
There are no "official" connection cables available for
the 3310, but a company from the UK called Cellsavers
(http://www.cellsavers.co.uk) have come up with a very
ingenious serial cable with a connector that you can fit
behind the battery on the phone For those who don't
know, there are 4 metal pins that are probably used by
Nokia to install software and perform other
program-ming on to the phone, and those nice folks at
Cellsavers managed to figure out how to use them to
control the phone through a serial port There might be
other companies supplying the same type of product,
but I have not seen any around
Another important note about the hardware is that
you will need to get a battery charger for the phone
One often comes with the package, and you can plug
it in and leave the phone on forever without having to
worry about the batteries
Installing Gnokii and smsd
Before starting to install Gnokii and ssmmssdd, make
sure you have MySQL installed and working
properly on your server Installing Gnokii is quite
straightforward—it involves little more than the
usual ccoonnffiigguurree mmaakkee mmaakkee iinnssttaallll steps However,
there are some configuration options that I find
important
The first might be a matter of taste, but I like to place
everything belonging to Gnokii in //uussrr//llooccaall//GGnnookkiiii
Therefore, I will use pprreeffiixx==//uussrr//llooccaall//GGnnookkiiii when
invoking it Next, the wwiitthhoouutt xx configuration switch
indicates that we will not need to use the xxggnnookkiiii GUI
application to send SMS messages and manage the
phone If you want to take a look at the graphical tools,
you can of course skip this parameter, but on a Unix
serv-er whserv-ere you normally do not have Xwindows installed
you'll get a whole lot of errors if you do so The last
parameter is eennaabbllee sseeccuurriittyy, which turns on a lot of
security-related features in the package, like the ability to
change the PIN number I find them useful, so I usually
turn them on
The resulting configure line will be as follows:
./configure prefix=/usr/local/Gnokii without-x
10 $CONFIG [ ‘keywords_directory’ ] = ‘./keywords/’ ;
11 $CONFIG [ ‘default_email’ ] = ‘eric@persson.tm’ ;
12 $CONFIG [ ‘database_username’ ] = ‘root’ ;
13 $CONFIG [ ‘database_password’ ] = ‘’ ;
14 $CONFIG [ ‘database_hostname’ ] = ‘localhost’ ;
15 $CONFIG [ ‘database_database’ ] = ‘sms’ ;
16 17
18 /*
37 $keywords = array();
38
39 $dh = opendir ( $CONFIG [ ‘keywords_directory’ ]);
40 if( $dh ){
41 while( $filename = readdir ( $dh ) ){
42 if( ereg ( ‘^([a-z0-9_]*).php$’ , $filename , $match ) ){
43 $keywords [ $match [ ]] =
$CONFIG [ ‘keywords_directory’ ] $filename ;
44 echo date ( ‘Y-m-d H:i:s’ ) ’:
‘ $match [ ] chr ( 10 );
47
48 if( sizeof ( $keywords )== 0 )
49 return_error ( ‘Keyword directory was empty.’ );
Trang 12Once you've downloaded the Gnokii tarball from
http://www.Gnokii.org—the latest version at the time of
this writing is Gnokii-0.5.5—you can decompress it andstart the compilation process:
# gzip dc Gnokii0.5.5.tar.gz | tar xof
[global]
port = /dev/ttyS1 model = 3310 initlength = default connection = serial bindir = /usr/local/Gnokii/sbin/
Make sure that you have connected your phone to thecorrect serial port as you specified in the configuration.Also, check the model of your phone and enter it accord-ingly The iinniittlleennggtthh variable controls the number ofcharacters sent to the phone during initialization; youdon't normally want to change this setting—unless youhave problems with the connection, I suggest that youuse the default value (at least initially)
The ccoonnnneeccttiioonn variable should be set to sseerriiaall, sincewe'll be connecting to the phone using the serial port
In case you're wondering, it's possible to configure it touse an infrared connection instead
Now, it's time to test it all and see if everything worksfine A good starting point here is to try and send out anSMS message using Gnokii:
81
82 include_once( $phpfile );
83 if( function_exists ( $keyword ) )
84 $keyword ( $message , $sender );
mistakes as the one
101 if( strpos ( $message , ‘ ‘ )> 0 )
102 $match_part = substr ( $message , 0 , strpos ( $message , ‘
108 if( isset( $keywords [ $match_part ]) ){
109 include_once( $keywords [ $match_part ]);
110 if( function_exists ( $match_part ) )
111 $match_part ( $message , $sender );
130 return sprintf ( ‘%1.3fs’ , (( $end_seconds
-$start_seconds )+ $end_fraction - $start_fraction ));
131
132 }
133
134
135 /* Connect to the mysql database */
136 $connection = mysql_connect ( $CONFIG [ ‘database_hostname’ ],
$CONFIG [ ‘database_username’ ], $CONFIG [ ‘database_password’ ]);
142 /* Select the database that contains the smsd tables */
143 mysql_select_db ( $CONFIG [ ‘database_database’ ], $connection ) or
return_error ( ‘Could not select database’ );
163 match_message ( $sms [ ‘text’ ], $sms [ ‘number’ ]);
Trang 13Clearly, you will need to replace the xxxxxxxxxxxxxxxxxx above
with a real, working phone number that you can test for
the message to arrive (you could, in fact, use the same
number as the cell phone you're using to send the
mes-sage) If you don't receive the message, or if you get an
error, you may want to step back and look at the
config-uration and build procedure once again, just to make
sure that you haven't missed anything
The next step consists of configuring ssmmssdd so that we
can send messages out onto the network
programmati-cally It's obviously important to have Gnokii working
first, since ssmmssdd relies on the same runtime configuration
libraries The ssmmssdd source code is located in the //ssmmssdd//
folder under the directory where you unpacked the
Gnokii tarball
SSmmssdd can work either with a database or with a
filesys-tem but, for the purposes of this article, we will only
focus on configuring it to use MySQL The daemon is not
compiled by default when you compile Gnokii, so that
will have to be our next step You will need to manually
edit the Makefile and change every instance of the path
to the MySQL installation in the DB Modules section
Next, you can build the executables:
# make
# make libmysql.so
# make install
Setting Up the smsd Database
Since we want to use ssmmssdd with MySQL, we need to
cre-ate a database for it to use For simplicity's sake, we'll call
it ssmmss and grant a new MySQL user with login ssmmss andpassword ssmmss access to it Naturally, if you move into aproduction environment where security is a concern,you may want to use a more secure username/passwordcombination Keep in mind that anyone who can accessyour ssmmss database can insert rows into the outbox andtherefore send messages from the connected phone
On a larger system, the possibility for abuse is certainlythere—and therefore security is worth at least someconsideration
In the ssmmssdd directory of your tarball, you will also find
a SQL file called ssmmss ttaabblleess mmyyssqqll ssqqll that contains thetable definitions needed to run the daemon All you need
to do is import these into your database and you are allset to go There is also a file for those that preferPostgreSQL, but we will focus on MySQL here
Installing daemontoolsThe ddaaeemmoonnttoooollss package is a collection of tools thatcan be used to monitor and manage UNIX-based serv-ices Its installation procedure is quite straightforward,since there aren't too many options or configurationdirectives The only thing to keep in mind is that somedifferences in newer versions of glibc (2.3.1 andabove) may require you to patch the ddaaeemmoonnttoooollsssource before you try to compile it The patch youneed is called the "errno-patch" and fixes an incom-patible declaration of the eerrrrnnoo variable made in thesource I've seen some people claim that this problem
is caused by bad programming practices, but theerror really only started popping up when changeswere made to glibc, so I'm not too sure as to how truethat is Whatever the real reason, if you encounter thisproblem, simply patch the source and you'll be justfine If you need to download the patch, you can get
it from http://www.qmail.org/moni.csi.hu/pub/glibc-2.3.1/.Then, follow the daemontools installation instruc-tions, which you can find at
http://cr.yp.to/daemontools/install.html.
If you're not familiar with patching software, this isdone by downloading the software, extracting it, andthen using the ppaattcchh program to affect the actualchanges in the source code More information aboutthe eerrrrnnoo patching process and ddaaeemmoonnttoooollss can befound at http://www.qmail.org/moni.csi.hu/pub/glibc-
2.3.1/INSTRUCTIONS but, generally speaking, you canget away with something like this:
# tar zxvf daemontools-0.76.tar.gz
# cd admin/daemontools-0.76
# patch -p1 /path/to/daemontools-0.76.errno.patch
On to Some PHPOur main PHP script will be as small and efficient aspossible, since it will be running as a daemon on ourserver all the time Its main task will be to check if thereare new messages in the SMS inbox table and, if so,
9 $return_message = ‘Hello to you my friend!’ ;
10 /* Output a log message */
“‘ $message ’” from “‘ $sender ’” with “‘
$return_mes-sage ’”’ chr ( 10 );
12
13 /* Send the reply to the sender */
14 mysql_query ( ‘INSERT INTO outbox SET
number=”’ $sender ’”, text=”’ $return_message ’”,
processed_date=”0”, insertdate=now(), error=0,
Trang 14match them against the possible keywords that we
have created, so that the appropriate action can be
taken We'll call this script ssmmssppaarrssee pphhpp
First of all, let's decide how we're going to structure
our application Since our main goal is to respond to
certain keywords, we'll start by creating a few "keyword
scripts", which are really nothing more than standalone
PHP files stored in a subdirectory called kkeeyywwoorrddss
For example, if we wanted to define a keyword called
hheelllloo, our directory structure would like this:
./keywords/
./keywords/hello.php
./smsparse.php
As you can see, each keyword has its own PHP file
We simply use the keyword as the filename for the
script that contains the actions associated with it in
order to simplify the entire process
Let's now have a look at ssmmssppaarrssee pphhpp, which you can
see in Listing 1 At the beginning of the script (in the
rreeaadd kkeeyywwoorrddssfunction), we read through the contents
of the kkeeyywwoorrdd directory Each file
found in the directory is matched
against the eerreegg(()) pattern on line 41
and, if that operation is successful, a
new item with the keyword as a key
and the file's path as the value is added
to the array that the function returns
at the end of its execution
As you can see, on line 148 we sort
the array in a descending fashion
based on the length of each key We
do this so that longer keywords are
checked for first and we don't end up
in a situation where a word like "eat" is
matched instead of "Seattle" because it
is shorter
The main portion of the application works by
execut-ing a loop indefinitely At every iteration, we check if a
new message has arrived in the inbox and, if that is the
case, match the message against the active keywords
that we have identified at the beginning of the script,
and finally sleep for 1 second before the next cycle
For the actual matching process, I have written two
alternatives that use different approaches The first one,
mmaattcchh mmeessssaaggee(()), is the most fault tolerant but also the
slowest one The second one, mmaattcchh mmeessssaaggee ffaasstt(()), is
not as tolerant but will save some CPU resources byusing a faster algorithm The difference won't probably
be dramatic, but on a heavily loaded server or with alarge list of keywords it may well have an impact on theoverall performance of the system
In mmaattcchh mmeessssaaggee(())(lines 74-92), the message is firstcleaned of unwanted characters such as non-alphanu-meric values and spaces, and converted to lowercase.Next, the function cycles through all the keywords andperforms an eerreegg(())match against the "clean" version ofthe message If a match occurs, the PHP file correspon-ding to the keyword is included and executed
The mmaattcchh mmeessssaaggee ffaassttfunction, on the other hand,works by taking the first word in the message and con-verting it to lowercase The word is then used to per-form a search in the keyword array and, if a match isfound, the appropriate PHP file is included and execut-ed
Writing Keyword Scripts Since keyword scripts are an idea I came up specifically
for this article, it's probably a good idea
to discuss them a little Essentially, akeyword script simply contains codethat determines what happens when akeyword is matched To make it possi-ble for multiple scripts to coexist, theactual functionality is stored in a func-tion that has the same name as the key-word that corresponds to a particularscript
Let's assume, for example, that wewant to match the word "hello" at thebeginning of a message and reply with
an SMS of our own In this, case, we'dhave to write a PHP script, calledhheelllloo pphhpp, similar to the one shown inListing 2 As you can see, the file contains a function, called
hheelllloo(()), that accepts the incoming message and thesender's phone number as arguments
Sending a reply to the sender through SMS is a simpleprocess—all we need to do is add a row to the outboxtable of the ssmmss database The SMS daemon will period-ically poll the database for new outgoing messages andsend them automatically
Trang 15Your Own PHP Daemon: Using
daemontools
The last step in our quest consists of setting up our PHP
script to run as a daemon You could, in theory, simply
run the script and detach it from the console, but if
you're running a proper server, a more robust
configu-ration is required—and this is where the ddaaeemmoonnttoooollss
package comes into place
The configuration of ddaaeemmoonnttoooollss is a bit complicated
compared to the other packages we have seen in this
article because it involves a relatively large number of
files and directories However, once one realizes that
there is method to the madness, it's not quite so bad
Given the amount of space alloted for this article, I will
leave it up to you to get ddaaeemmoonnttoooollssup and running—
the documentation is very clear and there are plenty of
resources for this purpose on the Net
When ddaaeemmoonnttoooollss is installed it creates a directory
called //sseerrvviiccee This will contain information on all the
various services that ddaaeemmoonnttoooollss is running; a program
called ssuuppeerrvviissee monitors the //sseerrvviiccee directory and
takes care of starting and keeping the services running
as needed Compared to "normal daemons", which are
started at boot time, ddaaeemmoonnttoooollss services are started by
ssuuppeerrvviissee and, if any of them is killed or dies
unexpect-edly, ssuuppeerrvviissee itself takes care of restarting them again
automatically
Therefore, ddaaeemmoonnttoooollss is an excellent solution if you
want your services to be running all the time and be
monitored for failures of any kind However, not all
services are suitable to run with this package—they
have to behave in a certain manner that makes it
pos-sible for ssuuppeerrvviissee to interact with them in an
automat-ed fashion
Luckily, most applications can be modified so that
they can be compatible with ssuuppeerrvviissee, and our
ssmmssppaarrsseerr script is no exception First of all, we must
ensure that the script can be run without having to
explicitly invoke the PHP interpreter Under a UNIX
shell, this is done by introducing a "shebang", that is, a
special command at the beginning of the file that tells
the shell interpreter which application the script should
be piped through in order for it to be executed
Let's start by figuring out where PHP is installed:
# whereis php
On my machine, a RedHat 8 server, the commands
outputs the following:
php: /usr/local/bin/php /usr/local/lib/php
/usr/local/lib/php.ini
This means that I have the PHP interpreter's binary
installed in //uussrr//llooccaall//bbiinn//pphhpp
It's now time to create a service directory for our
serv-ice We'll start by creating a "service" directory for
ssmmssppaarrssee in the //uussrr//llooccaall//ssmmssppaarrssee// directory, where
I will assume that you have stored the ssmmssppaarrssee pphhppscript and its underlying directory structure with all thekeyword scripts We will call the directory ssuuppeerrvviissee ssmmssppaarrssee:
That's it! If we now create a symlink from the iiccee directory to our newly created folder, ssuuppeerrvviissee willautomatically take care of starting and monitoring ourserver:
er, we will create a subdirectory to house the executionfiles for ssmmssdd:
# mkdir -p /usr/local/Gnokii/supervise-smsd/
Next, we'll write a new rruunn file:
#!/bin/sh exec /usr/local/Gnokii/bin/smsd -u sms -p sms -d sms -m mysql
Finally, to start the ssmmssdd run file, we link the vviissee ssmmssdd directory into //sseerrvviiccee with:
ssuuppeerr # ln -s /usr/local/Gnokii/supervise-smsd/
/service/supervise-smsd/
If you now check your process list, you should seeyour ssmmssppaarrssee and ssmmssdd processes listed-that is, if youhave done everything right:
we can diagnose any problems properly should anything
go wrong As part of the ddaaeemmoonnttoooollss package, you will
F
Trang 16find a small program, called mmuullttiilloogg, that is capable of
logging the output of a service directly to a set of
auto-matically-rotated logfiles This means that, if we set up
our service settings properly, we won't even need to
write any special code for the purpose of creating
activi-ty logs!
To enable the logging functionality, start out by
cre-ating a log directory in ssuuppeerrvviissee ssmmssppaarrssee:
# mkdir -p /usr/local/smsparse/supervise-smsparse/log
The logging process acts much like a normal process
running under ssuuppeerrvviissee It needs its own directory
and run file; therefore, we need to create a special run
file at
//uussrr//llooccaall//ssmmssppaarrssee//ssuuppeerrvviissee ssmmssppaarrssee//lloogg//rruunn that contains the following
com-mands:
#!/bin/sh
exec multilog t /main
MMuullttiilloogg supports a wide range of arguments, which,
in turn, make it possible to create very complex logging
rules Our command line above, however, is quite simple
and really just means "add a timestamp on each line, and
store the logfiles in //mmaaiinn" The tt argument represents
the number
of Temps Atomique International (TAI) seconds since
1970-01-01 00:00:10 TAI As you might remember from
Listing 1, we prepend a ddaattee((''YY mm dd HH::ii::ss'')) string
before each line is outputted and, therefore, we will
actu-ally have double timestamps in the log file (naturactu-ally, you
can modify the script to omit its timestamp, or change
the mmuullttiilloogg instantiation to do the same)
We don't need to link the lloogg directory directly from
//sseerrvviiccee The ssuuppeerrvviissee program will execute the
run-file it contains automatically for us However, you must
restart ssuuppeerrvviissee to make it aware of the new log
direc-tory You can, once again, use the ssvvcc program to send
a TERM signal to the service:
# svc -t /service/supervise-smsparse/
A new look at the process list (see Figure 1) will show
you that ssmmssppaarrssee has been started again, together with
the logging process This means that our services are
now managed by ssuuppeerrvviissee and will run indefinitely-all
the while providing us with a nice logfile, which we can
monitor by using the ttaaiill utility:
This example shows that ssmmssppaarrsseerr was started
cor-rectly, and 2 keywords where found, hheelllloo and ssuucccceessss
As you can see, the TAI timestamp at the beginning of
each line is a bit cryptic, but it can be translated into a
human readable form by piping the ttaaiill outputthrough ttaaii6644nnllooccaall like this:
# tail /service/supervise-smsparse/log/main/current |
\ tai64nlocal 2004-01-07 15:57:27.380601500 2004-01-07 15:55:46:
Starting sms parser
2004-01-07 15:57:27.380605500 2004-01-07 15:55:46:
hello 2004-01-07 15:57:27.380607500 2004-01-07 15:55:46:
success
Conclusion The easiest way to test your new Gnokii setup is to grabanother cell phone and send an SMS message contain-ing the word "Hello" to your Gnokii phone If all goeswell, ssmmssppaarrssee will pick it up and reply back with themessage we entered in the hheelllloo keyword script
As you have probably by now realized, it's not thathard to set up a mobile service through which you canexchange information with your users by utilizing SMS.Even if you're not in the business of running SMS gate-ways, you could use it for a variety of other activities.For example, you can use it to provide "fun" services,like interactive voting, or a useful server monitoringinterface for your internal network The list of possibili-ties is very long—and my clients have shown greatinterest in using SMS as a complement to other servic-es
If you're worried about scalability, this solution maynot be for you, as it will have trouble handling a verylarge number of messages on a daily basis However, it
is so inexpensive that it could well be a good startingpoint for a more serious implementation The goodnews is that you'll be able to stay with Gnokii even ifyour needs grow, as newer versions of the package areslated to support multiple phones
To Discuss this article:
http://forums.phparch.com/126
When Eric's not out skiing or hiking, he's working as a freelance
develop-er on various projects His current focus is finishing his education in open-air alpine environments.
Trang 17Welcome to the world of PHP-GTK Why
intro-duce GTK to a largely web-based language?
Well, convenience and portability come to
mind, for example Sometimes it's not feasible to write
a Java Swing interface when you've invested so much
time in your PHP classes, as you need to rewrite large
portions of code While it could be done, you'd have to
fork your code in two projects, and use two different
languages That's not something you can easily
con-vince many clients to do
Content management—a very common task for most
websites these days—represents a typical example of
an activity that is often performed directly through the
web but that could really be best served by a "true"
GUI-based client application In most circumstances,
creating a separate application is an expensive
proposi-tion, due to the duplication of code involved, the
addi-tional expertise needed and the difficulty of using a
lan-guage that will run properly on a wide variety of
plat-forms In this article, we'll tackle porting an existing
HTML-based news manager to PHP-GTK-and you'll see
how easy it is to make the jump from Web to GUI with
this powerful, if often neglected, platform
In creating our project, we'll start with a data
abstrac-tion layer and a tradiabstrac-tional HTML interface that we'll
ditch later on This article gets a little complex-so as a
prerequisite please install PHP-GTK, and create a table
in mysql with the schema shown in Listing 1 An SQL
dump with a few sample rows of data can be found in
the files for this article—it's always great to have some
sample data to work with
The Data Abstraction Layer
As a general rule, I create a data abstraction layer for every complex project I work on Some people swear
by this approach, others swear at it My personal praise goes to abstraction layers because I can do things like automatically change the modified date of a record without remembering to do it in each instance of SQL code An abstraction layer can also validate data and check the credentials of the person trying to perform changes in a multi-user situation
Consider the code in Listing 2, which represents a simple data abstraction for a news item Once you have this example up and running, you can test creating a row in the database with the code from listing 3 As you can see, once the abstraction layer is established, we don't even have to worry about embedding SQL state-ments in our code
Offline Content Management with PHP-GTK
by Morgan Tocker
PHP: 4.1+ (4.3 or greater recommended) OS: Windows, Linux
Applications: PHP-GTK, MySQL Code: http://code.phparch.com/20/4
Code Directory: gtk-cms
REQUIREMENTS
Over the years, I have had the opportunity to work on a
few content management systems for websites of varying
complexity While each CMS is a little different from the
others, I can’t help but think that sometimes I find myself
performing the same hacks and workarounds over and
over just to get around the limitations of HTML The
desired output of the majority of our PHP work must be
web based—but management of the content doesn’t
have to be.
+ - + - + -+ -+ - + - +
| Field | Type | Null | Key | Default | Extra |
+ - + - + -+ -+ - + - +
| id | int(11) | | PRI | NULL | auto_ increment | | author | varchar(64) | | MUL | | |
| story | text | | | | |
| created | int(10) | YES | | NULL | |
| modified| int(10) | YES | | NUL L | |
| subject | varchar(255) | YES | | NULL | |
+ - + - + -+ -+ - + - +
Listing 1
Trang 18An HTML-based News Manager
Listings 4 through 6 provide the basis for a very simple
news management system based entirely on the web
Listing 4 (iinnddeexx pphhpp) is the home page of the system,
which creates a list of all the news available in the
data-base Listing 5 (eeddiitt pphhpp) provides the necessary
inter-face for editing the news items and Listing 6 (ssaavvee pphhpp)
takes care of saving our changes to the database
Although this example works well, there are a few
problems with it First of all, we have no data integrity
For example, the author "Morgan Tocker" is probably
the same as the author "Morgan J Tocker" and "M
Tocker" But if I wanted to compile a list of authors
(SSEELLEECCTT ddiissttiinncctt((aauutthhoorr)) FFRROOMM nneewwss WWHHEERREE vviissiibbllee ==
''11'';;), it might well contain each of the three individual
authors that were just mentioned, since we are
allow-ing each user to enter his or her name every time a
news item is created or edited
Another problem is the handling of whitespace in theauthor's name ''TThhiiss '' does not equal ''tthhiiss'' and ''tthhiiss '' does not equal '' tthhiiss'' Got it? Don't laugh—ithappens In an eternal struggle to keep data clean, wecan use ttrriimm(()) to zap off the unwanted whitespace, oruse a HTML <<sseelleecctt>> to solve the typos in our firstexample This would work, but it comes with anotherlimitation: we couldn't easily add more authors to thelist You could add a field called "other author", or write
a bit of JavaScript with an item called "Other " on thelist, whereby an oonncchhaannggee(()) event would prompt theuser for the name of the new author, and then recreatethe list dynamically
What I'd actually like to see here, however, is acombo field A combo box is neither a textfield or aselect box—it's actually both of them at the same
7 var $id ; // primary key auto_increment
8 var $author ; // author
9 var $subject ; // subject of the news article
10 var $created ; // date the article was published
11 var $modified ; // modified date
12 var $story ; // body of the news
13 var $visible ; // bool ? is the record visible
21 foreach( mysql_fetch_array ( $result ) as $field => $value )
22 $this -> $field = $value ;
23
24 } else {
26 mysql_query ( “INSERT into article (created, modified) VALUES (UNIX_TIMESTAMP(), UNIX_TIMESTAMP())” );
36
37 mysql_query ( “UPDATE article SET $property = ‘$value’, modified = UNIX_TIMESTAMP() WHERE id = ‘“ $this -> id ”’” );
38 $this -> $property = $value ;
8 $news -> set_property ( ‘author’ , ‘Morgan Tocker’ );
9 $news -> set_property ( ‘subject’ , ‘An article by Morgan’ );
10 $news -> set_property ( ‘visible’ , ‘1’ );
11 $news -> set_property ( ‘story’ , ‘This is the body of my message’ );
Trang 19time—and it's a blessing (or a curse if you prefer) to all
modern operating systems that someone left it out of
the HTML 4.0 specification
Getting Your Feet Wet With GTK
Since the kind of functionality that we want cannot be
provided by a web browser (at least not without a
mas-sive amount of custom work), we'll have to turn
else-where—and that's where PHP-GTK comes into play
Our PHP-GTK application actually provides a "true" GUI
to our news management system, and works on a
dif-ferent machine from that of the webserver
The core of the application is shown in Listing 7 As
you can see, the PHP-GTK version of the news
manag-er is a bit more complex than the plain-HTML one,
although the length of the script is quite deceptive,
since the functionality of the three scripts that made up
the previous application has now been incorporated
into a single one
At the core, however, the application is extremely
simple Essentially, we create a set of GTK objects, and
connect them to various handlers, which, in turn, are
automatically called by the system when a specific
event takes place—such as, for example, the user
click-ing on a button Figure 1 shows you the application
running on a Linux system
The PHP-GTK application requires a copy of ddaattaa pphhpp,which was our Listing 2, so, if you update your classlibrary, be sure to copy it over to your PHP-GTK appli-cation Naturally, this is a great aspect of writing allyour applications with the same language, since you'reable to happily recycle your code as many times as youwant, and you can run it on a variety of platforms
There is a configuration option in our ddaattaa pphhpp whichchooses the MySQL server to connect to In the webserver's case, it's probably llooccaallhhoosstt In the case of thePHP-GTK application, however, you will probably beconnecting to the database remotely and, therefore,you should enter the IP or hostname of your server
Now that the application is running, notice how thecombo box used for the author's name makes theapplication easier to use Rather than having to buildadditional pages or cumbersome Javascript-based solu-tions, we can rely on the combo box to allow the user
to either choose an existing author or create a new onethrough a single control
Remembering DataI'm an Apple Cocoa programmer, and Cocoa applica-tions feature a concept called "defaults" A default is
F
Figure 1
Trang 20basically the PHP equivalent to a session that never
expires It's a variable that you can set, and will remain
available to you indefinitely, even if you shut down the
application and launch it agagin
Defaults can be really handy for settings and
prefer-ences, although they are not quite as easy to
imple-ment in a PHP-GTK application as they are in Cocoa
Luckily, I've written a PHP script to store this data, so
you won't have to It creates a file called
$$SSCCRRIIPPTT NNAAMMEE sseessssiioonn, where it stores default
informa-tion When you first install (or execute) the application,
be sure to create this file in advance with the proper
permissions, so that no error will be output even if the
user under which the script is running does not have
write access to the folder where the defaults file resides
To tap into the features of defaults, you'll need to add
the following line to the beginning of your file:
<?php
include_once session.php;
?>
Creating a default is the same as creating a session
The GTK application can store data in the $$ SSEESSSSIIOONN
super global, and the same data will be available on
relaunch The following is an example:
informa-Making the GTK-APP work offline
Now that we have a GUI-based application that doesn'trequire a browser and a web server to run, the nextstep would be to make it independent of the database
as well, so that you can use it as a completely "offline"application that can be run even when no connectivity
15 <h2>Edit record <?php echo $news -> id?> </h2>
16 <form method=”POST” action=”save.php”>
17 <INPUT type=”hidden” name=’id’ value=’ <?php echo $news -> id?> ’>
17 header ( “Location: index.php” );
18 19
20 ?>
Listing 6
Trang 21We're 90% there already All we really have to do is
build a proper system of caching and check to make
sure no changes have occurred since our last update
There are two generally accepted ways of performing
this last operation:
• Checking if the data has changed
from the data we grabbed
• Checking to see if the timestamp or
the last-modified date is more recent
than the timestamp from when we
grabbed the record
For our application, I am going to select the second
of these choices, given that it's easier to compare
time-stamps than it is to compare content, particularly if
there's a lot of it However, keep in mind that
time-stamps are always going to be based on the local
machine's clock and, without the database acting as a
broker to determine absolute time, it's possible that
your content will de-synchronize, thus causing
unwant-ed inconsistencies Here's how we'll be performing our
up-to-date checks:
<?php
$database_copy = new news($id, true);
if ($news->modified <= $database_copy->modified) { // Provide a warning - our copy is out of date } else {
// you may update safely }
?>
Caching Content Since we cannot store the information in the database,
we need a means to cache our information until we cansynchronize it Given that they provide a persistentoffline storage mechanism, defaults seem to be the per-fect choice here
We are going to cache each of the objects for laterretrieval by adding an uuppddaattee ccaacchhee(()) method to ourddaattaa pphhpp class, which you can see in Listing 9 Forexample, to check if we have a cache for record ID 6,
we can see if it's an object:
<?php
If (is_object($_SESSION['record']['6'])) { // we have cache for 6.
}
?>
To make the synchronization process faster, we could
also only accept cached data that is less than 72 hoursold as good without making the roundtrip to the data-base to check whether it has changed
<?php
if (is_object($_DEFAULT['record']['6']) && (time() <
$_DEFAULT['record']['6']->modified + (3600*72)) { // we have recent cache for 6
}
?>
In this case, however, you really want to make surethat your time is properly synchronized with theMySQL server—you may choose to get your currenttime by executing a SSEELLEECCTT UUNNIIXX TTIIMMEESSTTAAMMPP(()) on thedatabase server
Before we write the data back to the database, wewill have to check to see that no changes have occurred
40 session_set_save_handler ( ‘ session_open’ , ‘ session_close’ ,
‘ session_read’ , ‘ session_write’ , ‘ session_destroy’ ,
11 $tmp = new news ( $id );
13 $_SESSION [? record ?][ $id ] = $tmp ;
Trang 22while the application was working offline If there were
changes, we will need to display a proper warning—for
example by showing a dialogue box
Where to go from here
In order for the application to be more versatile, you
may want to integrate it with the equivalent of an
"Outbox", where changes to content are written to, but
no updates take place straight away The outbox will
just be another array of records saved in your defaults—
very similar to a cache but organized in a different way
that makes it easier to catch and revise updates before
they take place
A good news management system could work
simi-larly to the way most mail clients work, with the
poten-tial to work both online and offline depending on
whether a connection to the database is available
Once this mechanism is in place, you can take
advan-tage of the application's layout to add more
functional-ity, such as workflow management For example, if
your environment calls for the approval of news items
before they are published, you could manage the entire
flow of operations through a series of "drop boxes"
where each item is deposited by users with the proper
credentials
Another possible improvement would be to include
the possibility of marking certain changes or new news
items as "drafts", so that you can save them (without
publishing them on to the database) and work on them
later
Finally, the editing method is very basic and would be
much more effective, particularly for non-technicalusers, if it were based on a more advanced interface.Interestingly enough, PHP-GTK also supports Scintilla, avery advanced open-source component that plugs intoGTK to provide extended editing capabilities (once youdownload it from http://www.scintilla.org/, you can com-pile it into your version of PHP-GTK with //ccoonnffiigguurree eennaabbllee sscciinnttiillllaa eennaabbllee ggttkkhhttmm) By working aScintilla component into your system, you could makethe editing process much easier for your users
Tips for Writing Applications with PHP-GTK
To Discuss this article:
http://forums.phparch.com/127
Morgan Tocker is a freelance developer living and working in Brisbane, Australia His consultancy business, w www.icedotblue.com m, is responsi- ble for all sorts of php hacks
Error Checking
The lifespan of your typical PHP-GTK application is usually longer than that of its web-based
coun-terparts It will have to keep running for several hours, with functions being called over and over
again For a GTK application, you may find that you will want to manage your error handling, and
check the integrity of your variables frequently While you should be doing this with web-based
applications, too, there is less of an opportunity for laziness in GTK
For example, I had a problem with an earlier version of PHP-GTK where the incorrect data seemed
to be returned intermittently – and my application crashed and burned In going through it with a
fine-tooth comb I checked the integrity of data at a few points and, if it didn’t return the
expect-ed results, I either triexpect-ed again or producexpect-ed a ‘nicer’ error
In Summary, it’s a good idea to check that an item is still an array/object/integer (or whatever it
was supposed to be) and that it is not empty/null Personally, I look forward to the release of PHP
5 and exception handling, when GTK & PHP can be taken to the next level and it will become
eas-ier to tackle these issues
Portability, Recycling, and Reusing
Another good idea is to try and store the important parts of your code nested in function calls, as
opposed to using the traditional linear approach Keeping in mind the way callbacks work, you will
find it easier to work with both a web-based and a GTK version of the same application if they both
use OOP techniques Finally, try to separate your code from your desired output, so that you can
create a file like ddaattaa pphhpp and share it between the two without the need to branch your code
Trang 23Can’t stop thinking about PHP?
Write for us!
Trang 24In the last issue, we talked a little about the Zend
Engine internals and how they relate to writing an
extension, about how to create an extension skeleton
using the eexxtt sskkeell tool, how to write extension
func-tions and access their parameters (using a ssccaannff(()) style
function), how to return simple types (like strings and
integers) and how to build up a PHP array We covered
a fair amount of ground, but there are still plenty more
things to learn about PHP extension writing
In this issue, we're going to look at arrays again and
see how it is possible to build multi-dimensional arrays
and how to traverse the elements of, or look-up a
par-ticular value from an array
Multi-Dimensional Arrays
As we saw last time, PHP arrays are implemented using
hash-tables This approach allows indexing the array
using a string or integer key to fetch its values Since a
hash-table is not a native C type, fetching its values is
not quite as simple as with native C arrays On top of
that, the Zend Engine has no built-in support for
multi-dimensional arrays—they are simply implemented by
storing another array in the appropriate slot of the
hash-table This can be a difficult or daunting prospect
for the budding extension author, especially
consider-ing the state of the internals documentation, even
though it is actually quite simple to implement
For our first example, let's create a two dimensional
array where the first dimension contains a list of first
names and the second dimension a list of surnames If
you're not sure what I mean, Listing 1 contains the PHP
script equivalent for the C code in Listing 2 The
con-tents of Listing 1 should be self-explanatory, so let'stake a look through Listing 2 now, line by line
Lines 1 through 5 declare a C-style 2D array The twosets of square brackets tell the compiler that it has twodimensions; the first dimension has 3 slots, while thesecond dimension has 2 slots These correspond to the
3 sets of first and last names that we are going to use
to initialize our PHP array Lines 7 and 8 are commentsdescribing the prototype for the function Hopefullyyou will recall that these comments, although theyhave no effect on the code itself, are an important cod-ing convention that helps to remind you how the func-tion is intended to be used Line 9 uses the
PPHHPP FFUUNNCCTTIIOONNmacro to declare the actual PHP function.Lines 11 and 12 declare some temporary variables—iiwill represent the person whose name we are adding,and jj will indicate if we are looking at their first or lastname The ttmmppaarrrraayy variable, as its name implies, willact as temporary storage for the array we create foreach person Line 14 initializes the PHP function'srreettuurrnn vvaalluuee as an array, and then we begin a loop online 16 which will step through each person in ournames array, using the variable ii as the counter For
Code: http://code.phparch.com/20/2Code Directory: extensions
REQUIREMENTS
As we saw last time, writing PHP extensions in C isn't quite
as difficult as you might think In this issue, we're going to
dive into the hash API and use it to traverse arrays and
fetch values from them.
Trang 25each person, we allocate a PHP variable using the
MMAAKKEE SSTTDD ZZVVAALL(())macro, we set it up as an array (lines
17 and 18), and then we step through each of their
names and add them as string elements to our
tempo-rary array (lines 19 to 21) Having prepared our
"per-son" array, we need to add it to our "people" array—the
return value for the function (line 24)
The code should be fairly simple to follow, although
you might be wondering about two things in
particu-lar The first thing you might ask is whether you should
(or should not) worry about freeing the temporary
array value In this case you should not free it—we
"gave" it to the Zend Engine when we used
aadddd nneexxtt iinnddeexx zzvvaall(()), and the engine will take care of
freeing it at the appropriate time If we were to free it
ourselves, we would cause a crash some time later in
the script that would be difficult to track down
The other question you might be asking is whether
we need to return something from the function The
answer is no—the C function prototype is declared as a
vvooiidd function, so it has nothing to return in the usual
sense Instead, PHP passes us a rreettuurrnn vvaalluuee variable
that we populate—it is this variable that will be passed
back into your PHP script when the function returns
Since the first thing we are doing is setting up the
rreettuurrnn vvaalluuee, we don't need to do anything special
after the loops that populate it and, therefore, we
sim-ply "fall out" of the bottom of the function
As you can see, building a multi-dimensional array is
not that hard Although my example is quite succinct,the same principle can be used to build PHP arrays withany number of dimensions—you simply create a newintermediate array to hold the contents of the dimen-sion you want to add, and then add it You're not lim-ited to strings for the values either—you can use anyvalid zzvvaall value (integers, real numbers, strings,resources and boolean values, or even resources if youwant to)
Now that you are have mastered returning dimension arrays, how about looking at working withmultidimensional arrays that have been passed intoyour function?
multi-Getting Stuff Out of ArraysThere are two things that you will typically want to dowith an array that has been passed to your function—either you want to look up a specific keyed value and
do something with it, or you want to step through allvalues and do something with each of them We'll dealwith the first of these now
So far, we've used some really convenient macros toadd items to arrays—these macros insulate us from thenot-so-pretty guts of the hash table implementation.However, we've now reached a point where we muststep beyond these macros—because there are nomacros for fetching an item from an array
Before we delve in, it's worth thinking for a minuteabout how you use arrays in your PHP scripts Imaginethat you have a PHP script that accepts a couple of
$$ GGEETTparameters—name and age—and displays them
on some kind of e-card Let's also pretend that the ageparameter is optional-the e-card will happily displaysomething good regardless of whether the age param-eter is passed or not PHP (being the nice flexible thingthat it is), will allow you to access the age parameterusing $$ GGEETT[[''aaggee'']] syntax, even if it is not there (thevalue returned to your script will be NNUULLLL in that caseand, at worst, the interpreter will print out a warningmessage to indicate that the element does not exist) Ifyou are slightly more strict with your code, you mightfirst want to check that the age value is present byusing iisssseett(()) and then take a different course ofaction
This is a simple validation of input parameters and,while PHP allows you to be a lazy script coder, it does-n't allow you to be a lazy extension author—you mustcheck if an element is present before you access it, sincethe NULL you get back from the hash API is the kindthat causes a crash if you don't handle it properly Withthat in mind, take a look at Listing 3, which representsour hypothetical e-card generating function The idea isthat you pass an array of values to the function, and itwill pull out the name and age
Lines 1 to 3 are the usual prototype comments andthe PPHHPP FFUUNNCCTTIIOONN declaration Next, we declare a vari-
5 array( “Rasmus” , “Lerdorf” ),
6 array( “Zeev” , “Suraski” ),
7 array( “Andi” , “Gutmans” )
7 /* {{{ proto array phpa_2d_array()
8 Returns a 2d array of names */
Trang 26able to point to the array passed in as the parameter to
the function on line 5 This is the same as the way that
we declared the temporary array variable from the last
example Line 6 declares two variables to hold the
name and age values—they are declared as zzvvaall ****
because the hash table stores zzvvaall ** and returns a
pointer to its storage address This allows you to
modi-fy its stored value if you wish, but you don't want to do
something like that unless you are really confident in
your abilities—in my experience, it's better to just stick
to using the main API functions
The next thing is fetching the array parameter using
zzeenndd ppaarrssee ppaarraammeetteerrss(()) The "a" format code
indi-cates that we want an array value; we are storing it into
the variable named aarrrraayy If the user
doesn't supply a single array as the
parameter, an appropriate warning
message is displayed and our function
will return a NNUULLLL value (remember that
the default return value is NNUULLLL, so we
don't need to do anything special to
get a NNUULLLL value here)
Now we're in new territory—
zzeenndd hhaasshh ffiinndd(())is the function to use
to look up a value by string key It
accepts four parameters; the first is a
pointer to a hash table, the second is a
pointer to the key string, the third is the
length of the key, including the NNUULL
ter-minator and the fourth is a pointer to a
zzvvaall **** that will receive the value if it exists You can
get at the hash table contained in a zzvvaall using the
ZZ AARRRRVVAALL PP(()) macro Before you use it, you must make
sure that the zzvvaall really does reference an array value,
otherwise you will get garbage results and most likely a
crash In this case, zzeenndd ppaarrssee ppaarraammeetteerrss(())has already
performed the check for us (we told it we wanted an
array), so we don't need to do anything further
The zzeenndd hhaasshh ffiinndd function returns SSUUCCCCEESSSS if the
element exists or FFAAIILLUURREE if it does not Beware—the
values for SSUUCCCCEESSSS and FFAAIILLUURREE are such that you must
always explicitly compare for the value you want to
check—do not assume that SSUUCCCCEESSSS will evaluate to
TTRRUUEE or that FFAAIILLUURREE will evaluate to FFAALLSSEE Another
potential gotcha is with the length of the string key—it
must include the NNUULL terminator for the string The
con-vention used within PHP is to use the ssiizzeeooff(())
opera-tor when you are passing a string that you know at
compile time, since ssiizzeeooff(()) on a constant string
resolves to the string length plus one for the
termina-tor—it's handled at compile time and saves your CPU a
few cycles when you call the function from your script
However, if you don't know the string at compile time
(perhaps it was passed as a parameter to your function
too) you should not use the ssiizzeeooff(()) operator—it will
resolve to the size of a string pointer, not the size of the
string itself So, at runtime, you need to call ssttrrlleenn(())
and add one to the result to arrive at the correct lengthfor a key
You might be wondering about the string ((vvooiidd****))cast on the last parameter to zzeenndd hhaasshh ffiinndd(())—it isjust there to keep the compiler from issuing an incor-rect warning Remember that this function wants toreturn a pointer to its storage for the element? In C, ageneric pointer to something has the type vvooiidd **, andwhen you want to return a value by reference in C, youadd an extra asterisk, so the type becomes vvooiidd ****.Since we are dealing with data that is already a point-
er, we have an extra level of indirection that makes ourfourth parameter appear to be a vvooiidd ****** equivalent—
this causes the compiler to issue awarning because it looks like we mighthave made a mistake In this case weare safe, so we use the cast to hide thewarning Be very careful though—it isstill very easy to make mistakes whendealing with all these pointers, even ifyou are an experienced C coder
Back to the listing then—we havenow managed to fetch the nnaammee ele-ment from the array that was passed
to our function, and now we want toprint it out as a string It is very impor-tant to stress that the value we have is
a zzvvaall and that, beyond that, we don'tknow anything else about it If it is astring, we can just print out the string value, but if it hasany other type it will need to be converted first, other-wise we risk crashing the engine The
ccoonnvveerrtt ttoo ssttrriinngg eexx(()) API call will handle this tion for us in the best possible way—it will do nothing
situa-1 /* {{{ proto void phpa_emit_ecard(array fields)
2 Emits a personalized e-card greeting */
Trang 27if the value is already a string, otherwise it will convert
it to a string by making a copy of the value and
con-verting the copy The reason for making a copy is that
you don't want to change the original value directly,
since this would be reflected in the script as a sudden
"magical" change in the type of that array element
Now that we have the name in a string form, we
sim-ply print it out to the output buffer mechanism using
the zzeenndd pprriinnttff(()) function (it's equivalent to the
pprriinnttff(())function you'd call from your PHP scripts, but
channels its output through the scripting engine, so
that it can be inserted properly in the script's overall
output buffer) Note that we are using ZZ SSTTRRVVAALL PPPP(())
to access the underlying string value Earlier in the
arti-cle we used ZZ AARRRRVVAALL PP(())to get at an array value—you
can see that the names and functions of these two
macros are similar and reasonably intuitive—the former
returns a string value while the latter returns the array
value (the underlying hash table) The potentially
con-fusing part of the names is the trailing PP or PPPP—what
does that mean? Each PP represents a level of pointer
indirection, so if you are accessing a zzvvaall **, you should
use the PP version of the macro, but if you are
access-ing a zzvvaall **** you should use the PPPP version of themacro There are a whole bunch of related macros thatallow you to access the string value, string length, inte-ger value, floating point value and so on Keep in mindthat you should not use these macros unless you knowthat the zzvvaall is of the appropriate type
Having now printed the name, we proceed to lookupthe age This is done in a similar away to above, but thistime we want to print the age as a number, so we use
ccoonnvveerrtt ttoo lloonngg eexx(())to ensure that we have a integervalue, and ZZ LLVVAALL PPPP(())to access that value If the agewas not found in the array, instead of printing an age-specific salutation, a more generic message is usedinstead
That's it—our function is complete Or is it? When weprint out the name using zzeenndd pprriinnttff(()), we are relying
on the string being a regular C-style NUL-terminatedstring, since that is what the pprriinnttff(()) family of func-tions expect Since any string in PHP could potentially
be a binary string (maybe it is a far-eastern multi-bytestring) we are probably going to end up clipping thestring at the wrong point and generating broken out-put The fix for this situation is to use the PPHHPPWWRRIITTEE(())
macro instead and pass ZZ SSTTRRVVAALL PPPP((nnaammee)) and
ZZ SSTTRRLLEENN PPPP((nnaammee))as its parameters
If you want to access an array element using an arrayindex, you can use the zzeenndd hhaasshh iinnddeexx ffiinndd(()) func-tion—it takes 3 parameters—the first is the hash table,the second is the integer value of the key and the third
is a pointer to a zzvvaall **** In other words, you use it inthe same way as you use zzeenndd hhaasshh ffiinndd(()), but instead
of passing the string and the string length, you pass theinteger value of the key
Iterating ArraysNow we know how to pull specific items out of anarray, what about doing the equivalent of ffoorreeaacchh(()), sothat we can print a list of names? Before we delve intothe C code, let's just refresh our memories about how
we can iterate arrays in the PHP script itself There arethree different ways to achieve this; the first and sim-plest approach that is familiar to programmers comingfrom other languages is to use an integer counter andstep through the elements from 0 to the number-of-elements-minus-one using a ffoorr loop This firstapproach is fine if your array is only ever indexed byintegers, but this doesn't always hold true in PHP Thatleads us on to the second method Arrays have an inter-nal position pointer that you can adjust using the eenndd(()),
nneexxtt(()), pprreevv(()), ccuurrrreenntt(()), eeaacchh(()) and rreesseett(()) tions Using various combinations of these allows you
func-to step through and fetch elements from the array Thismethod is useful, but since they operate on the internalarray pointer, anything else that changes that pointerwhile you are looping over it will mess up the loop Thefinal approach is to use the ffoorreeaacchh(())control structure
F
1 /* {{{ proto void phpa_iterate_array(array array)
2 For each element of the array, print the key and value */
Trang 28that was introduced in PHP 4 ffoorreeaacchh(())works in a
sim-ilar way to eeaacchh(())and nneexxtt(()), although it is has a little
more tolerance to things messing with the internal
position pointer, since it creates a copy of the array
before working on it
It should be apparent that touching the internal array
pointer while inside a looping control structure is a bad
thing, so we want to do something that is more like the
traditional for-loop approach, and store the array
posi-tion in a local variable in our extension funcposi-tion Of
course, we want it to work with string keys as well as
integer keys
Let's look at Listing 4, which demonstrates how to
iterate an array and print out the keys and values Lines
1-3 have the familiar prototype comments and
PPHHPP FFUUNNCCTTIIOONN declaration Lines 5-10 declare the
vari-ables that we will be using—we have a variable to hold
reference the array parameter, another to hold a
point-er to the key if it is a string, anothpoint-er for the length of
that string, a long to hold the integer value of the key
if it is not a string, a zzvvaall **** to hold the element value
and lastly we have a HHaasshhPPoossiittiioonn variable that will
keep track of where we are in the array (you can think
of this as being a bit like the integer index you would
use in a traditional ffoorr-loop, except that it works with
string indices too) Lines 12-15 validate the function
parameters to ensure that we receive only a single
array
Now we are ready to being the actual iteration The
first thing we want to do is initialize our HHaasshhPPoossiittiioonn
variable so that it points to the first element of the
array—this is achieved by calling
zzeenndd hhaasshh iinntteerrnnaall ppooiinntteerr rreesseett eexx(())and passing it
the hash table from the array and a pointer to the ppooss
variable The name of this function is a little
misleading-it doesn't touch the internal pointer at all
We want to keep looping until we run out of
ele-ments, so let's use a wwhhiillee-loop and check the return
value of the zzeenndd hhaasshh ggeett ccuurrrreenntt ddaattaa eexx(())
func-tion This function is similar to zzeenndd hhaasshh iinnddeexx ffiinndd(()),
except that instead of passing an integer index, we are
passing our hash position If the function returns
SSUUCC CCEESSSS, it will have stored the value of the current array
element in our iitteemm variable If there are no more
ele-ments, it will return FFAAIILLUURREE instead; we use this fact to
break out of the wwhhiillee-loop at the appropriate point
We also want the key for this element; we can use
zzeenndd hhaasshh ggeett ccuurrrreenntt kkeeyy eexx(())to get it This function
is a little bit complicated, since it needs to be able to
return a string key (and its length) or an integer key—
so it requires that you pass suitable variables to receives
those values It's important to stress that
requirement-even if you are only interested in integer keys you still
need to pass valid pointers for the string and length
The opposite is also true-if you only want strings you
still need to supply a variable to hold integer values
The zzeenndd hhaasshh ggeett ccuurrrreenntt ddaattaa eexx(())function returnsone of three values—HHAASSHH KKEEYY IISS SSTTRRIINNGG indicates thatthe key is a string key, HHAASSHH KKEEYY IISS LLOONNGG indicates thatthey key is an integer index and HHAASSHH KKEEYY NNOONN EEXXIISSTTAANNTTindicates that there is no element at the current posi-tion I'm using a switch statement to print the key cor-rectly based on its type It is worth noting that there is
no need to check for HHAASSHH KKEEYY NNOONN EEXXIISSTTAANNTT here, since
zzeenndd hhaasshh ggeett ccuurrrreenntt ddaattaa eexx(()) will have returnedFFAAIILLUURREE before we reach this point
The rest of the code inside the loop should be explanatory by now, except for the very last line-weneed to advance to the next element before continuingwith the next iteration of the loop, and we achieve thatusing zzeenndd hhaasshh mmoovvee ffoorrwwaarrdd eexx(())
self-Summing Up
By now you should be feeling pretty good at workingwith arrays in your PHP functions We've seen how tobuild up arrays, multi-dimensional arrays, how to pullvalues out of array by string key and by numeric key.We've also seen how to iterate through the contents of
an array All this should give you plenty of ammunitionfor when you decide to move your PHP code over to C
To Discuss this article:
http://forums.phparch.com/124
Wez Furlong is the Technical Director of The Brain Room Ltd., where he uses PHP not only for the web, but also as an embedded script engine for Linux and Windows applications and systems Wez is a Core Developer
of PHP, having contributed SQLite, COM/.Net, ActivePHP, mailparse and the Streams API (and more) and is the "King" of PECL-PHP's Extension Community Library His consulting firm can be reached at
http://www.thebrainroom.net.
Trang 29Before starting on our quest for performance, let
me pass along a small word of caution Making
your applications faster is certainly a noble goal
but, unfortunately, it will often require a fair bit of time
and frequently expose or introduce bugs It is
absolute-ly critical that you do not begin optimization
prema-turely, as doing so will virtually guarantee that
dead-lines will be missed and that the likelihood of ending up
with a working program will be slim Only optimize
your applications once the code has been completely
written, tested and deemed acceptable, and always set
specific performance levels you seek to attain Without
a specific goal, you can just keep on optimizing
forev-er, as there will always be some other tricks and
tune-ups you could apply
Now that we've gotten the standard optimization
disclaimer out of the way, let's get to the fun part—
doing the actual work While you can certainly gain
significant performance increases from optimizing
your PHP code, this is usually one type of an
optimiza-tion you would want to leave till the very end when all
other options are exhausted Optimizing the actual
script can be a fairly drawn out process and there is
always a risk of breaking working code Whenever
pos-sible, it is always better to optimize things outside of
your code that will have a positive impact on the
per-formance of your applications As you can probably
guess, the focus of this article will be optimizations
that do not actually require code modification and still
make your PHP applications run much faster
Getting StartedThe first step consists of optimizing the PHP executableitself, which will make all the scripts executed by it runfaster This can be done by making your C compiler,such as ggcccc, work harder when compiling PHP and tunethe binary executable it generates for maximum per-formance This optimization is performed by specifyingseveral settings to the compiler via the CCFFLLAAGGSS environ-ment variable This variable, in turn, is used by the con-figuration script, which then passes these values on tothe compiler at build time It is important to note thatwhile I am mentioning these options only in the con-text of PHP, these optimization flags are applicable to allparts of the system—and the more efficient the system,the faster it will be able to run everything, includingyour PHP applications
Below is an example of a modified PHP building cedure, which leaves room for compile-time tuning
proexport CFLAGS="O3 msse mmmx march=pentium3 \ mcpu=pentium3 -mfpmath=sse -funroll-loops"
-./configure make make install
The Need For Speed
Optimizing your PHP Applications
by Ilia Alshanetsky
PHP: 4.1+
OS: N/AApplications: Optional: Turck Mmcache,APC, PHP Accelerator, Zend Cache
REQUIREMENTS
The ever growing popularity of the web is putting a
con-tinually growing stress on the software and hardware
used to power the common website This article will help
you combat the growing server loads and increase your
web serving capacity without resorting to costly hardware
Trang 30What do these options do? The first one, OO33,
indi-cates what level of optimization the compiler should
use Normally, PHP uses only OO22, which is considered
to be "safe", as too much optimization can cause
sta-bility issues However, given the evolution of
compil-ers, OO33 is, in my experience, just as safe and many
projects have already adopted it as their default
opti-mization level The main difference between the two
is that OO33 enables function inlining, which allows the
compiler to optimize out some functions by replacing
function calls with a copy of their code Another
opti-mization technique that is enabled by OO33 is register
renaming, which allows the compiler to take
advan-tage of unused registers for various tasks; this is very
handy on modern processors with large numbers
reg-isters that are frequently left unused
The downside of OO33 is that it makes the generated
code nearly impossible to debug, since the register
rearrangement creates a situation where a valid
back-trace in the event of a crash cannot be generated
However, since you should not encounter crashes in a
production environment, this is a fairly acceptable loss
in most situations
In our compilation script above, we have a set of
options that tell the compiler in a fair bit of detail
about what processor the server has and what features
it supports This allows the compiler to apply various
tricks and optimizations that are specific to a
particu-lar CPU (a Pentium III in our case) This is not
normal-ly done when producing binaries for distribution, since
the goal is to generate portable code that can run on
as many models of CPUs for a particular architecture
as possible
Of course, enabling CPU-specific targeting means
that the portability of the generated binary will be
lim-ited to a single processor type For example, code
tai-lored for the Pentium III via the mmaarrcchh and mmccppuu
switches (such as the one in my example) will not
work on older Pentiums and AMD processors If you
are compiling PHP for a server farm that uses all types
of CPUs, you may not want to use CPU tailoring
options as they would require you to compile a
sepa-rate PHP executable for every CPU type
The other three options, mmssssee, mmmmmmxx and
mmffpp mmaatthh==ssssee, indicate that my processor supports these
extended instruction sets and tells the compiler it
should try to use them to generate a more optimal
code SSE and MMX are primarily math-related
instructions sets and their usage can significantly
accelerate any mathematical operations the
underly-ing C code needs to perform
The last option I specify, , tells the compiler that it
should unroll any small loops The effect is the
reduc-tions in the number of instrucreduc-tions the processor
needs to execute, since there is no more loop
However, the resulting binary will be slightly largersince instead of a single instance of the code in theloop, you'll now have the code inside the loop repeat-
ed as many times as the loop would have ran
Configuring PHP ProperlyNow that we have set our compiler options, let'sreview the configuration of PHP itself, as that, too, canhave significant impact on performance
In most cases, PHP is used for serving web pages,usually as an Apache server module The standardapproach is to compile PHP as a shared Apache mod-ule that the web server then loads on startup This isthe recommended approach, as it allows for easy PHPupgrades that do not require recompilation ofApache However, this is most definitely not the most
performance-friendly approach
When generating a dynamically loadable module,the linker will add a series of hooks to allow the mod-ule to be loaded, which, among other things, does notallow the compiler to optimize the generated code tothe fullest The end result is that the compiled PHPexecutable is anywhere between 10% and 25% slow-
er than it would be had it been compiled statically intoApache
# PHP configure line /configure with-apache=/path/to/apache_source
# Apache configure line /configure activate- module=src/modules/php4/libphp4.a
The configuration procedure above will compilePHP directly into Apache, making PHP part of theApache server executable As you can image, thismeans that upgrades of Apache or PHP will requireyou to recompile both packages However, given theinfrequent releases of both projects and relative quickcompilation, the extended build procedure is morethan made up for by the performance increase
You can speed up the increase in compilation timecaused by the static compilation by reducing thenumber of extensions PHP compiles—and that willalso increase performance By default, PHP compiles anumber of extensions that you may never use andthat, in the end, only increase the size of your PHPbinary, causing it to use more memory Worse yet,some extensions will initialize various buffers andparameters on every request, slowing down the dataserving process You should try to compile only theextensions you need and disable extensions that you
do not intend to use
./configure \ disable-all \ disable-cgi \ disable-cli \ with-apache=/path/to/apache_source \
Trang 31enable-session \
with-pcre-regex \
with-pgsql \
The example above uses the ddiissaabbllee aallll
configu-ration flag to disable all extensions that are enabled by
default in one go, saving the time needed to find all of
the default extensions and disable them It also will
automatically disable all newly enabled-by-default
extensions should any appear in the future without
having to manually go through the configuration The
ddiissaabbllee ccggii and ddiissaabbllee ccllii configuration
direc-tives explicitly disable the generation of the CLI and
CGI SAPIs, whose compilation is not automatically
dis-abled by the -disable-all flag Since only the Apache
SAPI is needed, there is no need to waste time
build-ing binaries that will not be used
Once all the unneeded SAPIs and extensions have
been disabled, the needed extensions are enabled and
the compilation process can begin The end result is a
smaller binary, which is especially important for SAPIs
such as CGI and CLI where the startup costs occur on
every request A smaller binary will load that much
faster allowing it to get to code processing quicker
More importantly, unneeded initializations will not be
performed, making PHP work faster in all instances,
regardless of the underlying SAPI
Optimizing the INI File
With the PHP configuration and compilation out of the
way, it's time to turn to the PHP.INI configuration
directives, which can be used to improve the overall
performance of your scripts as well
I'll begin with the rreeggiisstteerr gglloobbaallss option, which is
already off by default as of PHP 4.2.0 However, many
people still have it enabled, since their configuration
was never updated as they upgraded their versions of
PHP This option makes PHP register a potentially large
number variables based on user and system input, as
well as making certain security exploits possible It is is
recommended to keep this option off and use the
readily available super-globals to access the data
passed by the user through POST and GET queries or
browser cookies
You can further optimize the process of creating
variables based on user input by changing the
vvaarrii aabblleess oorrddeerr directive It indicates which source of
client-generated information should be used to
popu-late the superglobals, as well as in which order they
should be considered when building $$ RREEQQUUEESSTT, which
is a cumulative result of the contents of other
super-globals By default, this option has a value of EEGGPPCCSS,
meaning that data from the system environment, the
server environment, as well as user GGEETT//PPOOSSTT//CCOOOOKKIIEE
input is stored Storage and creation of array elements
inside super-globals can take a hefty amount of
mem-ory and will have a negative impact on performance asthis process is repeated during every single request.Therefore, you can improve the overall performance
of your system by reducing the number of als that are being created In most situations, thismeans that you can set the value of vvaarriiaabblleess oorrddeerr
super-glob-to just GGPPCC, so that only the data passed by the user inthe GGEETT//PPOOSSTT queries or through cookies is storedinside super-globals The effect of this choice is amuch faster input parsing procedure and a smallermemory footprint If you need to use environment orsystem parameters, you can fetch them individuallyusing the ggeetteennvv(()) function instead, which will notcause a consistent performance impact
Beyond the standard super-globals, PHP also createsspecial variables that are used to store data that ispassed via the command line In a web environment,your PHP scripts will never be passed arguments insuch a manner and, therefore, creating those variables
is not necessary You should disablerreeggiisstteerr aarrggcc aarrggvv, which is the PHP setting respon-sible for the creation of these variables, to furtherspeed up your scripts Keep in mind that, if you usethe CLI SAPI, you will need to leave this optionenabled, otherwise your scripts will not be able toretrieve arguments passed to them via the commandline
When parsing user input, PHP automatically escapesthe data to prevent the user from injecting specialcharacters that can potentially result in an undefinedbehavior in certain portions of your scripts Thisautomation is not always needed, since not all datafetched from the user is used in such a manner thatprovides a chance for special characters to cause trou-ble It would be better to disable this automation byturning off the mmaaggiicc qquuootteess ggppcc directive and manu-ally escape the data as needed using aaddddssllaasshheess(()),, orusing whatever is the most appropriate escaping func-tion for the situation For example, in some cases, youneed to use special escaping functions that are specif-ically tailored to secure data in a particular context,such as eessccaappeesshheellllccmmdd(()) for command lines andmmyyssqqll eessccaappee ssttrriinngg(()) for MySQL queries
The advantages of doing your own escaping arenumerous: first of all, you only escape what you need,thus reducing the amount of time PHP spends parsinguser input You also save memory, as the escapingprocess will allocate twice as much memory to store
an escaped string than it would normally for anunescaped one Moreover, you also get a better-designed application that does not depend on a par-ticular server configuration and is capable of workingsecurely in an environment where mmaaggiicc qquuootteess ggppcc isdisabled
Beyond variable creation there are a number of
F
Trang 32other INI settings that are important for optimization
purposes By default, every PHP request is prefixed
with an XX PPoowweerreedd BByy header, which shows that what
version of PHP you are running For the purposes of
rendering the page, this header is completely useless
and, unless the user fetches the headers manually, it
will never even be visible In fact, just about the only
people who can make use of this field are those trying
to compromise your system and for that purpose need
to determine what software is being run on it It would
be prudent, therefore, to disable sending of this
head-er by setting the eexxppoossee pphhpp setting to off Not only
will this make a potential attacker's job more difficult,
but it will also save a little bit of bandwidth and
slight-ly increase performance by not sending useless data
over the connection with your client
Speaking of sending data across the wire to your
users, this is another area where proper INI
configura-tion can be of much use By default, PHP will print the
data to the user as soon as your script outputs it,resulting in many write operations, each sending asmall bit of data to the socket This can become quiteslow, especially for large pages, since many systemcalls will need to be performed to write the data and
at least some browsers will re-render the page eachtime a small chunk of data is received, making theuser's experience less than pleasant The alternative is
to buffer the data in memory and send it in largechunks, thus reducing the number of writes to thesocket and potentially speeding up the rendering time
on the client
Output buffering can be enabled and controlled viathe oouuttppuutt bbuuffffeerriinngg option, which allows you tospecify how big the memory buffer used to store ascript's output should be Ideally, you would want thisbuffer to be about the same size as the average pageyou send to your clients; this way, your average scriptoutput can be sent across the wire in one large chunk
Figure 1
Trang 33At the same time, you should be careful not to create
overly large buffers, as each PHP instance will have a
buffer of its own—and, with many instances running
at the same time, this can add up to quite a few
megabytes, potentially exhausting all available
memo-ry
Another solution that can accelerate the process of
sending data to the user is compression PHP supports
a GZIP-compressed output buffer handler that can be
used to compress the data sent to the user in a
man-ner that is automatically recognized by most modern
browsers For those users with compatible browsers,
compression will reduce the size of the page many
times over The decrease in page size is especially
con-venient for users with slow connections, for whom this
technique can shave off several seconds from the time
it takes to load each page In addition, faster data
transmission allows server processes to be freed
earli-er, which, in turn, makes it possible for your server to
handle a greater number of requests in any given
timespan Another pleasant side effect (on a very large
scale) is the reduced bandwidth bill; I have seen
band-width usage cut by as much as 40-50% by simply
introducing compression
Better yet, implementing this feature does not
require any code modification and it can be enabled
by simply setting oouuttppuutt hhaannddlleerr to oobb ggzzhhaannddlleerr
inside the php.ini file Alternatively, you can enable it
for individual virtual hosts inside httpd.conf or
specif-ic directories via htaccess, or even via iinnii sseett(())inside
scripts that output large quantities of text You should,
however, keep in mind that compressing the data
does require CPU power, and will increase the server
load slightly However, in most cases the benefits of
faster loading pages, minimized bandwidth usage and
reduced number of server processes will outweigh the
inevitable slight increase in CPU usage
On occasion, you may find yourself using PHP not
only to send data, but also to retrieve it from a remote
source (for example, when implementing a network
client like an e-mail application that has to retrieve
messages from an IMAP server)
In these situations it is important to keep in mind
that the Internet is not a local storage medium, and
getting data out of it can be quite slow You probably
don't want to spend too much time waiting for the
external source to respond to your query, or you may
run the risk of hogging down your whole server To
prevent endless waiting, you should use the
ddeeffaauulltt ssoocckkeett ttiimmeeoouutt setting, which allows you to
define how many seconds PHP should wait before
giv-ing up on fetchgiv-ing data from a remote source This is
especially important in a web environment, since
while your script is waiting for data its web server
instance cannot be used to serve other requests,
potentially requiring the creation of additionalprocesses and resulting in an increased server load
In addition to remote sockets, you are likely to beworking with local sockets in the form of databaseconnections Tuning your connection parameters is avery important step that will prevent connection over-load, which may result in a performance drop andrefused connections leading to broken pages I recom-mend that you use the mmaaxx lliinnkkss and mmaaxx ppeerrssiisstteennttoptions that exist for most database interfaces to spec-ify how many connections PHP may keep open at anyone time By default, these options are set to -1(unlimited), which in most situations is not a goodidea, since it could lead to PHP trying to open moreconnections than your database server can handle.This setting is especially important when using persist-ent connections, which in an Apache environment willsoon result in each child having their own connectionopen to the database It is absolutely critical to ensurethat there are strict controls to prevent persistent con-nections from taking up all possible database sockets,thus causing the DB server to refuse all other connec-tions
In many instances (for example, if you run a sharedhost), it may be prudent to disable persistent connec-tions altogether via the aallllooww ppeerrssiisstteenntt directive.This will automatically convert all attempts to openpersistent connections into regular connections andhelp preventing a possible overload on your server
PHP's INI settings include several directives that limitthe operations that PHP can perform, such as the abil-ity to access and manipulate files and the amount ofmemory allocated by the interpreter These settingsare quite useful in a shared environment, where youwant to keep a tight leash on your users to ensure thatthey are not abusing the system but, in a dedicatedenvironment where you control a majority (if not all)
of the PHP code executed by the interpreter, they onlyserve to slow down often-used functionality Thus, forperformance reasons it is better not to use thessaaffee mmooddee, ooppeenn bbaasseeddiirr and mmeemmoorryy lliimmiitt directives
in dedicated environments; the checks performed byPHP to enforce them are quite expensive and can lead
to significant performance losses if enabled
Beyond the ConfigurationBesides optimization tricks and configuration tuningthere are several other methodologies that canimprove the performance of PHP applications withoutactually having to dabble in the application's sourcecode
The first and foremost of these tools is an opcode
cache, sometimes referred to as a "PHP compiler",
although the term is really misused Under normal cumstances, before the PHP script can be ran it must
cir-F
Trang 34first be parsed and converted to a series of instructions
(opcodes) that the Zend Engine can understand This
is a fairly fast process, but in large scripts with many
include files it can take up a significant amount of
time Even in smaller applications, reading the PHP
script from disk and parsing it every single time before
execution can add up It is quite wasteful, since for the
most part the scripts rarely change between
execu-tions and there is really no need to parse the code
from scratch every single time
This is where an opcode cache comes in Instead of
repeated parsing, the generated instructions are
stored inside shared memory (or on disk), so that
fur-ther access to the script does not require reparsing
Additionally, because the opcodes are often stored
directly in memory, file system operations are reduced
to a simple check to determine whether or not the
script has changed since it was cached, thus further
improving performance
Most opcode cache implementations—and there
are several of them on the market nowadays-go even
further and actually optimize the opcodes before
stor-ing them Durstor-ing the traditional compilation process,
the PHP parser tries to speed up the opcode
genera-tion process and does not always generate the most
optimal instructions for the Zend Engine to execute
With an opcode cache, since the parsing is only done
once, it makes sense to spend some time analyzing
the generated opcodes and optimizing them so that
their execution can be as fast as possible The end
result is that, with an opcode cache in place, you may
see your PHP's performance improve anywhere
between 40-600%
As far as opcode caching products go, for the most
part all available solutions offer just about the same
level of performance, with some minor differences My
current favorite is Turck-MMcache
(http://turck-mmcache.sourceforge.net/), which was originally
devel-oped by Dmitry Stogov This particular compiler
comes with a particularly efficient opcode caching
mechanism and a powerful optimizer that in most
cases can allow you to squeeze in a few extra requests
per second compared to its competition This cache
also includes a few other features, such as a memory
session handler and a content caching mechanism,
which can be used to further improve the
perform-ance of your PHP applications Unfortunately, at this
time Dmitry is unable to dedicate time to the project
and the development of MMCache has stalled
However, a number of volunteers have promised to
continue maintaining the project and hopefully will
pick up where Dmitry left off
The Zend Performance Suit (ZPS) is a commercially
available PHP acceleration package offered by Zend
that also implements an opcode cache and an
opti-mizer as well as content caching capabilities The bigplus of ZPS is that it is designed with both experiencedand novice users in mind and provides a very power-ful and user friendly interface to its components This
is especially useful when configuring content caching,which in Mmcache, for example, can require a bit ofmanual labor and testing However, unlike MMcache,ZPS is not free Its licensing model starts at about $499per server, which may put it out of the price range ofsmall site operators
Aside from ZPS, there is also APC, an Open Sourceinitiative that has made big strides in the past year Itsperformance is similar to that of ZPS and MMcache,but the lack of a good optimizer makes it a little slow-
er in certain situations Given its active development,however, there is little doubt that it will eventually beable to match the capabilities of the other implemen-tations
I should also mention the IonCube PHP Accelerator,which was one of the original free opcode cacheimplementations It still works quite well with PHP 4.3series, but has not had any new visible developments
in over a year and consequently does not perform aswell as MMCache or APC in most situations
A Hidden CacheRegardless of whether or not an opcode cache is used,most scripts will still perform a fair number of file sys-tem operations These can become a major bottle-neck, because, while processor and memory speedskeep increasing, hard-drive speeds remain quite slow
It does not take much to reach the maximum read orwrite speed of a drive, which is usually just a fewdozen megabytes per second
For ultimate performance, it is best to eliminate allfilesystem operations While this may seem like animpossible goal, a wonderful invention called a
"ramdisk" makes it attainable without much effort Aramdisk is really little more than the emulation of ahard-drive in memory; as far as programs (includingyour PHP scripts) are concerned, it appears to be justanother run-of-the-mill disk partition However, thedata written in a ramdisk is actually stored directly inthe system's memory, where data throughput is meas-ured in hundreds of megabytes per second
Nearly all operating systems support ramdisks, butLinux actually goes a step further and allows for it to
be bound to a physical drive or directory This meansthat, while you get all the benefits of writing and read-ing data to memory, you also do not risk losing thatdata in the event of a system crash or reboot, since thekernel will automatically synchronize it back to thephysical drive as needed Incidentally, it's also veryeasy to turn on this feature-all you need is someonewith root access and a few spare minutes:
Trang 35mount bind -ttmpfs /tmp /tmp
mount bind -ttmpfs /home/webroot /home/webroot
The example above binds two commonly used
directories, the temporary directory (frequently used
for session storage and other common operations)
and the directory where web site files can be found
The end result is that virtually all file operation
com-monly performed by PHP are accelerated through the
reduction in the file I/O overhead At the same time,
reliability is not sacrificed for the sake of performance,
making this an ideal solution even for the most
demanding of websites The only downside of this
speed-up is that the ramdisk uses your memory and,
therefore, binding large directories can eat up quite a
bit of space that would otherwise be available to your
applications Thus, you need to exercise a bit of
cau-tion to ensure that directories mapped to ramdisks do
not end up consuming all available memory and force
the operating system to use its much slower swap
memory facilities
And We Didn't Even Touch a Line of
Code!
As you've probably by now realized, there are many
ways to improve the speed of PHP applications
with-out having to perform potentially dangerous codechanges
Equally important is the fact that the changes for themost part require very little time to implement andcan result in massive performance improvements Thisdoes not mean that you should abandon the practice
of optimizing the code itself, which is, of course, animportant tool for making your applications faster.However, when time is of the essence and the pressure
is on, it is always good to know a few tricks to makethe code run faster without having to tinker with it
F
To Discuss this article:
http://forums.phparch.com/128
Ilia Alshanetsky is an active member of the PHP development team and
is the current release manager of PHP 4.3.X Ilia is also the principal developer of FUDforum ( http://fud.prohost.org/forum/ ), an open source bulletin board and a contributor to several other projects He can
be reached at ilia@prohost.org .