pecl_http 0.14.1 • Building absolute URIs • RFC compliant HTTP redirects • RFC compliant HTTP date handling • Parsing of HTTP headers and messages • Caching by “Last-Modified” and/ or ET
Trang 1VOLUME 4 ISSUE 10
www.phparchitect.com
FOCUS ON SECURITY
PROTECT YOUR WORK FROM SQL INJECTION ATTACKS
Ilia Alshanetsky explains with this exerpt from
php|architect’s Guide to PHP Security
ESCAPE OUTPUT
Handling External Data
Is your work vulnerable to
HTTP RESPONSE SPLITTING?
THE CREATOR OF PHP RASMUS LERDORF
ON
OPTIMIZATION WITH
THE ALTERNATIVE
PHP CACHE
Trang 2NEXCESS.NET Internet Solutions
SITEWORX control panel
NODEWORX Reseller Access
All of our servers run our in-house developed PHP/MySQL
server control panel: INTERWORX-CP
INTERWORX-CP features include:
- Rigorous spam / virus filtering
- Detailed website usage stats (including realtime metrics)
- Superb file management; WYSIWYG HTML editor
INTERWORX-CP is also available for your dedicated server Just visit
http://interworx.info for more information and to place your order
WHY NEXCESS.NET? WE ARE PHP/MYSQL DEVELOPERS
LIKE YOU AND UNDERSTAND YOUR SUPPORT NEEDS!
ORDER TODAY AND GET 10% OFF ANY WEB HOSTING PACKAGE
VISIT HTTP://NEXCESS.NET/PHPARCH FOR DETAILS
D e d i c a t e d & M a n a g e d D e d i c a t e d s e r v e r s o l u t i o n s a l s o a v a i l a b l e
/mo N EX R ESELL 2 $ 59 95
7500 MB Storage
100 GB TransferUnlimited MySQL DatabasesHost Unlimited DomainsPHP5 / MySQL 4.1.XNODEWORX Reseller Access
/mo
C O N T R O L P A N E L :
php 4
NEW! PHP 5 & MYSQL 4.1.X
PHP4 & MySQL 3.x/4.0.x options also available
We'll install any PHP extension you need! Just ask :)
MONEY BACK GUARANTEE
WITH ANY ANNUAL SIGNUP
4.1.x
3.x/4.0.x
Trang 4An introduction to PHP’s own opcode cache
Applying PHP to Publishing News
by RUBÉN MARTÍNEZ ÁVILA
Trang 5Graphics & Layout
php|architect (ISSN 1709-7169) is published
twelve times a year by Marco Tabini & Associates, Inc., P.O Box 54526, 1771 Avenue Road, Toronto,
ON M5M 4N5, Canada
Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes
no responsibilities with regards of use of the information contained herein or in all associated material.
php|architect, php|a, the php|architect logo, Marco Tabini & Associates, Inc and the Mta Logo are trademarks of Marco Tabini & Associates, Inc.
Contact Information:
General mailbox: info@phparch.com
Editorial: editors@phparch.com
Subscriptions: subs@phparch.com
Sales & advertising: sales@phparch.com
Technical support: support@phparch.com
Printed in Canada Copyright © 2003-2005 Marco Tabini & Associates, Inc.
Reading the Table of Contents, flipping through the pages, or simply
eyeballing the cover of this issue, you will probably notice a certain
theme: security
As I’m sure you’ve read in Security Corner over the past issues,
the problems of poorly architected sites, security-ignorant code, and
general carelessness when it comes to externally-supplied data, are rampant in our
community Failure to abide by a few simple rules (never trust external data; filter
input; escape output; etc.) has left much of the world wide web in a state of
epidemic The main culprits: remote code execution, SQL Injection and Cross Site
Scripting (“XSS”)
I can almost hear some of you thinking “It can’t be THAT bad! How many times
do you have to beat this dead horse?” and I wish you were correct The reality of the
situation is that XSS vulnerabilities (if not the other, more severe problems) can be
found on all but a few elite sites (relatively speaking, from a pool of billions of web
pages, of course)
Still don’t think it’s that bad? Then you should have been at php|works in Toronto,
last month Rasmus (more on him below) gave a keynote talk on PHP Security, and
spent a good chunk of his time explaining the wide dispersion of XSS vulnerabilities
To illustrate his point (perfectly, I might add), he asked his audience to shout out
the names of their favorite Canadian shopping sites, from which he chose a random
site he’d never visited Within 90 seconds, Rasmus had effectively demonstrated an
XSS problem on the site In fact, even the heavy-hitters are not immune: a friend
showed me a simple XSS exploit for Google, as I was writing this editorial Google!
This is the sort of stuff that keeps me awake at night, and one of the reasons
we’re happy to bring you an issue that’s packed full of security-related content We
have the standard Security Corner, with an explanation of HTTP Response Splitting,
and how you can avoid problems in this area We’re also proud to be publishing
a chapter from Ilia Alshanetsky’s recently-released book, php|architect’s Guide to
PHP Security, which is even more packed full of security content Ben continues his
mini-series on security-related tips, focusing on escaping output, with which you
can avoid the dreaded XSS problems on your sites
Security aside (for a moment), we’re extremely excited to feature an article on
APC, the Alternative PHP Cache, written by the creator of PHP, himself, Rasmus
Lerdorf APC has been around for a while, but Rasmus (and his Yahoo! colleagues)
have recently put a considerable amount of work into a largely-reworked major
release of this extension There’s finally a stable opcode cache for PHP 5, and from
a source we can obviously trust, so we know it’s done right A special “thanks” goes
out to Rasmus for writing the piece
FOCUS ON
SECURITY
EDITORIAL
Trang 7PHP 5.0.5 RC1
php.net announces the release of PHP 5.0.5
RC1.
“This version is a maintenance release
that contains numerous bug fixes, including
security fixes to vulnerabilities found in the
XMLRPC package All users of PHP 5.0 are
encouraged to upgrade to this version.
Some of the changes in PHP 5.0.5
• Changed ming to support official
0.2a and 0.3 library versions.
• Added PHP_INT_MAX and PHP_INT_
SIZE as predefined constants.
• Fixed memory corruption in stristr().
• Many more changes included as well
as several bug fixes.
Get your hands on the latest release at
php.net!
MySQL 5.0
Release Candidate
MySQL announces:
“I’m proud and excited to announce the
first Release Candidate of MySQL 5.0 This
milestone signals that we are nearing what
is certainly the most important release in
MySQL’s history.
MySQL 5.0 has new functionality that I
hope will be welcomed, adopted, and put to productive use by the community of MySQL users—you On the commercial side, MySQL
AB is getting a lot of good vibes from new enterprise customers who are beginning to understand the impact MySQL can have on their IT infrastructure and costs of running mission-critical applications.”
Some of the new ANSI SQL features include:
• Views (both read-only and updatable views)
• Stored Procedures and Stored Functions, using the SQL:2003 syntax, which is also used by IBM’s DB2
• Small bug-fixes in the chatAdvanced example—the error dialog was removed.
• najax.html.importForm (imports an
associative array to the corresponding form elements) and najax.html exportForm (exports form values to
an associative array) were added.
• Support for asynchronous call canceling was added.
Check out the latest release at
http://najax.sourceforge.net/dev/
PHPsh 1.0.1
According to the psychogenic homepage, PHPsh provides ”Simple, web-based shell access to your server.”
“It can be very annoying when you are restricted to FTP access—how can you find out the full path to a directory, or perform
a command line SQL dump when you’re trapped in the limited, chrooted environment provided by an FTP server? PHPsh (PHP shell) allows you to have shell commands run on your behalf by any webserver which serves PHP pages It solves these issues and more, allowing you to tap into the power of any Unix (Linux, BSD, etc.) server!
PHPsh was designed to allow developers, webmasters and sysadmins a quick and easy remedy to those situations in which it would
be so easy to solve a problem or answer a question with shell access but a pointy-haired hosting company thinks shell access is only useful for crackers while simultaneously allowing anyone with FTP access the right to run arbitrary commands through CGI or PHP (doh!).”
http://www.psychogenic.com/en/
products/PHPsh.php.
php|architect Releases New Design Patterns Book
We’re proud to announce the release of php|architect’s Guide to PHP Design Patterns, the latest release in our Nanobook series.
You have probably heard a lot about Design Patterns —a technique that helps you design rock-solid solutions to practical problems that programmers everywhere encounter
in their day-to-day work Even though there has been a lot of buzz, however, no-one has yet come up with a comprehensive resource on design patterns for PHP developers—until today.
Author Jason E Sweat’s book php|architect’s Guide to PHP Design Patterns is the first, comprehensive guide to design patterns designed specifically for the PHP developer This book includes coverage of 16 design patterns with a specific eye to their applications in PHP when building complex web applications, both in PHP 4 and PHP 5 (where appropriate, sample code for both versions of the language is provided).
For more information, http://www.phparch.com/shop_product.php?itemid=96.
Trang 8Looking for a new PHP Extension?
Check out some of the latest
offerings from PECL.
expect 0.1
This extension allows to interact with
processes through PTYs, using the expect
library.
runkit 0.6
Replace, rename, and remove user defined
functions and classes Define customized
superglobal variables for general purpose
use Execute code in restricted environment
(sandboxing).
pecl_http 0.14.1
• Building absolute URIs
• RFC compliant HTTP redirects
• RFC compliant HTTP date handling
• Parsing of HTTP headers and messages
• Caching by “Last-Modified” and/
or ETag (with ‘on the fly’ option for ETag generation from buffered output)
• Support for sending data/files/
streams with (multiple) ranges
• Negotiating user preferred language/
Xdebug 2.0.0beta4
The Xdebug extension helps you debug your scripts by providing valuable debug information, includin the following:
• stack and function traces in error messages with:
• full parameter display for user defined functions
• function name, file name and line indications
• support for member functions
• memory allocation
• protection for infinite recursions Xdebug also provides:
• profiling information for PHP scripts
• script execution analysis
• capabilities to debug your scripts interactively with a debug client
Check out some of the hottest new
releases from PEAR.
Validate_BE 0.1.1
Package contains locale validation for Belgium
such as:
• Postal Code
• Bank Account Number
• Structured Bank Transfer message
(Nationnal transfer from an bank
account to another)
• VAT
• Natitonal ID
• Identity Card Number (not ready)
• SIS CARD ID (belgian “sécurité
sociale” ID)
HTML_Progress2 2.0.0
This package provides a way to add a fully
customizable loading bar into existing XHTML
documents Your browser should be
DHTML-compatible.
Features:
• create bar (horizontal, vertical),
circle, ellipse and polygon (square,
rectangle) progress meters
• allows usage of existing external
StyleSheet and/or JavaScript
• all elements’ (progress, cells, labels) HTML properties are customizable
• percentage/labels can be placed around the progress meter
• compliant with CSS/XHMTL standards
• integration with template engines is very easy
• implements the Observer design pattern: it is possible to add Listeners
• adds a customizable monitor pattern
to display a progress bar; end-user can abort progress at any time
• allows many progress meters on the same page without uses of an iframes
• error handling system that supports native PEAR_Error, but also PEAR_
ErrorStack, and any other system you might want to plug-in.
• PHP 5 ready
Image_Graph 0.7.0
Image_Graph provides a set of classes that create graphs/plots/charts based on (numerical) data Many different plot types are supported: Bar, line, area, step, impulse, scatter, radar, pie, map, candlestick, band, box & whisker and smoothed line, area and radar plots Graphs are highly customizable,
making it possible to get the exact look and feel that is required
The output is controlled by an Image_ Canvas, which facilitates easy deliver to many different output formats:GD (PNG, JPEG, GIF, WBMP), PDF (using PDFLib), Scalable Vector Graphics (SVG), and others.
Image_Graph is compatible with both PHP 4 and PHP 5.
Image_Canvas 0.2.2
A package providing a common interface
to image drawing, making image rendering library-independent.
Services_Yahoo 0.1.1
Services_Yahoo provides object-oriented interfaces to the web service capabilities of Yahoo!
HTML_AJAX 0.2.1
Provides PHP and JavaScript libraries for performing AJAX (Communication from JavaScript to your server without reloading the page).
Trang 9Tips & Tricks
ESCAPE OUTPUT
TIPS & TRICKS
by BEN RAMSEY
In the previous three Tips & Tricks columns, I’ve
taken time to fully explain why all input should
be filtered, and I’ve offered tips on how to filter
your data so that the data you work with and
save isn’t considered tainted However,
security-conscious programming doesn’t end with filtering data
Sure, now the data conforms to expectations, but it
may still contain characters that have special meaning
depending on the medium in which your application
chooses to display it That medium may be HTML, SQL,
XML, WML, etc
Thus, we must escape output
What is output? Output is any data that leaves your
application bound for another client or application The
receiving client or application expects the data to be
of a specific format (HTML, SQL, etc.), and that format
may include characters or other information with special
meaning to the receiving client/application The data
being sent, however, might—and probably does—
contain special characters that should not be interpreted
with any special meaning by the receiving client
CODE DIRECTORY: escape
TO DISCUSS THIS ARTICLE VISIT:
so we need to escape them
Escaping is also sometimes referred to as encoding
In short, it is the process of representing data in a way that it will not be executed or interpreted For example, HTML will render the following text in a Web browser as
Filter Input Escape Output You’re hearing an awful lot of this from
me lately, and as one person noted, “It’s great that they’re rubbing this topic in.” Indeed This month’s Tips & Tricks wraps up the recent focus on security with a discussion on escaping output, why it’s
important, and how to do it.
Trang 10Tips & Tricks
bold-faced text because the <strong> tags have special
meaning:
<strong>This is bold text.</strong>
But, suppose I want to render the tags in the browser
and avoid their interpretation Then, I need to escape
the angle brackets, which have special meaning in HTML
The following illustrates the escaped HTML:
<strong>This is bold text.</strong>
Why Escape?
So, you run a Web-based
forum, and you don’t
have a problem with users
entering the occasional
HTML tag Why should you
escape your output?
Here’s why: Suppose
this forum allows users to
enter HTML tags That’s fair
enough—you may want
to allow them to enter
bold-faced or italicized
text—but then it outputs
everything in its raw
form—everything So, all
HTML tags get interpreted
by the web browser
What if a user enters
Any subsequent user who is logged into the
forum and visits this page will now be redirected to
http://evil-example.org/steal-cookies.php and
any cookies set by the forum can be stolen
Let’s look at another example Many sites contain
login forms, which usually consist of two fields—a
username and a password When a user enters a username
and password, the application may enter the values into
an SQL statement, as in the following:
$sql = “SELECT * FROM users
WHERE username = ‘{$_POST[‘username’]}’
AND password = ‘{$_POST[‘password’]}’”;
This statement will work just fine as long as a user
enters a proper username and password, but suppose a
user enters something like “example’ OR 1 = 1; ”
as the username? The value of 1 will always equal 1, and since the user properly closed the single quote in the statement, the OR clause will be treated as part of the SQL, and everything after the will be ignored (at least
in most database engines) as a comment Thus, the user
is able to log in without an account
The first step to ensure situations such as these
do not occur is to filter all input to ensure that no unexpected characters appear in the data See the July
2005 through September 2005 issues of php|architect for
my full discussion on input filtering
After filtering, be sure to save the raw data Do not
escape it before storing
If escaped before storing, then it might be necessary
to unescape it at some point in the future For example, what if the data
is escaped for HTML output and stored to a database table only to be retrieved later to output in XML or
to PDF, etc.? Then, it must
be unescaped to transport
to those formats—and possibly escaped again
to accommodate the new output medium This process is bound to introduce more bugs to your code and could likely reduce the quality of the data Thus, to make the most of your data, it is best to save it raw (after filtering) and escape only when outputting.Escaping output is not a terribly difficult process
At the least, it may require the addition of a few extra lines of code, or it may require a little more attention
to detail The important thing to keep in mind is the format outputted and the special characters that need
to be escaped for that format For the purposes of this discussion, I will cover escaping for HTML and SQL, since PHP has excellent built-in functions for handling output
Data may leave your application
in many forms.
Trang 11Tips & Tricks
that attempt to do something similar by removing all
but a set of allowed tags, but these are not without their
flaws and can potentially introduce some nasty bugs
that are too lenient when outputting data Likewise,
strip_tags() offers the option to allow certain tags
with the format strip_tags($str, ‘<p> <a> <b>’);,
but this is also too lenient: attributes are not stripped
from allowed tags, allowing onclick events, etc to
persist in output Take the following code snippet, for
echo strip_tags($str, ‘<p> <a> <b>’);
This code will output the following, complete with
the cross-site scripting (XSS) in the onclick attribute:
<p><b>Bold text</b>
<a href=”#” onclick=”alert(‘XSS’);”>Link</
a></p>
Rather than completely stripping the tags from
output, a better alternative may be to escape all the tags,
allowing them to render in the output This is an easy
task with htmlspecialchars() and htmlentities()
Both of these functions serve the same purpose: to
convert special characters into their equivalent HTML
entities The main difference is that htmlentities() is
more exhaustive, choosing to convert all characters with
HTML character entity equivalents to their respective
HTML entities Thus, for its exhaustive nature, I will
recommend htmlentites() as the better function to use
to escape HTML output For the above $str example,
htmlentities() returns the following:
In this case, however, allowing the <b> tags may be
preferable, and so we can allow them by first escaping the
output and then converting the selected HTML entities
back to HTML with str_replace():
$str = htmlentities($str);
$str = str_replace(‘<b>’, ‘<b>’, $str);
$str = str_replace(‘</b>’, ‘</b>’, $str);
This will ensure that we send only those special
characters that we desire to have interpreted to the client
While this is a form of unescaping, which I mentioned
earlier is not a desirable process, it is nevertheless a
good alternative to using strip_tags() to allow certain tags, as it will ensure that any tags that contain undesirable attributes are not interpreted by the client
In addition, there is no guesswork involved here; I am not using a regular expression that I could potentially get wrong and, thus, introduce a hole in my application
I will always know what a <b> tag looks like after the angle brackets have been converted to their HTML entity equivalents, so it is easy for me to find and convert the tags back to HTML
Escaping SQL
Similarly, PHP offers excellent built-in functions for escaping SQL statements according to the database engine used For PostgreSQL, there is pg_escape_string() for MySQL, mysql_real_escape_string() and for SQLite,
sqlite_escape_string() If the other native database functions provided in PHP do not offer a similar function, then PHP offers addslashes(), though I would advise that the database’s native escape string function is always a better alternative than addslashes()
Using the SQL example from earlier, we can escape it using mysql_real_escape_string(), as shown in Listing
1, where we first filter it using the filter() function I gave in the August 2005 issue Thus, if a user enters the value “example’ OR 1 = 1; ” as a username, the SQL that is executed will be:
SELECT * FROM users WHERE username = ‘example\’ OR 1 = 1; ‘
AND password = ‘password’
The single quotation mark is escaped and no results are returned because this user doesn’t exist—the user can’t gain access to the application
Some database functions, such as the unified ODBC functions, mysqli, and PDO (in PHP 5.1), use the concept
of prepared statements to prepare and properly escape
an SQL statement Listing 2 illustrates a prepared statements example using PDO The SQL statement that
is created will appear much like the one listed above, but PDO offers added functionality through the optional
bindParam() parameters to define the type and length
of data
Prepared statements also exist in PEAR::DB and other database abstraction classes, but PDO offers much promise since it is built into the language and, thus, much faster with less overhead
So, if possible, use prepared statements (with PDO,
if possible) If they aren’t available, use the database’s built-in escaping function If that isn’t available, then fall back on addslashes() as a last resort
Trang 12For future installments of Tips & Tricks, I would like to
know what tips and tricks you are using Please send
your tip and/or trick to tnt@benramsey.com, and,
if I use it, you’ll receive a free digital subscription to
php|architect
A Security-Conscious Mindset
The key to secure programming is having a
security-conscious mindset Filtering input and escaping output
is just part of that mindset, but it takes more thought
than simply copying code from elsewhere to introduce
security to an application It takes careful planning and
diligent testing
By now, I hope that you are well on your way to being
a security-conscious programmer I have introduced some
tools and concepts to help you get started, and it is likely
that you have thought of code you’ve already written and
how to improve it using these principles
So, have fun, good luck, and be sure to keep security
at the forefront of a project Security is not a design
feature—it is an essential tool
1 <?php
2
3 $clean = filter ( $_POST , $post_whitelist );
4
5 $username = mysql_real_escape_string ( $clean [ ‘username’ ]);
6 $password = mysql_real_escape_string ( $clean [ ‘password’ ]);
7
8 $sql = “SELECT * FROM users
9 WHERE username = ‘{$username}’
10 AND password = ‘{$password}’” ;
8 $sql = ‘SELECT * FROM users
9 WHERE username = :username
10 AND password = :password’ ;
Tips & Tricks
BEN RAMSEY is a Technology Manager for Hands On Network
in Atlanta, Georgia He is an author, Principal member of the PHP
Security Consortium, and Zend Certified Engineer Ben lives just north
of Atlanta with his wife Liz and dog Ashley You may contact him at
ramsey@php.net or read his blog at http://benramsey.com/.
Trang 13An opcode cache works by intercepting the
compile and execute hooks in the Zend engine
and then storing the result of the compilation
phase in a shared memory cache
On subsequent requests to the same file,
a check is done to see if the opcodes corresponding
to the script are in the cache There is also a check to
determine if the file on disk has a modification time
that is newer than the timestamp on the opcodes in the
cache
There are a number of opcode caches available for
PHP They are sometimes referred to as compilers or
accelerators, but I find the term, opcode cache, to be
the most accurate and descriptive term for what they do
Similar packages to APC that are available are ionCube
PHP Accelerator, eAccelerator and Zend Cache Your choice
of cache, I will leave up to you, but at the time of this
writing only APC and Zend Cache had full PHP 5.1 support
and of those two only APC is open source and available
in PECL
Installing APC
There are a number of things you can configure when
you build APC, but you still may be able to install it with
a simple “pear install apc” command (an example
install session can be seen in Listing 1)
I tend to prefer poking around in any PECL extensions
I want to use before I install them, so I install extensions
by checking them out from CVS, and compiling using the
normal phpize + /configure + make + make install
TO DISCUSS THIS ARTICLE VISIT:
http://forum.phparch.com/258
Adding an opcode cache to your PHP configuration is the
easiest way to speed up your PHP applications without
changing a single line of your code.
Common Configuration Options
The APC configuration directives that I normally place in
my php.ini file can be seen in Listing 3
This setup gives me a 64M single file-backed mapped segment, geared for a server with 500 cacheable files I’ve turned opcode optimization off, because the ABOUT THE AUTHOR:
memory-RASMUS LERDORF is known for having gotten the PHP project off the ground in 1995, the mod_
info Apache module and he can be blamed for the ANSI92 SQL-defying LIMIT clause in mSQL 1.x which has now, at least conceptually, crept into both MySQL and Post-greSQL Prior to joining Yahoo!
as an infrastructure engineer in 2002, he was at a string of companies including Linuxcare, IBM, and Bell Canada working
on Internet technologies
Alternative PHP Cache
Trang 14apc optimizer is quite unhappy at the moment, and a
relatively low opcode cache time-to-live (ttl) of 30
minutes with a higher user cache ttl of 2 hours
These TTL values are only used in case we start to hit
the top of our 64 megabyte segment If we run out of
memory space, APC scans the cache for opcode and user
cache entries that haven’t been accessed for the number
of seconds denoted in the ttl configuration directive, and
removes them The 500 files hint is just that: a hint
You can easily cache more files than the number you’ve
declared, but it is there to help optimize the hashing
algorithm There is no point in having a hashtable that
contains 10,000 slots, each using a little bit of memory,
if you are never going to have more than 25 files in it An
apc.num_files_hint of 500 actually ends up creating a
hash table with 1000 slots If two files hash to the same
slot the second file gets linked to the first As entries
hash to the same slots, the longer this linked list of
entries becomes, and to fetch these entries, APC has to
walk these linked lists sequentially Therefore having
too few hash slots is also a bad thing The one slight
advantage of having many collisions is that APC does
some very lazy garbage collection as it walks the linked
lists, but this behavior doesn’t outweigh the drawbacks
The apc.mmap_file_mask configuration parameter
is tricky—generally, you would just always use mkstemp
mask as I have shown in Listing 3 It is file-backed,
but the file is unlinked right after the mmap call, which
ensures that the shared memory segment automatically
be cleaned up (removed) when the APC (or APC-hosting)
process exits If, for some reason, you want to force a
real anonymous mmap, you can leave it empty You can
specify /dev/zero to mmap from there, if your OS prefers
that, or if you use something like /apc.shm.XXXXXX it
will use shm_open() instead On Linux, that path has
to be in the root directory, and you must have shmfs
enabled (either compiled into the kernel, or loaded as
a module)
You can also prevent APC from caching certain files
by using the apc.filters configuration directive You
provide either a single regular expression, or a
comma-separated list of regexes that match the full-path
filenames you want to exclude from being cached The
main reason you might want to do this is in
a scenario where you have files that change
extremely rapidly—by this, I mean every second
or two Another circumstance where excluding certain
files from the cache might be beneficial is when your
system consists of literally hundreds of thousands of files,
and you want to force APC to focus on the
performance-critical ones and not have the little-used files potentially
causing your cache to fill up, which slows down garbage
1 10:36pm ubuntu:~> pear install apc
9 Zend Module Api No: 20050617
10 Zend Extension Api No: 220050617
11 Use mmap instead of shmget (usually a good idea) [yes] :
12 Use apxs to set compile flags (if using APC with Apache)?
20 install ok: APC 3.0.6
14 Zend Module Api No: 20050617
15 Zend Extension Api No: 220050617
16 $ /configure enable-apc-mmap \
17 with-php-config=/usr/local/php5/bin/php-config \
18 with-apxs
19 …
20 configure: creating /config.status
21 config.status: creating config.h
22 10:45pm ubuntu:/tmp/pecl/apc> make
23 10:47pm ubuntu:/tmp/pecl/apc> make install
24 Installing shared extensions: /usr/local/php5/lib/php/ extensions/no-debug-non-zts-20050617/
Trang 15Alternative PHP Cache
The APC Info Page
In the pecl/apc directory, you will find a script called
apc.php (Figure 1) This file, when executed, gives
you a nice overview of what is in your cache, and how
much of your shared memory segment is being used It
would probably be a good idea to put this script behind
htaccess authentication, if you are going to put it in a
web-accessible directory, but it also has its own
built-in auth system Read the first section of the code built-in
apc.php, itself for more information
Uniquely Identifying Files
A file, whether it is the initial script file, or an included
file, is identified by its device and inode (the file’s
unique position identifier within the filesystem), not its
filename
This method is used, so files can be uniquely identified
in a single stat() call If we were to try to differentiate
files by their filename, we would need the fully qualified
pathname and that can be extremely expensive to get,
since it would involve calling realpath() which, in turn,
calls stat() for every component of the path in order
to resolve any symbolic links that it might discover By
using the file’s inode, we get it down to a single stat call
per file When PHP and APC are nested within an Apache
process, there is no additional stat(), since Apache will
have already made this call, and APC inherits the stat
structure directly from Apache This means that, for PHP
scripts that don’t include anything, APC eliminates all
disk-touching system calls after Apache has handed the
request over to PHP This additional optimization makes
for speedy caching
Updating Files on a Live Web Server
People tend to not pay enough attention to how they
update files on their web server
This is a problem, regardless of the presence of an
opcode cache If you fire up your favourite text editor
and edit a PHP script on your live web server, not
only is there a good chance that you will break your
actual code on the first try, but more importantly, your
editor probably does not write the file to the filesystem
atomically when you are done That means that requests
for the file you are saving may end up getting a partially
written file File writes tend to be pretty fast, so even
on a busy server this should only affect a few requests
However, if you throw an opcode cache into the mix,
you could end up caching this partially written file so
all subsequent requests will get the same partial set of
opcodes from the cache
In order to reduce the impact of this scenario, APC has
an option called file_update_protection This feature
is enabled, and set to 2 seconds by default—meaning that files that have been modified within 2 seconds of the request will not be cached This should prevent any partially written files from polluting the cache
Employing this feature, however, doesn’t fix the real problem—non-atomic file modification on a live web server The correct way to address this issue is to only replace files atomically, by writing to a temp file and then renaming the file to its intended destination filename, or by using automated tools such as rsync, that correctly handle the details of this maneuver, for you UNIX commands and applications such as cp, tar,
vi and emacs often do not create files atomically
Cache Slams
Another often-overlooked issue occurs when files on a very busy server are changed
Imagine a server whose front page gets hit hundreds
of times per second When you modify that front page file, many requests will see that the cached opcodes are now stale and will attempt to compile and cache the script from disk APC doesn’t really mind this, as it is smart enough to avoid any sort of race conditions during the compile and cache procedure, so you will never end
up with an inconsistent cache However, each request that tries to cache a script starts allocating memory in the cache at the same time Once all the small chunks of memory have been allocated and populated correctly, the cache entry gets activated atomically and any previous entries for the same file gets put on a deleted list and deleted when everyone is done accessing it This means that modifying files on a busy server can lead
to many simultaneous memory allocations and you could potentially fill up your shared memory segment because
of multiple concurrent requests all attempting to cache the same file, at approximately the same time
APC attempts to reduce the negative effects of this situation, with a slam_defense option that can be set to a percentage between 0 and 100 that indicates the likelihood that a request that hits an uncached file will skip trying to cache it Very much like the
file_update_protection setting, this is a mechanism
to ease the pain of something that really should be handled differently, by the user (the person who deploys the changed file, in this case) You can completely eliminate both the partial update and the cache slam problems by writing to a temporary file first; then, load that temporary file once, through your webserver (and thus, APC), to force it to be cached, and then rename the file to its final destination You might expect that the file would be re-cached once its name is changed,
Trang 16Alternative PHP Cache
but recall that APC uses the device and inode of the file,
not its name to uniquely identify it When you rename a
file, the inode doesn’t change, nor does the modification
time
Userspace Access to the Cache
There are a couple of ways to make use of the cache from
your userspace PHP scripts
The first way is to poke it for information about what
it is doing The apc.php script that comes with APC is
an example of how to use the apc_cache_info() and
apc_sma_info() functions These return an array that
contains information about objects stored in the cache
and the amount of memory that each of these objects is
using apc_clear_cache() lets you remove all entries
from the cache, without needing to restart your server
Normally you wouldn’t need to call this function
The apc_store() and apc_fetch() functions are
much more interesting These allow you to store your
own data in the cache Generally, you will want to use
these functions for relatively small amounts of data that
is used repeatedly, and is expensive to generate For
example, you might have an XML-based configuration
file for your application People have tended to shy away
from this in the past, but with the simplexml extension
in PHP 5, it is extremely easy to write a parser, and with APC storing the parsed config array, it is also blazingly fast Take this sample config file:
This should be mostly self-explanatory: Load the XML file using simplexml, loop through each section and use the
$entry[‘name’] shortcut for picking the name attribute
FIGURE 1
Trang 17Alternative PHP Cache
out of the entry, and make this name the key for each
section sub-array Then, since below each section in our
example, we just have flat XML with no attributes, nor
sub-nodes, we can just cast it directly to an array and
stick the data directly into our $config array If you have
a completely flat XML config file, you could just cast
$xml directly to an array and you are done, but usually
configuration files are slightly more complex, and you
need to decide how to deal with attributes and what you
want your final array to look like The above three lines
give us an array like this:
Now we can add apc_store()/apc_fetch() caching
and our entire xml-based parsing and caching solution
You may want to add a bit of error checking to make
sure that the conf.xml file actually exists, and if you
are going to do that, it means a stat() call You might
as well make use of that extra system call and pull in
the modification time, using filemtime() So, our final
approach would look like this:
Now we can change our conf.xml file all we want, and it
will be reparsed on the request that immediately follows
the change, and cached in shared memory between
changes apc_store() takes a third optional argument,
which is the number of seconds to cache the passed
data This makes it easy to use the store/fetch method for caching remote data where you want to fetch a new version every 30 minutes, for example
Real world Performance Numbers
Let’s look at 4 examples of what you can expect when you add APC to your system
First a common photo album application: Gallery (version 1) With no opcode cache, hitting a page of an album in Gallery with 9 photos on it, yields just over
9 requests per second That’s not very fast Although, looking at it a different way, it is about 800,000 requests/day Of course, that is just for the HTML for that album page and doesn’t include all of the extra requests needed
to fetch each thumbnail and whatever other images are
on there Still, it is probably more than fast enough for your family album But, faster is always better Adding APC gets us up to 30 requests/second, without changing
a single line of code At these speeds, you do notice
a difference An application that normally attains 30 requests/second, versus one that puts out 10 requests/second, feels snappier Or turn it around: 33ms to finish
a request vs 110ms
With a slight tweak, we can bring this up to about 32 requests/second Not much of an improvement The low-hanging fruit is usually the configuration information for
an application like this Unfortunately, Gallery stores its config in nested classes that will need to be serialized and unserialized Improving on this makes Gallery a bit faster, but probably not worth the maintenance headache
of having locally modified files It is just a couple of lines in Gallery’s config.php file, though At the top:
if($tmp = apc_fetch(‘gallery’)) { $gallery=unserialize($tmp);
if(!$config=apc_fetch(‘config’)) include ‘config inc’;
And, of course, at the bottom of config.inc you would need to add:
apc_tore(‘config’,$config);
Trang 18Alternative PHP Cache
This serialization of objects will be done by APC,
internally, soon so it will go a bit faster by eliminating
the extra userspace unserialize call, but it will still be
nowhere near as fast as using an array that gets copied
directly out of shared memory
FUDforum-2.6 is a popular bulletin board application
Without APC, viewing a message thread with a couple
of messages in it gets me 46 requests/second Turning
on APC brings that up to 160 requests/second Looking
at FUDforum’s config system, it (unfortunately) uses a
bunch of global variables in a file called GLOBALS.php
This file also includes a bunch of other things, and it is
included from all over, so it isn’t easy to eliminate the
include call, nor is it easy to cache the actual config
variables But it can be done At the top of GLOBALS.php
The main performance problem here is the need to do
the extract() In the end, this slows us down to about
153 requests/second If there was heavier logic and
perhaps an SQL query or some XML parsing involved in
creating the list of variables, then this approach would
have helped
Serendipity—also known as s9y—is an application
for people who want to host their own weblogs I get
10 requests/second on a plain PHP installation, and
37 requests/second after adding APC Although the
configuration system is array-based, there is plenty of
logic intertwined, so it is also difficult to cache this
information in s9y
Finally, let’s look at a code snippet written with APC
in mind I recently needed a flexible and fast RSS/Atom
feed reader It uses simplexml and a couple of PHP5.1
tricks to reduce the RSS or Atom XML data down into
an easily cacheable array The code is a bit long to
include here, but fire up a browser and have a look at it–
http://lerdorf.com/simple_rss.phps The inline comments
should help make sense of the code It is basically just a
complicated example of the XML-based config file parser
we developed earlier, but now, we get some numbers
You will notice there are two levels of caching
It caches the parsed XML to shared memory with
apc_store() and it also caches the downloaded raw XML
to disk I tend to do this because I have multiple things
reading these various XML files and they sometimes have
different ideas of what is interesting in them This way I can have different parsers that parse the disk-cached XML into their own shared memory slots, but don’t need to hit the backend server for each separate application On my
lerdorf.com server I have http://buzz.progphp.com,
http://flickr.progphp.com and http://lerdorf.com itself all wanting to access some of the same XML files in very different ways
Now, for the numbers: I am using my RSS2 feed from http://toys.lerdorf.com as the sample XML file Without any caching at all—not even disk-based raw XML caching—I get about 25 requests per second But that number is very variable, depending on the amount
of traffic on the remote server, and general network latency issues It is clear that fetching the entire remote 76kB XML file on every request is not a smart thing
to do Simply caching the XML data between requests brings that number way up to 165 requests per second Finally, and most dramatically, adding apc_store() and
apc_fetch() takes us to 550 requests/second This brings us to the point where getting a 76kB XML feed into an easily walkable array is basically free, from a performance perspective That’s less than 2ms per end-to-end request on a rather low-end 1.8GHz Athlon box with IDE drives, running Ubuntu Linux, and an untuned default Apache install Turning off Keepalive, and changing MaxRequestsPerChild from its default 100 to
0 (unlimited), brings that number up to 590 requests per second
Conclusion: Speed is Good!
Opcode caching plus injecting user caching in the right places in your application can result in dramatic performance gains
In my RSS example, I went from 25 requests/second
to nearly 600 In a full application, there are performance gains to be had all along the way You need to look at where your data comes from, how often it changes, and how close to the final presentation format you can get
it to, before it is cached Applications that were not designed with this in mind from the start can be difficult
to retrofit Keep your designs simple and clean Do not use objects as datastores, and try to avoid spaghetti include sequences—your applications will be easier to deploy and will run much faster
Trang 19SQL Injection
FEATURE
TO DISCUSS THIS ARTICLE VISIT:
http://forum.phparch.com/259
The goal of SQL injection is to insert arbitrary
data, most often a database query, into a string
that’s eventually executed by the database
The insidious query may attempt any number
of actions, from retrieving alternate data, to
modifying or removing information from the database
To demonstrate the problem, consider this excerpt:
// supposed input
$name = “ilia’; DELETE FROM users;”;
mysql_query(“SELECT * FROM users WHERE
name=’{$name}’”);
The function call is supposed to retrieve a record from
the users table where the name column matches the
name specified by the user Under normal circumstances,
$name would only contain alphanumeric characters and
perhaps spaces, such as the string ilia But here, by
appending an entirely new query to $name, the call to the database turns into disaster: the injected DELETE query removes all records from users
MySQL Exception
Fortunately, if you use MySQL, the mysql_query()
function does not permit query stacking, or executing multiple queries in a single function call If you try to stack queries, the call fails
However, other PHP database extensions, such as SQLite and PostgreSQL, happily perform stacked queries, executing all of the queries provided in one string and creating a serious security problem
SQL injection is a common vulnerability that is the result of lax
input validation In this excerpted chapter from
php|architect’s Guide to PHP Security , you will learn how to thwart
this type of attack.
by ILIA ALSHANETSKY author of
php|architect’s Guide to PHP Security
SQL INJECTION
Trang 20SQL Injection
Magic Quotes
Given the potential harm that can be caused by SQL
injection, PHP’s automatic input escape mechanism,
magic_quotes_gpc, provides some rudimentary
protection If enabled, magic_quotes_gpc, or “magic
quotes”, adds a backslash in front of single-quotes,
double-quotes, and other characters that could be used
to break out of a value identifier But, magic quotes
is a generic solution that doesn’t include all of the
characters that require escaping, and the feature isn’t
always enabled Ultimately, it’s up to you to implement
safeguards to protect against SQL injection
To help, many of the database extensions available for
PHP include dedicated, customized escape mechanisms
For example, the MySQL extension for PHP provides the
function mysql_real_escape_string() to escape input
characters that are special to MySQL:
However, before calling a database’s own escaping
mechanism, it’s important to check the state of
magic quotes If magic quotes is enabled, remove any
backslashes (\) it may have added; otherwise, the input
will be doubly-escaped, effectively corrupting it (because
it differs from the input supplied by the user)
In addition to securing input, a database-specific
escape function prevents data corruption For example,
the escape function provided in the MySQL extension is
aware of connection characters and encodes those (and
others) to ensure that data isn’t corrupted by the MySQL
storage mechanism and vice versa
Native escape functions are also invaluable for
storing binary data: left “unescaped”, some binary data
may conflict with the database’s own storage format,
leading to the corruption or loss of a table or the entire
database Some database systems, such as PostgreSQL,
offer a dedicated function to encode binary data
Rather than escape problematic characters, the function
applies an internal encoding For instance, PostgreSQL’s
pg_escape_bytea() function applies a Base64-like
encoding to binary data:
// for plain-text data use:
pg_escape_string($regular_strings);
// for binary data use:
pg_escape_bytea($binary_data);
A binary data escaping mechanism should also be used
to process multi-byte languages that aren’t supported natively by the database system (Multi-byte languages such as Japanese use multiple bytes to represent a single character; some of those bytes overlap with the ASCII range normally only used by binary data.)
There’s a disadvantage to encoding binary data: it prevents persisted data from being searched other than
by a direct match This means that a partial match query such as LIKE ‘foo%’ won’t work, since the encoded value stored in the database won’t necessarily match the initial encoded portion looked for by the query
For most applications, though, this limitation isn’t a major problem, as partial searches are generally reserved for human readable data and not binary data, such as images and compressed files
Prepared Statements
While database-specific escape functions are useful, not all databases provide such a feature In fact, database-specific escape functions are relatively rare (At the moment) only the MySQL, PostgreSQL, SQLite, Sybase, and MaxDB extensions provide them For other databases, including Oracle, Microsoft SQL Server, and others, an alternate solution is required
A common technique is to Base64-encode all values passed to the database, thus preventing any special characters from corrupting the underlying store or causing trouble But Base64-encoding expands data roughly 33 percent, requiring larger columns and more storage space Furthermore, Base64-encoded data has the same problem as binary encoded data in PostgreSQL:
it cannot be searched with LIKE Clearly a better solution
is needed—something that prevents incoming data from affecting the syntax of the query
Prepared queries (also called prepared statements) solve a great many of the aforementioned risks Prepared queries are query “templates”: the structure of the query
is pre-defined and fixed, and includes placeholders that stand-in for real data The placeholders are typically type-specific—for example, int for integer data and
text for strings—which allows the database to interpret the data strictly For instance, a text placeholder is always interpreted as a literal, avoiding exploits such as the query stacking SQL injection A mismatch between
a placeholder’s type and its incoming datum cause, execution errors, adding further validation to the query
In addition to enhancing query safety, prepared queries improve performance Each prepared query is parsed and compiled once, but can be re-used over and over If you need to perform an INSERT en masse, a pre-compiled query can save valuable execution time.Preparing a query is fairly simple Here is an
Trang 21SQL Injectionexample:
pg_query($conn, “PREPARE stmt_name (text) AS “
” SELECT * FROM users WHERE name=$1”);
pg_query($conn, “EXECUTE stmt_name ({$name})”);
pg_query($conn, “DEALLOCATE stmt_name”);
PREPARE stmt_name (text) AS creates a prepared
query named stmt_name that expects one text value
Everything following the keyword AS defines the actual
query, except $1 is the placeholder for the expected
text
If a prepared statement expects more than one
value, list each type in order, separated by a comma,
and use $1, $2, and so on for each placeholder, as in
PREPARE stmt_example (text, int) AS SELECT *
FROM users WHERE name=$1 AND id=$2
Once compiled with PREPARE, you can run the prepared
query with EXECUTE Specify two arguments: the name of
the prepared statement (such as stmt_name) to run and
a list of actual values enclosed in parentheses
Once you’re finished with the prepared statement,
dispose of it with DEALLOCATE Forgetting to jettison
prepared queries can cause future PREPARE queries to
fail This is a common error when persistent database
connections are used, where a statement can persist
across requests For example, given that there is no way
to check if a statement exists or not, a blind attempt
to create one anyway will trigger a query error if one is
already present
As nice as prepared queries are, not all databases
support them; in those instances escaping mechanisms
should be used
No Means of Escape
Alas, escape functions do not always guarantee data
safety Certain queries can still permit SQL injection,
even after escapes are applied
Consider the following situation, where a query
expects an integer value:
$id = “0; DELETE FROM users”;
$id = mysql_real_escape_string($id); // 0;
DELETE FROM users
mysql_query(“SELECT * FROM users WHERE
id={$id}”);
When executing integer expressions, it’s not necessary
to enclose the value inside single quotes Consequently,
the semicolon character is sufficient to terminate the
query and inject an additional query Since the semicolon
doesn’t have any “special” meaning, it’s left as-is by both
the database escape function and addslashes()
There are two possible solutions to the problem
The first requires you to quote all arguments Since
single quotes are always escaped, this technique prevents SQL injection However, quoting still passes the user input to the database, which is likely to reject the query Here is an illustrative example:
$id = “0; DELETE FROM users”;
$id = pg_escape_string($id); // 0; DELETE FROM users
pg_query($conn, “SELECT * FROM users WHERE id=’{$id}’”)
or die(pg_last_error($conn)); // will print invalid input syntax for integer: // “0; DELETE FROM users”
But query failures are easily avoided, especially when validation of the query arguments is so simple Rather than pass bogus values to the database, use a PHP cast
to ensure each datum converts successfully to the desired numeric form
For example, if an integer is required, cast the incoming datum to an int; if a complex number is required, cast to a float
$id = “123; DELETE FROM users”;
$id = (int) $id; // 123 pg_query($conn, “SELECT * FROM users WHERE id={$id}”);
// safe
A cast forces PHP to perform a type conversion If the input is not entirely numeric, only the leading numeric portion is used If the input doesn’t start with a numeric value or if the input is only alphabetic and punctuation characters, the result of the cast is 0 On the other hand,
if the cast is successful, the input is a valid numeric value and no further escaping is needed
Numeric casting is not only very effective, it’s also efficient, since a cast is a very fast, function-free operation that also obviates the need to call an escape routine
The LIKE Quandary
The SQL LIKE operator is extremely valuable: its % and _
(underscore) qualifiers match 0 or more characters and any single character, respectively, allowing for flexible partial and substring matches However, both LIKE
qualifiers are ignored by the database’s own escape functions and PHP’s magic quotes Consequently, user input incorporated into a LIKE query parameter can subvert the query, complicate the LIKE match, and in many cases, prevent the use of indices, which slows a query substantially With a few iterations, a compromised
LIKE query could launch a Denial of Service attack by overloading the database
Here’s a simple yet effective attack:
Trang 22SQL Injection
$sub = mysql_real_escape_string(“%something”);
// still %something
mysql_query(“SELECT * FROM messages “
“WHERE subject LIKE ‘{$sub}%’”);
The intent of the SELECT above is to find those messages
that begin with the user-specified string, $sub
Uncompromised, that SELECT query would be quite fast,
because the index for subject facilitates the search But
if $sub is altered to include a leading % qualifier (for
example), the query can’t use the index and the query
takes far longer to execute—indeed, the query gets
progressively slower as the amount of data in the table
grows
The underscore qualifier presents both a similar and a
different problem A leading
underscore in a search
pattern, as in _ish, cannot
be accelerated by the
index, slowing the query
And a trailing underscore
may substantially alter
the results of the query
To complicate matters
further, underscore is a
very common character
and is frequently found in
perfectly valid input
To address the LIKE
quandary, a custom
escaping mechanism must
convert user-supplied % and
_ characters to literals Use
addcslashes(), a function
that lets you specify a
character range to escape
$sub = addcslashes(mysql_real_escape_
string(“%something_”),
“%_”);
// $sub == \%something\_
mysql_query(“SELECT * FROM messages “
”WHERE subject LIKE ‘{$sub}%’”);
Here, the input is processed by the database’s
prescribed escape function and is then filtered through
addcslashes() to escape all occurrences of % and
_ addcslashes() works like a custom addslashes(),
is fairly efficient, and much faster alternative than
str_replace() or the equivalent regular expression
Remember to apply manual filters after the SQL
filters to avoid escaping the backslashes; otherwise,
the escapes are escaped, rendering the backslashes as
literals and causing special characters to re-acquire
special meanings
SQL Error Handling
One common way for hackers to spot code vulnerable
to SQL injection is by using the developer’s own tools against them For example, to simplify debugging of failed SQL queries, many developers echo the failed query and the database error to the screen and terminate the script
for any number of reasons.) Besides being embarrassing, the code may reveal a great deal
of information about the application or the site For instance, the end-user may be able discern the structure of the table and some of its fields and may
be able to map GET/POST parameters to data to determine how to attempt
a better SQL injection attack In fact, the SQL error may have been caused by an inadvertent SQL injection Hence, the generated error becomes a literal guideline to devising more tricky queries.The best way to avoid revealing too much information
is to devise a very simple SQL error handler to handle SQL failures:
function sql_failure_handler($query, $error) { $msg = htmlspecialchars(“Failed Query:
{$query}<br>”
.”SQL Error: {$error}”);
error_log($msg, 3, “/home/site/logs/sql_error_ log”);
if (defined(‘debug’)) { return $msg;
} return “Requested page is temporarily unavailable, “
.”please try again later.”;
} mysql_query($query)
Trang 23SQL Injection
The handler function takes the query and error message
generated by the database and creates an error string
based on that information The error string is passed
through htmlspecialchars() to ensure that none of the
characters in the string are rendered as HTML, and the
string is appended to a log file
The next step depends on whether or not the script is
working in debug mode or not If in debug mode, the error
message is returned and is likely displayed on-screen for
the developer to read In production, though, the specific
message is replaced with a generic message, which hides
the root cause of the problem from the visitor
Authentication Data Storage
Perhaps the final issue to consider when working with
databases is how to store your application’s database
credentials—the login and password that grant access
to the database Most applications use a small PHP
configuration script to assign a login name and password
to variables This configuration file, more often than
not (at least on shared hosts), is left world-readable
to provide the web server user access to the file But
world-readable means just that: anyone on the same
system or an exploited script can read the file and steal
the authentication information stored within Worse,
many applications place this file inside web readable
directories and give it a non-PHP extension—.inc is a
popular choice Since inc is typically not configured to
be interpreted as a PHP script, the web browser displays
such a file as plain-text for all to see
One solution to this problem uses the web server’s
own facilities, such as htaccess in Apache, to deny
access to certain files As an example, this directive
denies access to all files that end (notice the $) with
the string inc
<Files ~ “\.inc$”>
Order allow,deny Deny from all
</Files>
Alternatively, you can make PHP treat inc files as scripts
or simply change the extension of your configuration files to php or, better yet, inc.php, which denotes that the file is an include file
However, renaming files may not always be the safest option, especially if the configuration files have some code aside from variable initialization in the main scope The ideal and simplest solution is to simply not keep configuration and non-script files inside web server-
A proper solution must ensure that other users on the system have no way of seeing authentication data Fortunately, the Apache web server provides just such
a mechanism The Apache configuration file, httpd.conf can include arbitrary intermediate configuration files during start-up while Apache is still running as root Since root can read any file, you can place sensitive information in a file in your home directory and change
it to mode 0600, so only you and the superuser can read and write the file
One common way for hackers to spot
code vulnerable to SQL injection is by using the developer’s own tools against them.
Trang 24SQL Injection
<VirtualHost ilia.ws>
Include /home/ilia/sql.cnf
</VirtualHost>
If you use the Include mechanism, be sure that your
file is only loaded for a certain VirtualHost or a certain
directory to prevent the data from being available to
other hosts on the system
The content of the configuration file is a series of
SetEnv lines, defining all of the authentication parameters
necessary to establish a database connection
SetEnv DB_LOGIN “login”
SetEnv DB_PASSWD “password”
SetEnv DB_DB “my_database”
SetEnv DB_HOST “127.0.0.1”
After Apache starts, these environment variables are
accessible to the PHP script via the $_SERVER super-global
or the getenv() function if $_SERVER is unavailable
echo $_SERVER[‘DB_LOGIN’]; // login
echo getenv(“DB_LOGIN”); // login
An even better variant of this trick is to hide the
connection parameters altogether, hiding them even
from the script that needs them Use PHP’s ini directives
to specify the default authentication information for the
database extension These directives can also be set
inside the hidden Apache configuration file
php_admin_value mysql.default_host “127.0.0.1”
php_admin_value mysql.default_user “login”
php_admin_value mysql.default_password
“password”
Now, mysql_connect() works without any arguments, as
the missing values are taken from PHP ini settings The
only information remaining exposed would be the name
of the database
Because the application is not aware of the database
settings, it consequently cannot disclose them through a
bug or a backdoor, unless code injection is possible In fact,
you can enforce that only an ini-based authentication
procedure is used by enabling SQL safe mode in PHP
via the sql.safe_mode directive PHP then rejects any
database connection attempts that use anything other
than ini values for specifying authentication data
This approach does have one weakness in older
versions of PHP: up until PHP 4.3.5, there was a bug in
the code that leaked ini settings from one virtual host
to another Under certain conditions, this bug could be
triggered by a user, effectively providing other users on
the system with a way to see the ini values of other
For example, if a user only requires read-access to the database, don’t permit the user to execute UPDATE or
INSERT queries Or more realistically, limit write access
to those tables that are expected to change—perhaps the session table and the user accounts table
By limiting what a user can do, you can detect, track, and defang many SQL injection attacks Limiting access
at the database level is supplemental: you should use it
in addition to all of the database security mechanisms listed in this chapter
Maintaining Performance
Speed isn’t usually considered a security measure, but subverting your application’s performance is tantamount
to any other exploit As was demonstrated by the LIKE
attack, where % was injected to make a query very slow, enough costly iterations against the database could saturate the server and prevent further connections Unoptimized queries present the same risk: if the attacker spots inefficiencies, your server can be exhausted and rendered useless just the same
To prevent database overloading, there are a few simple rules to keep in mind
Only retrieve the data you need and nothing more Many developers take the “*” shortcut and fetch all columns, which may result in a lot of data, especially when joining multiple tables More data means more information to retrieve, more memory for the database’s temporary buffer for sorting, more time to transmit the results to PHP, and more memory and time to make the results available to your PHP application In some cases, with large amounts of data, database sorting must be done within a search file instead of memory, adding to the overall time to process a request Again, only retrieve the data you need, and name the columns to minimize size further
To further accelerate a query, try using unbuffered queries that retrieve query results a small portion at a time However, unbuffered queries must be used carefully: only one result cursor is active at any time, limiting you
to work with one query at a time (And in the case of
Trang 25SQL Injection
MySQL, you cannot even perform INSERT, UPDATE, and
other queries until all results from the result cursor have
been fetched)
To work with a database, PHP must establish a
connection to it, which in some cases can be a rather
expensive option, especially when working with complex
systems like Oracle, PostgreSQL, MSSQL, and so on
One trick that speeds up the connection process is to
make a database connection persistent, which allows
the database handle to remain valid even after the
script is terminated If a connection is persistent, each
subsequent connection request from the same web server
process reuses the connection rather than recreating it
anew
The code below creates a persistent MySQL database
connection via the mysql_pconnect() function, which is
syntactically identical to the regular mysql_connect()
function
mysql_pconnect(“host”, “login”, “passwd”);
Other databases typically offer a persistent connection
variant, some as simple as adding the prefix “p” to the
word “connect”
Anytime PHP tries to establish a persistent connection,
it first looks for an existing connection with the same
authentication values; if such a connection is available,
PHP returns that handle instead of making a new one
Words of Caution
Persistent connections are not without drawbacks For
example, in PHP, connection pooling is done on a
per-process basis rather than per-web server, giving every
web-server process its own connection pool So, 50 Apache
processes result in 50 open database connections If the
database is not configured to allow at least that many
connections, further connection requests are rejected,
breaking your web pages
In many cases, the database runs on the same machine
as the web server, which allows data transmission to be
optimized Rather than using the slow and bulky TCP/IP,
your application can use Unix Domain Sockets (UDG), the
second fastest medium for Inter Process Communication
(IPC) By switching to UDG, you can significantly improve
the data transfer rates between the two servers
To switch to UDG, change the host parameter of the
connection For example, in MySQL, set the host, followed
by the path to the UDG
Query Caching
In some instances, a query is as fast as it can be, yet still takes significant time to execute If you cannot throw hardware at the problem—which has its limits as well—try to use the query cache A query cache retains
a query’s results for some period of time, short-circuiting the need to recreate the results from scratch each time the same query runs
Each time there’s a request for a page, the cache is checked; if the cache is empty, if the cache expired the previous results, or if the cache was invalidated (say, by
an UPDATE or an INSERT), the query executes Otherwise, the results saved in the cache are returned, saving time and effort
ILIA ALSHANETSKY is the principal of Advanced Internet Designs Inc., which specializes in security auditing, performance analysis and application development He is the author of FUDforum (http://fudforum.org), a highly popular, Open Source bulletin board, focused on providing the maximum functionality at the highest levels
of security and performance Ilia is a core PHP Developer, an active member of PHP’s QA team, and was the Release Master for the PHP 4.3.x series He has authored and co-authored a number of extensions, most notably SHMOP, PDO, SQLite and GD, and is responsible for a large number of bug fixes and performance tweaks in the language
A prolific lecturer and writer, Ilia can found speaking at international conferences He is frequently published in print and online magazines
on a variety of PHP topics, and is also the author of an upcoming book
on PHP security Ilia can be reached at ilia@ilia.ws.
dynamic web pages - german php.node
news scripts tutorials downloads books installation hints
www.dynamicwebpages.desex could not be better |
Trang 27This article teaches developers to create sites quickly,
by concentrating on application-specific code and letting the Seagull PHP framework handle the rest
A web framework is a necessity when developing
a serious website Programmers should not
recreate basic web elements when great tools
to help them get the job done already exist
One of these tools, Ruby on Rails, garnered
much attention when it was released in July 2004 It
simplified Ruby development, separated data from
display, and made web development fun
Various PHP frameworks exist, including a
Rails clone called Cake http://www.cakephp.org
which is still early in development This article will
concentrate on another framework, one called Seagull
(http://seagull.phpkitchen.com) It’s fast, secure, has
very clean code and doesn’t look half bad, either
Seagull is a BSD licensed , object oriented application
LINKS:
http://seagull.phpkitchen.com http://seagull.phpkitchen.com/apidocs/
http://pear.php.net/package/HTML_Template_Flexy
FLOCKING TO
SEAGULL
Trang 28Flocking to Seagull
built on solid, heavily-tested tools, and uses more than a
few PEAR libraries for many of its tasks It is very easy to
install, using the PEAR-installer, and offers a web-based
installation procedure It uses good coding practices such
as design patterns, database abstraction and separation
of content and presentation
Seagull frees developers from repetitive programming
tasks and lets them concentrate on application-specific
code It is completely modular, so new features can
easily be added to the system The developer community
also pays considerable attention to maintaining a cleanly
structured codebase, observing security guidelines and
respecting web standards like XHTML and CSS
Although it has a very low release number, this
framework offers much functionality like user and
permission management and some ready-to-use
modules like Publisher—a lightweight CMS—a
contact-us module, a guestbook module, a module for setting
up a list of FAQs (Frequently Asked Questions) and
even a shopping cart It also has a front controller
that lets you easily create search engine friendly
URLs like http://www.example.com/index.php/
contactus/action/list/
The project was started in 2001 by Demian Turner,
who wanted to create a simple and stable framework,
using innovative design patterns for his project Since
October 2003, the project has been hosted on SourceForge
(http://www.sourceforge.net/projects/seagull/)
You may be wondering where Seagull got its name:
Demian Turner was on a ferry surrounded by some
seagulls As the birds were coasting along with the
boat, they twisted their necks to get a better view of
the passengers He found this really interactive (for the
birds) and he thought the main focus of a framework should be interactivity That’s why it’s called Seagull
If you’ve never used a web framework, you may wonder if the advantages of such a system outweigh the cost of learning how to use it You may think that since creating sites from scratch has worked, you should continue doing things that way All we ask is that you follow this tutorial to create a simple site with Seagull
If you find it doesn’t save you time, don’t use it If you wonder how you ever lived without it, great! If you have used a web framework before, the following tutorial will introduce you to Seagull and give you a handle on its various idiosyncrasies
This tutorial will walk you through the steps of creating a medium sized application with Seagull We will need to install the framework, create a few users, manage permissions, use various modules (like the CMS module for sharing articles) and, last but not least, modify the look and feel so it fits with your corporate identity Additionally, we will create a new module called
“wish list,” in which users will be able to sign up and add/edit/delete items from their wish lists, which will be publicly viewable This is a simple application, but one practical enough to give you all the tools necessary to create your own site Let’s get started!
Model-View-Controller
Seagull uses the Model View Controller pattern For an introduction to the MVC pattern, see the May 2003 issue of php|architect (https://www.phparch.com/issue.php?mid=9)
Figure 1 shows how MVC is implemented in Seagull
Seagull frees developers from repetitive programming tasks and lets them concentrate
on application-specific code.
Trang 29Flocking to Seagull
In detail:
• Root directory: init.php and constants.php
• etc/: basic configuration files, SQL files, etc
• lib/: libraries (Seagull, PEAR and other) and data files like arrays for country names or languages in lib/data/
• modules/: each module has its own subdirectory
• var/: for all temporary data like compiled templates, DB_DataObject entities, log files and sessions This directory must be writeable
by the webserver
• www/: application webroot which contains the front controller script, themes and Javascript Only this directory should be viewable to the web; otherwise, make sure to protect the others with htaccess files
Basic Classes
Basic tasks like connecting to a database, sending emails
or formatting output are done using the Seagull Base Classes, contained in /lib/SGL/ These classes provide Seagull with its basic functionality and do not need
to be completely understood before using Seagull We advise you to become familiar with these classes when you get the chance, however, as it will give you a greater understanding of the framework, itself For a deeper look
at these classes please visit the API documentation at the project homepage.
System Architecture
The framework consists of:
• base framework: The framework itself is made
up of a set of base classes, organized according
to the MVC design pattern, that take care of
permissions, authentication, sessions, input/
output and database abstraction
• modules: Each generalized area of functionality
comes in the form of a module that is
associated with manager classes, blocks, or
items You may find your business requirements
already implemented in one of these pre-made
modules
• libraries: Most task-specific functionality
comes from libraries, which are quite often
from PEAR (http://pear.php.net) These
libraries can be independently updated when
upgrades/improvements are available
• entities/entity managers: Each object in the
application such as Member, Group, Property,
Document, Article, etc is represented by
an entity You can quickly prototype entities
using the tools Seagull provides to create
skeleton classes
Directory Structure
Before starting to use Seagull, let’s have a look at the
directories it contains You can see the complete structure
in Figure 2
FIGURE 1
Trang 30Flocking to Seagull
Templates and Themes
Seagull uses templates and themes
for separating data from layout
By default, the PEAR package
HTML_Template_Flexy is used
Flexy compiles all HTML templates
into PHP scripts that are never
edited by the developer You also
won’t need to worry that template
files are being parsed every time a
request is made
By using templates, you can
FIGURE 2
FIGURE 3
split the jobs for programming and designing to different people This way,
a designer will never have access to the program logic and will be unable to ruin your carefully crafted code
A theme, in turn, is a collection of directories placed in www/themes/ Each subdirectory contains the HTML templates for the module it represents
Installation
Installing Seagull is very easy All you need is a webserver (like Apache or IIS), PHP (version 4.1 or newer—PHP 5 works, too), and a database (e.g MySQL, PostreSQL, Oracle) before you can begin
First, download the most recent version of Seagull from the project homepage, and unpack it into your webroot directory
Alternatively, you can use the PEAR Package manager This method is the easiest and fastest way to get Seagull up and running, but there are a few requirements:
• You must be running a recent version of PHP 4.3.4+ with the base PEAR packages installed
• You must set the pear data_dir to your webroot, or point it to anywhere on your filesystem, and subsequently create a virtual host
to expose the www directory This is done with the -d data_dir=/ path/to/data/dir switch To view your current settings use
pear config-show
• Your preferred package state must be set to alpha The current state of the Seagull project is stable, but there is a dependency on the Validate library, which has been alpha for ages now
So, to install Seagull using the PEAR installer, type the following on the command line (on one line):
pear -d data_dir=/path/to/web/root \ -d preferred_state=alpha install \ onlyreqdeps \
0.4.5.tgz
http://kent.dl.sourceforge.net/sourceforge/seagull/seagull-Once you have performed the installation with the PEAR package manager, don’t forget to revert your PEAR configuration settings to their original state
Now, let’s continue the installation process