1. Trang chủ
  2. » Công Nghệ Thông Tin

Tài liệu PDFLib''''s block tool pdf

65 436 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề PHPLib’s Block Tool: Templating PDFs for Maximum Reusability
Tác giả Ron Goff
Trường học Nexcess.net
Chuyên ngành PHP / MySQL
Thể loại Bài viết
Năm xuất bản 2005
Thành phố Ann Arbor
Định dạng
Số trang 65
Dung lượng 5,12 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

org lists the features as: • Management of several databases creation, access or upload • Management of the attached databases • Create, edit and delete tables and indexes • Insert, edit

Trang 1

<?ph p

Secure your applications against Email Injection Tips on Output Buffering

KOMODO - reviewed and much more

FPDI in Detail

Importing existing documents with Free PDF Import

2005 Look Back

Reflecting on last year’s events in the PHP world

with PHP guru Derick Rethans

i18n

Internationalize your web application

with less PHP code

PDFLib’s

VOLUME 5 ISSUE 1

Trang 2

NEXCESS.NET Internet Solutions

SITEWORX control panel

NODEWORX Reseller Access

All of our servers run our in-house developed PHP/MySQL

server control panel: INTERWORX-CP

INTERWORX-CP features include:

- Rigorous spam / virus filtering

- Detailed website usage stats (including realtime metrics)

- Superb file management; WYSIWYG HTML editor

INTERWORX-CP is also available for your dedicated server Just visit

http://interworx.info for more information and to place your order

WHY NEXCESS.NET? WE ARE PHP/MYSQL DEVELOPERS

LIKE YOU AND UNDERSTAND YOUR SUPPORT NEEDS!

ORDER TODAY AND GET 10% OFF ANY WEB HOSTING PACKAGE

VISIT HTTP://NEXCESS.NET/PHPARCH FOR DETAILS

D e d i c a t e d & M a n a g e d D e d i c a t e d s e r v e r s o l u t i o n s a l s o a v a i l a b l e

/mo N EX R ESELL 2 $ 59 95

7500 MB Storage

100 GB TransferUnlimited MySQL DatabasesHost Unlimited DomainsPHP5 / MySQL 4.1.XNODEWORX Reseller Access

/mo

C O N T R O L P A N E L :

NEW! PHP 5 & MYSQL 4.1.X

PHP4 & MySQL 3.x/4.0.x options also available

We'll install any PHP extension you need! Just ask :)

MONEY BACK GUARANTEE

WITH ANY ANNUAL SIGNUP

4.1.x

3.x/4.0.x

Trang 5

If you want to bring a php-related topic to the attention of the professional php community, whether it

is personal research, company software, or anything else, why not write an article for php|architect?

If you would like to contribute, contact us and one of our editors will be happy to help you hone your idea and turn it into a beautiful article for our magazine Visit www.phparch.com/writeforus.php

or contact our editorial team at write@phparch.com and get started!

Download this month’s code at: http://www.phparch.com/code/

CONTENTS

WRITE FOR US!

Features

Reflecting on last year’s

events in the PHP world

by DERICK RETHANS

Templating PDF’s for Maximum Reusability

by RON GOFF

with Free PDF Import

by JAN SLABON

Internationalize Your Web applications

with less PHP code

Why is it Taking so Long?

Lead times and the rationale behind them

Trang 6

Graphics & Layout

php|architect (ISSN 1709-7169) is published

twelve times a year by Marco Tabini & Associates, Inc., P.O Box 54526, 1771 Avenue Road, Toronto,

ON M5M 4N5, Canada

Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes

no responsibilities with regards of use of the information contained herein or in all associated material.

php|architect, php|a, the php|architect logo, Marco Tabini & Associates, Inc and the Mta Logo are trademarks of Marco Tabini & Associates, Inc.

In the past five (or so) years, especially, the desktop landscape has changed,

severely Desktops have traditionally been dominated by Windows, but

alternatives are making their way into both the office and home

Apple’s hit operating systems in the OS X series, and other chic products

(like the iPod) have not only fueled the sales of Macintosh computers, but

have opened consumers’ minds to the reality that there are alternatives to Windows

The market is still strongly clutched by Microsoft, but more and more users are

making the “switch” to Mac (and to a much lesser extent, alternatives like Linux)

This diversity, while good, can cause portability problems, and as I’ve touched

on in past issues, developers can no longer target a single browser, but must become

more and more aware of standards and cross-browser/cross-platform compatibility

issues

For the most part, developers seem to have the browser issue under control I

personally never use Internet Explorer for anything but testing (I’m a Firefox fanboy),

and it’s very rare that I still run into sites that simply won’t work with FF Even in

cases where it seems I’m out of luck, I can often spoof the User-Agent header, and

get a working site Since Firefox is available on many platforms, it seems that the

HTML issue is (mostly) behind us—I say “mostly” because standards-compliance

and portability are things that we always need to strive for

If you’ve tried to distribute a printable, offline-viewable, and well laid out

document, in the past, you know that HTML doesn’t cut it There’s little provision

for the features that are necessary to build a professional document (there is hope

with CSS, though) This often leaves websites delivering “richer” documents, such

as MS Word documents or RTF files

The distribution of proprietary format documents leads to its own set of

problems, primarily: document creation and portability Have you tried to build a

Word document from your non-Windows Web server? It’s not fun Equally tedious

is trying to get that document to render properly in different versions of Word, on

different platforms—worse is the rendering in non-Microsoft applications, such as

OpenOffice Enter PDF

Now, PDF is certainly not new technology It does, however, seem to be becoming

more and more the de facto standard for document distribution PDF is no stranger

to php|architect readers: if you’re not reading this on paper, you’re reading a PDF,

and we’ve brought you much PDF-centric content in the past, but we’ve certainly not

drained the PDF knowledge pool

This month, we’re happy to focus on PDF, once again, but this time with a twist:

using PHP to modify existing PDFs, through various means.

It’s also our pleasure to be running Derick Rethans’ PHP Lookback, 2005 Marco

will touch more on this in exit(0).

On that note, we at php|architect wish you and your business a happy and

successful 2006 Here’s to another great year of PHP!

PLATFORM

DIVERSITY

EDITORIAL

Trang 8

PHP 5.1.2 RC1

Ilia Alshanetsky announces the release of php

5.1.2 RC1.

“I’ve just packaged PHP 5.1.2RC1, the first

release candidate for the next 5.1 version A

small holiday present for all PHP users, from

the PHP developers This is primarily a bug

fixing release with its major points being:

• Many fixes to the strtotime() function,

over 10 bugs have been resolved.

• A fair number of fixes to PDO and its

drivers

• New OCI8 that fixes large number of

bugs backported from head.

• A final fix for Apache 2 crash when

SSI includes are being used.

• A number of crash fixes in extensions

and core components.

• XMLwriter & Hash extensions were

added and enabled by default.”

Get all the info at http://ilia.ws/archives/

97-PHP-5.1.2RC1-Released!.html

FUDforum 2.7.4RC1

Released

The FUDforum team has announced the latest

release of their open source forum package,

version 2.7.4 RC1 Some of the new features

include:

• Added subscribed forum filter to

message navigator

• Added handling for in-lined

attachments in mailing list

import

• Added the ability to supply

custom signature to message

synchronized from the forum

back to mailing list or a news

group

• Added support for allowing the

user to select how many threads

they want to see per page

• Much more…

Visit FUDforum.org for all the latest info.

eZ components

ez components 1.0 beta2

ez.no is proud to announce the release

of ez components ez.no announces: ”Ez components is an enterprise ready, general purpose PHP platform As a collection of high quality independent building blocks for PHP application development, ez components will both speed up development and reduce risks An application can use one or more components effortlessly, as they all adhere to the same naming conventions and follow the same structure All components are based on PHP 5.1, except for the ones that require the new Unicode support that will be available from PHP 6 on.”

Need to speed up your development?

Check out ez.no for more info.

xajax 0.2

xajaxproject.org announces the release of version 0.2 What is it? The site describes it as:” an open source PHP class library that allows you to easily create powerful, web- based, Ajax applications using HTML, CSS, JavaScript, and PHP Applications developed with xajax can asynchronously call server-side PHP functions and update content without reloading the page.”

To start working with xajax, visit

xajaxproject.org.

SQLiteManager 1.2.0RC2

If SQLite is the db of choice for your PHP application, you may be interested in the latest release of SQLiteManager SQLiteManager org lists the features as:

• Management of several databases (creation, access or upload)

• Management of the attached databases

• Create, edit and delete tables and indexes

• Insert, edit, delete records in these tables

• Management of views; create views from SELECTs

• Management of triggers

• Management of user defined functions

• Manual request and from file,

it is possible to define the format of the requests, sqlite or MySQL; a conversion is done in order to directly import a MySQL database in SQLite

• Importing of records from a formatted text file

• Export of structure and the data

• Choice of several display skins Check out SQLiteManager.org to start managing your SQLite DB, today.

php|architect Releases New PDFlib Book

We are proud to announce the release of our latest book

in the “Nanobooks” series called Beginning PDF Programming

with PHP and PDFlib

Authored by Ron Goff, this book provides a thorough introduction to the great capabilities provided by the PDFlib library for the creation and manipulation of PDF files

The book features a foreword by Thomas Merz, the original author of PDFlib and founder of PDFlib GmbH, and tackles topic like PDF file creation, fonts, text, shapes and

much more, including PDFlib’s Block Tool, which allows for

the manipulation of existing PDF documents.

For more information, http://www.phparch.com/pppp

Trang 9

Check out the hottest new releases from PEAR.

Image_Color2 0.1.4

PHP 5 color conversion and basic mixing.

Currently supported color models:

• CMYK - Used in printing

• Grayscale - Perceptively weighted

grayscale

• Hex - Hex RGB colors i.e #abcdef

• HSL - Used in CSS3 to define colors

• HSV - Used by Photoshop and other

graphics packages

• Named - RGB value for named colors

like black, khaki, etc.

• WebsafeHex - Just like Hex but rounds

to websafe colors

Config 1.10.5

The Config package provides methods for

configuration manipulation.

• Creates configurations from scratch

• Parses and outputs different formats

(XML, PHP, INI, Apache )

• Edits existing configurations

• Converts configurations to other

It provides a common API for all supported RDBMS The main difference to most other

DB abstraction packages is that MDB2 goes much further to ensure portability Among other things MDB2 features:

• An OO-style query API

• A DSN (data source name) or array format for specifying database servers

• Datatype abstraction and on demand datatype conversion

• Various optional fetch modes to fix portability issues

• Portable error codes

• Sequential and non sequential row fetching as well as bulk fetching

• Ability to make buffered and unbuffered queries

• Ordered array and associative array for the fetched rows

• Prepare/execute (bind) emulation

• Sequence emulation

• Replace emulation

• Limited sub select emulation

• Row limit support

• Transactions support

• Large Object support

• Index/Unique Key/Primary Key support

• Reverse engineering schemas from

an existing DB

• SQL function call abstraction

• Full integration into the PEAR Framework

• PHPDoc API documentation

The GDChart extension provides an interface

to the bundled gdchart library This library

uses the (bundled) GD library to generate 20

different types of graphs, based on supplied

parameters.

The extension provides an OO interface

to gdchart exposing majority of options via

properties and complex (array) options via a

series of methods.

To use the current version of the extension

PHP 5.0.0 is required, and older PHP 4 only

version can be downloaded from CVS, by

checking out the extension with PECL_4_3

tag.

yaz 1.0.6

This extension implements a Z39.50 client for

PHP using the YAZ toolkit.

Fileinfo 1.0.3

This extension allows retrieval of information regarding vast majority of files This information may include dimensions, quality, length etc

Additionally, it can also be used to retrieve the mime type for a particular file and for text files, the proper language encoding.

pecl_http 0.21.0

It eases handling of HTTP URLs, dates, redirects, headers and messages, provides means for negotiation of clients preferred language and charset, as well as a convenient way to send any arbitrary data with caching and resuming capabilities.

It provides powerful request functionality,

if built with CURL support Parallel requests are available for PHP-5 and greater.

PHP-5 classes: HttpUtil, HttpMessage, HttpRequest, HttpRequestPool, HttpDeflateStream, HttpInflateStream PHP-5.1 classes: HttpResponse

Xdebug 2.0.0beta5

The Xdebug extension helps you debugging your script by providing a lot of valuable debug information The debug information that Xdebug can provide includes the following:

• stack and function traces in error messages with:

• full parameter display for user defined functions

• function name, file name and line indications

• support for member functions

• memory allocation

• protection for infinite recursions Xdebug also provides:

• profiling information for PHP scripts

• script execution analysis

• capabilities to debug your scripts interactively with a debug client

Trang 10

Welcome to the fourth installment of the PHP

Look Back Just as in previous years, we’ll

look back on PHP development discussions,

bloopers and accomplishments of the last

year This is not supposed to be a fully

objective review of last year—note that the opinions in

this article are that of the author, and not of the PHP

development team (nor of php|architect).

January

January was a quiet month, with not much going on

After about 8 months[001], we finally added[002] a

PIC/non-PIC detection mechanism to the configure script, that

will select non-PIC object generation for supported

platforms (Linux and FreeBSD) Non-PIC code is about

30% faster, as measured in earlier benchmarks

on the events of the past year Derick Rethans, a PHP internals developer, has been publishing a PHP Look Back for a few years, now, and this year, we saw it fitting to publish it, here Happy 2006!

by DERICK RETHANS

A week later, Leonardo[003] was wondering whether we planned on adding type hints for scalar types to PHP As PHP is a weakly-typed language, this is not something

we wanted to add, although we did add support for an

“array” type hint, later in the year With PHP 5.1’s new GOTO execution method (added last August), variable name lookups are cached internally This caused some problems for Xdebug[004], as it needs some information to find out which variables are used in a specific scope Andi committed[005] a patch that made Xdebug work properly, again

Michael started working on his HTTP extension (which

2005

PHP

LOOK BACK

Trang 11

2005 Look Back

generates way too many commit mails ;-) and encountered

a problem with a naming clash[006] between PEAR’s HTTP

class and his PECL extension Greg responded[007], and said

that this problem will be solved when PEAR 1.4 comes

out, with its channel support

February

Andi started discussions in February by pointing out a

date for the first beta of PHP 5.1: March 1st He declared

that “both PDO and Date should be included in the default

distribution”[008] and others suggested that XML Reader[009]

should be included by default, as well In reply to Andi,

Rasmus mentioned[010] that he would like to see the

filter extension included, as well The discussion about

this extension quickly transitioned to data mangling

of input request variables, and how they could not be

influenced by the script authors, but only by the system

administrator In the end, this discussion made place

for the topic of Operator overloading[011], where certain

people kept reiterating that operator overloading is a

“good thing.[012]”

Andrei tried to stop this discussion by being funny[013],

but it didn’t work very well[014] Around the same time, Wez

announced[015] the first beta of PDO—PHP Data Objects

Wez wanted people to test[016] PDO, and of course, over the

next couple of months, there were various PDO-related

concerns[017] and issues raised

Another discussion in February was about auto

boxing[018] in PHP Auto boxing is the encapsulation of

all primitive types as objects Naturally, people asked

why[019] we would want to have this, and no sound

reason was given In the end, this discussion suggested

that phpDocumentor [020] should handle type determining,

instead Having a doc block[021] parsing extension to the

reflection API would be nice, although a bit hard

We also had an often-recurring discussion[022] on why

the GPL[023] is a bad idea for PECL[024] extensions

John added the first version[025] of XMLRPCi to CVS;

why he chose this silly name is still unknown Jani

wrote about a problem with overwriting globals[026], an

issue that—later in the year—warranted a new PHP release, and Greg introduced[027] PEAR 1.4, with channel support

Halfway through the month, Marcus[028] mentioned a few things that should go into PHP 5.1; most notably the toString() fix, which unfortunately, did not actually make it into the release Type hinting with “= NULL” did, make it in[029], though

Martin Sarsale reported[030] an issue with references and segfaults, something which had been annoying us

at eZ systems[031] for quite some time, too This issue got fixed in PHP 4.4, albeit not without a little bickering (more about that later)

March

In March, Ilia proposed[032] a patch that adds a special token that tells PHP’s parser to abort parsing when the token is encountered This allows us to attach binary data to the end of a PHP script, which is highly useful for one-script installers, such as the one that FUDForum[033] uses

On the 14th of the month, Zeev released the first RCs[034] of both 5.0.4 and 4.3.11 We also encountered further reference issues[035]

The same guy that mailed tons of “fixes” to the internals list, last June[036], was back with more[037]

patches Andrei, once again, pointed out[038] that it is

a good idea to check with an extension’s maintainer before applying patches, and Greg published[039] the

package2.xml documentation

Lukas, once more, pointed out[040] the weird naming scheme that new extensions seem to be getting, and luckily Debian’s PHP packages got rid[041] of some of the insanity that was present in previous[042] releases by not always building in ZTS mode Unfortunately, their packages still force PIC mode for the libraries

A user brought up the idea of an upload meter patch[043], again, and although we all seemed to remember[044] that the original patch was rejected[044],

no one could find the original thread[046] where this was discussed Last year’s Look Back discussed this too, and

Luckily, Debian’s PHP packages got rid of some of the insanity that was present

in previous releases.

Trang 12

2005 Look Back

there, the reason was mentioned[047]

In the last week of the month, we had some fuss[048]

about “FreeBSD doing stupid things[049]” regarding their

naming of auto tools executables[050]

April

April started with a suggestion[051] by Zeev to change

the way that autoload() works, by allowing multiple

instances of this magic function In the end we, didn’t

end up implementing this, and as Lukas described[052],

“Frameworks should provide autoload() helper

methods, but should never implement the function itself

It’s up to the end user to do this.” (This is exactly how

we implemented it for the eZ components[053])

Andi wanted to release PHP 5.1 Beta 1[054] really soon,

but, as Jani mentioned[055], there were quite a few things

that were still not fully ready, and thus the suggestion to

call it “Alpha”[056] was made, instead During this thread,

some pet-features[058] were brought up[059]

Kamesh, from the Netware porting team, found

another reference issue[060] Marcus added the File [061] class

to his SPL extension, causing a small stir—the new class

clashed with any application that already defines its

own File class Although this is a valid point, projects

defining a “File” class should know better, and would be

wise to prefix their class names This same issue will pop

up later in the year

A last, somewhat larger, discussion erupted when

a question[062] about whether APC could be used as a

content cache was posted to the list Rasmus found it an

interesting idea[063], although this functionality can also

be accomplished in user space In the last point of the

thread, Rasmus mentioned[064] that APC will soon support

PHP 5

May

May had a slow start, and things only got interesting

at the end of the month The first discussion that came

up was Ilia’s removal of dangling commas from enums,

something that “was in c language from the first day[065].”

Apparently, GCC 4 is “becoming worse and worse[066],” but

luckily, we can still just ignore the warnings[067]

After a small private discussion with Dmitry about

Marcus’ and my reference fix patch[068], he came to the

conclusion that this patch breaks binary compatibility

and that this problem warrants a PHP 4.4 release As this

reference problem has been affecting many users, and

definitely eZ over the past months, I wrote an email[069]

to the list stating that it is “totally irresponsible” not to

release a fix for such a grave bug Zeev[070] also said that

“we should probably not fix this at all in the 4.x tree”

because of the hassles that accompany “breaking module

binary compatibility.” He also seemed to think that the bug can easily be worked around

Other users were a bit happier[071] that we finally nailed this bug, and Jani replied to Zeev that the magnitude[072]

of this bug is pretty high Rasmus added that he “will

be deploying the patch and happily breaking binary compatibility[073]” as soon as the patch is ready Breaking binary compatibility is only a “burden on the maintainers

of these packages” (of the various distributions) Wez thought that “the only logical move forward is a 4.4 branch and release[074].” In the end, the Zeev almighty was

“tired of going through the reasons again and again[075]” and noted that “everyone appears to prefer the upsides

to the downsides.” This resulted in the creation of the

PHP_4_4 branch[076] in the first week of June

June

Wez added a new patch to our CVS server that allows

us to block access[077] to specific branches—with this,

we closed the PHP_4_3 branch for good A week later,

I announced 4.4.0RC1[078], which features the reference bug fix

Andi wrote another PHP 5.1 mail[079], which spawned

a nice long discussion on adding goto [080] to PHP, and comparing goto to exceptions Magnus smartly added[081]

that “people are talking about hypothetical messy code because of goto” and that they forget that you don’t have to use a language construct simply because it is available

The same thread also went into a branch that discussed[082] the ifsetor() language construct After Andi returned, he decided not to do anything with

goto or ifsetor() [083], and that it was now the time to branch, so that we can merge the Unicode support that was developed in parallel by mostly Andrei and Dmitry, although Rasmus was “pretty sure the current discussions will pale in comparison to the chaos that will be created when the Unicode stuff goes into HEAD![084]”

Johannes wondered when the new date stuff[085] was going in; it was added a week later, just before PHP 5.1 beta 2 Lukas suggested that we add[086] the public keyword

to PHP 4.4 for forward compatibility Rasmus again wondered about “the reasoning for not having var be

a synonym for public in PHP 5[087].” Andi mentioned[088]

that this “was meant to help people find vars so that they can be explicit about the access modifiers” when moving to PHP 5

A few days later, Andi read a blog posting[089] which described how PHP 4.4 is breaking backwards compatibility

by issuing an E_STRICT in cases where developers abuse return-by-reference This, however, was not actually the case[090]

Trang 13

2005 Look Back

Yasuo started a long thread[091] on allow_url_fopen()

and claimed it was dangerous[092] The main result of

this thread seemed to be that we wanted to split the

setting into two different privileges: one that allows

remote opening of URLs and one to allow include() on

remote URLs However, this is something we could not

yet change

The last thread of the month was by Andi, writing

about the PHP 5.1 release process[093]

July

In July, Jessie suggested[094] a String extension that

declares only one class: String This class is meant to

prevent copying of the string’s data for most operations

(which is currently done with PHP’s string functions)

Most of the other developers where against it, for

different reasons: “String is such a generic name for a

non-core class[095]” and “the savings gained by this will be

more than offset by OO overhead[096],” so we will not let

“this get anywhere near the core[097].”

In the same week, I made more changes to the date

extension[098] that allows users to more easily select the

timezone that they want, instead of having to rely on

the TZ environment variable This is also needed because

the TZ environment variable[099] can most likely not

be used in a thread safe way, and it is certainly not

portable[100] Also in the same week, I proposed an API

for new Date and Timezone functionality[101] After some

pressure[102], I added[103] an OO API, too Near the end of

the month, I committed the implementation of the new

date functionality[104] It was, however, #ifdef-ed out to

facilitate discussions at a later date

Jessie came up with Yet Another Namespace

Proposal[105], and tried to come up with a solution for all

the previous problems we had with the implementation

He also made several patches[106] that added namespaces

to PHP

We had some more fuss[107] about PHP 4.4 breaking BC,

where some people didn’t see[108] why we had to implement this fix Unfortunately, there were some quirks[109] that we still had to sort out

In this same month, Rasmus released APC 3.0.0[110]

which came with PHP 5.1 support and numerous fixes

August

August started with a discussion on instanceof [111] being

“broken,” as it raises a fatal error in the case where the class that is being checked for doesn’t exist Andi declared “if you’re referencing classes/exceptions in your code that don’t exist, then something is very bogus with your code[112]” and “the only problem is if the class does not exist in your code base, in which case, your application should blow up![113]”

I raised a question about whether the new PHP with

Unicode should be called PHP 5.5 or PHP 6.0[114] Andi (amd the majority) wanted to go “with PHP 6 and aim to release it before Perl 6[115].”

After PHP_5_1 was branched, Andrei merged the Unicode branch and gave us some instructions on how

to get started with it[116] He also introduced the general ideas behind the implementation[117]

PHP 5.1 RC1 was finally rolled, about half way through the month, followed by PHP 5.0.5 RC2[118], a week later.During the development of the eZ components[119],

we discovered various things in PHP’s OO model that we wanted to see changed One of those issues was described

in the Property Overloading RFC[120] Unfortunately, not everybody could be convinced[121], and no changes were made I will try again though :)

The other issue that we raised was that failed typehints throw a fatal error[122], while that is not strictly necessary Instead of throwing exceptions[123] in this case, the discussion turned towards adding a new error mode[124]

(E_RECOVERABLE [125]) that will be used for corrupting fatal errors at the language level—this is exactly the case with failed typehints

non-engine-If you’re referencing classes/exceptions

in your code that don’t exist, then something is very bogus with your code.

Trang 14

2005 Look Back

The longest thread of the month, was started by

Rasmus when he posted his PHP 6[126] wish list, which

featured controversial changes such as “removing

magic_quotes” and “making identifiers case-sensitive,”

to which most developers quickly agreed[127] Following

his initial wish list, the crowd went wild and started

suggesting all kinds of weird changes, such as “Radically

change all of the operator syntaxes[128],” adding <?php6 [129]

as a BC breaking mode, and “Named parameters[130].”

Marcus made a list of his own[131] which would later

become the first draft of the meeting agenda for a PHP

Developers Meeting

September

In September, Antony committed[132] an upgraded OCI8

extension which fixes a lot of bugs[133] We also decided

to play a bit nicer with version_compare(), regarding

naming[134] release candidates

Zeev wanted to roll[135] PHP 5.0.5 but there was an

issue[136] with the shutdown order The reference issues

returned, too The first one[137] turned out to be an

incorrect merge to the PHP 5.0 branch, where suddenly

some of the notices turned into errors[138] The second

one[139] is simply a small change in behaviour, which

previously created memory corruption Rasmus explained

the issue a bit more[140], once again

Ilia tried to implement a clever fix[141] which turned

out to be a problem later on Pierre started a discussion

on supporting Unicode in identifiers, something he didn’t

want to see PHP already supports using UTF-8 encoded

characters[142] in identifiers, so removing this feature

will break BC unnecessarily Besides breaking BC, many

people simply want to use their own language for writing

code, as Tex[143] writes

Zeev made another attempt at PHP 5.1.0 RC2[144] with

the latest PEAR being the only thing missing Marcus

brought up the issue of toString() again, and finally

managed to get it into CVS, but unfortunately not in time

for PHP 5.1

Stanislav[146] noticed some problems with detecting

time zones, as the new date/time code did not try to

attempt detection in favour of the new date.timezone

setting[147] After some discussion, we came up with a solution[147], which was then implemented It should guess the timezone correctly in most cases, even on

Windows I also added support for an external timezone database[149]

October

In October, I noticed some weird notices[150] with

“make install-pear,” without a clue as to why they were showing up This discussion turned into a “why does PEAR not support PHP 5.1” thread[151] In the end, Greg managed to nail down the weird notices, though

I also noticed a commit by Dmitry[152] that ignores “&” when $this is passed I pointed out that this should not be supported (in PHP 5), as it doesn’t make really sense that people won’t see a warning/notice/error when they’re doing something silly Dmitry explained[153] that disallowing it would break code, but he also writes that

by “using ‘=& $this’, a user can break the $this value”—which is something we definitely should prevent He suggested[154] we make this an E_STRICT warning, and Andi suggested[155] we escalate this to an E_ERROR in PHP 6, but neither of those things happened

A week later, Piotr[156] asked for a tarball of our CVS to make it “possible to convert it to Subversion repository so browsing the repositories would be much easier.”

We wondered[157] why he needed that, as we offer our own browser[158], already

Matthias[159] said that we “do not want to set off yet another discussion about the changes 4.4 brought,” but that is exactly what he did Again, there was something wrong with his code, and thus the warning is legal.After resolving the timezone issues, last month,

we were surprised by a message from Zeev He simply missed[161] the conclusion in the “lengthly thread.”

As a result of the negative comments on the PHP 4.4.0 release, Lukas, Ilia and I set up a routine[162] for involving some of the more known projects to the PHP 4[163] and PHP 5[164] release processes As part of this effort, we send out[165] a mail to all participating projects whenever we

The filter extension, which I’ve been

developing for quite some time, did not make it into PHP 5.1

Trang 15

After the PDM I posted[187] the meeting notes[188] to the list Most of the outcome was well appreciated, except the curly braces idea which has already been discussed With these notes, we hope to make PHP 6 a success The notes also spawned numerous[189] polls[190] on the symbol to use for separating namespaces from class names/function names We also discussed our version of a goto: labeled[191]

breaks[192].The filter extension[193], which I’ve been developing for quite some time, did not make it into PHP 5.1, although

it is a good idea[194] to add it, now, with an “experimental” status, so that this wanted extension gets more testing Perhaps for PHP 5.1.2…

December

December was a quiet month with little action Ilia proposed[195] a plan for PHP 5.1.2 and released PHP 5.1.2RC1[196], Zeev committed[197] Dmitry’s re-implementation

of the FastCGI API and some user[198] was whining about our “official” IRC channel (which doesn’t exist)

That was it for 2005 (as far as PHP internal development is concerned)! I hope you enjoyed reading this, and have a happy new year Extra thanks go to Ilia, for being the release master, Dmitry for maintaining the engine, Jani for hunting down bug reports, Andrei for his work on Unicode, Mike for his enormous stream of useless commit messages ;-), and to all others who made PHP happen this year 

have a release candidate to test

I raised[166] some concern regarding our current

Unicode implementation because of maintenance issues

In part of my mail, I also indicated that I wanted “to

clean up PHP 6 for real,[167]” after private discussions

with Marcus and Ilia Behind the scenes, we prepared

some material to organize a PHP Developers Meeting to

discuss the Unicode implementation and the extended

“PHP 6 Wishlist.” I also committed[168] a patch that allows

typehints for classes to work with = NULL [169]

Another guy raised the issue of “that new isset()-like

language construct,[170]” but this ended up going nowhere,

as people were suggesting very Perl-like[171] operators

Jani replied to this thread with “How about a good ol’

beating with a large trout?[172]”

On the last day of the month, we released PHP 4.4.1[173]

which addresses some of the reference issues we’ve seen

in PHP 4.4.0

November

In November, we prepared to finally release PHP 5.1,

and one of the efforts was to make an upgrade guide[174]

for people switching to PHP 5.1 Sean noticed[175] a

problem with the parameter parsing API’s automatic

type conversion Like Andrei[176], many people think that

“passing ‘123abc’ and having it interpreted as 123” is still

wrong

Dmitry implemented[177] support for “= null” as default

to array type hinting, something that I did not do[178] on

purpose because “= array()” is the logically correct way

of doing this Andi agreed[179] with me on this

Ilia implemented, in PHP 5.1RC5[180], one of the items

that was on the outcome list of the PHP Developers

Meeting: adding a notice that warns people that curly

braces[181] for addressing a character in a string is now

deprecated in favour of the [] operator—contrary to the

current explanation in our manual {} and [] are exactly

the same thing[182] and “having two constructs for the same

behaviour is silly and leads to confusing, hard to read

code.” The outcome of this discussion was the removal

of the notice in PHP 5.1 and the likely conclusion is that

it is not going to get removed

Another change that as made PHP 5.1RC6 was the

creation of the “Date” class, which caused quite a stir

after the release of PHP 5.1[183] The reason to introduce it

in 5.1 was simply to make sure that no applications were

going to break if we introduced the Date class later in the

5.1.x series Unfortunately a lot of projects, including

PEAR, never heard of “prefixing” class names, causing

class name clashes Marcus described the problem as

“PEAR ignores coding standards,[184]” but others suggested

that we renamed the internal class[185] to something silly

DERICK RETHANS provides solutions for Internet related problems He has contributed in a number of ways to the PHP project, including the mcrypt , date and input-filter extensions, bug fixes, additions and leading the QA team He now works as project leader for the

eZ compoments project for eZ systems A.S In his spare time he likes

to work on, xdebug watch movies, travel and practice photography You can reach him at derick@derickrethans.nl.

Trang 18

PHPLib’s Block Tool

<?php

FEATURE

The PDFLib Block Tool—available for use only

with PDFlib Personalization Server (PPS)—helps

create PDF documents derived from large

amounts of variable data

Before the block tool was added, it was a

difficult process to place variable data, images, and even

other PDFs into precise areas of a PDF that had been

designed previously Now, adding variable data is very

simple and helps create great dynamic pieces for just

about any application

Installing the Block Tool

Currently, the block tool plug-in for Adobe Acrobat is only

available on the Windows and Macintosh (both Mac OS 9

and Mac OS X) platforms On either platform, you must

also have Version 6 or 7 of Adobe Acrobat Professional

or Adobe Acrobat Standard, or the full version of Adobe

Acrobat 5 Other versions of Adobe Acrobat—Acrobat

Reader, and Acrobat Elements—and all other PDF creation

tools do not work with the block tool plug-in (Check the PDFlib web site for an up-to-date list of supported PDF authoring tools.)

Windows OS Installation

If you’re using Windows, you can use the block tool installer provided by PDFlib to get the plug-in installed correctly into your version of Adobe Acrobat 5, 6,

or 7 The installer places the correct files into the Acrobat plug-ins folder, which is typically found at

C:\Program Files\Adobe\Acrobat 6.0\Acrobat\plug_ins\ PDFlib The Windows version of the block tool is compatible only with PPS version 6.0.1

by Ron Goff

TO DISCUSS THIS ARTICLE VISIT:

http://forum.phparch.com/280

If you’ve been developing for any length

of time, you’ve probably been tasked with generating PDFs at some point In this article, we’ll discuss the process of combining data from many sources into a single PDF—from installation of the block tool, to creating the blocks in Adobe Acrobat, and then finally working with the blocks via PDFlib.

PDFLib’s Block Tool

CODE DIRECTORY: pdflib

Trang 19

PHPLib’s Block Tool

Mac OS Installation

You can install the block tool in either Mac OS 9 or OS X

If you own Adobe Acrobat 5, place the files that comprise

the block tool into the Acrobat plug-in directory, typically

located at /Applications/Adobe Acrobat 5.0/Plug-Ins/

If you’re using Adobe Acrobat version 6 or version 7, save

the files that comprise the block tool into a new directory

and then locate the Acrobat program, which is usually

found at /Applications/Adobe Acrobat 6.0 Professional

Using the Finder, click once on the Acrobat application

to select it and then choose “File > Get Info” from the

menu bar Locate the triangle next to the words

“Plug-ins.” Expand the triangle, select “Add,” and then locate

the folder that contains the block tool plug-in files

Creating Blocks

After you install the block tool, you should see a new

menu called “PDFlib Blocks” in Acrobat’s main menubar

You should also see a new icon that resembles [=])—this

is the block tool (See the top of Figure 1.) You use the

block tool icon to create regions that you can fill with

variable data

When you click the block tool icon and hover over the

PDF, your cursor turns into a crosshair To create a block,

click the mouse and hold it while dragging your cursor

As you drag your cursor, a lightly-outlined box should

appear (See Figure 1.)

When you’re satisfied with the size of the box, release

the mouse button A menu like the one shown in Figure

3 appears The menu controls all of the properties of

the block, including the formatting of the data that will

be contained in the block (data that you will add via

FIGURE 1 FIGURE 2

FIGURE 3

The New and Improved Block Tool

If you’ve used previous versions of the block tool, you’ll notice that the new version is much more user friendly The export and import features have also been updated, making it much quicker to apply blocks from previously formatted PDFs

Trang 20

PHPLib’s Block Tool

PDFlib)

There are three types of blocks that can be created:

• The first and default type of block is text It

handles any type of text, whether it’s a single

line of text or many lines of text

• The second type of block is image As its name

implies, an image block is a container for the

dynamic placement of images within the PDF

• The third and last type is PDF, which is able to

contain other PDFs

Each block has general properties (see Figure 2) and

type-specific properties General properties set attributes such as the placement of the block, its background and border colours, and its orientation, to name just a few Some of the sections that follow describe the type-specific properties

So what do you do with blocks? As you might have inferred, already, you use blocks to mix dynamic content amid static content A designer can create a PDF, include static text and images, and then place blocks wherever dynamic content should appear Your application “fills

in the blanks,” so to speak, and because blocks retain properties such as typeface, font size, color, kerning, and other settings, the block, once filled, looks exactly like the rest of document—just as the designer intended.Using blocks, the application that generates each PDF document need not format anything However,

if you want to customize a block on-the-fly, you can Pre-defined block attributes can be overwritten by your code

Editing Block Settings

To change a block property, select the block you want

to configure and then navigate to find the property you want to change For example, Figure 3 shows how to edit the textflow property, which can be either true or false

(hence, the dropdown menu)

The purpose of most properties is obvious, but be careful with attributes that specify font names Unless you’re running Acrobat on the same machine as your PDFlib application, it’s likely that the set of fonts on the two machines (say, your desktop and the server, respectively) will differ Be sure to use the name of fonts that are installed on your server

Text Flow Settings

If you want a block to flow (automatically wrap and justify) arbitrary amounts of text, set the textflow

property to true Once set to true, an additional button named TextFlow appears next to the existing button labeled Text Click on TextFlow to examine and set specific variables (such as leading and indents) that control how text flows in the block All other text attributes—those for one line of text or a flow of text—remain in the same pane as the textflow property

Trang 21

PHPLib’s Block Tool

Image Settings

By changing the block option to image, you can use

PDFlib to place images dynamically in a PDF There are far

fewer options for an image block than for a text block

The options screen for an image block is shown in Figure

5

The defaultimage attribute names a default image to

place if the image specified by PDFlib is unavailable

The dpi setting, or the number of dots per inch, is

used to override the dpi of an image PDFlib will use the

default dpi value of the image if it is available, or 72

dpi if this option isn’t set If necessary, you can set the

horizontal and vertical dpi independently by supplying

two values instead of one, first horizontal dpi and then

vertical dpi

The scale property controls the scaling of the

image You can supply one value to scale horizontally

and vertically equally, or supply two values, one for the

horizontal and another for the vertical scale factor

PDF Settings

The settings for a PDF block are very similar to the settings

for an image block, as shown in Figure 6 defaultpdf

specifies a default PDF to place if the PDF document that

PDFlib names cannot be found

defaultpdfpage specifies which page of the default

PDF to place if the default PDF must be used

scale controls the scaling of the PDF As with an

image, you can specify one value to apply to both axes

or you can provide two values, one for horizontal scaling

and another for vertical scaling

Custom Settings

When using any type of block, you can specify custom

attributes Custom attributes do not affect the output

when using PDFlib, but can be retrieved by PDFlib for

interpretation by your code Custom attributes are good

for passing information to the PDFlib program, or even

for just better record keeping

As an example, say that you want to create a text

block that’s limited to ten characters or less Create the

text block, add a custom property named length, set it

to 10, and then retrieve the value via PDFlib at runtime

Your code can verify the length of a string before filling

the block and react accordingly, perhaps truncating the

string or asking the user to provide a new value

The PDFlib Blocks Menu

To make setting up blocks easier, the “PDFlib Blocks”

menu has a few handy tools You can export and import

blocks to re-use complex blocks, you can align elements,

and more

FIGURE 7

FIGURE 8

FIGURE 9

Trang 22

PHPLib’s Block Tool

Whatever text you “insert” assumes the

formatting of the block.

Exporting

The “Export” feature is a huge timesaver when dealing

with multiple PDFs that require the same types of blocks

Once you’ve finished setting up blocks in a single “master”

PDF, you can export those blocks and then import them

over and over again into other PDFs There are several

different settings in the “Export” dialog (see Figure 7):

• You can export blocks from all pages of the

PDF or from a subset of them

• You can export blocks to a new PDF or to an

existing PDF Selecting “New File on Disk”

creates a blank PDF with the blocks set in

the new file If you want to export blocks to

a document that you already have opened

in Adobe Acrobat, select “Open Document”

and click “Choose” to see a list of all open

documents If you choose “Replace Existing

Files”, the block tool will overwrite the target

file with blank pages with the blocks in the

proper place

• The next option is “Export Which Blocks?” This

section allows you to control which blocks

are exported You can export all blocks—

depending on the number of pages you choose

in the first section—or just the blocks that

you highlight before exporting You can also

choose to delete the blocks that exist on the

target PDF

Importing

You can import blocks from another PDF using the import

option in the “PDFlib Blocks” menu When you choose

“Import,” you will be presented with a screen to choose

the file that contains the blocks you want to import

(Figure 8)

After you choose the appropriate file, you can

determine which pages the blocks should be applied to

Alignment Options

The alignment option in the “PDFlib Blocks” menu allows

you to align two blocks

To align, choose a block It should turn pink, reflecting

that it’s your primary choice Then choose another block;

it should turn blue, indicating that it’s your secondary choice When you select “Align,” the blue block should align with the pink block Figure 9 shows two blocks,

Block_1, the secondary block, left-aligned to the primary block, Block_0

The “Size” alignment option only works when more than one block is selected You can change all secondary blocks (blue) to be either the same width or height as the primary block (pink)

The “Center” alignment option aligns all blocks selected either horizontally or vertically, and even both horizontally and vertically

Defining Blocks and Detecting Settings

Two other time savers are available in the “PDFlib Block”

menu: one creates a block from a placed object like an image, and another creates blocks that automatically detect the font settings and font color of the font that the block is being created over

Click on “Click Object to Define Block” and then click

on an object such as an image to create a block of the same dimension in the exact same position

Or, if you click on “Detect Underlying Font and Color”

before you create a block, the block’s font settings are automatically set to match the style and size of the text below the new block This feature is especially useful

when dealing with a lot of text and specific colors (You may have to adjust the font name to match a font located

on the server running PDFlib.)

Using Blocks

As you might imagine, working with blocks from within your code makes placing text, images, and PDFs into a dynamic PDF far simpler than writing code to control the pointer, stroke text line-by-line, and so on With blocks, formatting is separated from your code, leaving all of the aesthetics to the designer creating the PDF Better yet,

a change to the design of the page doesn’t (necessarily)

Anytime

Anytime

Anytime

Trang 23

Anytime

Anytime

Trang 24

PHPLib’s Block Tool

necessitate tweaking your code

Setting up the dynamic PDF document is similar to

what’s been shown in prior chapters, except you need to

pull in the PDF that contains the blocks First, specify

the basic information:

PDF_set_info($p, “Creator”, “block_tool.php”);

PDF_set_info($p, “Author”, “Ron Goff”);

PDF_set_info($p, “Title”, “Block Tool”);

Next, pull in the PDF page that contains the blocks, place

it into memory, and create a new blank page:

$block_file = “block_file.pdf”;

$blockcontainer = PDF_open_pdi($p, $block_file, “”, 0);

//Page standard 8.5 x 11

PDF_begin_page_ext($p, 612, 792, “”);

Continuing, call up the actual page that you want to use

In the line of code below, the 1 (numeral one) refers to

page one of the PDF that contains the blocks

$page = PDF_open_pdi_page($p, $blockcontainer, 1, “”);

If you want to use another page from the “template”

PDF, just specify that page number instead of 1

Finally, the page with blocks is “copied” to the new

page in the new PDF

PDF_fit_pdi_page($p, $page, 0.0, 0.0, “adjustpage”);

The adjustpage option adjusts the size of the new

page to match the page size of the template PDF

adjustpage overrides any page settings that have been

set previously

From here, you are ready to use the blocks

Text Blocks

Whether working with a line of text or a text flow, text is

easy to fill in: just specify the name of the block and the

text to render and call PDF_fill_textblock()

$block = “Block_1”;

$text = “All the pie in the sky wasn’t enough to fill my plate”;

PDF_fill_textblock($p, $page, $block, $text, “encoding=winansi”);

The block name, here Block_1, is the name that was

assigned to the block when it was created in the

template PDF (Block names are unique and the default

name is Block_#, but a block name can be any string of

alphanumeric characters.)

Notice that there are no extra formatting options

Whatever text you “insert” assumes the formatting of

the block

If you want to override a block’s formatting, you can Where encoding=winansi appears, add the options that you want to override For example, to override the font size, specify encoding=winansi fontsize=12

You should also enable embedding as needed You can enable embedding by adding embedding=true as in

encoding=winansi embedding=true

Image Blocks

The process of placing an image in an image block resembles that of placing the image “manually”: the image is loaded and then placed

$block4 = “Block_4”;

$image_load = “image.jpg”;

$image = PDF_load_image($p, “auto”, $image_load, “”);

PDF_fill_imageblock($p, $page, $block4, $image, “”);

PDF_close_image($p, $image);

In this example, the image image.jpg is placed in Block_4

using the function PDF_fill_imageblock()

Form Conversion

You may be familiar with the Adobe Acrobat “Form Tool,” a great way to create fillable areas of your PDF So, why not just use forms to define variable data placement? Because the form tool is limited:

it cannot specify advanced font settings, whereas the block tool has been designed specifically to customize all aspects of your text However, if you have a PDF that used the form tool to define areas for text, there is an option within the “PDFlib Blocks” menu to convert your pre-made forms into blocks (Figure 5.4)

Trang 25

RON GOFF is the technical director/senior programmer for Conveyor

Group (www.conveyorgroup.com), a Southern-California based

web development firm He is the author of several articles for

PHP|Architect magazine and other online publications Ron’s lives in

California with his wife Nadia and 2 children You can contact him at

ron@conveyorgroup.com.

Closing the Page

After you’ve filled all of the appropriate blocks on the

open page, you must close that page

PDF_close_pdi_page($p, $page);

This line closes the PDF and you can start a new page, or

end the entire document after this is called

Putting All Together

A complete example using the PDF_fill_textblock()

function can be seen in Listing 1

The PDFlib block tool is easy to use and provides

for complex layouts without extensive programming

Using blocks, a designer can assign where dynamic text,

images, and even PDFs are to be placed, yielding a much

more professional result 

9 PDF_set_info ( $p , “Creator” , “block_tool.php” );

10 PDF_set_info ( $p , “Author” , “Ron Goff” );

11 PDF_set_info ( $p , “Title” , “Block Tool” );

37 header ( “Content-type: application/pdf” );

38 header ( “Content-Length: $len” );

39 header ( “Content-Disposition: inline; “

Trang 26

FPDI in DetailFEATURE

by JAN SLABON

TO DISCUSS THIS ARTICLE VISIT:

http://forum.phparch.com/279

PDF documents—or better stated: the PDF

format—have reached widespread popularity

over the past few years, and this momentum

continues A very strong example of this is

in a recent ISO standard, which is based on

PDF 1.4, and defines a PDF derivate for the long-term

preservation of electronic documents PDF has becomea

a real standard!

In fact, the dynamic generation of PDF documents is

an important issue today, and will continue to be so in

the future While it’s quite simple to build PDF docments

on desktop PCs, their dynamic generation on a webserver,

especially when using a language like PHP, can prove

very difficult

On the Internet, you’ll find several PDF APIs that

will allow you to create PDF documents with PHP Some

“FPDF” stands for “Free.”

Most PHP developers about the ability to create PDF documents on the fly When looking at the wide range of PHP classes or APIs, every product has its own advantages and disadvantages—some of them are very expensive and others are free, but don’t offer the same functionality as the expensive ones The main difference between the free and commercial libraries is the ability to use external documents PDFLib has supported this through its PDI interface, but the free classes didn’t external documents, until I released FPDI for

FPDF, which gives you the same muscle—but for free!

Trang 27

FPDI in Detail

define the name of the image and its real object relation After this, you can simply refer to the image by using the name you provided in the content stream As FPDF, and any other PDF generators, use named relations, which lead into name conventions, you have to pay attention when updating a PDF

If you’ve read Marco’s article, you’ll remember that there’s a part in it where he searches for the next available font name This check has to be built into FPDF before every piece of code where FPDF creates a named relation

Another disadvantage of updating documents is that you cannot remove single pages, or reuse an existing page in an easy way This method will, however, allow us

to reuse, resize, crop or rotate page We can also avoid naming conventions, because every imported page has its own kind of namespace in the new document, as you’ll see below

The Basics

While I was studying the PDF reference to find a good solution for importing pages, I came across a technique with the spooky name of “form XObjects” I’m sure that everyone who stumbles upon this term thinks about conventional “forms” like those that we use in HTML, or

on paper In this case, “form” has another meaning: it corresponds to the notation of forms in the PostScript language

A form XObject can be compared with a kind of layer

It is a self-contained description of any sequence of graphics objects—its whole structure is almost similar

to the structure of a single page in a PDF document The form XObject has its own resource dictionary, where named relations are defined So, it seemed to be the perfect solution for my problem: if I could create form XObjects, I most certainly would be able to convert pages into them

But, form XObjects have more advantages than simply preparing FPDF for PDF import For example, they can be reused at any time in a PDF document, where the viewer application can cache the rendered results to optimize the execution It sounded like a kind of template to me, so I began extending FPDF with this feature, which resulted

in a PHP class called fpdf_tpl This class redirects all output made by FPDF into containers which will be used

as form XObjects, so one can reuse any output created with FPDF, at any time

This class has more to offer than merely preparing FPDF for FPDI—as already stated You can reuse a template multiple times in a document, whereas it only needs to

be written once to the resulting document, which leads

to less memory usage and processing time in your script

When I was working with FPDF, I was often challenged

with a situation where I had to rebuild a whole document,

programmatically As you can imagine, this part was

very frustrating, tedious, and time consuming A digital

version of your document is sitting right in front of you,

and you just cannot use it

Similarly, I ran into additional problems when dealing

with vector based graphics and FPDF There was no real

way to import such things, except by converting them to

bitmaps and using the Image() method of FPDF I’m sure I

don’t have to explain the drawbacks to this workaround

When I found an article in php|architect (Vol 3, Issue

5) where Marco Tabini described how to parse a PDF and

update it with some simple content, I got the idea to

implement this technique into FPDF—which resulted in

a library which was also named with 4 simple chars: FPDI

(Free PDF Import)

I released my new library under the Apache

Software License 2.0, which allows you to use it in your

commercial or non-commercial projects The project

homepage can be found at http://fpdi.setasign.de The

article by Marco is freely available as a monthly sample,

at http://www.phparch.com/issuedata/articles/article_110.pdf

In this article, I’ll introduce you to FPDI, explain

how it was born, and cover its internal workings I will

assume that you have some knowledge of FPDF, and have

a bit of experience with the Portable Document Format,

itself If not, just download FPDF, and run the tutorials

that Olivier provided in the package This article will not

tell you how to use FPDF, but will delve deeper into the

details of the PDF structure and how FPDI extends FPDF,

bringing out the ability to import single pages of existing

PDF documents—not just modifying existing documents

This feature is not that clear to most people out there

At this point I could tell you much about the structure

of a PDF document, but as I already mentioned, the whole

idea is based on another article, where everything you

need to know about parsing a PDF is already described

I will cover some details about that issue later in this

article

I want to make it clear why I chose the “import single

pages” method, instead of “really modifying/updating” a

PDF To put it simply: “It is much easier.” You can look at

a PDF document as a collection of single objects which

are linked to each other Pages, images, font descriptions,

and document information are all single objects and can

be identified by a unique ID

The PDF format is more flexible than just assigning

objects by simple IDs, though—it allows one to define

named relations For example, these relations can be

used to put an image into a content stream of a PDF

page You have to set up a resource dictionary, where you

Trang 28

FPDI in Detail

templates, approximately 1.2 MB

I hope that the main advantage of fpdf_tpl is now clear Let’s skip ahead and take a deeper look at this class The class uses an array for holding all created templates named $this->tpls where each entry describes

a single template as an array with special keys The main entries in each template array are x, y, w, h and buffer All other entries are just used to save other information, and are prefixed with o_

A new property, with the name of $this->res is used

to assign resources like fonts, images, or other templates,

to the template or the page The assignment of resources

to single pages is left in for testing purposes, and will be removed in the next release of fpdf_tpl

30 static $content = null ;

31 $this -> SetFont ( ‘Arial’ , ’B’ , 10 );

32 $this -> SetFillColor ( 255 , 153 , );

33

34 $this -> Rect ( $this -> lMargin , 28 , $width =

35 $this -> w $this -> rMargin - $this -> lMargin , 3 , ‘F’ );

36 $this -> Rect ( $this -> lMargin , $this -> h 10 ,

48 $content = file_get_contents ( FILE );

49 $this -> SetFont ( ‘Courier’ , ’’ , );

50 $this -> MultiCell ( $width - 3 , 2.5 , $content );

51 }

52

53 // For debugging purpose

54 function pdf ( $orientation = ’P’ , $unit = ’mm’ , $format = ’A4’ )

55 {

56 $this -> _startTime = microtime ();

57 parent :: fpdf_tpl ( $orientation , $unit , $format );

58 }

59

60 // For debugging purpose

61 function Close () {

62 $this -> _endTime = microtime ();

63 $this -> _writingTime = true ;

67 list( $usec , $sec ) = explode ( “ “ , $this -> _startTime );

68 $start = ((float) $usec + (float) $sec );

69 list( $usec , $sec ) = explode ( “ “ , $this -> _endTime );

70 $end = ((float) $usec + (float) $sec );

71 $time = $end - $start ;

77 for( $n = 0 , $c = count ( $this -> pages ); $n < $c ; $n ++)

78 $buffersize += strlen ( $this -> pages [ $n ]);

79 for( $n = 0 , $c = count ( $this -> tpls ); $n < $c ; $n ++)

80 $buffersize += strlen ( $this -> tpls [ $n ][ ‘buffer’ ]);

Examples of its use are: the generation of headers and/or

footers, table headers which could be repeated on every

page, a background grid of large tables, text in front or

behind a template, etc

If you take a look at Listing 1 and Figure 1, you’ll

see a sample script which demonstrates the use of

templates You turn templates on and off by setting

the $pdf->useTPLs property to true or false—the visual

result is the same This demo has no real meaning, but it

shows how much the file size and process time decrease if

you’re using templates My tests gave me a process time

of only 0.0766 seconds when using templates, and 3.649

seconds without them! The same was true for the buffer

size: with templates it only takes up 14.5 kb—without

Trang 29

PODCAST AD

Trang 30

So, we’ll only take a look at the tpl key in $this->res

This array is needed to rebuild the form XObjects

resources dictionary with named relations, which are

used in the template To redirect the output made by

FPDF, I used a simple flag, $this->intpl, and extended

the _out() method I had to take special care because a

form XObject cannot include internal or external links or

better, any kind of annotation

FPDF uses a single, global resource dictionary for all

pages and creates this within the _putresources() method

I extended this method to make it call _puttemplates(),

which will create all necessary template objects After

the objects are created and written, the named relations

to them will be written to the main resource dictionary

All created templates are usable on every page!

Unfortunately, using the global resource dictionary isn’t

the best solution because it’ll introduce problems when

interpreting or extracting pages of a document, as you

will see later

With the fpdf_tpl class, I’ve build the basis for

FPDI—now, we have to convert the pages of an already

existing PDF document, but we have to parse it first, to

get the desired information

Parsing the Original Document

I owe a lot of credit to Marco’s article, because the

parsing of an existing document was nearly completely

covered in it

I adapted all parsing functions into a single class,

pdf_parser, and added support for reading streams Let’s take a quick look at the structure and how the parsing has to be done The first task that the parser has to do

is to read the xref-table of the PDF document This is done by the pdf_parser::pdf_read_xref() method The xref-table is similar to a table of contents It gives us information about the objects used in the document, and their byte-offset positions in the file At the end of the xref-table, we’ll find the file trailer dictionary; the entries

in this table lead us to the catalogue dictionary of the file The catalogue dictionary is the root of all objects

in the document’s object hierarchy and we’ll find the reference to the first page tree node of the document’s page tree—which is exactly what we’re searching for: all single pages used in the existing document

The parser has to follow the whole page tree to get the exact page count and to collect other information on the pages, which is done by read_pages() in the extended class, fpdi_pdf_parser, and results in an array as the

$this->pages property The keys of $this->pages are the desired page numbers starting at zero where each entry holds the related page object After this task is done,

we have enough information about the source document for now

While I was implementing this code, I got stuck

on some problems—it took me several days (and nights) to fix them A great problem for me was the determination of the line ending in a file Normally, this task is handled by the PHP configuration directive

Trang 31

FPDI in Detail

distance of the first to the third and the second to the fourth value This bug has been overlooked for a long time, because its only manifests itself if the MediaBox’s

x- or y-value have values other than 0 It’ll be fixed in the future!

To resolve the MediaBox’s data, the extended parser for FPDI is shipped with a getPageBox() method This method is needed, because the MediaBox (or any other box) can also be referenced to another PDF object, or the value can be inherited by a parent page in the page tree This method makes sure that the correct values will be resolved Currently, FPDI supports only PDFs that contain a MediaBox—there are other boxes in the PDF specification e.g a CropBox or a TrimBox If your PDF uses other boxes instead of a MediaBox, the results of FPDI might not be as expected Also if another box is used, you can ignore the bug described in the paragraph above

The next task is to fill the buffer of our template with the content stream of the imported page There’s one important difference between a PDF page and a form XObject: a page can have multiple content streams, while a form XObject can only have one Because of this issue, we have to concatenate all content streams

of a page into one single stream To do this, there’s a method called getPageContent() in the extended parser (fpdi_pdf_parser)

All of these resolved streams can be encoded with different filters The most commonly used filter is the

FlateDecode filter which can be decoded with the zlib

functions, if they are enabled in the PHP installation I’ve also written 2 more decoders for the LZWDecode- and

ASCII85Decode-filters With these 3 filters, FPDI should handle nearly all documents which have encoded page content streams—until now there have been no bug reports related to an absent filter The decoding of the content streams is done by the rebuildContentStream()

method, in the extended parser class After decoding all streams, they can be simply concatenated to a single one and assigned to the buffer key in the desired template array

The next step is to resolve the resources which are used in the content streams we want to import These can be relations to images, fonts or other form XObjects The resources are normally defined as named relations

in the page dictionary, or in one parent page in the page tree To resolve them, the extended parser offers a

_getPageResources() method, which returns the desired resource data of the page The method will not resolve the resource’s own data, but only the information like its name, and to which objects it is referenced in the original document The real import of these resources

auto_detect_line_endings, but as a PDF file can have

multiple updates by different programs (on different

operating systems), the line endings can be mixed To

overcome this issue, I’ve written a wrapper for fgets()

which comes in use as a fallback function if fgets()

returns incorrect data This wrapper function also enables

the class to be used with a PHP-version less than 4.3,

where auto_detect_line_endings was introduced

To make FPDI compatible with PHP versions less than

4.3, I also created other wrapper functions for strspn()

and strcspn() where introduced so that FPDI should run

with php 4.2+

During my testing (with hundreds of PDF files), I

found several minor bugs in the parsing process—some

are fixed and some are so raw that they can be ignored

for now

Let’s Convert a Page to a

Form XObject

First, we’ll take a deeper look at a page object found in

$this->pages of a parser object A PDF object is represented

internally as an array, in a specified structure, as Marco

defined in his article For demonstration purposes, we

use the shipped demonstration PDF with FPDI:

$pdf =& new fpdi();

$pdf->setSourceFile(‘classes/pdfdoc.pdf’);

echo “<pre>”;

print_r($pdf->current_parser->pages[0]);

You can see the output in Listing 2 At first look, it

seems very odd, but everything makes sense! Every entry

in any level is built as an array with at least the keys

0 and 1, where 0 describes the type of the value in key

1 All other keys are used to define special attributes

of that value The types are defined as constants in

pdf_parser.php For example the 0 key in the lowest level

is 9 which is defined as a PDF object This object’s value

is a dictionary (5)—in this case a page dictionary—with

tokens that each have their own value types

To import a page, FPDI offers a method called

ImportPage() which is close to the BeginTemplate()

method of fpdi_tpl As we’ve seen, the structure of a

template entry in $this->tpls contains main entries like

x, y, w, h and buffer

If we take a closer look at Listing 2, we can see a

relationship between these entries /MediaBox is an array

(6) of exactly 4 entries, whose value types are numeric

(1) The first entry’s value is that of x, the second of y,

third of w and, not surprisingly, the last one of h This is

actually a bug in the current release of FPDI The last 2

values are also coordinates The real values for the width

and the height have to be calculated by specifying the

Trang 32

A PDF cannot be compared to a file with

a structural language like HTML.

1 <?php

2 define ( FPDF_FONTPATH , ‘classes/font/’ );

3 require_once( ‘classes/fpdi.php’ );

4

5 $pdf =& new fpdi ( ‘L’ , ’pt’ );

6 // load the origin document

7 $pagecount = $pdf -> setSourceFile ( ‘pdfs/article_110.pdf’ );

17 // use the imported page

18 $size = $pdf -> useTemplate ( $tplidx , $x , $y , 250 );

19 // draw a border around the used page

20 $pdf -> Rect ( $x , $y , $size [ ‘w’ ], $size [ ‘h’ ], ‘D’ );

21

22 // if it’s the third page in a row do a

23 // pagebreak and reset the x- and y-values

Ngày đăng: 24/01/2014, 14:20

TỪ KHÓA LIÊN QUAN

w