1. Trang chủ
  2. » Giáo án - Bài giảng

enterprise integration with ruby, the pragmatic programers (2006)

311 372 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Enterprise Integration with Ruby
Tác giả Maik Schmidt
Trường học The Pragmatic Bookshelf
Chuyên ngành Enterprise Integration with Ruby
Thể loại Book
Năm xuất bản 2006
Thành phố Raleigh, North Carolina
Định dạng
Số trang 311
Dung lượng 2,42 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

We will access relational databases such as Oracle and MySQL and we will work with LDAP repositories.. When you have to create a small standalone application— one that only relies upon a

Trang 1

Prepared exclusively for Jacob Hochstetler

Trang 2

Beta Book

Agile publishing for agile developers

The book you’re reading is still under development As an experiment,we’re releasing this copy well before we normally would That wayyou’ll be able to get this content a couple of months before it’s avail-able in finished form, and we’ll get feedback to make the book evenbetter The idea is that everyone wins!

Be warned The book has not had a full technical edit, so it will tain errors It has not been copyedited, so it will be full of typos.And there’s been no effort spent doing layout, so you’ll find bad pagebreaks, over-long lines, incorrect hyphenations, and all the other uglythings that you wouldn’t expect to see in a finished book We can’t

con-be held liable if you use this book to try to create a spiffy applicationand you somehow end up with a strangely shaped farm implementinstead Despite all this, we think you’ll enjoy it!

Throughout this process you’ll be able to download updated PDFsfromhttp://books.pragprog.com/titles/fr_eir/reorder.When the book is finally ready, you’ll get the final version (and

subsequent updates) from the same address In the meantime,

we’d appreciate you sending us your feedback on this book at

http://books.pragprog.com/titles/fr_eir/errata.

Thank you for taking part in this experiment

Dave Thomas

Trang 3

Enterprise Integration with Ruby

A Pragmatic Guide

Maik Schmidt

The Pragmatic Bookshelf

Raleigh, North Carolina Dallas, Texas

Prepared exclusively for Jacob Hochstetler

Trang 4

B o o k s h e l f

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks Where those designations appear in this book, and The Pragmatic Programmers, LLC was aware of a trademark claim, the designations have been printed in initial capital letters or in all capitals The Pragmatic Starter Kit, The Pragmatic Programmer, Pragmatic Programming, Pragmatic Bookshelf and the linking g device are trademarks of The Pragmatic Programmers, LLC.

Every precaution was taken in the preparation of this book However, the publisher assumes no responsibility for errors or omissions, or for damages that may result from the use of information (including program listings) contained herein.

Our Pragmatic courses, workshops, and other products can help you and your team create better software and have more fun For more information, as well as the latest Pragmatic titles, please visit us at

http://www.pragmaticprogrammer.com

Copyright © 2006 The Pragmatic Programmers LLC.

All rights reserved.

No part of this publication may be reproduced, stored in a retrieval system, or ted, in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior consent of the publisher.

transmit-Printed in the United States of America.

ISBN 0-9766940-6-9

Printed on acid-free paper with 85% recycled, 30% post-consumer content.

B1.2 printing, January 2006

Version: 2006-1-24

Trang 5

1.1 What Is Enterprise Software? 2

1.2 What Is Enterprise Integration? 3

1.3 Why Ruby? 3

1.4 Who Should Read This Book? 5

1.5 PragBouquet 5

1.6 Acknowledgments 6

2 Databases 8 2.1 The Coupon Application 9

2.2 Database Interface (DBI) 25

2.3 Object-Relational Mappers 28

2.4 Lightweight Directory Access Protocol (LDAP) 51

3 Processing XML 75 3.1 A Short XML reminder 77

3.2 Generating XML documents 79

3.3 Processing XML Documents 91

3.4 Validating XML Documents 123

3.5 Are There Alternatives to XML? 128

4 Low Ceremony Distributed Applications 141 4.1 “I’d Rather Use a Socket” 142

4.2 Remote Procedure Calls Using HTTP 155

5 Distributed Applications with RPC 175 5.1 Another Day, Another Protocol 175

5.2 We Will Take No REST, Will We? 185

5.3 SOAP 196

5.4 CORBA, RMI, and Friends 210

Prepared exclusively for Jacob Hochstetler

Trang 6

CONTENTS vi

6.1 Internationalization and Localization 230

6.2 Logging 250

6.3 Creating Daemons and Services 269

6.4 Build and Deployment Process 276

6.5 Project Automation with Rake 293

6.6 Testing Legacy Applications 304

Trang 7

There are two types of complex systems: those that have

grown out of simpler systems and those that do not work.

Unknown

Chapter 1

IntroductionHave you ever worked for a big enterprise? Do you remember yourexpectations as you walked into work on that first day? Whistling asthe sun shone brightly, you might have been thinking “It will be great

to work for <company name here> They will have a professional ronment, where coffee is free, where every system has been specifiedaccurately, implemented carefully, and tested thoroughly Hmmmm

envi-I wonder which database and programming language they use.”

After your fifth cup of free coffee (around 9:07) you came to realize thatthe real world looks completely different from your expectations Typi-cal enterprises use dozens, hundreds, and sometimes even thousands

of applications, components, services, and databases Many of themwere custom-built in-house or by third parties, some were bought, oth-ers are based on Open Source projects, and the origin of a few—usuallythe most critical ones—is completely unknown A lot of applicationsare very old, some are fairly new and seemingly no two of them werewritten using the same tools They run on heterogeneous operating sys-tems and hardware, they use databases and messaging systems fromvarious vendors, they were written in different programming languages.The reasons for this are manifold You can find countless books explain-ing why the situation is so bad You can even find books claimingthat they help you to prevent such a chaos This book uses anotherapproach We will not help you to clean up this mess, but we willhelp you to deal with the problems pragmatically Instead of complain-ing that valuable data is spread across different database schemas oracross databases from several vendors, we will write code that inte-grates it We will take it even a step further and write new applicationswhich aggregate all your existing resources It doesn’t matter if we

Prepared exclusively for Jacob Hochstetler

Trang 8

WHATISENTERPRISESOFTWARE? 2

have to use relational databases, LDAP repositories, XML files, or web

services based on different protocol standards We will blend data from

multiple, disparate databases to create new business knowledge

Along the way we’ll show you how to solve all the small day-to-day

problems These are the things that occur over and over again,

espe-cially when developing enterprise software We will access relational

databases such as Oracle and MySQL and we will work with LDAP

repositories We’ll show you how to do application logging, how to

deploy your software, how to automate tedious and error-prone tasks,

and how to survive in an international environment Oh, and as you

might have guessed already from the book’s title, we will use Ruby to

accomplish all these things

1.1 What Is Enterprise Software?

In Patterns of Enterprise Application Architecture [?], Martin Fowler writes:

“Enterprise applications are about the display, manipulation, and

stor-age of large amounts of often complex data and the support or

automa-tion of business processes with that data.”

That’s a concise but nevertheless abstract definition, because every

non-trivial piece of software has to store, manipulate, and display data

Video games do nothing else (and modern video games also need huge

amounts of data that often can get complex) The key point in the

defi-nition above is the second part: that the data in enterprise applications

is used for business processes and not for rendering alien space ships

Unsurprisingly, there are more differences between enterprise

applica-tions and other types of software For example, enterprise applicaapplica-tions

are often created only for a small user group that is in close contact

with the development team, implying the developers know their

cus-tomers very well In extreme cases programs are written for only a

single person (special report generators for theCEO, for example)

Enterprise software demands a certain set of tools Large amounts

of data—complex or not—have to be stored somehow and somewhere

Often it is stored in relational databases, but it can also be in plain

text files orLDAP repositories In addition, modern enterprise software

is often based on distributed architectures consisting of many small

to mid-size components that perform specialized tasks and that are

connected by some kind of middleware such asCORBA,RMI,SOAP, and

XML-RPC

Trang 9

WHATISENTERPRISEINTEGRATION? 3

Obviously, as an enterprise software developer you’re better off if you

know how to deal with such technologies You shouldn’t be troubled

by the details of reading from a relational database or accessing aLDAP

repository Mastering skills such as these help you to concentrate on

the fun stuff—the application itself

1.2 What Is Enterprise Integration?

Enterprise integration is a rather vague term and cannot be defined in

a strict mathematical sense Simply put, it happens whenever you use

an existing enterprise resource to achieve some results If you use an

existing database or web service in your application, you’re

perform-ing enterprise integration If you build a new component that is used

by other pieces of your existing architecture, you’re doing enterprise

integration, too

Integration needn’t just happen inside a single enterprise It’s possible—

and not too unusual—that the software or data of two different

enter-prises has to be integrated If you’re using a payment gateway to bill

your customers, for example, you’re effectively integrating enterprise

software

You might ask yourself if every development activity in an enterprise

environment is some kind of enterprise integration There are a few

exceptions Enterprise integration does not happen when you build a

completely new piece of software from scratch, for example In reality

this case is rare, but from a theoretical point of view this is the only

clear exception

Enterprise integration often means integration with standard software

such as databases, LDAP repositories, message queues, ERMsystems,

and so on If you’re using one of these technologies, chances are good

that you’re doing some enterprise integration

1.3 Why Ruby?

Most enterprise software running today was written in languages such

as COBOL, C/C++, and Java Because of its distributed nature,

enter-prise software often makes it easy to use new tools and programming

languages When you have to create a small standalone application—

one that only relies upon an existing database, SOAP service, or LDAP

repository—it almost doesn’t seem to matter if you were to write it in

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 10

WHYRUBY? 4

C++, Java, or Ruby But if you look into it more deeply, dynamic

lan-guages such as Perl, Python, and Ruby have many of advantages,

espe-cially in enterprise environments:

• They are interpreted and do not need a compile phase, which

increases development speed tremendously After editing your

program you can see the results of your changes immediately

• Enterprise software is about munging data Dynamic languages

are designed to handle data, and include high-level data types

such as hashes

• Memory management is dealt with by the language This is a great

advantage over languages such as C++ where you have to specify

the length of each string you read from a database Dynamic

lan-guages prevent waste and result in more concise, more robust,

and more secure software

• Software written in dynamic languages is installed as source code,

so you always know exactly which version is currently running on

your production system Gone are the days when you had to guess

if a certain binary executable is the right one

We will show you Ruby’s strengths and how it helps you to

accom-plish many tasks much faster, more elegant, and with more fun than

with any other programming language available today But, even more

important, we will also tell you about Ruby’s weaknesses Ruby is

com-paratively young and although the core of the language is mature and

lots of excellent libraries are available, many things are still missing or

incomplete

Although there is no industry standard for enterprise programming

with Ruby (as there is with J2EE or NET), everything you need is

readily available The most important libraries come with every Ruby

distribution and the standard distribution has grown rapidly over the

last years All the other stuff can be found in public places such as

RubyForge1 or the Ruby Application Archive2

1http://www.rubyforge.org

2http://raa.ruby-lang.org

Trang 11

WHOSHOULDREADTHISBOOK? 5

1.4 Who Should Read This Book?

This book was written for experienced enterprise developers who know

Java, C#, or C++, but don’t know much Ruby (although you should

probably have read Programming Ruby [?]) We assume you are familiar

with relational databases and have at least an idea whatLDAPis Maybe

you do not know RELAX NG, but you understand the concepts of XML

and you know what well-formed, SAX2, and DOM mean

You’ve probably used tools such as object-relational mappers Maybe

you’re familiar with Enterprise Java Beans (EJB), Java Data Objects

(JDO) and so on Maybe you’re fed up with editing configuration files

instead of coding You are looking for better ways to integrate the

exist-ing resources in your company and you are lookexist-ing for better ways to

quickly create new and fancy applications based on all the wonderful

stuff you already have

Depending on the tools you’ve used to build your architecture, different

choices are available for the integration process If you’re using

mes-sage queues you have a lot of freedom and flexibility for integrating your

services and software with others The same holds true for all kinds of

web service protocols It’s slightly different with databases, because

they usually do not offer interfaces as clean as message based systems

do Sometimes you have to access tables directly, sometimes you have

to use a set of stored procedures written in a proprietary database

pro-gramming language

In this book we do not talk about sophisticated messaging patterns

Instead, we cover the basics We show you how to use databases, web

services, XML files, and all the other legacy stuff you want to combine

for building new applications

1.5 PragBouquet

To make things more interesting and tangible we’ve founded an

imag-inary company called PragBouquet It sells flowers from a web shop

Customers from all over the world can order flowers and send them to

people living in the United States

PragBouquet’s business demands a lot of components and services It

depends on several partners, too Their current infrastructure is shown

in Figure1.1, on the following page Customers place orders in the web

shop The shop communicates with the central order system Because

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 12

ACKNOWLEDGMENTS 6

Figure 1.1: PragBouquet Infrastructure

PragBouquet has no billing system, the order system uses an external

payment gateway to charge orders In parallel the production system is

informed of new orders and busy florists create wonderful bunches of

flowers Eventually, the floral goods are picked up by a parcel service

and are delivered to the happy recipient

This is only a rough overview We’ll show single components in more

detail when necessary

1.6 Acknowledgments

First, I’d like to thank Dave Thomas and Andy Hunt for giving me the

opportunity to write this book for The Pragmatic Bookshelf Working

with them has been both an honor and a pleasure I couldn’t imagine

better or more professional working conditions

It would be impossible to write a book about software for enterprise

integration without the software itself The following gentlemen kindly

made their ingenious work public for free, and have always responded

quickly and accurately to all my questions: Yukihiro “matz” Matsumoto,

Will Drewry, arton (the author of Rjb), Sean Russel, Ian Macdonald,

Takaaki Tateishi, Thomas Uehlinger, Jim Weirich, Nikolai Lugovoi, Daniel

Berger, why the lucky stiff, Minero Aoki, Michael Neumann, Kubo

Take-hiro, Tomita MasaTake-hiro, Matt Mower, David Heinemeier Hansson, Hiroshi

Trang 13

ACKNOWLEDGMENTS 7

Nakamura, John W Small, Takahashi Masayoshi, Gotou Yuuzou, Yoshida

Masato, and Grant McLean

Please, stand up with me and give my reviewers a round of applause:

Frank Tewissen, Matthias “Matze” Klame, Uwe Simon, and Kaan Karaca

did an awesome job! Without their corrections and suggestions this

book wouldn’t be half as good

A loud “Thank you very much!!!” goes to all the people who sent errata

and suggestions during the beta book process: Lee Grey, Hoang Uong,

Ola Bini, Ron Lusk, John Athayde, Blair Zajac, Jim Weirich, Pat

Poden-ski, Gregory Brown, Lachlan Dowding, Sean, Eldon, Stuart Halloway,

Raymond Brigleb, Ken Barker, Peter Morelli, Eric-Olivier Lamey, and

Jim Kimball

Perhaps there are authors who write books in isolation under a rock

or on a lonesome island Fortunately, I didn’t and got invaluable

sup-port from a lot of wonderful people I am deeply grateful to my

par-ents (this one’s for you), my sister Yvonne Janka (yet another book you

won’t read?), my brother Andrè Schmidt (for relaxing shopping/running

tours and even more relaxing evenings with “the boys”), Christian &

Agnieszka Rattat (for being true friends when I needed them most),

Frank Tewissen (for listening patiently and for advising carefully), Manu

(for being “die Manu”! Heja BVB!), AleX Reinartz (I’m looking forward to

the next decades), Bettina Hamidian & Corinna Lorscheid (for

insight-ful talks and lots of fun), Katja Wevelsiep (let’s have a coffee tomorrow,

OK?), Frank Möcke (for giving me the opportunity to publish texts in

my mother tongue), Dr Andreas Kötz (for your appreciation), and to the

“Gleis drei” staff (for providing a perfect proof-reading environment)

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 14

Chapter 2

Databases

Database management systems are one of the oldest and most widelyused applications in information technology—they are indispensable toenterprises It’s nearly impossible to do some serious enterprise inte-gration without touching some kind of database directly or indirectly.Various types exist (relational databases, object-oriented databases,directory services,XMLdatabases, and hash databases such as Berke-ley DB) They differ mainly in the way data is organized and accessedinternally Under the hood, though, they are all similar: data is stored

in some kind of file system and is accessed through a special layer,often over a network You can find one or more of the different types

in every company, but relational databases are by far the most popularones in use today

Although it’s often tedious, repetitive, and error-prone work, accessingdatabases is, in principle, easy You open a connection, create andexecute some statements, read and process some data, and finally freeall resources occupied At least, that’s how the Gods Of Persistencewanted it to be But real life in our sinful world looks different Infor-mation and business logic is often spread across different schemas anddatabases To make things even worse many companies use productsfrom multiple vendors This happens for various reasons: they want

to prevent vendor lock-in, our company is the product of a corporatemerger, different departments prefer different tools, and so on

Unfortunately, PragBouquet is no exception Its data is stored in bothOracle and MySql databases In this chapter we will show you not onlyhow to directly manipulate different types of databases, but also how

to access them using more advanced tools such as object-relationalmappers and database abstraction layers

Trang 15

THECOUPONAPPLICATION 9

2.1 The Coupon Application

PragBouquet’s business has been doing well, but business can always

be better, can’t it? To boost sales, the marketing department wants

to send a coupon to every customer who’s used the online store, but

hasn’t used it in the last 6 months People who have been asked not to

be e-mailed should not get an e-mail

That does not sound too difficult PragBouquet already has a mass

mailing program that expects a CSV (Comma Separated Values) file

containing e-mail addresses, customer names, and a text to be sent

The problem becomes selecting names and e-mail addresses of all

cus-tomers who did not place an order in the last 6 months, filtering out

those who do not want to be e-mailed, and writing the rest to the CSV

file

Instantly you’ve fired up your favorite text editor thinking that this is

a great opportunity to strengthen your Ruby skills CreatingCSV files

is a breeze and selecting some data sets from a database should not

be a problem, either So you ask your database administrator where

you can find the information you need and he takes you down a peg or

two He tells you that for historical reasons (an euphemism for “Nobody

knows why”) the information you need is spread across two databases

Customer data and order data are stored in an Oracle database, but the

white list containing the e-mail addresses of all customers who want to

receive e-mail from PragBouquet is stored in the web shop’s MySQL

database You scribble a bit on your notepad and realize that the

sys-tem architecture has to look like Figure2.1, on the next page

Exploring The Environment

You decide to start with the Oracle part Before moving on you want to

have a closer look at the structure of the order database Your database

administrator told you that the relevant tables are calledcustomersand

orders He gave you plenty of Microsoft Word documents describing

every single table in the order database Despite this you have a look at

the current state of affairs yourself using SQL*Plus, Oracle’sSQLshell

C:\> sqlplus scott

SQL*Plus: Release 9.2.0.1.0 - Production on Sat Jun 4 16:00:04 2005

Copyright (c) 1982, 2002, Oracle Corporation All rights reserved.

Enter password: XXXX

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 16

THECOUPONAPPLICATION 10

Figure 2.1: Coupon Application Workflow

Why didn’t we use a standard product?

You might be asking yourself if it’s a good idea for

PragBo-quet to have created its own customer and order databases?

Wouldn’t it be much easier to buy a solution off the shelf?

Cus-tomer data is at the core of every enterprise and many

pro-cesses rely upon it It’s needed for billing, for statistics, for

trou-bleshooting, and so on Although many big companies offer

software for customer relationship management, it’s never a

bad idea to think about building your own customer database

No product will fit your needs better than your own and no

product will ever be as flexible as yours

Trang 17

THECOUPONAPPLICATION 11

Connected to:

Personal Oracle9i Release 9.2.0.1.0 - Production

With the Partitioning, OLAP and Oracle Data Mining options

JServer Release 9.2.0.1.0 - Production

SQL> describe customers

- -

SQL> describe orders

- -

No big surprises here Obviously, customers are characterized mainly

by their address data and we guess that the tables are connected using

columncustomer_id in tableorders

Determine the Winners

If we’re going to use a Ruby program to extract information from an

Oracle database, we’ll need a library that connects our code to the

underlying OracleAPI There are currently three Ruby modules for

Ora-cle:

• Oracle by Yoshida Masato1

• Ruby/OCI8 by Kubo Takehiro2

• Ruby9i by Jim Kain3

Trang 18

THECOUPONAPPLICATION 12

Storing Addresses—A Plea from the Rest of the World

Even though addresses are critical for many purposes, their

data representation is often performed carelessly and

with-out foresight In particular, aspects of internationalization are

often forgotten, because designers and developers normally

do not know a lot about the administrative characteristics of

their neighbors

For example, Germany is a federal country divided into 16

states, but to the Germans the different states do not mean

a lot They aren’t part of an address, they do not occur on

envelopes, and you do not have to put them into a web form

when ordering something from an internet shop It’s not

surpris-ing that German customers get annoyed by web forms insistsurpris-ing

on a state When working in an international environment, it’s

better to make the state optional

There is no international standard for the representation of an

address In Germany, for example, a street address is street

name followed by a blank followed by the house number In

Italy, there’s a comma between the street name and the house

number Other countries put the number before the name

It’s nearly impossible to automatically separate street names

and house numbers afterwards, because house numbers can

contain nearly arbitrary characters

Another aspect of addresses that is forgotten surprisingly often

in this context is that addresses represent geographical objects

Geographical objects have coordinates, locations that are

becoming increasingly important as we move into a world

using location-based services If you want to offer location

based services to your customers some day you’ll have to

determine the geographical position of their addresses For

many cities it’s possible to locate an object down to the

indi-vidual house number

Please, don’t misunderstand me: you should not try to come up

with a solution that will work with every possible address format

in the world (I think that would probably be impossible), but

you should at least have a closer look at the countries you’re

potentially working in

Trang 19

THECOUPONAPPLICATION 13

The main difference between these libraries is their support (or lack

thereof) for new data types Gone are the days when you could only

store small strings and numbers in your database Nowadays you can

store complete books or MP3 files in CLOB (Character Large Object) or Character Large ObjectBLOB(Binary Large Object) columns Major versions of the Oracle Call Binary Large ObjectInterface (OCI) also differ in other areas, such as security, performance Oracle Call Interfaceetc

In this book we’ll use Kubo Takehiro’s Ruby/OCI8 driver—it’s actively

maintained, runs on many platforms, and provides a lot of

function-ality It comes in two flavors: A low-level and a high-level API The

low-levelAPIdirectly reflects the Oracle C library and we will not show

its usage, as the high-levelAPIis probably more convenient to use

Let’s dive into Ruby now and see how we can identify the customers

who should get a coupon

connection = OCI8.new('maik' , 'maik' )

- cursor = connection.exec(<<-SQL)

5 select a.id, a.name, a.surname, a.email

- from customers a, orders b

- where a.id = b.customer_id

- and b.created < sysdate - 180

look similar in every modern programming language First we

estab-lish a database connection by calling the new( ) method of class OCI8

(connect( ) would have been a much better name, but for the moment

we have to live with it) Thenew( ) method returns a connection object,

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 20

THECOUPONAPPLICATION 14

that can be used to communicate with the database server and to create

other database objects, such as statements and cursors

The SQL statement joins the tables customers and orders and returns

only those customers whose last order is older than 180 days The

sub-select identifies the most current entry for each customer and makes

sure that every customer is returned only once

As you can see,SQLstatements can be executed directly by calling the

exec( ) method of an OCI8 connection For SELECT statements, exec( )

returns a so-called cursor representing a result set on the database

server Clients can move through a result set by calling fetch( ) on the

cursor object After the last row has been read from the cursorfetch( )

returnsnil

Eventually, we close our cursor to free valuable resources on the database

server Cursors are resources like file handles, and are in limited

sup-ply If you’re a bad citizen and failed to free off these resources, Oracle

will raise an exception sooner or later

Admittedly, our example is concise and expressive, but using Ruby’s

iterators automatically leads to a more elegant solution with less explicit

5 select a.id, a.name, a.surname, a.email

- from customers a, orders b

- where a.id = b.customer_id

- and b.created < sysdate - 180

Trang 21

THECOUPONAPPLICATION 15

seymour@example.com

Found 2 coupon recipients.

Whenexec( ) is called as an iterator—with a code block—it returns the

number of rows selected The code block automatically gets each row

fetched as a parameter and you no longer have to close the cursor

explicitly Actually, you don’t even notice that you’re working with a

cursor

Enhancing Flexibility

OK, our first example works We know where to get the data from and

we know how to get it, so let’s turn our little script into software First

of all, we have to replace the constant 180 days with something more

dynamic To do this, we could create the string containing the SQL

statement on the fly, substituting in the time value, but this approach

has some serious drawbacks

As we already know, theSQL statement gets transferred over the

net-work to the database server whenever we call exec( ) Then it gets

parsed, analyzed, optimized, executed, and eventually the result is sent

back to the client

Actually, modern database servers try to optimize a lot Part of this

process is the creation of a so-called query execution plan for every query execution planstatement they receive Current Oracle versions even try to compress

the result sets before sending it back to the client to decrease

band-width and processing time ForSQLstatements that are executed often

this means that we could gain a lot if the statement could be parsed,

analyzed, and optimized only once

Furthermore, building SQLstatements on the fly often creates

danger-ous security holes What if someone uses a web form to pass us the

following string for the number of days?

'180; delete from customers; commit;'

In the worst case the database server will happily execute the malicious

statement giving you an excellent opportunity to check if your backup

system is working properly This common kind of attack is called SQL

Fortunately, it is possible to circumvent all these disadvantages by

using so called prepared statements We transmit a statement tem- prepared statementsplate to the server, where it is parsed, analyzed, and optimized The

server then sends back a statement handle All the dynamic portions

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 22

THECOUPONAPPLICATION 16

of our statement are replaced by placeholders Whenever we want to

execute our statement, we only send the server the handle and the

actual values for our placeholders

Customer = Struct.new(:id, :name, :surname, :email)

- def initialize(connection)

- @find_stmt = connection.parse(<<-SQL)

- select a.id, a.name, a.surname, a.email

- and b.created < sysdate - :days

First of all, we have inserted a placeholder (:days) into theSELECT

state-ment Then we create a prepared statement by callingparse(sql)on our

connection This method returns a handle identifying our statement on

the server

Callingbind_param( ) in line 17 binds the:daysplaceholder to its actual

value and in the following line we finally execute theSELECT statement

@find_stmt is referring to The rest is business as usual Using the

CustomerFinderlooks like this:

- finder = CustomerFinder.new(ora_connection)

- customers = finder.find(180)

- customers.each { |c| puts c.email }

5 ora_connection.logoff

Trang 23

THECOUPONAPPLICATION 17

Respecting Customer Privacy

So far, so good We can create a list of all customers that should

poten-tially get a coupon, but we still have to sort out those who do not want

to receive e-mails from PragBouquet As we’ve already learned, this

information is stored in the web shop’s MySQL database There we can

find a table called whitelistcontaining a list of all e-mail addresses that

we are allowed to use

MySQL, created by Monty Widenius, is one of the most popular Open

Source databases at the moment It started as a thin wrapper for the

mSQL database and has grown over the years into a full-blown

trans-actional database management system MySQL support in Ruby was

made possible by the great work of Tomita Masahiro He has developed

both a C library binding called MySQL/Ruby4 and a pure Ruby

bind-ing called Ruby/MySQL5 Thanks to a patch written by Matt Mower,

Ruby/MySQL now also works with MySQL version 4.1.1 and later.6

In this book we’ll use the pure Ruby implementation (for no special

reason) As with our order database we first examine the webshop

database using the MySQL shell:

C:\>mysql webshop

Welcome to the MySQL monitor Commands end with ; or \g.

Your MySQL connection id is 3 to server version: 4.0.22-nt

Type ' help; ' or ' \h ' for help Type ' \c ' to clear the buffer.

mysql> describe whitelist;

+ -+ -+ -+ -+ -+ -+

+ -+ -+ -+ -+ -+ -+

Trang 24

THECOUPONAPPLICATION 18

connection = Mysql.new( 'localhost' , '' , '' , 'webshop' )

- whitelist = connection.query('select * from whitelist' );

5 whitelist.each_hash { |h| puts h[ 'email' ] }

Here we have a textbook example of database use: create a

tion, execute a query, print its result, and finally close the

connec-tion What more could we say that hasn’t already been expressed in

the code? Alright, we have some details for you Calling thequery(sql)

method returns an object of class Mysql::Result that represents a

com-plete result set You can read the single rows of a result set using

various methods—here we chose each_hash( ) It returns a Hash for

every row where the column names are the hash keys with the data as

the corresponding values

Printing the whole whitelist was not exactly what we wanted Instead

we have to check whether a certain email address is contained in the

whitelist That means we have to execute a statement such as

select count(*)

from whitelist

where email = 'email@example.com'

and see if it returns 1 Obviously, the email address in thewhereclause

of our statement is variable and from what we’ve learned in Section2.1,

Enhancing Flexibility, on page15, you might assume it would be a good

idea to use a prepared statement for this purpose You are absolutely

right: it would be a good idea, but unfortunately support for prepared

statements in MySQL is a rather new feature It was introduced in

version 4.1 and the current Ruby drivers do not support it

Trang 25

THECOUPONAPPLICATION 19

Obviously,num_rows( ) returns the number of rows in a result set (which

is what we wanted to determine) In use, our Whitelist class looks as

follows

- whitelist = Whitelist.new(connection)

- puts whitelist.contains?('homer@example.com' )

- puts whitelist.contains?( 'unknown_address' )

5 connection.close

produces:

true

false

We’ve created our SQLstatement using strings Does it make you feel

comfortable? Although the coupon application is an internal project,

the e-mail addresses come from an external source and so you should

never trust them In addition, it’s really wasteful to execute an SQL

statement for every single e-mail address So, we will trade some space

for time and read all e-mail addresses into a Hash initially

class Whitelist

- def initialize(connection)

- result = connection.query( 'select email from whitelist' );

- result.each_hash { |h| @whitelist[h['email' ]] = true }

That’s a really good compromise Even if we have to read several

thou-sand e-mail addresses into memory, it’s still a low price for the

perfor-mance and security we get

Joining Forces

We have everything available now to create the list of our lucky coupon

recipients: we can read all potential customers from the Oracle order

database and can look them up on the white list stored in the MySQL

webshop database Because the mailing program expects data as CSV

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 26

THECOUPONAPPLICATION 20

we reopen theCustomerclass and add an appropriate method (see

Sec-tion 3.5, Comma-Separated Values (CSV), on page 129, to learn more

about Ruby’sCSVlibrary)

The following program then printsCSVdata to the console so it can be

easily redirected to the mass mailing program

- require 'whitelist'

# Read all potential customers

5 ora_connection = OCI8.new( 'maik' , 'maik' )

- finder = CustomerFinder.new(ora_connection)

- customers = finder.find(180)

- ora_connection.logoff

-10 # Sort out customers not in whitelist

- mysql_connection = Mysql.new('localhost' , '' , '' , 'webshop' )

That’s it We could happily move to the next project But wouldn’t

it be interesting to know how many customers actually convert their

coupon? To do this, we have to store at least the customer ids of all

coupon recipients somewhere Let’s put it into the order database in

a new table called coupon_recipients This will let us check to see how

many of the customers on this list placed an order after the coupon

mailing

Trang 27

THECOUPONAPPLICATION 21

File create table coupon_recipients (

customer_id int not null,

created timestamp default sysdate

);

For the first time in this chapter we’re going to write data into the

database It’s nearly the same as reading information, but there are

a few subtleties we have to take care of

Here, we’ve used another form of bind variable, numbering them instead

of naming them explicitly It’s more or less a matter of taste whether

you bind parameters by name or by number, but you have to be

con-sistent If you’ve used numbers as placeholders for the parameters in

the SQL statement, you have to bind them by number later That’s

especially important for output parameters:

- cursor = connection.parse("begin :now := sysdate; end;")

- cursor.bind_param( ':now' , Time.mktime(1972, 9, 30), Date)

- puts cursor[':now' ]

There’s something even more critical hidden in ourRecipientclass

- connection = OCI8.new('maik' , 'maik' )

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 28

See that we’ve enabled the auto-commit feature of the connection object auto-commit

on line 3 This makes sure that every SQL statement gets committed

immediately, saving any changes to the database when the statement

is executed That’s what we’d normally expect to happen

Oracle is a transactional database—you can group several SQL

ments as if they were one If any of the statements fail, all the

state-ments will be ignored—the database content will not be changed The

current transaction can be committed by executing the COMMIT

com-mand or it can be rolled back by callingROLLBACK Settingautocommit

to true is like callingCOMMITafter every singleSQLstatement Without

it, nothing would ever get written to the database You wouldn’t even

notice it, because from the database’s point of view it’s not an error

Our final version of the coupon application differs only slightly from our

-5 # Read all potential customers

- ora_connection = OCI8.new('maik' , 'maik' )

# Sort out customers not in whitelist

- mysql_connection = Mysql.new('localhost' , '' , '' , 'webshop' )

Trang 29

THECOUPONAPPLICATION 23

set its autocommitfeature totrue We also defer closing the connection

until the end of the program, as it’s needed during the whole runtime

The Fruits of Our Labor

Two weeks ago the coupons were sent to their lucky recipients Today

started like any other: switched on your PC and went into the kitchen

to get a (free) cup of coffee As you came back to your desk to create

yet more extraordinary code, one of the marketing guys was waiting for

you “You’re the techie that sent out the coupons two weeks ago, aren’t

you?” he asks Before you can say a word he proceeds: “Although

we worked several weeks on the functional specification of the coupon

application, we somehow forgot to define some statistics requirements

Now we’re afraid that we can’t find out how successful our marvelous

and groundbreaking coupon idea was Is there any way you could create

some statistics, anyhow?”

Mostly, you’re surprised that something like a functional specification

exists—it’s the first you heard of it But, when you recover, you

remem-ber thecoupon_recipients table and open an SQL*Plus shell:

SQL> select count(*) from coupon_recipients;

COUNT(*)

-3145

SQL> select count(*) from orders where customer_id in (

2 select customer_id from coupon_recipients

3 ) and created > sysdate - 14;

Turning around to the marketing guy you say: “29.16% of the coupon

recipients placed an order during the last two weeks Do you need

anything else?” He is obviously impressed: “No, thank you very much!

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 30

THECOUPONAPPLICATION 24

Managing Database Resources

So far, our examples have been simple and we didn’t care

about performance and optimization But opening a new

database connection is expensive and should not be

per-formed unnecessarily If you only need a single connection,

databases can be represented as a singleton object A

single-ton object is available everywhere in your program and can

be created only once Thanks to the Ruby standard library it’s

a piece of cake to create a singleton encapsulating ourOCI8

driver:

File require 'oci8'

require 'singleton'

class Database include Singleton attr_reader :connection

def initialize

@connection = nil end

def connect(usr, pwd, dbname = nil)

@connection = OCI8.new(usr, pwd, dbname)

@connection.autocommit = true

@connection

end def disconnect

if !@connection.nil?

@connection.logoff

@connection = nil end

end end

ClassDatabasemakes a connection to our database available

wherever we need it and we get access to the one and only

instance by calling Database.instance( ) At program start we

have to callDatabase.instance.connect(usr,pwd)once and from

then onDatabase.instance.connectioncontains our connection

You did an awesome job and I wouldn’t be surprised, if you get a corner

office soon.” You lean back and take a sip of your coffee It’s still hot

Trang 31

DATABASEINTERFACE(DBI) 25

2.2 Database Interface (DBI)

It’s a bit annoying that the information we needed for our coupon

application is spread across two databases—it might be a good idea

to change this situation someday Anticipating this change, it might

be advantageous to make our application more independent of the

underlying drivers As we’ve seen in the previous sections,

access-ing databases usaccess-ing native drivers in principle differs only slightly from

vendor to vendor: you have to obtain a connection, create or prepare

statements, execute statements, and retrieve results eventually

Tech-nically, though, there are many subtle (and sometimes not so subtle)

differences Countless attempts have been made to standardize this

interface For example, on the Microsoft Windows platform there is

ODBC,OLE DB, andADO.NET, to name just a few Java has itsJDBCand

dynamic languages such as Perl, Python, and Ruby use an approach

All database abstraction layers work in a similar fashion: they define database abstraction

layers

an abstract interface to the database, and a concrete implementation,

called a database driver, is implemented for each specific database database driverFor the Ruby DBI library, these drivers are known as DBD modules.8

These drivers are accessed by your program through a standard

inter-face,9 so you do not have to remember if the method to get a new

con-nection was called new( ), connect( ), create_connection( ), or whatever

In DBI it’s called connect(driver_url, user=nil, auth=nil, params=nil)for every

database supported and it always expects the same parameters in the

same order

Compared to other database abstraction layers, DBI is extremely

sim-ple To use it you only have to know two classes, DatabaseHandleand

StatementHandle A database handle represents a connection to the

database, while a statement handle represents an activeSQLstatement

To examine whether we can benefit from usingDBIin our PragBouquet

application, we’ll change theWhitelistclass to use it

DBI.connect( 'DBI:Mysql:webshop' , '' , '' ) do |conn|

- conn.select_all('select * from whitelist' ) { |row| p row }

7 This list proves the old adage: the good thing about standards is that there’re so

many to choose from.

Trang 32

DATABASEINTERFACE(DBI) 26

Because of the block syntax supported by theDBImethods, our

demon-stration program became extremely compact In line 3, DBI.connect( )

returns a database handle that gets passed into the block When the

program reaches the end of the block, the connection is closed

auto-matically Within the block we callselect_all( ), which executes a SELECT

statement and calls a code block for every row that was returned

Again, we do not have to care about resource management—the

state-ment will be released at the end of the block The only thing left to do

is to integrate the code into theWhitelistclass

We did not change the interface and only the connection object has to

be instantiated differently to use theWhitelistclass:

- whitelist = Whitelist.new(connection)

- puts whitelist.contains?( 'homer@example.com' )

- connection.disconnect

Should we move thewhitelisttable from MySQL to our Oracle database,

we only have to change the string “Mysql” to “Oracle” and the program

will still work

Encouraged by our success, we’ll change the Oracle stuff in our

Cus-tomerFinderclass to use DBI too

- def initialize(connection)

- @find_stmt = connection.prepare(<<-SQL)

- select a.id, a.name, a.surname, a.email

- and b.created < sysdate - :days

Trang 33

DATABASEINTERFACE(DBI) 27

As with the previous example, we did not have to change a lot Instead

of calling parse( ) on our connection object in line 3, we have to call

prepare( ) now Similarly,exec( ) becomesexecute( ) on line 18 We have

to pass aDBI connection object now:

- finder = CustomerFinder.new(connection)

- customers = finder.find(180)

- customers.each { |c| puts c.email }

5 connection.disconnect

Despite all this, the benefits of a database abstraction layer aren’t as

big as you might think It’s convenient to work with DBI when you

have to access a database product that you haven’t worked with before,

but you shouldn’t assume that you can easily replace your existing

database by a completely different one only because you’re using an

abstraction layer Moving from one database to another is one of the

most complicated things in developing enterprise software

Because there are so many proprietary additions to SQL in every

ven-dor’s implementation, writing portable statements is nearly impossible

Often such statements look quite harmless For example, look at the

statement starting on line 3 in ourCustomerFinderclass It contains at

least three potential problems:

• Not all databases support sub-selects

• sysdate is specific to Oracle In MySQL you’d have to use now( )

and in DB2 it’d becurrent timestamp

• The syntax of arithmetic expressions for dates (such assysdate-180)

differs from vendor to vendor

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 34

OBJECT-RELATIONALMAPPERS 28

Sometimes the problems aren’t directly related to aSQLstatement, but

are caused by some side-effects like auto-generated identifiers which

aren’t available in every database To support such database specific

functions, the drivers used by DBI allow for some extensions, but if

you want to write portable software, it’s certainly not a good idea to

use them For example, to read the last auto-generated identifier from

a MySQL database, you call the last_insert_id( ) method This method

is not available for Oracle databases and it’s not easy to simulate the

auto-generation feature in Oracle

A last problem with DBI could be performance: the extra layers and

the need to map features can decrease performance significantly For

example, accessing MySQL using the native driver is twice as fast as

using theDBIlayer

There are much more important (and tricky) issues that might prevent

you from easily changing your database Consider, for example, C/C++

programs that contain Embedded SQL Even if you’re lucky and have Embedded SQLaccess to the source code of all programs running in your environment,

it still will be a lot of work to adjust them all

So, if you know up front that you have to support multiple databases

you can gain a lot by using an abstraction layer, but you have to plan

for it carefully

2.3 Object-Relational Mappers

A lot of people working in the software development department of

Prag-Bouquet have been thinking about re-organizing the current database

landscape for quite a long time The design of many databases has

become a bit messy over the years and it’s a big problem that logic and

data are spread across Oracle and MySQL databases To save license

costs, all the Oracle databases should be migrated to a MySQL database

in the future and all new stuff should be implemented in the MySQL

database right from the beginning

The first thing that has to be added is an automatic management

sys-tem for ordering flowers Today flowers are ordered from a big

whole-saler more or less manually by the buying department The clerks get

daily order reports and they can see how many flowers are still in stock

Then they do some simple calculations using a spreadsheet application

and place new orders accordingly It’s your task now to automate this

process as far as possible, i.e to create a database for the flowers in

Trang 35

OBJECT-RELATIONALMAPPERS 29

Generating Unique Ids

It’s really strange: mankind is talking about going to Mars, but

creating artificial primary keys in databases is still a problem in

the 21stcentury, because there’s no standard

From a design point of view, there are a lot of advantages

to creating an artificial unique (numeric) primary key for every

table in the database, even if a natural primary key does exist

Numeric values only need a small amount of space and can

be indexed efficiently

Although there’s a need for unique ids in every database, all

vendors come up with their own ideas and concepts to

gener-ate them It’s easy to genergener-ate them more or less portable by

creating a table containing only two columns:

create table sequences (

value int default 1 not null,

table_name varchar(64)

);

To create a sequence for our customerstable, we insert a new

row into thesequencestable:

insert into sequences (table_name) values ( 'customers' );

Generating a new sequence value is straightforward, then:

begin

update sequences set id = id + 1

where table_name = 'customers' ;

select id from sequences

where table_name = 'customers' ;

end;

Unfortunately, this solution is not particularly efficient, because it

has to be executed in a transaction that can slow down things

a bit Oh, and did I mention that not all databases support

transactions?

Whenever your program relies upon auto-generated identifiers

you should encapsulate this process carefully to prevent bad

surprises when you have to migrate to another database

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 36

OBJECT-RELATIONALMAPPERS 30

stock and to remove flowers from stock whenever a new bouquet leaves

PragBouquet

Before opening your text editor you take a day off to think about the

new database structure and after 24 hours of constant thinking you

finally had this revolutionary idea: we need a table that represents

flowers:

- id int unsigned not null auto_increment primary key,

- name varchar(64) not null,

- price double not null

That should be sufficient for a first version: flowers have a name, a

price, and an artificial primary key that is created by the database

auto-matically The “only” thing left to do is mapping the flowers table to a

Flowerclass and mapping all its columns to the according attributes

You have read Martin Fowler’s Patterns of Enterprise Application

Archi-tecture [?] and you still remember his Active Record pattern and its

definition:

“An object that wraps a row in a database table or view, encapsulates

the database access, and adds domain logic on that data.”

Before programming an Active Record for theflowers table we

encapsu-late access to the MySQL database in a singleton first:

-10 def connect(host, usr, pwd, db= nil)

- @connection = Mysql.new(host, usr, pwd, db)

Fine, after calling Database.instance.connect( ) once, we can access the

database connection calling Database.instance.connection( ) from

Trang 37

any-OBJECT-RELATIONALMAPPERS 31

where we want So, let’s use it to create new flowers:

- insert into flowers (name, price)

- values ( '#{name}' , #{price.to_f})

def initialize(id, name, price)

- @id, @name, @price = id, name, price

Virtually planting a rose looks like this:

- rose = Flower.create( 'rose' , 1.99)

- puts rose

and produces:

A rose (1) costs $1.99.

The first version of the Flower class allows for creating new objects

by calling create(name,price) This method inserts a new row into the

database, reads the id that has been generated by MySQL and returns

a new Flower object To make sure that no conflicts happen in the

database because of duplicateidvalues, we have declared theinitialize( )

method private Hence, only methods of theFlowerclass can create new

objects

For the sake of completeness we add the remaining methods needed

to be fullyCRUD compliant (CRUDstands for Create, Retrieve, Update, CRUD

Delete):

Report erratum

Prepared exclusively for Jacob Hochstetler

Trang 38

OBJECT-RELATIONALMAPPERS 32

Now we can retrieve, update, and deleteFlowerobjects in the database:

It took less than an hour to create the Active Record and it works fine,

but despite all this you still think that sometimes life isn’t fair: all your

friends are hanging around at the beach having fun and you’re writing

tons of boring SQL statements only to read and save Flower objects

Enough is enough and hence you decide to look for a tool that will do

all this tedious stuff for you

Trang 39

OBJECT-RELATIONALMAPPERS 33

Object-Relational Mappers for Ruby

Because of its dynamic nature Ruby is a perfect language

for creating tools like object-relational mappers: you can

eas-ily create classes and methods on the fly and determining

the structure of a database is not a big problem with most

database systems, either

Unsurprisingly, several projects have been initiated to

imple-ment an object-relational mapper∗, but ActiveRecord is by far

the most popular and most advanced It’s much more than a

simple mapper, it’s fast, it supports nearly every database

avail-able, and it is constantly enhanced by a big community

interest-ing one, for example.

ActiveRecord Basics

ActiveRecord is an an enhanced implementation of Martin Fowler’s

Active Record object-relational mapping pattern.10 ActiveRecord was

created by David Heinemeier Hansson because he needed it for the

famous Ruby on Rails project11 ActiveRecord now supports nearly

every database system currently in use (MySQL, PostgreSQL, SQLite,

Microsoft SQL Server, Oracle, and DB2)

Code always trumps prose, so instead of explaining academic

persis-tence strategies, let’s start by telling ActiveRecord to connect to our

These statements load the ActiveRecord gem (see Section6.4, RubyGems,

on page288to learn more about RubyGems) then establish a

connec-tion to thewebshopdatabase running on localhost

Trang 40

OBJECT-RELATIONALMAPPERS 34

Now we have to map theflowerstable to a Ruby class calledFlower:

That’s it! All we had to do is derive our class from ActiveRecord::Base

Every instance of classFlowerrepresents a single row of theflowerstable

ActiveRecord derives the name of the database table by taking the class

name, turning it into lowercase, and pluralizing it For example, Flower

becomesflowersandPragmaticProgrammerbecomespragmatic_programmers

If necessary, you can also set the table name explicitly, either because

the built-in pluralization rules don’t work for you or because you want

to map to an existing table whose name doesn’t meet ActiveRecord’s

expectations

class LegacyTable < ActiveRecord::Base

set_table_name 'xy12aj'

end

All Flower objects automatically have accessors for all the columns of

theflowerstable, so there’ll be accessors namedname( ) andprice( ):

- flower.name = 'primrose'

- flower.price = 0.99

ActiveRecord stores all columns internally in a hash called attributes,

but using this knowledge is dangerous, as it links us to ActiveRecord’s

implementation Instead, we should access column values using just

the attributes For example, we could add ato_s( ) method to our class

- "A #{self.name} (#{self.id}) costs $#{self.price}."

In addition, ActiveRecord creates methods for reading, updating, and

deleting rows in the database To initialize the flowerstable with some

lovely plants, we can do the following:

- ].each do |name, price|

- flower = Flower.new(:name => name, :price => price)

- flower.save

Ngày đăng: 29/04/2014, 14:42