Database systems concepts 4th edition phần 7 ppt

Transactions access data using two operations: • readX, which transfers the data item X from the database to a local buffer belonging to the transaction that executed the read operation.

Trang 1

5. Deconstruct and move as far down the tree as possible lists of projection tributes, creating new projections where needed This step draws on the prop-erties of the projection operation given in equivalence rules 3, 8.a, 8.b, and12.

at-6. Identify those subtrees whose operations can be pipelined, and execute themusing pipelining

In summary, the heuristics listed here reorder an initial query-tree representation

in such a way that the operations that reduce the size of intermediate results are plied ﬁrst; early selection reduces the number of tuples, and early projection reducesthe number of attributes The heuristic transformations also restructure the tree sothat the system performs the most restrictive selection and join operations beforeother similar operations

ap-Heuristic optimization further maps the heuristically transformed query sion into alternative sequences of operations to produce a set of candidate evalu-ation plans An evaluation plan includes not only the relational operations to beperformed, but also the indices to be used, the order in which tuples are to be ac-

expres-cessed, and the order in which the operations are to be performed The access-plan – selectionphase of a heuristic optimizer chooses the most efﬁcient strategy for eachoperation

14.4.4 Structure of Query Optimizers ∗∗

So far, we have described the two basic approaches to choosing an evaluation plan;

as noted, most practical query optimizers combine elements of both approaches Forexample, certain query optimizers, such as the System R optimizer, do not considerall join orders, but rather restrict the search to particular kinds of join orders TheSystem R optimizer considers only those join orders where the right operand of each

join is one of the initial relations r1, , r n Such join orders are called left-deep join

orders Left-deep join orders are particularly convenient for pipelined evaluation,

since the right operand is a stored relation, and thus only one input to each join ispipelined

Figure 14.6 illustrates the difference between left-deep join trees and non-left-deep

join trees The time it takes to consider all left-deep join orders is O(n!), which is much

less than the time to consider all join orders With the use of dynamic programming

optimizations, the System R optimizer can ﬁnd the best join order in time O(n2 n)

Contrast this cost with the O(3 n) time required to ﬁnd the best overall join order.The System R optimizer uses heuristics to push selections and projections down thequery tree

The cost estimate that we presented for scanning by secondary indices assumedthat every tuple access results in an I/O operation The estimate is likely to be ac-curate with small buffers; with large buffers, however, the page containing the tuplemay already be in the buffer Some optimizers incorporate a better cost-estimationtechnique for such scans: They take into account the probability that the page con-taining the tuple is in the buffer

Trang 2

550 Chapter 14 Query Optimization

r3

r5 r4 r3 r2 r1

(a) Left-deep join tree (b) Non-left-deep join tree

Figure 14.6 Left-deep join trees

Query optimization approaches that integrate heuristic selection and the tion of alternative access plans have been adopted in several systems The approachused in System R and in its successor, the Starburst project, is a hierarchical procedurebased on the nested-block concept ofSQL The cost-based optimization techniquesdescribed here are used for each block of the query separately

genera-The heuristic approach in some versions of Oracle works roughly this way: For

an n-way join, it considers n evaluation plans Each plan uses a left-deep join order, starting with a different one of the n relations The heuristic constructs the join order for each of the n evaluation plans by repeatedly selecting the “best” relation to

join next, on the basis of a ranking of the available access paths Either nested-loop

or sort– merge join is chosen for each of the joins, depending on the available access

paths Finally, the heuristic chooses one of the n evaluation plans in a heuristic

man-ner, based on minimizing the number of nested-loop joins that do not have an indexavailable on the inner relation, and on the number of sort– merge joins

The intricacies ofSQLintroduce a good deal of complexity into query optimizers

In particular, it is hard to translate nested subqueries inSQLinto relational algebra

We brieﬂy outline how to handle nested subqueries in Section 14.4.5 For compound

SQLqueries (using the∪, ∩, or − operation), the optimizer processes each component

separately, and combines the evaluation plans to form the overall evaluation plan.Even with the use of heuristics, cost-based query optimization imposes a substan-tial overhead on query processing However, the added cost of cost-based query op-timization is usually more than offset by the saving at query-execution time, which

is dominated by slow disk accesses The difference in execution time between a goodplan and a bad one may be huge, making query optimization essential The achievedsaving is magniﬁed in those applications that run on a regular basis, where the querycan be optimized once, and the selected query plan can be used on each run There-fore, most commercial systems include relatively sophisticated optimizers The bib-liographical notes give references to descriptions of the query optimizers of actualdatabase systems

Trang 3

14.4.5 Optimizing Nested Subqueries ∗∗

SQLconceptually treats nested subqueries in the where clause as functions that take

parameters and return either a single value or a set of values (possibly an empty set).The parameters are the variables from outer level query that are used in the nested

subquery (these variables are called correlation variables) For instance, suppose we

have the following query

selectcustomer-name

fromborrower

where exists (select *

fromdepositor

wheredepositor.customer-name = borrower.customer-name)

Conceptually, the subquery can be viewed as a function that takes a parameter (here,

borrower.customer-name) and returns the set of all depositors with the same name.

SQLevaluates the overall query (conceptually) by computing the Cartesian

prod-uct of the relations in the outer from clause and then testing the predicates in the whereclause for each tuple in the product In the preceding example, the predicatetests if the result of the subquery evaluation is empty

This technique for evaluating a query with a nested subquery is called correlated evaluation Correlated evaluation is not very efﬁcient, since the subquery is sepa-

rately evaluated for each tuple in the outer level query A large number of randomdiskI/Ooperations may result

SQLoptimizers therefore attempt to transform nested subqueries into joins, wherepossible Efﬁcient join algorithms help avoid expensive randomI/O Where the trans-formation is not possible, the optimizer keeps the subqueries as separate expressions,optimizes them separately, and then evaluates them by correlated evaluation

As an example of transforming a nested subquery into a join, the query in thepreceding example can be rewritten as

fromborrower, depositor

wheredepositor.customer-name = borrower.customer-name

(To properly reﬂectSQLsemantics, the number of duplicate derivations should notchange because of the rewriting; the rewritten query can be modiﬁed to ensure thisproperty, as we will see shortly.)

In the example, the nested subquery was very simple In general, it may not be

possible to directly move the nested subquery relations into the from clause of the

outer query Instead, we create a temporary relation that contains the results of the

nested query without the selections using correlation variables from the outer query,

and join the temporary table with the outer level query For instance, a query of theform

Trang 4

select from L1where P1and exists (select *

from L2where P2)

where P2is a conjunction of simpler predicates, can be rewritten as

create table t1as

select distinct V from L2where P1

2

select from L1, t1

where P1and P2

where P1contains predicates in P2without selections involving correlation variables,

and P2reintroduces the selections involving correlation variables (with relations

ref-erenced in the predicate appropriately renamed) Here, V contains all attributes that

are used in selections with correlation variables in the nested subquery

In our example, the original query would have been transformed to

create table t1as select distinctcustomer-name

fromdepositor

fromborrower, t1

where t1.customer-name = borrower.customer-name

The query we rewrote to illustrate creation of a temporary relation can be obtained

by simplifying the above transformed query, assuming the number of duplicates ofeach tuple does not matter

The process of replacing a nested query by a query with a join (possibly with a

temporary relation) is called decorrelation.

Decorrelation is more complicated when the nested subquery uses aggregation,

or when the result of the nested subquery is used to test for equality, or when the

condition linking the nested subquery to the outer query is not exists, and so on.

We do not attempt to give algorithms for the general case, and instead refer you torelevant items in the bibliographical notes

Optimization of complex nested subqueries is a difﬁcult task, as you can infer fromthe above discussion, and many optimizers do only a limited amount of decorrela-tion It is best to avoid using complex nested subqueries, where possible, since wecannot be sure that the query optimizer will succeed in converting them to a formthat can be evaluated efﬁciently

Trang 5

14.5 Materialized Views ∗∗

When a view is deﬁned, normally the database stores only the query deﬁning the

view In contrast, a materialized view is a view whose contents are computed and

stored Materialized views constitute redundant data, in that their contents can beinferred from the view deﬁnition and the rest of the database contents However, it

is much cheaper in many cases to read the contents of a materialized view than tocompute the contents of the view by executing the query deﬁning the view

Materialized views are important for improving performance in some tions Consider this view, which gives the total loan amount at each branch:

applica-create viewbranch-total-loan(branch-name, total-loan) as

selectbranch-name, sum(amount)

fromloan

groupbybranch-name

Suppose the total loan amount at the branch is required frequently (before making

a new loan, for example) Computing the view requires reading every loan tuple

pertaining to the branch, and summing up the loan amounts, which can be consuming

time-In contrast, if the view deﬁnition of the total loan amount were materialized, thetotal loan amount could be found by looking up a single tuple in the materializedview

14.5.1 View Maintenance

A problem with materialized views is that they must be kept up-to-date when the

data used in the view deﬁnition changes For instance, if the amount value of a loan

is updated, the materialized view would become inconsistent with the underlyingdata, and must be updated The task of keeping a materialized view up-to-date with

the underlying data is known as view maintenance.

Views can be maintained by manually written code: That is, every piece of code

that updates the amount value of a loan can be modiﬁed to also update the total loan

amount for the corresponding branch

Another option for maintaining materialized views is to define triggers on insert,delete, and update of each relation in the view definition The triggers must modifythe contents of the materialized view, to take into account the change that caused thetrigger to fire A simplistic way of doing so is to completely recompute the material-ized view on every update

A better option is to modify only the affected parts of the materialized view, which

is known as incremental view maintenance We describe how to perform

incremen-tal view maintenance in Section 14.5.2

Modern database systems provide more direct support for incremental view tenance Database system programmers no longer need to deﬁne triggers for viewmaintenance Instead, once a view is declared to be materialized, the database sys-tem computes the contents of the view, and incrementally updates the contents whenthe underlying data changes

Trang 6

main-554 Chapter 14 Query Optimization

14.5.2 Incremental View Maintenance

To understand how to incrementally maintain materialized views, we start off byconsidering individual operations, and then see how to handle a complete expres-sion

The changes to a relation that can cause a materialized view to become out-of-dateare inserts, deletes, and updates To simplify our description, we replace updates to

a tuple by deletion of the tuple followed by insertion of the updated tuple Thus,

we need to consider only inserts and deletes The changes (inserts and deletes) to a

relation or expression are referred to as its differential.

the new value v new is given by r new

1s We can rewrite r new

1s as (r old ∪ i r)1s,

which we can again rewrite as (r old

1s) ∪ (i r1s) In other words,

v new = v old ∪ (i r1s)

Thus, to update the materialized view v, we simply need to add the tuples ir 1 s

to the old contents of the materialized view Inserts to s are handled in an exactly

symmetric fashion

Now suppose r is modiﬁed by deleting a set of tuples denoted by dr Using the

same reasoning as above, we get

v new = v old − (d r1s)

Deletes on s are handled in an exactly symmetric fashion.

14.5.2.2 Selection and Projection Operations

Consider a view v = σθ(r) If we modify r by inserting a set of tuples ir, the new value of v can be computed as

v new = v old ∪ σ θ (i r)

Similarly, if r is modiﬁed by deleting a set of tuples dr, the new value of v can be

computed as

v new = v old − σ θ(dr)

Projection is a more difﬁcult operation with which to deal Consider a materialized

view v = ΠA (r) Suppose the relation r is on the schema R = (A, B), and r contains two tuples (a, 2) and (a, 3) Then, ΠA (r) has a single tuple (a) If we delete the tuple (a, 2) from r, we cannot delete the tuple (a) from ΠA (r): If we did so, the result would

be an empty relation, whereas in reality ΠA(r) still has a single tuple (a) The reason is that the same tuple (a) is derived in two ways, and deleting one tuple from r removes only one of the ways of deriving (a); the other is still present.

This reason also gives us the intuition for solution: For each tuple in a projectionsuch as ΠA(r), we will keep a count of how many times it was derived

Trang 7

When a set of tuples dr is deleted from r, for each tuple t in drwe do the following.

Let t.A denote the projection of t on the attribute A We ﬁnd (t.A) in the materialized view, and decrease the count stored with it by 1 If the count becomes 0, (t.A) is

deleted from the materialized view

Handling insertions is relatively straightforward When a set of tuples ir is

in-serted into r, for each tuple t in ir we do the following If (t.A) is already present in the materialized view, we increase the count stored with it by 1 If not, we add (t.A)

to the materialized view, with the count set to 1

When a set of tuples ir is inserted into r, for each tuple t in irwe do the

fol-lowing We look for the group t.A in the materialized view If it is not present,

we add (t.A, 1) to the materialized view If the group t.A is present, we add 1

to the count of the group

When a set of tuples dr is deleted from r, for each tuple t in drwe do the

following We look for the group t.A in the materialized view, and subtract 1

from the count for the group If the count becomes 0, we delete the tuple for

the group t.A from the materialized view.

• sum: Consider a materialized view v = A G sum (B) (r)

When a set of tuples ir is inserted into r, for each tuple t in irwe do the

fol-lowing We look for the group t.A in the materialized view If it is not present,

we add (t.A, t.B) to the materialized view; in addition, we store a count of

1associated with (t.A, t.B), just as we did for projection If the group t.A is present, we add the value of t.B to the aggregate value for the group, and add

1to the count of the group

When a set of tuples dr is deleted from r, for each tuple t in drwe do the

following We look for the group t.A in the materialized view, and subtract

t.Bfrom the aggregate value for the group We also subtract 1 from the countfor the group, and if the count becomes 0, we delete the tuple for the group

t.Afrom the materialized view

Without keeping the extra count value, we would not be able to distinguish

a case where the sum for a group is 0 from the case where the last tuple in agroup is deleted

• avg: Consider a materialized view v = A G avg (B) (r).

Directly updating the average on an insert or delete is not possible, since

it depends not only on the old average and the tuple being inserted/deleted,but also on the number of tuples in the group

Trang 8

Instead, to handle the case of avg, we maintain the sum and count

aggre-gate values as described earlier, and compute the average as the sum divided

by the count

• min, max: Consider a materialized view v = A G min (B) (r) (The case of max is

exactly equivalent.)

Handling insertions on r is straightforward Maintaining the aggregate

val-ues min and max on deletions may be more expensive For example, if the

tuple corresponding to the minimum value for a group is deleted from r, we have to look at the other tuples of r that are in the same group to ﬁnd the new

minimum value

14.5.2.4 Other Operations

The set operation intersection is maintained as follows Given materialized view v =

r ∩ s, when a tuple is inserted in r we check if it is present in s, and if so we add

it to v If a tuple is deleted from r, we delete it from the intersection if it is present The other set operations, union and set difference, are handled in a similar fashion; we

leave details to you

Outer joins are handled in much the same way as joins, but with some extra work

In the case of deletion from r we have to handle tuples in s that no longer match any tuple in r In the case of insertion to r, we have to handle tuples in s that did not match any tuple in r Again we leave details to you.

14.5.2.5 Handling Expressions

So far we have seen how to update incrementally the result of a single operation Tohandle an entire expression, we can derive expressions for computing the incremen-tal change to the result of each subexpression, starting from the smallest subexpres-sions

For example, suppose we wish to incrementally update a materialized view E11

E2 when a set of tuples ir is inserted into relation r Let us assume r is used in E1alone Suppose the set of tuples to be inserted into E1is given by expression D1 Then

the expression D11E2gives the set of tuples to be inserted into E11E2.See the bibliographical notes for further details on incremental view maintenancewith expressions

14.5.3 Query Optimization and Materialized Views

Query optimization can be performed by treating materialized views just like regularrelations However, materialized views offer further opportunities for optimization:

• Rewriting queries to use materialized views:

Suppose a materialized view v = r 1sis available, and a user submits a

query r 1 s 1 t Rewriting the query as v 1 tmay provide a more efﬁcientquery plan than optimizing the query as submitted Thus, it is the job of the

Trang 9

query optimizer to recognize when a materialized view can be used to speed

up a query

• Replacing a use of a materialized view by the view deﬁnition:

Suppose a materialized view v = r1sis available, but without any index

on it, and a user submits a query σA=10(v) Suppose also that s has an index

on the common attribute B, and r has an index on attribute A The best plan for this query may be to replace v by r1s, which can lead to the query plan

σ A=10(r) 1 s; the selection and join can be performed efﬁciently by using

the indices on r.A and s.B, respectively In contrast, evaluating the selection directly on v may require a full scan of v, which may be more expensive.

The bibliographical notes give pointers to research showing how to efﬁciently form query optimization with materialized views

per-Another related optimization problem is that of materialized view selection,

namely, “What is the best set of views to materialize?” This decision must be made

on the basis of the system workload, which is a sequence of queries and updates that

reﬂects the typical load on the system One simple criterion would be to select a set

of materialized views that minimizes the overall execution time of the workload ofqueries and updates, including the time taken to maintain the materialized views.Database administrators usually modify this criterion to take into account the im-portance of different queries and updates: Fast response may be required for somequeries and updates, but a slow response may be acceptable for others

Indices are just like materialized views, in that they too are derived data, can speed

up queries, and may slow down updates Thus, the problem of index selection is

closely related, to that of materialized view selection, although it is simpler

We examine these issues in more detail in Sections 21.2.5 and 21.2.6

Some database systems, such as Microsoft SQL Server 7.5, and the RedBrick DataWarehouse from Informix, provide tools to help the database administrator with in-dex and materialized view selection These tools examine the history of queries andupdates, and suggest indices and views to be materialized

14.6 Summary

• Given a query, there are generally a variety of methods for computing the

answer It is the responsibility of the system to transform the query as entered

by the user into an equivalent query that can be computed more efﬁciently

The process of ﬁnding a good strategy for processing a query, is called query

optimization.

• The evaluation of complex queries involves many accesses to disk Since the

transfer of data from disk is slow relative to the speed of main memory andtheCPUof the computer system, it is worthwhile to allocate a considerableamount of processing to choose a method that minimizes disk accesses

• The strategy that the database system chooses for evaluating an operation

de-pends on the size of each relation and on the distribution of values within

Trang 10

columns So that they can base the strategy choice on reliable information,

database systems may store statistics for each relation r These statistics

in-clude

The number of tuples in the relation r The size of a record (tuple) of relation r in bytes The number of distinct values that appear in the relation r for a particular

attribute

• These statistics allow us to estimate the sizes of the results of various

oper-ations, as well as the cost of executing the operations Statistical informationabout relations is particularly useful when several indices are available to as-sist in the processing of a query The presence of these structures has a signif-icant inﬂuence on the choice of a query-processing strategy

• Each relational-algebra expression represents a particular sequence of

opera-tions The ﬁrst step in selecting a query-processing strategy is to ﬁnd a al-algebra expression that is equivalent to the given expression and is esti-mated to cost less to execute

relation-• There are a number of equivalence rules that we can use to transform an

ex-pression into an equivalent one We use these rules to generate systematicallyall expressions equivalent to the given query

• Alternative evaluation plans for each expression can be generated by

simi-lar rules, and the cheapest plan across all expressions can be chosen Severaloptimization techniques are available to reduce the number of alternative ex-pressions and plans that need to be generated

• We use heuristics to reduce the number of plans considered, and thereby to

reduce the cost of optimization Heuristic rules for transforming algebra queries include “Perform selection operations as early as possible,”

relational-“Perform projections early,” and “Avoid Cartesian products.”

• Materialized views can be used to speed up query processing Incremental

view maintenance is needed to efﬁciently update materialized views whenthe underlying relations are modiﬁed The differential of an operation can becomputed by means of algebraic expressions involving differentials of the in-puts of the operation Other issues related to materialized views include how

to optimize queries by making use of available materialized views, and how

to select views to be materialized

Trang 11

• Distinct value estimation

• Minimal set of equivalence rules

Left-deep join order

DeletionUpdates

• Query optimization with

a clustering index? Explain your answer

14.2 Consider the relations r1(A, B, C) , r2(C, D, E) , and r3(E, F ), with primary keys

A, C, and E, respectively Assume that r1 has 1000 tuples, r2has 1500 tuples,

and r3has 750 tuples Estimate the size of r11 r21 r3, and give an efﬁcientstrategy for computing the join

14.3 Consider the relations r1(A, B, C) , r2(C, D, E) , and r3(E, F )of Exercise 14.2

Assume that there are no primary keys, except the entire schema Let V (C, r1)

be 900, V (C, r2)be 1100, V (E, r2)be 50, and V (E, r3)be 100 Assume that r1has 1000 tuples, r2has 1500 tuples, and r3has 750 tuples Estimate the size of

r11 r21 r3, and give an efﬁcient strategy for computing the join

14.4 Suppose that a B+-tree index on branch-city is available on relation branch, and

that no other index is available What would be the best way to handle thefollowing selections that involve negation?

a σ ¬(branch-city<“Brooklyn”)(branch)

b σ ¬(branch-city=“Brooklyn”)(branch)

c σ ¬(branch-city<“Brooklyn” ∨ assets<5000)(branch)

14.5 Suppose that a B+-tree index on (branch-name, branch-city) is available on tion branch What would be the best way to handle the following selection?

rela-σ (branch-city<“Brooklyn”) ∧ (assets<5000)∧(branch-name=“Downtown”)(branch)

Trang 12

14.6 Show that the following equivalences hold Explain how you can apply then

to improve the efﬁciency of certain queries:

a E11θ (E2− E3) = (E11θ E2− E1 1θ E3)

b σθ( A G F (E)) = A G F (σθ(E)), where θ uses only attributes from A.

c σθ(E1 1E2) = σθ(E1) 1E2where θ uses only attributes from E1

14.7 Show how to derive the following equivalences by a sequence of tions using the equivalence rules in Section 14.3.1

a ΠA(R− S) and Π A(R) − Π A(S)

b σB<4(A G max (B) (R))and A G max (B) (σB<4(R))

c. In the preceding expressions, if both occurrences of max were replaced by

minwould the expressions be equivalent?

d (R 1S) 1T and R 1(S 1T )

In other words, the natural left outer join is not associative

(Hint: Assume that the schemas of the three relations are R(a, b1), S(a, b2), and T (a, b3), respectively.)

e σθ (E1 1E2)and E1 1σ θ (E2), where θ uses only attributes from E2

14.9 SQLallows relations with duplicates (Chapter 4)

a. Deﬁne versions of the basic relational-algebra operations σ, Π, ×,1,−, ∪,

and∩ that work on relations with duplicates, in a way consistent withSQL

b. Check which of the equivalence rules 1 through 7.b hold for the multisetversion of the relational-algebra deﬁned in part a

14.10 ∗∗ Show that, with n relations, there are (2(n−1))!/(n−1)! different join orders.

Hint: A complete binary tree is one where every internal node has exactly

two children Use the fact that the number of different complete binary trees

with n leaf nodes is 1

n

2(n−1)

(n−1)

If you wish, you can derive the formula for the number of complete binary

trees with n nodes from the formula for the number of binary trees with n nodes The number of binary trees with n nodes is 1

14.11 ∗∗ Show that the lowest-cost join order can be computed in time O(3 n) sume that you can store and look up information about a set of relations (such

As-as the optimal join order for the set, and the cost of that join order) in constanttime (If you ﬁnd this exercise difﬁcult, at least show the looser time bound of

O(2 2n).)

Trang 13

14.12 Show that, if only left-deep join trees are considered, as in the System R

opti-mizer, the time taken to ﬁnd the most efﬁcient join order is around n2 n Assumethat there is only one interesting sort order

14.13 A set of equivalence rules is said to be complete if, whenever two expressions

are equivalent, one can be derived from the other by a sequence of uses of theequivalence rules Is the set of equivalence rules that we considered in Sec-

tion 14.3.1 complete? Hint: Consider the equivalence σ3=5(r) = { }.

14.14 Decorrelation:

a. Write a nested query on the relation account to ﬁnd for each branch with

name starting with “B”, all accounts with the maximum balance at thebranch

b. Rewrite the preceding query, without using a nested subquery; in otherwords, decorrelate the query

c. Give a procedure (similar that that described in Section 14.4.5) for lating such queries

decorre-14.15 Describe how to incrementally maintain the results of the following operations,

on both insertions and deletions

a. Union and set difference

b. Left outer join

14.16 Give an example of an expression deﬁning a materialized view and two ations (sets of statistics for the input relations and the differentials) such thatincremental view maintenance is better than recomputation in one situation,and recomputation is better in the other situation

situ-Bibliographical Notes

The seminal work of Selinger et al [1979] describes access-path selection in the tem R optimizer, which was one of the earliest relational-query optimizers Graefeand McKenna [1993] describe Volcano, an equivalence-rule based query optimizer.Query processing in Starburst is described in Haas et al [1989] Query optimization

Sys-in Oracle is brieﬂy outlSys-ined Sys-in Oracle [1997]

Estimation of statistics of query results, such as result size, is addressed by dis and Poosala [1995], Poosala et al [1996], and Ganguly et al [1996], among others.Nonuniform distributions of values causes problems for estimation of query size andcost Cost-estimation techniques that use histograms of value distributions have beenproposed to tackle the problem Ioannidis and Christodoulakis [1993], Ioannidis andPoosala [1995], and Poosala et al [1996] present results in this area

Ioanni-Exhaustive searching of all query plans is impractical for optimization of joinsinvolving many relations, and techniques based on randomized searching, which donot examine all alternatives, have been proposed Ioannidis and Wong [1987], Swamiand Gupta [1988], and Ioannidis and Kang [1990] present results in this area

Parametric query-optimization techniques have been proposed by Ioannidis et al.

[1992] and Ganguly [1998], to handle query processing when the selectivity of query

Trang 14

parameters is not known at optimization time A set of plans — one for each of severaldifferent query selectivities— is computed, and is stored by the optimizer, at compiletime One of these plans is chosen at run time, on the basis of the actual selectivities,avoiding the cost of full optimization at run time

Klug [1982] was an early work on optimization of relational-algebra expressionswith aggregate functions More recent work in this area includes Yan and Larson[1995] and Chaudhuri and Shim [1994] Optimization of queries containing outerjoins is described in Rosenthal and Reiner [1984], Galindo-Legaria and Rosenthal[1992], and Galindo-Legaria [1994]

TheSQLlanguage poses several challenges for query optimization, including thepresence of duplicates and nulls, and the semantics of nested subqueries Extension

of relational algebra to duplicates is described in Dayal et al [1982] Optimization ofnested subqueries is discussed in Kim [1982], Ganski and Wong [1987], Dayal [1987],and more recently, in Seshadri et al [1996]

When queries are generated through views, more relations often are joined than isnecessary for computation of the query A collection of techniques for join minimiza-

tion has been grouped under the name tableau optimization The notion of a tableau

was introduced by Aho et al [1979b] and Aho et al [1979a], and was further extended

by Sagiv and Yannakakis [1981] Ullman [1988] andMaier [1983] provide a textbookcoverage of tableaux

Sellis [1988] and Roy et al [2000] describe multiquery optimization, which is the

problem of optimizing the execution of several queries as a group If an entire group

of queries is considered, it is possible to discover common subexpressions that can be

evaluated once for the entire group Finkelstein [1982] and Hall [1976] consider timization of a group of queries and the use of common subexpressions Dalvi et al.[2001] discuss optimization issues in pipelining with limited buffer space combinedwith sharing of common subexpressions

op-Query optimization can make use of semantic information, such as functional

de-pendencies and other integrity constraints Semantic query-optimization in relational

databases is covered by King [1981], Chakravarthy et al [1990], and in the context ofaggregation, by Sudarshan and Ramakrishnan [1991]

Query-processing and optimization techniques for Datalog, in particular ques to handle queries on recursive views, are described in Bancilhon and Ramakr-ishnan [1986], Beeri and Ramakrishnan [1991], Ramakrishnan et al [1992c], Srivas-tava et al [1995] and Mumick et al [1996] Query processing and optimization tech-niques for object-oriented databases are discussed in Maier and Stein [1986], Beech[1988], Bertino and Kim [1989], and Blakeley et al [1993]

techni-Blakeley et al [1986], techni-Blakeley et al [1989], and Grifﬁn and Libkin [1995] describetechniques for maintenance of materialized views Gupta and Mumick [1995] pro-vides a survey of materialized view maintenance Optimization of materialized viewmaintenance plans is described by Vista [1998] and Mistry et al [2001] Query op-timization in the presence of materialized views is addressed by Larson and Yang[1985], Chaudhuri et al [1995], Dar et al [1996], and Roy et al [2000] Index selec-tion and materialized view selection are addressed by Ross et al [1996], Labio et al.[1997], Gupta [1997], Chaudhuri and Narasayya [1997], and Roy et al [2000]

Trang 15

Transaction Management

The term transaction refers to a collection of operations that form a single logical unit

of work For instance, transfer of money from one account to another is a transactionconsisting of two updates, one to each account

It is important that either all actions of a transaction be executed completely, or, incase of some failure, partial effects of a transaction be undone This property is called

atomicity Further, once a transaction is successfully executed, its effects must persist

in the database — a system failure should not result in the database forgetting about

a transaction that successfully completed This property is called durability.

In a database system where multiple transactions are executing concurrently, ifupdates to shared data are not controlled there is potential for transactions to seeinconsistent intermediate states created by updates of other transactions Such a sit-uation can result in erroneous updates to data stored in the database Thus, databasesystems must provide mechanisms to isolate transactions from the effects of other

concurrently executing transactions This property is called isolation.

Chapter 15 describes the concept of a transaction in detail, including the properties

of atomicity, durability, isolation, and other properties provided by the transactionabstraction In particular, the chapter makes precise the notion of isolation by means

of a concept called serializability

Chapter 16 describes several concurrency control techniques that help implementthe isolation property

Chapter 17 describes the recovery management component of a database, whichimplements the atomicity and durability properties

Trang 16

Collections of operations that form a single logical unit of work are called tions A database system must ensure proper execution of transactions despite fail-

transac-ures— either the entire transaction executes, or none of it does Furthermore, it mustmanage concurrent execution of transactions in a way that avoids the introduction ofinconsistency In our funds-transfer example, a transaction computing the customer’stotal money might see the checking-account balance before it is debited by the funds-transfer transaction, but see the savings balance after it is credited As a result, itwould obtain an incorrect result

This chapter introduces the basic concepts of transaction processing Details onconcurrent transaction processing and recovery from failures are in Chapters 16 and

17, respectively Further topics in transaction processing are discussed in Chapter 24

15.1 Transaction Concept

A transaction is a unit of program execution that accesses and possibly updates

var-ious data items Usually, a transaction is initiated by a user program written in ahigh-level data-manipulation language or programming language (for example,SQL,

COBOL, C, C++, or Java), where it is delimited by statements (or function calls) of the

form begin transaction and end transaction The transaction consists of all tions executed between the begin transaction and end transaction.

opera-To ensure integrity of the data, we require that the database system maintain thefollowing properties of the transactions:

565

Trang 17

• Atomicity Either all operations of the transaction are reﬂected properly in the

database, or none are

• Consistency Execution of a transaction in isolation (that is, with no other

transaction executing concurrently) preserves the consistency of the database

• Isolation Even though multiple transactions may execute concurrently, the

system guarantees that, for every pair of transactions Ti and Tj, it appears

to Ti that either Tj ﬁnished execution before Ti started, or Tj started

execu-tion after Tiﬁnished Thus, each transaction is unaware of other transactionsexecuting concurrently in the system

• Durability After a transaction completes successfully, the changes it has made

to the database persist, even if there are system failures

These properties are often called the ACID properties; the acronym is derived from

the ﬁrst letter of each of the four properties

To gain a better understanding ofACID properties and the need for them, sider a simpliﬁed banking system consisting of several accounts and a set of trans-

con-actions that access and update those accounts For the time being, we assume that

the database permanently resides on disk, but that some portion of it is temporarily

residing in main memory

Transactions access data using two operations:

• read(X), which transfers the data item X from the database to a local buffer

belonging to the transaction that executed the read operation

• write(X), which transfers the data item X from the the local buffer of the

trans-action that executed the write back to the database

In a real database system, the write operation does not necessarily result in the

imme-diate update of the data on the disk; the write operation may be temporarily stored

in memory and executed on the disk later For now, however, we shall assume that

the write operation updates the database immediately We shall return to this subject

in Chapter 17

Let Ti be a transaction that transfers $50 from account A to account B This

trans-action can be deﬁned as

Let us now consider each of the ACIDrequirements (For ease of presentation, we

consider them in an order different from the orderA-C-I-D)

• Consistency: The consistency requirement here is that the sum of A and B

be unchanged by the execution of the transaction Without the consistencyrequirement, money could be created or destroyed by the transaction! It can

Trang 18

15.1 Transaction Concept 567

be veriﬁed easily that, if the database is consistent before an execution of thetransaction, the database remains consistent after the execution of the transac-tion

Ensuring consistency for an individual transaction is the responsibility ofthe application programmer who codes the transaction This task may be facil-itated by automatic testing of integrity constraints, as we discussed in Chap-ter 6

• Atomicity: Suppose that, just before the execution of transaction T ithe values

of accounts A and B are $1000 and $2000, respectively Now suppose that, ing the execution of transaction Ti, a failure occurs that prevents Tifrom com-pleting its execution successfully Examples of such failures include powerfailures, hardware failures, and software errors Further, suppose that the fail-

dur-ure happened after the write(A) operation but before the write(B) operation In this case, the values of accounts A and B reﬂected in the database are $950 and

$2000 The system destroyed $50 as a result of this failure In particular, we

note that the sum A + B is no longer preserved.

Thus, because of the failure, the state of the system no longer reﬂects a realstate of the world that the database is supposed to capture We term such a

state an inconsistent state We must ensure that such inconsistencies are not

visible in a database system Note, however, that the system must at some

point be in an inconsistent state Even if transaction Tiis executed to

comple-tion, there exists a point at which the value of account A is $950 and the value

of account B is $2000, which is clearly an inconsistent state This state,

how-ever, is eventually replaced by the consistent state where the value of account

A is $950, and the value of account B is $2050 Thus, if the transaction never

started or was guaranteed to complete, such an inconsistent state would not

be visible except during the execution of the transaction That is the reason forthe atomicity requirement: If the atomicity property is present, all actions ofthe transaction are reﬂected in the database, or none are

The basic idea behind ensuring atomicity is this: The database system keepstrack (on disk) of the old values of any data on which a transaction performs awrite, and, if the transaction does not complete its execution, the database sys-tem restores the old values to make it appear as though the transaction neverexecuted We discuss these ideas further in Section 15.2 Ensuring atomicity

is the responsibility of the database system itself; speciﬁcally, it is handled by

a component called the transaction-management component, which we

de-scribe in detail in Chapter 17

• Durability: Once the execution of the transaction completes successfully, and

the user who initiated the transaction has been notiﬁed that the transfer offunds has taken place, it must be the case that no system failure will result in

a loss of data corresponding to this transfer of funds

The durability property guarantees that, once a transaction completes cessfully, all the updates that it carried out on the database persist, even ifthere is a system failure after the transaction completes execution

Trang 19

suc-We assume for now that a failure of the computer system may result inloss of data in main memory, but data written to disk are never lost We canguarantee durability by ensuring that either

1. The updates carried out by the transaction have been written to disk fore the transaction completes

be-2. Information about the updates carried out by the transaction and written

to disk is sufﬁcient to enable the database to reconstruct the updates whenthe database system is restarted after the failure

Ensuring durability is the responsibility of a component of the database

sys-tem called the recovery-management component The

transaction-manage-ment component and the recovery-managetransaction-manage-ment component are closely lated, and we describe them in Chapter 17

re-• Isolation: Even if the consistency and atomicity properties are ensured for

each transaction, if several transactions are executed concurrently, their ations may interleave in some undesirable way, resulting in an inconsistentstate

oper-For example, as we saw earlier, the database is temporarily inconsistent

while the transaction to transfer funds from A to B is executing, with the ducted total written to A and the increased total yet to be written to B If a second concurrently running transaction reads A and B at this intermediate point and computes A + B, it will observe an inconsistent value Furthermore,

de-if this second transaction then performs updates on A and B based on the

in-consistent values that it read, the database may be left in an inin-consistent stateeven after both transactions have completed

A way to avoid the problem of concurrently executing transactions is toexecute transactions serially— that is, one after the other However, concur-rent execution of transactions provides signiﬁcant performance beneﬁts, as

we shall see in Section 15.4 Other solutions have therefore been developed;they allow multiple transactions to execute concurrently

We discuss the problems caused by concurrently executing transactions inSection 15.4 The isolation property of a transaction ensures that the concur-rent execution of transactions results in a system state that is equivalent to astate that could have been obtained had these transactions executed one at atime in some order We shall discuss the principles of isolation further in Sec-tion 15.5 Ensuring the isolation property is the responsibility of a component

of the database system called the concurrency-control component, which we

discuss later, in Chapter 16

15.2 Transaction State

In the absence of failures, all transactions complete successfully However, as we

noted earlier, a transaction may not always complete its execution successfully Such

a transaction is termed aborted If we are to ensure the atomicity property, an aborted

transaction must have no effect on the state of the database Thus, any changes that

Trang 20

A transaction that completes its execution successfully is said to be committed.

A committed transaction that has performed updates transforms the database into anew consistent state, which must persist even if there is a system failure

Once a transaction has committed, we cannot undo its effects by aborting it The

only way to undo the effects of a committed transaction is to execute a compensating transaction For instance, if a transaction added $20 to an account, the compensating

transaction would subtract $20 from the account However, it is not always possible

to create such a compensating transaction Therefore, the responsibility of writingand executing a compensating transaction is left to the user, and is not handled bythe database system Chapter 24 includes a discussion of compensating transactions

We need to be more precise about what we mean by successful completion of a

trans-action We therefore establish a simple abstract transaction model A transaction must

be in one of the following states:

• Active, the initial state; the transaction stays in this state while it is executing

• Partially committed, after the ﬁnal statement has been executed

• Failed, after the discovery that normal execution can no longer proceed

• Aborted, after the transaction has been rolled back and the database has been

restored to its state prior to the start of the transaction

• Committed, after successful completion

The state diagram corresponding to a transaction appears in Figure 15.1 We saythat a transaction has committed only if it has entered the committed state Simi-larly, we say that a transaction has aborted only if it has entered the aborted state A

transaction is said to have terminated if has either committed or aborted.

A transaction starts in the active state When it ﬁnishes its ﬁnal statement, it entersthe partially committed state At this point, the transaction has completed its exe-cution, but it is still possible that it may have to be aborted, since the actual outputmay still be temporarily residing in main memory, and thus a hardware failure maypreclude its successful completion

The database system then writes out enough information to disk that, even in theevent of a failure, the updates performed by the transaction can be re-created whenthe system restarts after the failure When the last of this information is written out,the transaction enters the committed state

As mentioned earlier, we assume for now that failures do not result in loss of data

on disk Chapter 17 discusses techniques to deal with loss of data on disk

A transaction enters the failed state after the system determines that the tion can no longer proceed with its normal execution (for example, because of hard-ware or logical errors) Such a transaction must be rolled back Then, it enters theaborted state At this point, the system has two options:

Trang 21

aborted failed

active

partially committed

Figure 15.1 State diagram of a transaction

• It can restart the transaction, but only if the transaction was aborted as a result

of some hardware or software error that was not created through the nal logic of the transaction A restarted transaction is considered to be a newtransaction

inter-• It can kill the transaction It usually does so because of some internal logical

error that can be corrected only by rewriting the application program, or cause the input was bad, or because the desired data were not found in thedatabase

be-We must be cautious when dealing with observable external writes, such as writes

to a terminal or printer Once such a write has occurred, it cannot be erased, since it

may have been seen external to the database system Most systems allow such writes

to take place only after the transaction has entered the committed state One way to

implement such a scheme is for the database system to store any value associated

with such external writes temporarily in nonvolatile storage, and to perform the

ac-tual writes only after the transaction enters the committed state If the system should

fail after the transaction has entered the committed state, but before it could complete

the external writes, the database system will carry out the external writes (using the

data in nonvolatile storage) when the system is restarted

Handling external writes can be more complicated in some situations For example

suppose the external action is that of dispensing cash at an automated teller machine,

and the system fails just before the cash is actually dispensed (we assume that cash

can be dispensed atomically) It makes no sense to dispense cash when the system

is restarted, since the user may have left the machine In such a case a

compensat-ing transaction, such as depositcompensat-ing the cash back in the users account, needs to be

executed when the system is restarted

Trang 22

15.3 Implementation of Atomicity and Durability 571

For certain applications, it may be desirable to allow active transactions to play data to users, particularly for long-duration transactions that run for minutes

dis-or hours Unfdis-ortunately, we cannot allow such output of observable data unless weare willing to compromise transaction atomicity Most current transaction systemsensure atomicity and, therefore, forbid this form of interaction with users In Chapter

24, we discuss alternative transaction models that support long-duration, interactivetransactions

15.3 Implementation of Atomicity and Durability

The recovery-management component of a database system can support atomicityand durability by a variety of schemes We ﬁrst consider a simple, but extremely in-

efﬁcient, scheme called the shadow copy scheme This scheme, which is based on

making copies of the database, called shadow copies, assumes that only one

transac-tion is active at a time The scheme also assumes that the database is simply a ﬁle ondisk A pointer called db-pointer is maintained on disk; it points to the current copy

of the database

In the shadow-copy scheme, a transaction that wants to update the database ﬁrstcreates a complete copy of the database All updates are done on the new database

copy, leaving the original copy, the shadow copy, untouched If at any point the

trans-action has to be aborted, the system merely deletes the new copy The old copy of thedatabase has not been affected

If the transaction completes, it is committed as follows First, the operating system

is asked to make sure that all pages of the new copy of the database have been writtenout to disk (Unix systems use the ﬂush command for this purpose.) After the operat-ing system has written all the pages to disk, the database system updates the pointerdb-pointer to point to the new copy of the database; the new copy then becomesthe current copy of the database The old copy of the database is then deleted Fig-ure 15.2 depicts the scheme, showing the database state before and after the update

db-pointer

(a) Before update

old copy ofdatabase(to be deleted)

old copy ofdatabase

db-pointer

(b) After update

new copy ofdatabase

Figure 15.2 Shadow-copy technique for atomicity and durability

Trang 23

The transaction is said to have been committed at the point where the updated

db-pointer is written to disk

We now consider how the technique handles transaction and system failures First,

consider transaction failure If the transaction fails at any time before db-pointer is

updated, the old contents of the database are not affected We can abort the

trans-action by just deleting the new copy of the database Once the transtrans-action has been

committed, all the updates that it performed are in the database pointed to by

db-pointer Thus, either all updates of the transaction are reﬂected, or none of the effects

are reﬂected, regardless of transaction failure

Now consider the issue of system failure Suppose that the system fails at any time

before the updated db-pointer is written to disk Then, when the system restarts, it

will read db-pointer and will thus see the original contents of the database, and none

of the effects of the transaction will be visible on the database Next, suppose that the

system fails after db-pointer has been updated on disk Before the pointer is updated,

all updated pages of the new copy of the database were written to disk Again, we

assume that, once a ﬁle is written to disk, its contents will not be damaged even if

there is a system failure Therefore, when the system restarts, it will read db-pointer

and will thus see the contents of the database after all the updates performed by the

transaction

The implementation actually depends on the write to db-pointer being atomic;

that is, either all its bytes are written or none of its bytes are written If some of the

bytes of the pointer were updated by the write, but others were not, the pointer is

meaningless, and neither old nor new versions of the database may be found when

the system restarts Luckily, disk systems provide atomic updates to entire blocks, or

at least to a disk sector In other words, the disk system guarantees that it will update

db-pointer atomically, as long as we make sure that db-pointer lies entirely in a single

sector, which we can ensure by storing db-pointer at the beginning of a block

Thus, the atomicity and durability properties of transactions are ensured by the

shadow-copy implementation of the recovery-management component

As a simple example of a transaction outside the database domain, consider a

text-editing session An entire text-editing session can be modeled as a transaction The actions

executed by the transaction are reading and updating the ﬁle Saving the ﬁle at the

end of editing corresponds to a commit of the editing transaction; quitting the editing

session without saving the ﬁle corresponds to an abort of the editing transaction

Many text editors use essentially the implementation just described, to ensure that

an editing session is transactional A new ﬁle is used to store the updated ﬁle At the

end of the editing session, if the updated ﬁle is to be saved, the text editor uses a ﬁle

rename command to rename the new ﬁle to have the actual ﬁle name The rename,

assumed to be implemented as an atomic operation by the underlying ﬁle system,

deletes the old ﬁle as well

Unfortunately, this implementation is extremely inefﬁcient in the context of large

databases, since executing a single transaction requires copying the entire database.

Furthermore, the implementation does not allow transactions to execute concurrently

with one another There are practical ways of implementing atomicity and durability

that are much less expensive and more powerful We study these recovery techniques

in Chapter 17

Trang 24

15.4 Concurrent Executions 573

15.4 Concurrent Executions

Transaction-processing systems usually allow multiple transactions to run rently Allowing multiple transactions to update data concurrently causes severalcomplications with consistency of the data, as we saw earlier Ensuring consistency

concur-in spite of concurrent execution of transactions requires extra work; it is far easier to

insist that transactions run serially — that is, one at a time, each starting only after

the previous one has completed However, there are two good reasons for allowingconcurrency:

• Improved throughput and resource utilization A transaction consists of many

steps Some involveI/Oactivity; others involveCPUactivity TheCPUand thedisks in a computer system can operate in parallel Therefore,I/Oactivity can

be done in parallel with processing at theCPU The parallelism of the CPU

and theI/Osystem can therefore be exploited to run multiple transactions inparallel While a read or write on behalf of one transaction is in progress onone disk, another transaction can be running in theCPU, while another diskmay be executing a read or write on behalf of a third transaction All of this

increases the throughput of the system — that is, the number of transactions

executed in a given amount of time Correspondingly, the processor and disk

utilization also increase; in other words, the processor and disk spend lesstime idle, or not performing any useful work

• Reduced waiting time There may be a mix of transactions running on a

sys-tem, some short and some long If transactions run serially, a short transactionmay have to wait for a preceding long transaction to complete, which can lead

to unpredictable delays in running a transaction If the transactions are ating on different parts of the database, it is better to let them run concurrently,sharing theCPUcycles and disk accesses among them Concurrent executionreduces the unpredictable delays in running transactions Moreover, it also

oper-reduces the average response time: the average time for a transaction to be

completed after it has been submitted

The motivation for using concurrent execution in a database is essentially the same

as the motivation for using multiprogramming in an operating system.

When several transactions run concurrently, database consistency can be destroyeddespite the correctness of each individual transaction In this section, we present theconcept of schedules to help identify those executions that are guaranteed to ensureconsistency

The database system must control the interaction among the concurrent actions to prevent them from destroying the consistency of the database It does

trans-so through a variety of mechanisms called concurrency-control schemes We study

concurrency-control schemes in Chapter 16; for now, we focus on the concept of rect concurrent execution

cor-Consider again the simpliﬁed banking system of Section 15.1, which has several

accounts, and a set of transactions that access and update those accounts Let T1and

Trang 25

T2be two transactions that transfer funds from one account to another Transaction T1

transfers $50 from account A to account B It is deﬁned as

Suppose the current values of accounts A and B are $1000 and $2000, respectively.

Suppose also that the two transactions are executed one at a time in the order T1

followed by T2 This execution sequence appears in Figure 15.3 In the ﬁgure, the

sequence of instruction steps is in chronological order from top to bottom, with

in-structions of T1appearing in the left column and instructions of T2appearing in the

right column The ﬁnal values of accounts A and B, after the execution in Figure 15.3

takes place, are $855 and $2145, respectively Thus, the total amount of money in

B := B + temp

write(B)

Figure 15.3 Schedule 1 — a serial schedule in which T1is followed by T2

Trang 26

15.4 Concurrent Executions 575

accounts A and B— that is, the sum A + B— is preserved after the execution of both

transactions

Similarly, if the transactions are executed one at a time in the order T2 followed

by T1, then the corresponding execution sequence is that of Figure 15.4 Again, as

expected, the sum A + B is preserved, and the ﬁnal values of accounts A and B are

$850 and $2150, respectively

The execution sequences just described are called schedules They represent the

chronological order in which instructions are executed in the system Clearly, a ule for a set of transactions must consist of all instructions of those transactions, andmust preserve the order in which the instructions appear in each individual transac-

sched-tion For example, in transaction T1, the instruction write(A) must appear before the instruction read(B), in any valid schedule In the following discussion, we shall refer

to the ﬁrst execution sequence (T1followed by T2) as schedule 1, and to the second

execution sequence (T2followed by T1) as schedule 2

These schedules are serial: Each serial schedule consists of a sequence of

instruc-tions from various transacinstruc-tions, where the instrucinstruc-tions belonging to one single

trans-action appear together in that schedule Thus, for a set of n transtrans-actions, there exist

n! different valid serial schedules.

When the database system executes several transactions concurrently, the sponding schedule no longer needs to be serial If two transactions are running con-currently, the operating system may execute one transaction for a little while, thenperform a context switch, execute the second transaction for some time, and thenswitch back to the ﬁrst transaction for some time, and so on With multiple transac-tions, theCPUtime is shared among all the transactions

corre-Several execution sequences are possible, since the various instructions from bothtransactions may now be interleaved In general, it is not possible to predict exactlyhow many instructions of a transaction will be executed before theCPUswitches to

B := B + temp

write(B)read(A)

A := A 50write(A)read(B)

B := B + 50

write(B)

Figure 15.4 Schedule 2 — a serial schedule in which T2is followed by T1

Trang 27

T1 T2read(A)

A := A –

–

50write(A)

read(A)

temp := A * 0.1

A := A temp

write(A )read(B)

Figure 15.5 Schedule 3 — a concurrent schedule equivalent to schedule 1

another transaction Thus, the number of possible schedules for a set of n transactions

is much larger than n!.

Returning to our previous example, suppose that the two transactions are

exe-cuted concurrently One possible schedule appears in Figure 15.5 After this

execu-tion takes place, we arrive at the same state as the one in which the transacexecu-tions are

executed serially in the order T1followed by T2 The sum A + B is indeed preserved.

Not all concurrent executions result in a correct state To illustrate, consider the

schedule of Figure 15.6 After the execution of this schedule, we arrive at a state

where the ﬁnal values of accounts A and B are $950 and $2100, respectively This ﬁnal

state is an inconsistent state, since we have gained $50 in the process of the

concur-rent execution Indeed, the sum A + B is not preserved by the execution of the two

transactions

If control of concurrent execution is left entirely to the operating system, many

possible schedules, including ones that leave the database in an inconsistent state,

such as the one just described, are possible It is the job of the database system to

ensure that any schedule that gets executed will leave the database in a consistent

state The concurrency-control component of the database system carries out this

task

We can ensure consistency of the database under concurrent execution by making

sure that any schedule that executed has the same effect as a schedule that could

have occurred without any concurrent execution That is, the schedule should, in

some sense, be equivalent to a serial schedule We examine this idea in Section 15.5

15.5 Serializability

The database system must control concurrent execution of transactions, to ensure

that the database state remains consistent Before we examine how the database

Trang 28

15.5 Serializability 577

T1 T2read(A)

A := A –

–

50read(A)

temp := A * 0.1

A := A temp

write(A)

read(B)write(A)

Figure 15.6 Schedule 4 — a concurrent schedule

system can carry out this task, we must ﬁrst understand which schedules will sure consistency, and which schedules will not

en-Since transactions are programs, it is computationally difﬁcult to determine actly what operations a transaction performs and how operations of various trans-actions interact For this reason, we shall not interpret the type of operations that atransaction can perform on a data item Instead, we consider only two operations:readand write We thus assume that, between a read(Q) instruction and a write(Q) instruction on a data item Q, a transaction may perform an arbitrary sequence of operations on the copy of Q that is residing in the local buffer of the transaction Thus,

ex-the only signiﬁcant operations of a transaction, from a scheduling point of view, areits read and write instructions We shall therefore usually show only read and writeinstructions in schedules, as we do in schedule 3 in Figure 15.7

In this section, we discuss different forms of schedule equivalence; they lead to the

notions of conﬂict serializability and view serializability.

T1 T2

read(A)write(A)

read(A)write(A)read(B)

write(B)

read(B)write(B)

Figure 15.7 Schedule 3 — showing only the read and write instructions

Trang 29

15.5.1 Conﬂict Serializability

Let us consider a schedule S in which there are two consecutive instructions Ii and

I j, of transactions Ti and Tj, respectively (i = j) If I i and Ij refer to different data

items, then we can swap Ii and Ijwithout affecting the results of any instruction in

the schedule However, if Ii and Ij refer to the same data item Q, then the order of

the two steps may matter Since we are dealing with only read and write instructions,

there are four cases that we need to consider:

1 Ii = read(Q), Ij = read(Q) The order of Ii and Ij does not matter, since the

same value of Q is read by Ti and Tj, regardless of the order.

2 Ii = read(Q), Ij = write(Q) If Ii comes before Ij, then Tidoes not read the value

of Q that is written by Tj in instruction Ij If Ij comes before Ii, then Tireads

the value of Q that is written by Tj Thus, the order of Ii and Ijmatters

3 Ii = write(Q), Ij = read(Q) The order of Ii and Ijmatters for reasons similar

to those of the previous case

4 Ii = write(Q), Ij = write(Q) Since both instructions are write operations, the order of these instructions does not affect either Ti or Tj However, the value

obtained by the next read(Q) instruction of S is affected, since the result of

only the latter of the two write instructions is preserved in the database If

there is no other write(Q) instruction after Ii and Ij in S, then the order of Ii and Ij directly affects the ﬁnal value of Q in the database state that results from schedule S.

Thus, only in the case where both Ii and Ij are read instructions does the relative

order of their execution not matter

We say that Ii and Ijconﬂictif they are operations by different transactions on the

same data item, and at least one of these instructions is a write operation

To illustrate the concept of conﬂicting instructions, we consider schedule 3, in

Fig-ure 15.7 The write(A) instruction of T1 conﬂicts with the read(A) instruction of T2

However, the write(A) instruction of T2does not conﬂict with the read(B) instruction

of T1, because the two instructions access different data items

Let Ii and Ij be consecutive instructions of a schedule S If Ii and Ijare instructions

of different transactions and Ii and Ijdo not conﬂict, then we can swap the order of

I i and Ij to produce a new schedule S We expect S to be equivalent to S , since all

instructions appear in the same order in both schedules except for Ii and Ij, whose

order does not matter

Since the write(A) instruction of T2in schedule 3 of Figure 15.7 does not conﬂict

with the read(B) instruction of T1, we can swap these instructions to generate an

equivalent schedule, schedule 5, in Figure 15.8 Regardless of the initial system state,

schedules 3 and 5 both produce the same ﬁnal system state

We continue to swap nonconﬂicting instructions:

• Swap the read(B) instruction of T1with the read(A) instruction of T2

• Swap the write(B) instruction of T1with the write(A) instruction of T2

• Swap the write(B) instruction of T1with the read(A) instruction of T2

Trang 30

T1 T2read(A)

write(A)

read(A)read(B)

write(A)write(B)

read(B)write(B)

Figure 15.8 Schedule 5 — schedule 3 after swapping of a pair of instructions

The ﬁnal result of these swaps, schedule 6 of Figure 15.9, is a serial schedule Thus,

we have shown that schedule 3 is equivalent to a serial schedule This equivalenceimplies that, regardless of the initial system state, schedule 3 will produce the sameﬁnal state as will some serial schedule

If a schedule S can be transformed into a schedule S by a series of swaps of

non-conﬂicting instructions, we say that S and S are conﬂict equivalent.

In our previous examples, schedule 1 is not conﬂict equivalent to schedule 2

How-ever, schedule 1 is conflict equivalent to schedule 3, because the read(B) and write(B) instruction of T1can be swapped with the read(A) and write(A) instruction of T2.The concept of conflict equivalence leads to the concept of conflict serializability

We say that a schedule S is conﬂict serializable if it is conﬂict equivalent to a serial

schedule Thus, schedule 3 is conﬂict serializable, since it is conﬂict equivalent to theserial schedule 1

Finally, consider schedule 7 of Figure 15.10; it consists of only the signiﬁcant

op-erations (that is, the read and write) of transactions T3and T4 This schedule is not

conﬂict serializable, since it is not equivalent to either the serial schedule <T3,T4>or

the serial schedule <T4,T3>

It is possible to have two schedules that produce the same outcome, but that are

not conﬂict equivalent For example, consider transaction T5, which transfers $10

T1 T2read(A)

write(A)read(B)write(B)

read(A)write(A)read(B)write(B)

Figure 15.9 Schedule 6 — a serial schedule that is equivalent to schedule 3

Trang 31

T3 T4read(Q)

write(Q)write(Q)

Figure 15.10 Schedule 7

from account B to account A Let schedule 8 be as deﬁned in Figure 15.11 We claim

that schedule 8 is not conﬂict equivalent to the serial schedule <T1,T5>, since, in

schedule 8, the write(B) instruction of T5conﬂicts with the read(B) instruction of T1

Thus, we cannot move all the instructions of T1before those of T5by swapping

con-secutive nonconﬂicting instructions However, the ﬁnal values of accounts A and B

after the execution of either schedule 8 or the serial schedule <T1,T5>are the same

— $960 and $2040, respectively

We can see from this example that there are less stringent deﬁnitions of schedule

equivalence than conﬂict equivalence For the system to determine that schedule 8

produces the same outcome as the serial schedule <T1,T5>, it must analyze the

com-putation performed by T1and T5, rather than just the read and write operations In

general, such analysis is hard to implement and is computationally expensive

How-ever, there are other deﬁnitions of schedule equivalence based purely on the read and

writeoperations We will consider one such deﬁnition in the next section

15.5.2 View Serializability

In this section, we consider a form of equivalence that is less stringent than conﬂict

equivalence, but that, like conﬂict equivalence, is based on only the read and write

operations of transactions

T1 T5read(A)

A := A – 50

write(A)

read(B)

B := B 10write(B)read(B)

Trang 32

Consider two schedules S and S , where the same set of transactions participates

in both schedules The schedules S and S  are said to be view equivalent if three

conditions are met:

1. For each data item Q, if transaction Ti reads the initial value of Q in schedule

S, then transaction T i must, in schedule S , also read the initial value of Q.

2. For each data item Q, if transaction Ti executes read(Q) in schedule S, and if that value was produced by a write(Q) operation executed by transaction Tj,

then the read(Q) operation of transaction Ti must, in schedule S , also read the

value of Q that was produced by the same write(Q) operation of transaction Tj

3. For each data item Q, the transaction (if any) that performs the ﬁnal write(Q) operation in schedule S must perform the ﬁnal write(Q) operation in schedule S 

Conditions 1 and 2 ensure that each transaction reads the same values in bothschedules and, therefore, performs the same computation Condition 3, coupled withconditions 1 and 2, ensures that both schedules result in the same ﬁnal system state

In our previous examples, schedule 1 is not view equivalent to schedule 2, since,

in schedule 1, the value of account A read by transaction T2 was produced by T1,whereas this case does not hold in schedule 2 However, schedule 1 is view equivalent

to schedule 3, because the values of account A and B read by transaction T2 were

produced by T1in both schedules

The concept of view equivalence leads to the concept of view serializability We

say that a schedule S is view serializable if it is view equivalent to a serial schedule.

As an illustration, suppose that we augment schedule 7 with transaction T6, andobtain schedule 9 in Figure 15.12 Schedule 9 is view serializable Indeed, it is view

equivalent to the serial schedule <T3, T4, T6> , since the one read(Q) instruction reads the initial value of Q in both schedules, and T6performs the ﬁnal write of Q in both

schedules

Every conflict-serializable schedule is also view serializable, but there are serializable schedules that are not conflict serializable Indeed, schedule 9 is not con-flict serializable, since every pair of consecutive instructions conflicts, and, thus, noswapping of instructions is possible

view-Observe that, in schedule 9, transactions T4and T6perform write(Q) operations

without having performed a read(Q) operation Writes of this sort are called blind

writes Blind writes appear in any view-serializable schedule that is not conﬂict

seri-alizable

T3 T4 T6read(Q)

write(Q)write(Q)

write(Q)

Figure 15.12 Schedule 9 — a view-serializable schedule

Trang 33

15.6 Recoverability

So far, we have studied what schedules are acceptable from the viewpoint of

consis-tency of the database, assuming implicitly that there are no transaction failures We

now address the effect of transaction failures during concurrent execution

If a transaction Ti fails, for whatever reason, we need to undo the effect of this

transaction to ensure the atomicity property of the transaction In a system that allows

concurrent execution, it is necessary also to ensure that any transaction Tj that is

dependent on Ti (that is, Tj has read data written by Ti) is also aborted To achieve

this surety, we need to place restrictions on the type of schedules permitted in the

system

In the following two subsections, we address the issue of what schedules are

acceptable from the viewpoint of recovery from transaction failure We describe in

Chapter 16 how to ensure that only such acceptable schedules are generated

15.6.1 Recoverable Schedules

Consider schedule 11 in Figure 15.13, in which T9is a transaction that performs only

one instruction: read(A) Suppose that the system allows T9to commit immediately

after executing the read(A) instruction Thus, T9commits before T8does Now

sup-pose that T8fails before it commits Since T9has read the value of data item A

writ-ten by T8, we must abort T9to ensure transaction atomicity However, T9has alreadycommitted and cannot be aborted Thus, we have a situation where it is impossible

to recover correctly from the failure of T8

Schedule 11, with the commit happening immediately after the read(A)

instruc-tion, is an example of a nonrecoverable schedule, which should not be allowed Most

database system require that all schedules be recoverable A recoverable schedule is

one where, for each pair of transactions Ti and Tj such that Tjreads a data item

previ-ously written by Ti, the commit operation of Tiappears before the commit operation

of Tj.

15.6.2 Cascadeless Schedules

Even if a schedule is recoverable, to recover correctly from the failure of a

transac-tion Ti, we may have to roll back several transactransac-tions Such situatransac-tions occur if

trans-actions have read data written by Ti As an illustration, consider the partial schedule

T8 T9read(A)

write(A)

read(A)read(B)

Trang 34

15.7 Implementation of Isolation 583

T10 T11 T12read(A)

read(B)write(A)

read(A)write(A)

read(A)

of Figure 15.14 Transaction T10 writes a value of A that is read by transaction T11

Transaction T11 writes a value of A that is read by transaction T12 Suppose that,

at this point, T10 fails T10 must be rolled back Since T11 is dependent on T10, T11must be rolled back Since T12 is dependent on T11, T12 must be rolled back Thisphenomenon, in which a single transaction failure leads to a series of transaction

rollbacks, is called cascading rollback.

Cascading rollback is undesirable, since it leads to the undoing of a signiﬁcantamount of work It is desirable to restrict the schedules to those where cascading

rollbacks cannot occur Such schedules are called cascadeless schedules Formally, a

cascadeless scheduleis one where, for each pair of transactions Ti and Tj such that

T j reads a data item previously written by Ti, the commit operation of Ti appears

before the read operation of Tj It is easy to verify that every cascadeless schedule is

also recoverable

15.7 Implementation of Isolation

So far, we have seen what properties a schedule must have if it is to leave the database

in a consistent state and allow transaction failures to be handled in a safe manner.Speciﬁcally, schedules that are conﬂict or view serializable and cascadeless satisfythese requirements

There are various concurrency-control schemes that we can use to ensure that,

even when multiple transactions are executed concurrently, only acceptable ules are generated, regardless of how the operating-system time-shares resources(such asCPUtime) among the transactions

sched-As a trivial example of a concurrency-control scheme, consider this scheme: A

transaction acquires a lock on the entire database before it starts and releases the

lock after it has committed While a transaction holds a lock, no other transaction isallowed to acquire the lock, and all must therefore wait for the lock to be released As

a result of the locking policy, only one transaction can execute at a time Therefore,only serial schedules are generated These are trivially serializable, and it is easy toverify that they are cascadeless as well

A concurrency-control scheme such as this one leads to poor performance, since itforces transactions to wait for preceding transactions to ﬁnish before they can start Inother words, it provides a poor degree of concurrency As explained in Section 15.4,concurrent execution has several performance beneﬁts

Trang 35

The goal of concurrency-control schemes is to provide a high degree of

concur-rency, while ensuring that all schedules that can be generated are conﬂict or view

serializable, and are cascadeless

We study a number of concurrency-control schemes in Chapter 16 The schemes

have different trade-offs in terms of the amount of concurrency they allow and the

amount of overhead that they incur Some of them allow only conﬂict serializable

schedules to be generated; others allow certain view-serializable schedules that are

not conﬂict-serializable to be generated

15.8 Transaction Deﬁnition in SQL

A data-manipulation language must include a construct for specifying the set of

ac-tions that constitute a transaction

The SQLstandard speciﬁes that a transaction begins implicitly Transactions are

ended by one of theseSQLstatements:

• Commit work commits the current transaction and begins a new one.

• Rollback work causes the current transaction to abort.

The keyword work is optional in both the statements If a program terminates

with-out either of these commands, the updates are either committed or rolled back —

which of the two happens is not speciﬁed by the standard and depends on the

im-plementation

The standard also speciﬁes that the system must ensure both serializability and

freedom from cascading rollback The deﬁnition of serializability used by the

stan-dard is that a schedule must have the same effect as would some serial schedule Thus,

conﬂict and view serializability are both acceptable

TheSQL-92standard also allows a transaction to specify that it may be executed in

a manner that causes it to become nonserializable with respect to other transactions

We study such weaker levels of consistency in Section 16.8

15.9 Testing for Serializability

When designing concurrency control schemes, we must show that schedules

gen-erated by the scheme are serializable To do that, we must ﬁrst understand how to

determine, given a particular schedule S, whether the schedule is serializable.

We now present a simple and efﬁcient method for determining conﬂict

serializ-ability of a schedule Consider a schedule S We construct a directed graph, called a

precedence graph, from S This graph consists of a pair G = (V, E), where V is a set

of vertices and E is a set of edges The set of vertices consists of all the transactions

participating in the schedule The set of edges consists of all edges Ti → T jfor which

one of three conditions holds:

1 Ti executes write(Q) before Tj executes read(Q).

2 Ti executes read(Q) before Tj executes write(Q).

3 Ti executes write(Q) before Tj executes write(Q).

Trang 36

15.9 Testing for Serializability 585

Figure 15.15 Precedence graph for (a) schedule 1 and (b) schedule 2

If an edge Ti → T j exists in the precedence graph, then, in any serial schedule S equivalent to S, Ti must appear before Tj.

For example, the precedence graph for schedule 1 in Figure 15.15a contains the

single edge T1 → T2, since all the instructions of T1are executed before the ﬁrst

in-struction of T2is executed Similarly, Figure 15.15b shows the precedence graph for

schedule 2 with the single edge T2→ T1, since all the instructions of T2are executed

before the ﬁrst instruction of T1is executed

The precedence graph for schedule 4 appears in Figure 15.16 It contains the edge

T1→ T2, because T1executes read(A) before T2executes write(A) It also contains the edge T2→ T1, because T2executes read(B) before T1executes write(B).

If the precedence graph for S has a cycle, then schedule S is not conﬂict serializable.

If the graph contains no cycles, then the schedule S is conﬂict serializable.

A serializability order of the transactions can be obtained through topological sorting, which determines a linear order consistent with the partial order of the

precedence graph There are, in general, several possible linear orders that can beobtained through a topological sorting For example, the graph of Figure 15.17a hasthe two acceptable linear orderings shown in Figures 15.17b and 15.17c

Thus, to test for conﬂict serializability, we need to construct the precedence graphand to invoke a cycle-detection algorithm Cycle-detection algorithms can be found

in standard textbooks on algorithms Cycle-detection algorithms, such as those based

on depth-ﬁrst search, require on the order of n2operations, where n is the number of

vertices in the graph (that is, the number of transactions) Thus, we have a practicalscheme for determining conﬂict serializability

Returning to our previous examples, note that the precedence graphs for ules 1 and 2 (Figure 15.15) indeed do not contain cycles The precedence graph forschedule 4 (Figure 15.16), on the other hand, contains a cycle, indicating that thisschedule is not conﬂict serializable

sched-Testing for view serializability is rather complicated In fact, it has been shown

that the problem of testing for view serializability is itself NP-complete Thus,

al-most certainly there exists no efﬁcient algorithm to test for view serializability See

Figure 15.16 Precedence graph for schedule 4

Trang 37

Figure 15.17 Illustration of topological sorting.

the bibliographical notes for references on testing for view serializability However,

concurrency-control schemes can still use sufﬁcient conditions for view serializability.

That is, if the sufﬁcient conditions are satisﬁed, the schedule is view serializable, but

there may be view-serializable schedules that do not satisfy the sufﬁcient conditions

15.10 Summary

• A transaction is a unit of program execution that accesses and possibly updates

various data items Understanding the concept of a transaction is critical forunderstanding and implementing updates of data in a database, in such a waythat concurrent executions and failures of various forms do not result in thedatabase becoming inconsistent

• Transactions are required to have theACIDproperties: atomicity, consistency,isolation, and durability

Atomicity ensures that either all the effects of a transaction are reﬂected

in the database, or none are; a failure cannot leave the database in a statewhere a transaction is partially executed

Consistency ensures that, if the database is initially consistent, the tion of the transaction (by itself) leaves the database in a consistent state

Trang 38

execu-15.10 Summary 587

Isolation ensures that concurrently executing transactions are isolated fromone another, so that each has the impression that no other transaction isexecuting concurrently with it

Durability ensures that, once a transaction has been committed, that action’s updates do not get lost, even if there is a system failure

trans-• Concurrent execution of transactions improves throughput of transactions and

system utilization, and also reduces waiting time of transactions

• When several transactions execute concurrently in the database, the

consis-tency of data may no longer be preserved It is therefore necessary for thesystem to control the interaction among the concurrent transactions

Since a transaction is a unit that preserves consistency, a serial execution

of transactions guarantees that consistency is preserved

A schedule captures the key actions of transactions that affect concurrent

execution, such as read and write operations, while abstracting away ternal details of the execution of the transaction

in-We require that any schedule produced by concurrent processing of aset of transactions will have an effect equivalent to a schedule producedwhen these transactions are run serially in some order

A system that guarantees this property is said to ensure serializability.

There are several different notions of equivalence leading to the concepts

of conﬂict serializability and view serializability.

• Serializability of schedules generated by concurrently executing transactions

can be ensured through one of a variety of mechanisms called

concurrency-control schemes.

• Schedules must be recoverable, to make sure that if transaction a sees the

ef-fects of transaction b, and b then aborts, then a also gets aborted.

• Schedules should preferably be cascadeless, so that the abort of a transaction

does not result in cascading aborts of other transactions Cascadelessness isensured by allowing transactions to only read committed data

• The concurrency-control–management component of the database is

respon-sible for handling the concurrency-control schemes Chapter 16 describesconcurrency-control schemes

• The recovery-management component of a database is responsible for

ensur-ing the atomicity and durability properties of transactions

The shadow copy scheme is used for ensuring atomicity and durability intext editors; however, it has extremely high overheads when used for databasesystems, and, moreover, it does not support concurrent execution Chapter 17covers better schemes

• We can test a given schedule for conﬂict serializability by constructing a dence graph for the schedule, and by searching for absence of cycles in the

Trang 39

prece-graph However, there are more efﬁcient concurrency control schemes for suring serializability.

en-Review Terms

• Transaction

• ACID properties

AtomicityConsistencyIsolationDurability

• Inconsistent state

• Transaction state

ActivePartially committedFailed

AbortedCommittedTerminated

• Transaction

RestartKill

• Observable external writes

• Shadow copy scheme

15.1 List theACIDproperties Explain the usefulness of each

15.2 Suppose that there is a database system that never fails Is a recovery manager

required for this system?

15.3 Consider a ﬁle system such as the one on your favorite operating system

a. What are the steps involved in creation and deletion of ﬁles, and in writingdata to a ﬁle?

b. Explain how the issues of atomicity and durability are relevant to the ation and deletion of ﬁles, and to writing data to ﬁles

cre-15.4 Database-system implementers have paid much more attention to the ACID

properties than have ﬁle-system implementers Why might this be the case?

15.5 During its execution, a transaction passes through several states, until it ﬁnally

commits or aborts List all possible sequences of states through which a action may pass Explain why each state transition may occur

Trang 40

trans-Exercises 589

15.6 Justify the following statement: Concurrent execution of transactions is moreimportant when data must be fetched from (slow) disk or when transactionsare long, and is less important when data is in memory and transactions arevery short

15.7 Explain the distinction between the terms serial schedule and serializable schedule.

15.8 Consider the following two transactions:

Tiêu đề	Query Optimization
Tác giả	Silberschatz, Korth, Sudarshan
Trường học	McGraw-Hill Companies
Chuyên ngành	Database Systems
Thể loại	giáo trình
Năm xuất bản	2001

Định dạng
Số trang	92
Dung lượng	564,42 KB