1. Trang chủ
  2. » Công Nghệ Thông Tin

Báo cáo môn cơ sở dữ liệu Query optimization

63 589 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 63
Dung lượng 655,72 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Báo cáo môn cơ sở dữ liệu Query optimization Introduction to Query Processing Translating SQL Queries into Relational Algebra Rules for equivalent RAEs Using Heuristics in Query Optimization Costbased query optimization Summary

Trang 2

 Introduction to Query Processing

 Translating SQL Queries into Relational Algebra

 Rules for equivalent RAEs

 Using Heuristics in Query Optimization

 Cost-based query optimization

 Summary

Trang 3

Introduction to Query Processing

 Translating SQL Queries into Relational Algebra

 Rules for equivalent RAEs

 Using Heuristics in Query Optimization

 Cost-based query optimization

 Summary

3

Trang 4

Introduction to Query Processing

 Query processing:

 The process by which the query results are

retrieved from a high-level query such as SQL or OQL, ODBMS

 Query optimization:

 The process of choosing a suitable execution

strategy for processing a query.

 Two internal representations of a query:

 Query Tree

Trang 5

Processing a high-level query

Scanning, parsing, and

validating

Scanning, parsing, and

validating

Query optimizer

Query code generator

Runtime database processor

Query in a high-level language

Code to execute the query

Execution plan Immediate form of query

Result of query

5

Trang 6

Example

Trang 7

 Introduction to Query Processing

Translating SQL Queries into Relational Algebra

 Rules for equivalent RAEs

 Using Heuristics in Query Optimization

 Cost-based query optimization

 Summary

7

Trang 8

Translating SQL Queries into Relational Algebra

 A1, A2, …(R)

SELECT * FROM R, S

WHERE c

R  c S

Trang 9

Translating SQL Queries into Relational Algebra

Query block : the basic unit that can be translated

into the algebraic operators and optimized.

 A query block contains a single

SELECT-FROM-WHERE expression, as well as GROUP BY and

HAVING clause if these are part of the block.

Nested queries within a query are identified as

separate query blocks.

 Aggregate operators (MAX, MIN, SUM, and COUNT)

in SQL must be included in the extended algebra.

9

Trang 10

Translating SQL Queries into Relational Algebra

SELECT LNAME, FNAME

Trang 11

 Introduction to Query Processing

 Translating SQL Queries into Relational Algebra

Query Trees and Query Graphs

 Rules for equivalent RAEs

 Using Heuristics in Query Optimization

 Cost-based query optimization

 Summary

11

Trang 12

Query Trees and Query Graphs

Query tree:

 A tree data structure that corresponds to a relational algebra expression

 It represents the input relations of the query as leaf nodes

of the tree, and represents the relational algebra operations

as internal nodes

 An execution of the query tree consists of executing an internal node operation whenever its operands are available and then replacing that internal node by the relation that results from executing the operation

Trang 13

Query Trees and Query Graphs

Trang 14

Query Trees and Query Graphs

Example: For every project located in ‘Stafford’, retrieve the project number, the controlling department number and the department manager’s last name, address and birthdate.

 Relation algebra :

PNUMBER, DNUM, LNAME, ADDRESS, BDATE ((( PLOCATION=‘STAFFORD’(PROJECT))

DNUM=DNUMBER (DEPARTMENT)) MGRSSN=SSN (EMPLOYEE))

 SQL query:

Q2: SELECT P.NUMBER,P.DNUM,E.LNAME,

E.ADDRESS, E.BDATE FROM PROJECT AS P,DEPARTMENT AS D, EMPLOYEE AS E

WHERE P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN AND

P.PLOCATION=‘STAFFORD’;

Trang 15

Query Trees and Query Graphs

15

Trang 16

Query Trees and Query Graphs

Trang 17

 Introduction to Query Processing

 Translating SQL Queries into Relational Algebra

Rules for equivalent RAEs

 Using Heuristics in Query Optimization

 Cost-based query optimization

 Summary

17

Trang 18

Equivalent Relational Expressions

equivalent if they produce the same results (tuples) on the same input relations

- Although their tuples/attributes may

be ordered differently

 An equivalent rule says that expressions of

two forms are equivalent

Can replace expression of first form by second, or vice versa

Trang 19

Rules for equivalent RAEs

19

1 Cascade of σ A conjunctive selection

condition can be broken up into a

cascade (that is, a sequence) of

individual σ operations :

(R)) )) ( (σ

(σ σ

(R)

σ

n 2

1 n

2

Trang 20

Rules for equivalent RAEs

2 Commutativity of σ The σ operation is

commutative:

3 Cascade of π:

(R)) (σ

σ (R))

σ

1 2

2

L 1

L

L L

L

1 n

2 1

Trang 21

Rules for equivalent RAEs

21

4 Commuting σ with π:

5 Commutativity of (and ×)

(R)) (

σ (R))

n 2, , 1

n 2, ,

S

R   θ   θ

S x R

S

x

Trang 22

Rules for equivalent RAEs

6 Commuting σ with (or x )

S)) (

σ R))

( σ

( )

(R

σ

2 1

2

θ    S    (

S)) (

σ R)

( σ

( σ )

(R

σ

2 1

2

θ

θ    S   

Trang 23

Rules for equivalent RAEs

) R) (

( )

(R

2

L θ

))S((

))R((

)

(R

4 2

3

L L

θ

Trang 25

8 Commutativity of set operations

9 Associativity of ( X , , and ∩) ∪, and ∩)

25

Rules for equivalent RAEs

)RS

()

R

( S  

)RS

()

R( S  

) T

S ( R

T )

R

Trang 26

R S

T

 If R S is better than S T then execute

R S first ( Choose a join order )

Trang 27

11 The π operation commutes with ∪, and ∩)

12 Converting a (σ,×) sequence into

10 Commuting σ with set operations (∩, - )∪, and ∩)

27

Rules for equivalent RAEs

)) S ( (

(R)) (

)) S ( (σ (R))

(σ )

(R

Trang 28

 Introduction to Query Processing

 Translating SQL Queries into Relational Algebra

 Rules for equivalent RAEs

 Using Heuristics in Query Optimization

 Cost-based query optimization

 Summary

Trang 29

Using Heuristics in Query Optimization

plan Two main approach :

Reduce the number of operations …

Estimate cost of each operation …

29

Trang 30

 Each Relational Algebra Expression (E) is

represented by a Query Tree (Q)

Trang 31

100 D

C B B

A, σ  σ 

Trang 32

100 D

C B B

C B B

Trang 33

1 Break up any SELECT operations with

conjunctive conditions into a cascade of SELECT operations (Rule 1)

2 Move each SELECT operation as

far down the query tree as is

permitted by the attributes

involved in the select condition

( Rules 2, 4, 6, and 10 )

Outline of a Heuristic Algebraic Optimization Algorithm

33

Trang 34

3 Rearrange the leaf nodes of the tree using the

following criteria ( Rules 5 , 9 concerning

commutativity and associativity of binary

operations )

– position the leaf node relations with the most

restrictive SELECT operations so they are

executed first in the query tree representation– make sure that the ordering of leaf nodes does

not cause CARTESIAN PRODUCT operations

Outline of a Heuristic Algebraic Optimization Algorithm

Trang 35

4 Combine a CARTESIAN PRODUCT operation

with a subsequent SELECT operation in the

tree into a JOIN operation, if the condition

represents a join condition (Rule 12)

5 Break down and move lists of projection

attributes down the tree as far as possible by

( Using Rules 3, 4, 7, and 11 )

Outline of a Heuristic Algebraic Optimization Algorithm

35

Trang 36

6 Identify subtrees that represent groups of

operations that can be executed by a single algorithm

Outline of a Heuristic Algebraic Optimization Algorithm

Trang 37

37

Trang 38

SELECT Pname

FROM PROJECT As P, DEPARTMENT As D, EMPLOYEE As E

WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname=‘Smith’

Pname((Dnum=Dnumber)  (Mgr_ssn=Ssn)  (Lname=‘Smith’))(PxDxE)

Trang 46

 Introduction to Query Processing

 Translating SQL Queries into Relational Algebra

 Rules for equivalent RAEs

 Using Heuristics in Query Optimization

Cost-based query optimization

 Summary

Trang 47

The cost of executing a query includes the

following components:

• Access cost to secondary storage

• Disk storage cost

• Computation cost

• Memory usage cost

• Communication cost

47

Trang 48

 The cost of an operation depends on the size and

other statistics of its inputs

 List some statistics about database relations that are stored in database-system catalogs

 Use the statistics to estimate statistics on the results

of various relational operations

Trang 49

Catalog Information

n r : number of tuples in a relation r.

b r : number of blocks containing tuples of r.

s r : size of a tuple of r.

f r : blocking factor of r

V(A, r): number of distinct values that appear in r for

attribute A; same as the size of  A (r).

f n b

Trang 50

Selection Size Estimation

The size estimate of the result of a selection

operation depends on the selection predicate

 single equality predicate

 single comparison predicate

 combinations of predicates

Trang 51

Equality selection A=v (r)

SC(A, r) : number of records that will satisfy the

selection

SC(A, r)/f r — number of blocks that these

records will occupy

 E.g Binary search cost estimate becomes

Equality condition on a key attribute: SC(A,r) = 1

f

r A

SC b

E

Trang 52

Selections Involving Comparisons

 Selections of the form AV (r) (case of  A  V (r) is symmetric)

 Let c denote the estimated number of tuples satisfying the condition

 If min(A,r) and max(A,r) are available in catalog

C = 0 if v < min(A,r)

In absence of statistical information c is assumed to be

n r / 2.

) , min(

) , max(

) ,

min(

.

r A r

A

r A

v n

Trang 53

Complex Selections

 The selectivity of a condition i is the probability

that a tuple in the relation r satisfies i If s i is the

number of satisfying tuples in r, the selectivity of i

is given by s i /n r

Conjunction: 1 2  n (r) :

53

n r

n r

n

s s

s

n  1  2  

Trang 54

r

n

s n

s n

s n

Trang 55

Estimation of the Size of Joins

Size of Cartesian product:

Cartesian product r × s contains nr ns tuples

Each tuple of r × s occupies lr + ls bytes

55

Trang 56

Estimation of the Size of Joins

r(R) and s(S) be relations:

If R ∩ S = , then r s is the same as r × s∅ , then r ⨝ s is the same as r × s ⨝ s is the same as r × s

If R ∩ S is a key for R, then a tuple of s will join with

at most one tuple from r

-> The number of tuples in r s is no greater than ⨝ s is the same as r × s

the number of tuples in s

Trang 57

Estimation of the Size of Joins

If R ∩ S is a foreign key in S referencing R, then the number of tuples in r s is exactly the same as the ⨝ s is the same as r × snumber of tuples in s

57

Trang 58

Estimation of the Size of Joins

R ∩ S = {A} is not a key for R or S

If every tuple t in R produces tuples in R S, The ⨝ s is the same as r × snumber of tuples in R S is estimated:⨝ s is the same as r × s

) ,

V

n

Trang 59

Estimation of the Size of Joins

If the reverse is true, the estimate obtained:

-> The lower of these two estimates is probably the more accurate one

59

) ,

( r A V

n

Trang 60

 Introduction to Query Processing

 Translating SQL Queries into Relational Algebra

 Rules for equivalent RAEs

 Using Heuristics in Query Optimization

 Cost-based query optimization

Summary

Trang 61

The main heuristic is to apply first the operations that reduce the size of intermediate results by:

operations to reduce the number of tuples

operations to reduce the number of attributes

61

Trang 62

most restrictive—that is, result in relations

with the fewest tuples or with the smallest

absolute size should be executed before other similar operations

themselves while avoiding Cartesian products

Trang 63

63

Ngày đăng: 02/11/2014, 15:52

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w