Distributed Database Management Systems: Lecture 17. The main topics covered in this chapter include: continue with VF to information requirement; attribute affinities; replication of Key attributes does not violate the disjoint ness condition;...
Trang 1Distributed Database Management Systems
Trang 2In this Lecture
• Continue with VF
– Attribute affinities
Trang 3Replication of Key attributes does not violate the disjoint ness
condition
Trang 4Vertical Fragmentation Information Requirements
Trang 5• Basic idea of VF is access efficiency
• Information Requirement is application
based
• Attribute affinities: obtained from more
primitive usage data
Trang 6• (80-20 Rule)
• Attribute usage values: Given a set of
queries Q = {q 1 , q 2 ,…, q q } that will
run on the relation R[A 1 , A 2 ,…, A n ]
Trang 8PROJ(jNo, jName, budget, loc)
JNO=Value
q2: SELEC JNAME, BUDGET FROM PRO J
Trang 9q3: SELECT JNAME FROM PRO J
Trang 11• AUM does not represent the query frequency at
Trang 12aff(A i , A j ) =
∑ ∑ refl(qk)accl(qk)
where refl(qk) is number of accesses to attributes (Ai,
Aj) for each execution of qk at site Sl, and…
Trang 14acc 1 (q 1 ) = 15, acc 2 (q 1 ) = 20, acc 3 (q 1 ) = 10 acc 1 (q 2 ) = 5, acc 2 (q 2 ) = 0, acc 3 (q 2 ) = 0 acc 1 (q 3 ) = 25, acc 2 (q 3 ) = 25, acc 3 (q 3 ) = 25
Trang 18Clustering Algorithm
Trang 19• VF is based on identifying groups of
attributes based on AA
Energy Algorithm (BEA); it uses AA; identifies groups of similar items
Trang 20• Large affinity attributes are combined
together and lower together
• BEA takes as input the AA and
generates the cluster affinity matrix CA
Trang 21Global Affinity Measure (AM)
Trang 22• Affinity Measure is a single value
that is calculated on the basis of
positions of elements in AA and their surrounding elements
Trang 24AM = ∑ n
i = 1
n
j = 1
aff(A 0 , A j )= aff(A i , A 0 )=
aff(A n+1 , A j )= aff(A i , A n+1 )=0