1. Trang chủ
  2. » Công Nghệ Thông Tin

Thuyết trình cơ sở dữ liệu nâng cao point access method

32 442 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 32
Dung lượng 1,02 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

• Each cell is associated with one bucket, but a bucket may contain several adjacent cells • Since the directory may grow large, it is usually kept on secondary storage • To guarantee t

Trang 1

Nhóm 1 :

Lâm Tu n Anhấ

Nguy n Đình Tân Anhễ

Lê Minh Châu

Point Access Method

Trang 2

1 Spatial Data

2 Main Memory Structure

3 Point Access Methods

Point Access Method

Trang 3

Spatial Data

Trang 4

• Complex Structure

• Dynamic

• Spatial databases tend to be large

• There is no standard algebra defined on spatial data

• Many spatial operators are not closed

• Spatial database operators more expensive than standard relational operators

• There is no total order among spatial object

Characteristic of Spatial Data

Trang 5

Queries in Spatial Data

Trang 6

• Exact Match Query ( EMQ )

• Condition : Given object o’ with spatial extent o’.G in Euclide with d-dimension

• Target : Find all objects o with same spatial extent as o’

• Query

• Point Query (PQ )

• Condition : Given a point p in Euclide with d-dimension

• Target : Find all objects o ovelapping with p

• Query

Queries in Spatial Data

Trang 7

• Enclosure Query ( EQ )

• Condition : Given object o’ with spatial extent o’.G in Euclide with d-dimension

• Target : Find all objects o enclosing o’

• Query

Queries in Spatial Data

Trang 9

• Requirements for Multidimensional Access Methods

• Dynamics

• Secondary/tertiary storage management

• Broad range of supported operations

• Independence of the input data and insertion sequence

Trang 10

Main Memory Structure

ith point: pi ith polygon: ri ith centroid: ci ith minimum bounding

box: mi

Figure 9 Running example.

Trang 11

Main Memory Structure

ith point: pi ith polygon: ri ith centroid: ci ith minimum bounding

box: mi

Figure 9 Running example.

Trang 12

Main Memory Structure

ith point: pi ith polygon: ri ith centroid: ci ith minimum bounding

box: mi

Figure 10 k-d construction

Trang 13

Main Memory Structure

ith point: pi ith polygon: ri ith centroid: ci ith minimum bounding

box: mi

Figure 11 k-d tree

Trang 14

Main Memory Structure

• Designed for main memory applications where all the data are available without accessing the disk

• Do not take secondary storage management into account explicitly

• In many spatial database applications the amount of data to be managed is notoriously large

Trang 15

• Multidimensional Hashing

• Hierarchical Access Method

Point Access Methods

Trang 16

• No total order for objects in two- and higher-dimensional space that completely preserves spatial proximity

• Try to construct hashing functions that preserve proximity at least to some extent

• Goal: Objects located close to each other in original space should be likely to be stored close together on the disk

• =>minimizing the number of disk accesses per range query

Multidimensional Hashing

Trang 17

• A d-dimensional orthogonal grid on the universe.

• The grid is not necessarily regular, the resulting cells may be of different shapes and sizes.

• Each cell is associated with one bucket, but a bucket may contain several adjacent cells

• Since the directory may grow large, it is usually kept on secondary storage

• To guarantee that data items are always found with no more than two disk accesses for exact match queries, the grid itself is kept in main memory, represented by d one-dimensional arrays called scales

The Grid File

Trang 18

• decomposes the universe regularly: all grid cells are of equal size

• each new split results in the halving of all cells and therefore in the doubling of the directory size

EXCELL

Trang 19

• Use a second grid file to

manage the grid directory

• The first of the two levels is called the root directory,

• Second level: the actual grid directory

• root directory contain

• pointers to the directory

pages of the lower level,

which in turn contain

pointers to the data pages

• Splits are often confined to the subdirectory regions

without affecting too much the surroundings

• =>slower directory growth

• not solve the problem of

super linear directory size

The Two-Level Grid File

Trang 20

• increase space utilization by introducing a second grid file

• relationship between these two grid files is not hierarchical but somewhat more balanced

• Both grid files span the whole universe

• The distribution of the data among the two files is performed dynamically

The Twin Grid File

Trang 21

• Based on binary of multi-way tree structure

• like hashing, stores data in bucket

• each bucket is leaf of a node, and a disk page

• interior nodes of the tree guide search

• search: top-down tree traversal

• difference between different methods: characteristics of the regions

Hierarchical Access Method

Trang 22

• k-d-B-tree

• combination of adaptive k-d-tree and B-tree

• partition the universe like adaptive k-d

• associates subspaces to tree nodes

• interior nodes are intervals

• nodes in same level are mutually disjoint

• perfectly balanced (like B-tree)

• search straightforward, like k-d-tree

• insert: search, find the right bucket, if required split and move half the data to it.

• Deletion: search, remove, if necessary merge node with siblings

Hierarchical Access Method

Trang 23

• k-d-B-tree

• combination of adaptive k-d-tree and B-tree

• partition the universe like adaptive k-d

• associates subspaces to tree nodes

• interior nodes are intervals

• nodes in same level are mutually disjoint

• perfectly balanced (like B-tree)

• search straightforward, like k-d-tree

• insert: search, find the right bucket, if required split and move half the data to it.

• Deletion: search, remove, if necessary merge node with siblings

Hierarchical Access Method

Trang 24

• LSD tree

• directory is organized same as adaptive k-d-tree

• better adaptation to data distribution (in compare to fixed binary partitioning)

• external balancing property: heights of external subtrees differ at most by one

• combines two split strategies to accommodate skewed data:

• data-dependent : based on data, tries to achieve most balanced structure (equal number of data in both sides of split)

• distribution-dependent: split at fixed dimension and position (know distribution is assumed)

Hierarchical Access Method

Trang 26

•Buddy tree:

• dynamic hashing scheme with tree structure (hybrid)

• tree is made by consecutive insertions

• cut the universe equally with iso-oriented hyperplanes

• interior nodes: a partition and an interval (MBB of points or intervals below node)

• intervals in same level nodes are mutually disjoint

• leaves are data (like other trees!!)

• each directory node has at least two entries

=> may not be balanced

• when a node splits, MBB of two intervals are computed to reflect the current situation

=> tries to achieve high selectivity at directory level

• except for root, only one pointer refers to each directory page

=> guarantees linear growth

Trang 27

Hierarchical Access Method

Trang 28

• BANG file (Balanced and Nested Grid)

• a hybrid method

• divides the univers to intervals (boxes), similar to grid

• difference: buckets regions may intersect

• can form nonrectangular bucket regions by taking geometric difference of two intervals (nesting)

• increased storage utilization: redistributes data between bucket during insertion

• balanced search tree to manage directory

Trang 29

Hierarchical Access Method

• BANG file (Balanced and Nested Grid)

• first 3 rectangles: R1: R2, R5, R6

• then R3 and R4 in R2 and R5

• representation as bit interleaving

• * = universe

• a point search may require traversal of entire directory in depth-first manner

Trang 30

Hierarchical Access Method

• hB-tree

• utilizes k-d-B-tree to organize the space represented by interior nodes

• difference in splitting: based on multiple attributes

• region not boxed shape

Trang 31

Hierarchical Access Method

• BV-tree

• tries to solve d-dimensional B-tree

• idea: maintain major strengths of Btree, by relaxing balancing and space utilization

• BV-tree not balanced

• at least 33% space utilization (50% for B-tree)

Trang 32

• What form of Point Query in Spatial Data ?

• What methods belong to Multidimensional Hashing Method?

• Grid File

• K-d Tree

• Linear Hashing

• EXCELL

Ngày đăng: 09/02/2016, 13:33

TỪ KHÓA LIÊN QUAN

w