1. Trang chủ
  2. » Giáo án - Bài giảng

Chapter 8 Physical Database Design

50 384 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Chapter 8 Physical Database Design
Định dạng
Số trang 50
Dung lượng 188,5 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Physical Database Design• Many physical database design decisions are implicit in the technology adopted – Also, organizations may have standards or an “information architecture” that

Trang 1

Chapter 08: Physical Database Design

Trang 2

Database Design Process

ConceptualModel

LogicalModel

External Model

External Model

External Model

Internal Model

Physical Design

Trang 3

Physical Database Design

• Many physical database design decisions are implicit

in the technology adopted

– Also, organizations may have standards or an

“information architecture” that specifies operating systems, DBMS, and data access languages thus constraining the range of possible physical

implementations.

• We will be concerned with some of the possible

physical implementation issues

Trang 4

Physical Database Design

• The primary goal of physical database design is data processing

efficiency

• We will concentrate on choices often available to optimize

performance of database services

• Physical Database Design requires information gathered during earlier stages of the design process

Trang 5

Physical Design Information

• Information needed for physical file and database design includes:

– Normalized relations plus size estimates for them

– Definitions of each attribute

– Descriptions of where and when data are used

• entered, retrieved, deleted, updated, and how often

– Expectations and requirements for response time, and data security, backup, recovery, retention and integrity – Descriptions of the technologies used to implement the database

Trang 6

Physical Design Decisions

• There are several critical decisions that will affect the integrity and performance of the system

Trang 7

Storage Format

• Choosing the storage format of each field (attribute) The DBMS

provides some set of data types that can be used for the physical

storage of fields in the database

• Data Type (format) is chosen to minimize storage space and maximize data integrity

Trang 8

Objectives of data type selection

• Minimize storage space

• Represent all possible values

• Improve data integrity

• Support all data manipulations

• The correct data type should, in minimal space,

represent every possible value (but eliminate

illegal values) for the associated attribute and can

support the required data manipulations (e.g

numerical or string operations)

Trang 9

Access Data Types

• Numeric (1, 2, 4, 8 bytes, fixed or float)

• OLE (limited only by disk space)

• Hyperlinks (up to 64000 chars)

Trang 10

Access Numeric types

• Byte

– Stores numbers from 0 to 255 (no fractions) 1 byte

• Integer

– Stores numbers from –32,768 to 32,767 (no fractions) 2 bytes

• Long Integer (Default)

– Stores numbers from –2,147,483,648 to 2,147,483,647 (no fractions) 4 bytes

• Replication ID

– Globally unique identifier (GUID) N/A 16 bytes

Trang 11

Designing Physical Records

• A physical record is a group of fields stored in adjacent memory locations and retrieved together as a unit

• Fixed Length and variable fields

Trang 13

The Memory Hierarchy

Main Memory = Disk Cache

• Only sequential access

• Not for operational data

Processor Cache:

• access time 10 nano’s

• 512K

Trang 14

Main Memory

• Fastest, most expensive (excluding cache)

• Today: 512MB are common even on PCs

• Many databases could fit in memory

– New industry trend: Main Memory Database– E.g TimesTen

• Main issue is volatility

Trang 15

– A disk block is also called a disk page or simply a page

• Used with a main memory buffer

Trang 16

of a block for the data is 100 bytes.

– What is the blocking factor?

Trang 17

The Mechanics of Disk

Mechanical characteristics:

• Rotation speed (5400RPM)

• Number of platters (1-30)

• Number of tracks (<=10000)

• Number of sectors (256/track)

• Number of bytes / sector (29=512)

• Block size (212=4096)

Platters

Spindle Disk head

Arm movement

Arm assembly

Tracks

Sector Cylinder

Trang 18

Important Disk Access Characteristics

• Block access time = Disk latency + transfer time

• Disk latency = seek time + rotational latency

• Seek time = time for the head to reach the right track

– 10ms – 40ms

• Rotational latency = rotation time to get to the right sector

– Time for one rotation = 10ms

– Average rotation latency = 10ms/2

• Transfer time = typically 5-10MB/s

• Disks read/write one block at a time (typically 4kB)

Trang 19

Representing Data Elements

• Relational database elements:

CREATE TABLE Product (

pid INT PRIMARY KEY,

name CHAR(20),

description VARCHAR(200),

maker CHAR(10) REFERENCES Company(name))

• A tuple is represented as a record

Trang 20

Record Formats: Fixed Length

• Information about field types same for all

records in a file; stored in system catalogs.

• Finding i’th field requires scan of record.

• Note the importance of schema information!

Base address (B)

Address = B+L1+L2

Trang 21

Need the header because:

•The schema may change

for a while new+old may coexist

•Records from different relations may coexist

header

Trang 22

Variable Length Records

Other header information

length

Place the fixed fields first: F1, F2

Then the variable length fields: F3, F4

Null values take 2 bytes only

Sometimes they take 0 bytes (when at the end)

header

Trang 23

Records With Referencing Fields

Trang 24

Storing Records in Blocks

• Blocks have fixed size (typically 4k)

R1 R2

R3 BLOCK

R4

Trang 25

Spanning Records Across Blocks

• When records are very large

• Or even medium size: saves space in blocks

block

Trang 26

• Binary large objects

• Supported by modern database systems

• E.g images, sounds, etc

• Storage: attempt to cluster blocks together

Trang 27

Modifications: Insertion

• File is unsorted

– add it to the end

• File is sorted:

– Is there space in the right block ?

• Yes: we are lucky, store it there

– Is there space in a neighboring block ?

• Look 1-2 blocks to the left/right, shift records

– If anything else fails, create overflow block

Trang 29

Modifications: Deletions

• Free space in block, shift records

• Maybe be able to eliminate an overflow block

Trang 30

Modifications: Updates

• If new record is shorter than previous, easy 

• If it is longer, need to shift records, create overflow blocks

Trang 31

– The cylinder number

– The track number

– The block within the track

– For records: an offset in the block

• sometimes this is in the block’s header

Trang 32

Logical Addresses

• Logical address: a string of bytes (10-16)

• More flexible: can blocks/records around

• But need translation table:

Logical address Physical address

Trang 33

Main Memory Address

• When the block is read in main memory, it receives a main memory address

• Buffer manager has another translation table

Trang 34

Designing Physical/Internal Model

• Overview

• terminology

• Access methods

Trang 35

Physical Design

• Internal Model/Physical Model

Operating System Access Methods

Data Base

User request

DBMS

Internal Model Access Methods

External Model

Interface 1

Interface 3 Interface 2

Trang 36

Physical Design

user presents a query, the DBMS determines which physical DBs are needed to resolve the query

access method to access the data stored in a

logical database.

methods and OS access methods access the

physical records of the database.

Trang 37

Physical File Design

• A Physical file is a portion of secondary storage

(disk space) allocated for the purpose of storing physical records

• Pointers - a field of data that can be used to locate

a related field or record of data

• Access Methods - An operating system algorithm

for storing and locating data in secondary storage

• Pages - The amount of data read or written in one

disk input or output operation

Trang 38

Internal Model Access Methods

• Many types of access methods:

Trang 39

Physical Sequential

• Key values of the physical records are in logical sequence

• Main use is for “dump” and “restore”

• Access method may be used for storage as well as retrieval

• Storage Efficiency is near 100%

• Access Efficiency is poor (unless fixed size physical records)

Trang 40

Indexed Sequential

• Key values of the physical records are in logical

sequence

• Access method may be used for storage and retrieval

• Index of key values is maintained with entries for the highest key values per block(s)

• Access Efficiency depends on the levels of index,

storage allocated for index, number of database

records, and amount of overflow

• Storage Efficiency depends on size of index and

volatility of database

Trang 41

Index Sequential

Data File Block 1

Block 2

Block 3

Address Block Number 1 2 3

Getta Harty

Mobile Sunoci Texaci

Trang 42

Indexed Sequential: Two Levels

Address

7 8 9

705 710 785

251 385

455 480 536 605 610 678

791 805

Address 1 2

Key Value 150 385

Address 3 4

Key Value 536 678

Address 5 6

Key Value 785 805

Trang 43

Indexed Random

• Key values of the physical records are not necessarily

in logical sequence

• Index may be stored and accessed with Indexed

Sequential Access Method

• Index has an entry for every data base record These are in ascending order The index keys are in logical sequence Database records are not necessarily in

ascending sequence.

• Access method may be used for storage and retrieval

Trang 44

Indexed Random

Address Block Number 2 1 3 2 1

Adams Getta

Dumpling

Trang 45

HawkeyesHoosiers

Trang 46

• Key values of the physical records are not necessarily in logical sequence

• Access Method is better used for retrieval

• An index for every field to be inverted may be built

• Access efficiency depends on number of database records, levels of index, and storage allocated for index

Trang 47

Address Block Number 1 2 3

CS 623

105, 106

Adams Becker Dumpling Getta Harty Mobile

Student name

Course Number

CH145 cs201 ch145 ch145 cs623 cs623

Trang 48

• Key values of the physical records are not

necessarily in logical sequence

• There is a one-to-one correspondence between

a record key and the physical address of the

record

• May be used for storage and retrieval

• Access efficiency always 1

• Storage efficiency depends on density of keys

• No duplicate keys permitted

Trang 49

• May be used for storage and retrieval

• Access efficiency depends on distribution of keys,

algorithm for key transformation and space allocated

• Storage efficiency depends on distibution of keys and algorithm used for key transformation

Trang 50

Comparative Access Methods

Indexed

No wasted space for data but extra space for index

Moderately Fast

Moderately Fast Very fast with multiple indexes

OK if dynamic

OK if dynamic

Easy but requires Maintenance of indexes

Impractical Possible but needs

a full scan can create wasted space

requires rewriting file

usually requires rewriting file

Hashed

more space needed for addition and deletion of records after initial load

Impractical Very fast

Not possible very easy very easy very easy

Ngày đăng: 12/05/2014, 11:55

TỪ KHÓA LIÊN QUAN