Module 6: Deriving a Physical Data Design

Module 1: Course OverviewModule 4: Deriving a Logical Data Design Module 5: Normalizing the Logical Data Design Physical Data Design Activity 6.1: Translating the Logical Data Design Imp

Trang 1

Module 6: Deriving a Physical Data

Design

Trang 2

Module 1: Course Overview

Module 4: Deriving a Logical Data Design Module 5: Normalizing the

Logical Data Design

Physical Data Design

Activity 6.1: Translating the Logical Data Design

Implementing Relationships

Data Optimization Techniques

Activity 6.2: Optimizing a

Physical Data Design

Module 6: Deriving a Physical Data Design

Trang 3

At the end of this module, you will be able to:

" Derive a physical data design for tables and fields from a logical data design

" Analyze data-usage characteristics to optimize a physical data design

" Determine methods for implementing relationships in a physical data design

" Identify different optimization techniques

" Determine the proper criteria for optimizing a physical data design

In this module, you will learn

about how data is stored,

how to implement a

relational model, and how to

optimize a database system

Trang 4

! Physical Data Design

This section discusses a few

general relational database

concepts and the ways in

which data is stored

Trang 5

Describing a Physical Data Model

Flat-File

Flat-file databases were one of the first methods used to store data in an organized format A flat-file database uses a set of rows and columns to store all information within a single file There is no relationship between flat-file databases because each database exists without knowledge of any other database Performance on this type of database is usually very good because of its inherent simplicity Fast updates and retrievals of flat-file data are achieved

by using an indexing method called the indexed sequential access method (ISAM) Legacy mainframe databases, as well as smaller PC-based databases, employ ISAM storage technology

Hierarchical

Hierarchical databases are extensible and flexible They have the advantage of being able to store a wide range of information in a variety of formats This type of database is often used when information storage needs vary greatly An example of a hierarchical database is Microsoft® Exchange, which is capable of storing varying types of information in a format that facilitates messaging and collaboration applications (With collaborative applications, many types of information tend be encapsulated in messages.)

Relational

Relational databases combine the advantages of both flat-file and hierarchical databases by providing good performance and flexibility of storage The relational model tends to be the most popular for new database development because tables can be linked together with unique values It is important to understand, however, that the other models are still in use and that developers

Slide Objective

To introduce the different

data storage models

available for use with a

Trang 6

working in enterprise environments will likely need to interface with one of these other types of databases at some point

" The relational model focuses on storing data, retrieving data, and maintaining data integrity Data is stored in one or more tables, consisting of columns and rows, within a single file Related items of data can be

retrieved efficiently by using Structured Query Language (SQL), regardless

of whether the items are all stored in one table or in many tables Data integrity is maintained by applying rules and constraints

Trang 7

Identifying Database Tables and Fields

" Each entity becomes a table in the physical model

Each row is an instance of an entity Each column is an attribute

" Fields are the attributes of the entity

Tables are the physical representation of entities If logical data design is done correctly, entities identified in the logical design stage should map directly to tables within a relational database Good logical design is extremely important for good database design

Tables can store a wide variety of data A table can contain a name, address, picture, voiceprint, movie, Microsoft Word document, and so on Because of this flexibility, a database can be used not only to store simple text data, but also to store the knowledge base of a business, no matter what form that knowledge takes

" The data in a table is stored in rows, or records Each record must be unique Records are manipulated by using the American National Standards Institute (ANSI) standard relational database language, which is referred to as SQL SQL is an English-like language that abstracts the operations performed on

a database into easily readable statements, such as Insert, Update, and Delete Most referential databases adhere to the ANSI SQL standard, although the version and enhancements used vary from product to product

Tables can be linked to other tables within the same database file This capability allows one type of data to be joined to another type and allows data normalization

The data in each record is stored in columns, or fields, that are specified from the attributes of the table’s defining entity Each field contains one distinct item

of data, such as a customer name, and each field must have a singular data type, such as text This data type is defined based on the kind of data stored in the field The data types that are allowed for a given field depend on the data types supported by the hosting DBMS When defining your tables, you should choose data types that will optimize performance, conserve disk space, and allow for growth

Slide Objective

To introduce the role of a

table and fields in the

physical data design

Lead-in

Tables and fields are the

basic building blocks of

physical data design

Trang 8

Common Data Types

Date and time data Date

Fixed-length character data String

Fixed-length binary data Binary

Currency values with fixed scale Money

Integer (whole number) data Integer

Floating point with fixed-precision data range Float

Float value with double-precision data range over regular float data type

(Double) Float (Long) Integer Integer value with double-precision data range over regular integer data type

Every field within a database must have a data type The data type allows you, and the database engine itself, to verify that a value entered in a field is valid for the information that the field represents

Most DBMSs support two major classifications of data types:

" System-supplied data types Every DBMS contains its own data types Examples of system-supplied types are Integer, Character, and Binary Some DBMSs contain variations of these types as well as additional types

" User-defined data types Some DBMSs allow you to define your own data types based on the system-supplied types For example, with SQL Server™ you might define a State data type based on the Character type with a length of 2 Defining this data type would help maintain conformity across all tables that include a State field In every table, any field of State data type would be consistent and identical

The slide shows a set of common data types, each of which is a variation on a generic character, number, or binary data type For example, Float, Money, and Integer all store numeric data, but they store it in different formats and can display it in different formats Because their data is stored in different formats, different data types consume different amounts of storage space For example, the double variants of a data type can store a number that is twice as large or store a fraction to more decimal places, but they typically use twice as much storage space During physical design, you need to consider the nuances of each data type to ensure that the most efficient storage solution is obtained

Details about data types are generally specific to the DBMS engine being used Refer to the DBMS documentation for specific information

Slide Objective

To provide a generic listing

of common data types

Lead-in

Most system-supplied data

types fall within a generic

set of types

Trang 9

! Implementing Relationships

" Implementing One-to-One Relationships

" Implementing One-to-Many Relationships

" Implementing Many-to-Many Relationships

" Activity 6.1: Translating the Logical Data Design

In this section

Just as the entities and attributes identified in logical design are represented as tables and columns in physical design, the relationships identified in logical design also have to be represented physically

In this section, you will learn about the relationships in a physical model and how they are implemented in a host DBMS

This section takes a look at

the various relationships

that can exist between

database tables and how to

implement those

relationships

Trang 10

Implementing One-to-One Relationships

" If relationship is mandatory

table OR

" If relationship is optional

relationship between two tables

Attr1

Attr3

E1_Key (PK) Entity1

Attr1 Attr3

E2_Key (PK) Entity2

E1_Attr1 E1_Attr3

E1_Key (PK) Entity1

E2_Attr1 E2_Attr2 E2_Attr3

E2_Key (PK) Entity2+E1_Attr1 E1_Attr2 E1_Attr3 E2_Attr1 E2_Attr2 E2_Attr3

E1_Key (PK) E2_Key (PK) Entity3

If the relationship specified between two entities is one-to-one, you have several options when designing the physical model In a one-to-one relationship, an instance of one entity is directly related to the corresponding instance of the other entity If both entities are required for the relationship, the entities and their relationship can be represented in one of two ways:

" As one table You can combine the two entities into one table and use the primary keys as

a composite key of the combined tables The advantage of combining the entities into one table is that you avoid the need to maintain separate tables, thereby reducing overhead and providing for more efficient utilization of storage space The disadvantage is that if the relationship changes at some point in the future, reversing this design decision can sometimes be costly

" As two tables You can keep each entity in its own table and add the primary key of one entity as a foreign key of the other entity Often there is an implied parent-child relationship between the entities In this case, you should add the primary key of the child entity as a foreign key in the parent entity because the parent entity owns the child entity This arrangement forces the database

to allow only unique entries in each key field and helps to ensure that each instance of one entity can relate to only one instance of the other entity

" If the relationship between the entities is optional, meaning that the parent entity can exist without a related instance of the child entity, then you should create a separate table for each entity and then use foreign keys to implement the relationship

Slide Objective

To introduce the issues

involved in specifying

one-to-one relationships in the

Trang 11

Implementing One-To-Many Relationships

Attr1 Attr3

E2_Key (PK) E1_Key (FK) Entity2

Attr1 Attr3

E1_Key (PK) Entity1

" Use foreign keys to identify relationship between entities

" Enforce relationship with foreign key constraints

" Use unique primary key to differentiate instances of child entity

" Use same foreign key to allow multiple instances of child entity

The physical design of a one-to-many relationship is really an extension of that

of a one-to-one relationship It requires the use of foreign keys in the child entity This foreign key determines the existence of the relationship Enforcing the relationship usually involves making sure that the foreign key is a valid parent entity

A one-to-many relationship is used frequently in data design because it tends to work well under most circumstances

Slide Objective

one-to-many relationships in the

physical design

Lead-in

Implementing a

one-to-many relationship is much

like implementing a

one-to-one relationship

Trang 12

Implementing Many-To-Many Relationships

" Cannot represent directly in most databases

" Create one or more new tables to maintain relationship

Attr1 Attr2 Attr3

E1_KeyID Employee

Attr1 Attr2 Attr3

E2_KeyID Client

E1_KeyID (FK) E2_KeyID (FK) Contracts

Most relational database systems, including SQL Server, cannot directly represent a many-to-many relationship

Many DBMSs work around this problem by using a new table to hold information that maintains the relationship between the entities

In the slide, the Employee and Client entities have a many-to-many relationship A single Employee can contract with many Clients, and a single Client can have contracts with many Employees Because this relationship cannot be expressed directly, each entity’s primary key is used as a foreign key

in a separate Contracts table This foreign key pair uniquely identifies the relationship between the Employee table and the Client table

Slide Objective

many-to-many relationships in the

physical design

Lead-in

The many-to-many

relationship presents a

unique set of issues that

must be dealt with in the

physical design

Trang 13

Activity 6.1: Translating the Logical Data Design

In this activity, you will evaluate the logical design for part of the solution developed for the Ferguson and Bardell, Inc case study From this design, you will determine the tables, columns, data types, and keys that are appropriate for the solution

After completing this activity, you will be able to:

" Evaluate a logical data design

" Determine the appropriate data types for columns in a table

" Produce a physical data design

Slide Objective

To introduce this activity

Lead-in

In this activity, you will

derive a physical data

design from a logical data

design

Trang 14

! Data Optimization Techniques

" Goals of Optimization

" Optimizing for Creation of Data

" Optimizing for Retrieval of Data

" Optimizing for Updating Data

" Considerations for Deleting Data

" Activity 6.2: Optimizing a Physical Data Design

In this section

In this section, you will learn about the types of actions typically performed with data in an application and the DBMS used to store the data for the application’s use You will also learn how data can be optimized to make each action performed with that data as efficient as possible

Slide Objective

To introduce the topic of

data optimization

Lead-in

In this section, you will learn

how data can be optimized

for specific uses

Trang 15

" Balanced implementation

The goal of optimization is to minimize the response time for each query and to maximize the throughput of the entire database server by minimizing network traffic, disk I/O, and processor time This goal is achieved by understanding the application’s requirements, the logical and physical structure of the data, and the trade-offs between conflicting uses of the database, such as a large number

of write-intensive insertions and heavy read-intensive queries

Performance issues should be considered throughout the development cycle, not

at the end when the system is implemented Many significant performance improvements can be achieved by carefully optimizing the database design from the outset

Most database systems can be classified as one of three types:

" Read-intensive

A read-intensive system is optimized for queries that do not change data It

is used mainly to gather information that is relatively static

" Write-intensive

A write-intensive system is optimized for frequent insertion or updating of data The database tables are designed in such a way that updates and insertions are as efficient as possible

" Balanced implementation

A balanced implementation is optimized to perform reasonably well under both read and write operations It is understood that it will not perform as well as a system optimized for one particular action, but will provide acceptable performance in a wide variety of situations

Slide Objective

To introduce goals of

optimization

Lead-in

Before you can optimize a

system, you have to know

what the system needs to

accomplish

Tiêu đề	Deriving a Physical Data Design
Thể loại	Course module

Định dạng
Số trang	28
Dung lượng	324,77 KB