Module 1: Course OverviewModule 4: Deriving a Logical Data Design Module 5: Normalizing the Logical Data Design Physical Data Design Activity 6.1: Translating the Logical Data Design Imp
Trang 1Module 6: Deriving a Physical Data
Design
Trang 2Module 1: Course Overview
Module 4: Deriving a Logical Data Design Module 5: Normalizing the
Logical Data Design
Physical Data Design
Activity 6.1: Translating the Logical Data Design
Implementing Relationships
Data Optimization Techniques
Activity 6.2: Optimizing a
Physical Data Design
Module 6: Deriving a Physical Data Design
Trang 3At the end of this module, you will be able to:
" Derive a physical data design for tables and fields from a logical data design
" Analyze data-usage characteristics to optimize a physical data design
" Determine methods for implementing relationships in a physical data design
" Identify different optimization techniques
" Determine the proper criteria for optimizing a physical data design
In this module, you will learn
about how data is stored,
how to implement a
relational model, and how to
optimize a database system
Trang 4! Physical Data Design
This section discusses a few
general relational database
concepts and the ways in
which data is stored
Trang 5Describing a Physical Data Model
Flat-File
Flat-file databases were one of the first methods used to store data in an organized format A flat-file database uses a set of rows and columns to store all information within a single file There is no relationship between flat-file databases because each database exists without knowledge of any other database Performance on this type of database is usually very good because of its inherent simplicity Fast updates and retrievals of flat-file data are achieved
by using an indexing method called the indexed sequential access method (ISAM) Legacy mainframe databases, as well as smaller PC-based databases, employ ISAM storage technology
Hierarchical
Hierarchical databases are extensible and flexible They have the advantage of being able to store a wide range of information in a variety of formats This type of database is often used when information storage needs vary greatly An example of a hierarchical database is Microsoft® Exchange, which is capable of storing varying types of information in a format that facilitates messaging and collaboration applications (With collaborative applications, many types of information tend be encapsulated in messages.)
Relational
Relational databases combine the advantages of both flat-file and hierarchical databases by providing good performance and flexibility of storage The relational model tends to be the most popular for new database development because tables can be linked together with unique values It is important to understand, however, that the other models are still in use and that developers
Slide Objective
To introduce the different
data storage models
available for use with a
Trang 6working in enterprise environments will likely need to interface with one of these other types of databases at some point
" The relational model focuses on storing data, retrieving data, and maintaining data integrity Data is stored in one or more tables, consisting of columns and rows, within a single file Related items of data can be
retrieved efficiently by using Structured Query Language (SQL), regardless
of whether the items are all stored in one table or in many tables Data integrity is maintained by applying rules and constraints
Trang 7Identifying Database Tables and Fields
" Each entity becomes a table in the physical model
Each row is an instance of an entity Each column is an attribute
" Fields are the attributes of the entity
Tables are the physical representation of entities If logical data design is done correctly, entities identified in the logical design stage should map directly to tables within a relational database Good logical design is extremely important for good database design
Tables can store a wide variety of data A table can contain a name, address, picture, voiceprint, movie, Microsoft Word document, and so on Because of this flexibility, a database can be used not only to store simple text data, but also to store the knowledge base of a business, no matter what form that knowledge takes
" The data in a table is stored in rows, or records Each record must be unique Records are manipulated by using the American National Standards Institute (ANSI) standard relational database language, which is referred to as SQL SQL is an English-like language that abstracts the operations performed on
a database into easily readable statements, such as Insert, Update, and Delete Most referential databases adhere to the ANSI SQL standard, although the version and enhancements used vary from product to product
Tables can be linked to other tables within the same database file This capability allows one type of data to be joined to another type and allows data normalization
The data in each record is stored in columns, or fields, that are specified from the attributes of the table’s defining entity Each field contains one distinct item
of data, such as a customer name, and each field must have a singular data type, such as text This data type is defined based on the kind of data stored in the field The data types that are allowed for a given field depend on the data types supported by the hosting DBMS When defining your tables, you should choose data types that will optimize performance, conserve disk space, and allow for growth
Slide Objective
To introduce the role of a
table and fields in the
physical data design
Lead-in
Tables and fields are the
basic building blocks of
physical data design
Trang 8Common Data Types
Date and time data Date
Fixed-length character data String
Fixed-length binary data Binary
Currency values with fixed scale Money
Integer (whole number) data Integer
Floating point with fixed-precision data range Float
Float value with double-precision data range over regular float data type
(Double) Float (Long) Integer Integer value with double-precision data range over regular integer data type
Every field within a database must have a data type The data type allows you, and the database engine itself, to verify that a value entered in a field is valid for the information that the field represents
Most DBMSs support two major classifications of data types:
" System-supplied data types Every DBMS contains its own data types Examples of system-supplied types are Integer, Character, and Binary Some DBMSs contain variations of these types as well as additional types
" User-defined data types Some DBMSs allow you to define your own data types based on the system-supplied types For example, with SQL Server™ you might define a State data type based on the Character type with a length of 2 Defining this data type would help maintain conformity across all tables that include a State field In every table, any field of State data type would be consistent and identical
The slide shows a set of common data types, each of which is a variation on a generic character, number, or binary data type For example, Float, Money, and Integer all store numeric data, but they store it in different formats and can display it in different formats Because their data is stored in different formats, different data types consume different amounts of storage space For example, the double variants of a data type can store a number that is twice as large or store a fraction to more decimal places, but they typically use twice as much storage space During physical design, you need to consider the nuances of each data type to ensure that the most efficient storage solution is obtained
Details about data types are generally specific to the DBMS engine being used Refer to the DBMS documentation for specific information
Slide Objective
To provide a generic listing
of common data types
Lead-in
Most system-supplied data
types fall within a generic
set of types
Trang 9! Implementing Relationships
" Implementing One-to-One Relationships
" Implementing One-to-Many Relationships
" Implementing Many-to-Many Relationships
" Activity 6.1: Translating the Logical Data Design
In this section
Just as the entities and attributes identified in logical design are represented as tables and columns in physical design, the relationships identified in logical design also have to be represented physically
In this section, you will learn about the relationships in a physical model and how they are implemented in a host DBMS
This section takes a look at
the various relationships
that can exist between
database tables and how to
implement those
relationships
Trang 10Implementing One-to-One Relationships
" If relationship is mandatory
table OR
" If relationship is optional
relationship between two tables
Attr1
Attr3
E1_Key (PK) Entity1
Attr1 Attr3
E2_Key (PK) Entity2
E1_Attr1 E1_Attr3
E1_Key (PK) Entity1
E2_Attr1 E2_Attr2 E2_Attr3
E2_Key (PK) Entity2+E1_Attr1 E1_Attr2 E1_Attr3 E2_Attr1 E2_Attr2 E2_Attr3
E1_Key (PK) E2_Key (PK) Entity3
If the relationship specified between two entities is one-to-one, you have several options when designing the physical model In a one-to-one relationship, an instance of one entity is directly related to the corresponding instance of the other entity If both entities are required for the relationship, the entities and their relationship can be represented in one of two ways:
" As one table You can combine the two entities into one table and use the primary keys as
a composite key of the combined tables The advantage of combining the entities into one table is that you avoid the need to maintain separate tables, thereby reducing overhead and providing for more efficient utilization of storage space The disadvantage is that if the relationship changes at some point in the future, reversing this design decision can sometimes be costly
" As two tables You can keep each entity in its own table and add the primary key of one entity as a foreign key of the other entity Often there is an implied parent-child relationship between the entities In this case, you should add the primary key of the child entity as a foreign key in the parent entity because the parent entity owns the child entity This arrangement forces the database
to allow only unique entries in each key field and helps to ensure that each instance of one entity can relate to only one instance of the other entity
" If the relationship between the entities is optional, meaning that the parent entity can exist without a related instance of the child entity, then you should create a separate table for each entity and then use foreign keys to implement the relationship
Slide Objective
To introduce the issues
involved in specifying
one-to-one relationships in the
Trang 11Implementing One-To-Many Relationships
Attr1 Attr3
E2_Key (PK) E1_Key (FK) Entity2
Attr1 Attr3
E1_Key (PK) Entity1
" Use foreign keys to identify relationship between entities
" Enforce relationship with foreign key constraints
" Use unique primary key to differentiate instances of child entity
" Use same foreign key to allow multiple instances of child entity
The physical design of a one-to-many relationship is really an extension of that
of a one-to-one relationship It requires the use of foreign keys in the child entity This foreign key determines the existence of the relationship Enforcing the relationship usually involves making sure that the foreign key is a valid parent entity
A one-to-many relationship is used frequently in data design because it tends to work well under most circumstances
Slide Objective
To introduce the issues
involved in specifying
one-to-many relationships in the
physical design
Lead-in
Implementing a
one-to-many relationship is much
like implementing a
one-to-one relationship
Trang 12Implementing Many-To-Many Relationships
" Cannot represent directly in most databases
" Create one or more new tables to maintain relationship
Attr1 Attr2 Attr3
E1_KeyID Employee
Attr1 Attr2 Attr3
E2_KeyID Client
E1_KeyID (FK) E2_KeyID (FK) Contracts
Most relational database systems, including SQL Server, cannot directly represent a many-to-many relationship
Many DBMSs work around this problem by using a new table to hold information that maintains the relationship between the entities
In the slide, the Employee and Client entities have a many-to-many relationship A single Employee can contract with many Clients, and a single Client can have contracts with many Employees Because this relationship cannot be expressed directly, each entity’s primary key is used as a foreign key
in a separate Contracts table This foreign key pair uniquely identifies the relationship between the Employee table and the Client table
Slide Objective
To introduce the issues
involved in specifying
many-to-many relationships in the
physical design
Lead-in
The many-to-many
relationship presents a
unique set of issues that
must be dealt with in the
physical design
Trang 13Activity 6.1: Translating the Logical Data Design
In this activity, you will evaluate the logical design for part of the solution developed for the Ferguson and Bardell, Inc case study From this design, you will determine the tables, columns, data types, and keys that are appropriate for the solution
After completing this activity, you will be able to:
" Evaluate a logical data design
" Determine the appropriate data types for columns in a table
" Produce a physical data design
Slide Objective
To introduce this activity
Lead-in
In this activity, you will
derive a physical data
design from a logical data
design
Trang 14! Data Optimization Techniques
" Goals of Optimization
" Optimizing for Creation of Data
" Optimizing for Retrieval of Data
" Optimizing for Updating Data
" Considerations for Deleting Data
" Activity 6.2: Optimizing a Physical Data Design
In this section
In this section, you will learn about the types of actions typically performed with data in an application and the DBMS used to store the data for the application’s use You will also learn how data can be optimized to make each action performed with that data as efficient as possible
Slide Objective
To introduce the topic of
data optimization
Lead-in
In this section, you will learn
how data can be optimized
for specific uses
Trang 15" Balanced implementation
The goal of optimization is to minimize the response time for each query and to maximize the throughput of the entire database server by minimizing network traffic, disk I/O, and processor time This goal is achieved by understanding the application’s requirements, the logical and physical structure of the data, and the trade-offs between conflicting uses of the database, such as a large number
of write-intensive insertions and heavy read-intensive queries
Performance issues should be considered throughout the development cycle, not
at the end when the system is implemented Many significant performance improvements can be achieved by carefully optimizing the database design from the outset
Most database systems can be classified as one of three types:
" Read-intensive
A read-intensive system is optimized for queries that do not change data It
is used mainly to gather information that is relatively static
" Write-intensive
A write-intensive system is optimized for frequent insertion or updating of data The database tables are designed in such a way that updates and insertions are as efficient as possible
" Balanced implementation
A balanced implementation is optimized to perform reasonably well under both read and write operations It is understood that it will not perform as well as a system optimized for one particular action, but will provide acceptable performance in a wide variety of situations
Slide Objective
To introduce goals of
optimization
Lead-in
Before you can optimize a
system, you have to know
what the system needs to
accomplish