Module 1: Course OverviewModule 4: Deriving a Logical Data Design Module 5: Normalizing the Logical Data Design Module 3: Using a Conceptual Design for Data Requirements ImplementingEnti
Trang 1Data Design
Trang 2Module 1: Course Overview
Module 4: Deriving a Logical Data Design
Module 5: Normalizing the Logical Data Design
Module 3: Using a Conceptual Design for Data Requirements
ImplementingEntityRelationships
Activity 5.1: Identifying Keys in the Logical Model
NormalizationBasic
Activity 5.2: Deriving the
Third Normal Form for a
Logical Data Model
Module 5: Normalizing the Logical Data Design
Trang 3After completing this module, you will be able to:
" Use primary and foreign keys to implement relationships between entities
" Explain the benefits of normalizing entities
" Normalize a table to third normal form
Trang 4! Implementing Entity Relationships
" Using Keys
" Primary Keys
" Foreign Keys
" Examples of Primary and Foreign Keys
" Activity 5.1: Identifying Keys in the Logical Model
In this section
In this section, you will learn how entities are uniquely identified within the logical data model with keys You will also learn how keys are used to implement relationships between entities
Slide Objective
To introduce the topics in
this section
Lead-in
In this section, you will learn
about primary and foreign
keys and how to use them
to implement relationships
Trang 5Using Keys
" Benefits
Keys are identifying values assigned to each instance of an entity within a data model As you learned in Module 4, “Deriving a Logical Data Design,” entities within a data model represent a grouping of information about people, places, objects, or ideas When you begin to move a logical design to a physical design, you use keys to uniquely identify each instance of an entity within the data model
Keys also provide the mechanism for tying entities together In your physical database design, you represent relationships between entities by adding the keys from parent entity tables to child entity tables so that the entities are bound together by the common key value
Slide Objective
To introduce the concept of
keys within the physical data
model
Lead-in
Keys are the mechanism for
implementing relationships
Trang 6Primary Keys
" Identifying attributes (existing or new) for entities
" Unique values are necessary
" Types of primary keys
A primary key is any field that represents the uniqueness of all an entity’s instances The primary key value must be unique across all instances of an entity; otherwise, the key value cannot identify any one instance of the entity The uniqueness of the values in a primary key allows each instance of an entity
to exist independently
An entity’s primary key can be an existing attribute within the entity Any identifying attribute that is guaranteed to be unique across all instances of an entity is a candidate for a primary key If the uniqueness of the existing attributes cannot be guaranteed, you can create an identifying key, usually a numeric value, and assign it primary key responsibilities
The following types of primary keys exist:
" Intelligent keys These keys are created from existing attributes and are related to entities as
both attributes and keys
" Surrogate keys These keys are identifier values with no specific relation to entities other
than to uniquely identify them
" Composite keys These keys include more than one value per entity When one attribute within an entity is not enough to uniquely identify each instance of the entity, you can select more than one attribute to be a composite key For example, each project within a project-tracking database might have a project number, and each project number might contain multiple jobs, each with its own identifying job number You might use recurring job numbers within each project to denote a specific phase of the project In this case, the primary key of the project table would be a composite key consisting of both the project number and the job number, because both numbers are necessary to uniquely identify any one job
Slide Objective
To introduce primary keys
and how they are defined
within the data model
Lead-in
Primary keys define
uniqueness in the relational
It might help to draw a
picture of a project table
with a composite key
Trang 7Foreign Keys
" Values that establish relationships between parent and child entities
" Typically primary keys of parent entities
" Keys that contain same values as parent entities’
instances
" Possibly members of child entities’ primary keys
Foreign keys are values that exist in child entities and that establish relationships with the corresponding parent entities The parent entity can find all its related child entities by searching the child entities for those instances with the appropriate foreign key
A foreign key for a child entity is usually the primary key of the parent entity, because as the primary key uniquely defines the parent, the child must in turn
be associated with a unique instance of the parent
The foreign key for a child entity cannot serve as the child’s entire primary key because many instances of the child entity may be related to the same parent instance, and as a result, have the same foreign key value In many cases, however, the foreign key can serve as a member of a composite key for the child
Slide Objective
To introduce foreign key
concepts
Lead-in
Foreign keys are used as
the links between entities
that make relationships
Also remind them about
parent and child entities, as
well as the different types of
relationships
Trang 8Examples of Primary and Foreign Keys
The slide shows a logical data model with primary and foreign keys, as well as entity relationships
The primary keys identified for each entity are surrogate keys As shown, each employee, client, timesheet, and invoice instance of an entity has its own unique identifier The EmployeeID primary key in the Employee entity is used
as a foreign key in both the Client and Timesheet entities Although the Timesheet entity has a primary key, the EmployeeID foreign key is responsible for linking the Timesheet and Employee entities together
Notice the existence between the Timesheet and EmployeeID entities A timesheet cannot exist unless an employee fills it out However, the dotted line between the Client and Employee entities indicates that although an employee and a client are related, neither depends on the other for its existence
Slide Objective
To show how primary and
foreign keys might be used
in a logical data model
Lead-in
The following slide shows
how primary and foreign
keys might be used in a
logical data model
Trang 9Activity 5.1: Identifying Keys in the Logical Model
In this activity, you will identify primary, foreign, and (if necessary) composite keys in the logical data model for Ferguson and Bardell, Inc
After completing this activity, you will be able to:
" Identify primary, foreign, and composite keys in a logical data model
" Select a primary and foreign key type that is appropriate for a given entity
Slide Objective
To introduce this activity
Lead-in
In this activity, you will
identify keys in the
Ferguson and Bardell, Inc
logical data model
Delivery Tip
Make the students aware
that they are focusing on
keys in this activity They
are identifying keys to
reinforce their knowledge so
that they will be able to build
a normalized data model
Trang 10! Normalization Basics
" Normalizing Logical Models
" Creating a First Normal Form Data Model
" First Normal Form Example
" Moving to a Second Normal Form Data Model
" Creating a Third Normal Form Data Model
" Third Normal Form Example
To explain the purpose of
this section and what
students will learn
Lead-in
In this section, you will learn
about normalization and its
benefits
Trang 11Normalizing Logical Models
" Process of eliminating duplicate data, and usually, defining relationships among tables
" Normal forms
" Normalized databases typically include more tables with fewer columns
Normalization is the process of progressively refining a logical model to
eliminate duplicate data from a database Normalization usually involves dividing a database into two or more tables and defining relationships among these tables
Database theorists have evolved a de facto standard of increasingly restrictive constraints, or normal forms, on the layout of databases Applying these normal forms results in a normalized database These de facto standards have generated
at least five commonly accepted normal form levels, each progressively more restrictive on data duplication than the preceding one As discussed later in this
module, achieving the third normal form is the most common level of
normalization because it is a compromise between too little normalization and too much
Normalized databases typically include more tables with fewer columns
Normalizing a database accomplishes the following tasks:
" Minimized duplication of information
A normalized database contains less duplicate information For example, the Ferguson and Bardell, Inc database needs to track timesheet and invoice information Storing timesheet information in the Invoice table would eventually cause this table to include a large amount of redundant data, such
as employee, job, task, and client information Normalizing the Ferguson and Bardell, Inc database creates separate, related tables for timesheets and invoices, thus avoiding duplication
" Reduced data inconsistencies Normalization reduces data inconsistencies by maintaining table relationships For example, if a client’s phone number is stored in multiple tables, or even in multiple records within the same table, and the number changes, the phone number might not be changed in all locations A client’s phone number stored only once in one table is easier to maintain
logical model to eliminate
duplicate data from a
database
Trang 12" Faster data modification (insertions, updates, and deletions) Normalization speeds up the data modification process in a database For example, removing client names and address information from the Invoice table results in less data to track and manipulate when working with an invoice The removed data is not lost because it still exists in the Client table Additionally, reducing duplicated information improves performance during updates because fewer values must be modified in the tables
Data that is overnormalized can cause performance issues in certain types of applications and can adversely affect read/write operations
Trang 13Creating a First Normal Form Data Model
" Create two-dimensional tables
" Assign only one value to each cell
" Assign a single meaning to each column
The first step in normalizing a database is to ensure that the tables are in first normal form To accomplish this step, the tables must adhere to the following criteria:
" Tables must be two-dimensional (with columns and rows)
Entities specified in the logical data model are transformed into database tables represented in a two-dimensional table, similar to a spreadsheet
" Each cell must contain one value
" Each column must have a single meaning
For example, a dual-purpose column, such as Order Date/Delivery Date, is not allowed
Slide Objective
To introduce the process of
creating a first normal form
data design
Lead-in
Creating first normal form is
the first step in producing a
normalized data design
Delivery Tip
Remind the students that
they are not completely out
of logical design at this
point They are still moving
from logical data design to
physical data design This is
a step in that process
Trang 14First Normal Form Example
As shown in the slide, the Timesheet entity for the Ferguson and Bardell, Inc case study was not originally in first normal form The original Timesheet entity had no unique identifier and not all of its attributes were singularly tied to one piece of information Also, each column in the Timesheet table could have multiple meanings
For example, the Employee attribute can be divided into two distinct attributes: Employee FirstName and Employee LastName This logic also applies to the Client and Job attributes, as shown in the slide
Slide Objective
To illustrate the concept of
first normal form
Lead-in
The Ferguson and Bardell,
Inc Timesheet entity was
not originally in first normal
form
Trang 15Moving to a Second Normal Form Data Model
" Eliminate redundant data within an entity
" Move attribute that depends
on only part of
a multivalue key to a separate table
" Consolidate information when possible
To move a data design to second normal form, you must look at several instances of an entity and move any redundant data within an entity’s attributes
to a separate table
The slide shows the Timesheet entity within the Ferguson and Bardell, Inc case study If a client moves to a different city, the client database would have to be updated, as well as every timesheet that referenced that client The solution is to remove the client information from the Timesheet table and replace the
information in the Timesheet table with a ClientID foreign key that corresponds
to the ClientID primary key of the Clients table The Timesheet table can then find the client’s name and address via the foreign key relationship When a client’s address changes, the change is recorded only in the Client’s table Similarly, the second normal form would replace the employee information with an EmployeeID foreign key that would link the timesheet to the individual employee who enters information into the timesheet
The second normal form also eliminates any other duplicate information The JobName and JobDesc attributes in the slide have been replaced with a single attribute, JobDesc, because the job description would likely contain the job name This logic also applies to the TaskName and TaskDesc attributes
1 Eliminate redundant employee and client information by making foreign key references to existing Employee and Client entities
2 Move additional employee and client information out of the Timesheet table because this information is not needed to describe a timesheet
3 Consolidate job and task information This information remains in the timesheet table because the information contains attributes that are needed
to describe a timesheet
Slide Objective
To illustrate the concept of
second normal form
Lead-in
Second normal form is a
bridge process that
eventually leads to third
normal form
Trang 16Creating a Third Normal Form Data Model
" Eliminate any columns that do not depend on a key value for their existence
" Generally, move any data not directly related to entity to another table
" Reduce or eliminate update and deletion anomalies
" Verify that no redundant data remains
Third normal form eliminates any columns that do not depend on a key value for their existence Any data not directly related to the entity is generally moved
to another table Third normal form is generally the final form that you should implement
Third normal form helps to avoid update and deletion anomalies because all data can be reached via foreign key values and redundant data within each table
no longer exists This level of normalization greatly increases database robustness and generates a more optimized design
1 Normalize to first and second normal form, as appropriate
2 Identify any attributes that do not depend on a key value for their existence
3 Move these independent attributes into separate tables and identify each with a primary key Relate that primary key back to the parent entity as a foreign key
Slide Objective
To illustrate the concepts of
third normal form
Lead-in
Third normal form is the
level to which most design
teams strive to normalize
their data designs