Student report: Course report: Instructor report:... As a convenience, here are the attributes rewritten using our relation listing method, with repeating groups and multivalued attribut
Trang 1effort is underway, which includes building integrated application and database systems
to perform basic business functions
The User Views
UTLA wishes to construct a system to track their academic activities, including
course offerings, instructor qualifications for the courses, course enrollment, and
student grades The following illustrations show the desired output reports with
sample data (these are the user views that should be normalized)
Student report:
Course report:
Instructor report:
Trang 2• A qualifications committee must approve instructors before they are permitted
to teach a particular course The qualifications (that is, the courses that thecommittee has determined the instructor is qualified to teach) are then added tothe instructor’s records, as shown in the Instructor report The list of qualifiedcourses does not imply that the instructor has ever actually taught the course butonly that he or she is qualified to do so
• Based on demand, any course may be offered multiple times, even in thesame year and semester Each offering is called a “section,” as shown inthe Section report
• Students enroll in a particular section of a course and receive a grade fortheir participation in that course offering Should they take the course again
at a later time, they receive another grade, and both grades are part of theirpermanent academic record
Trang 3TEAM FLY
• Although the day, time, building, and room for each section is noted
in the Section report, this is done merely to facilitate registering students
The scheduling of classrooms is out of scope for this project
• The day(s) and time(s) attributes on the Section report are merely text
descriptions of the meeting schedule The building of a meeting calendarfor sections is out of scope for this project
As a convenience, here are the attributes rewritten using our relation listing
method, with repeating groups and multivalued attributes enclosed in parentheses:
STUDENT REPORT: # ID, NAME, STREET ADDRESS, CITY, STATE,
ZIP CODE, HOME PHONE COURSE REPORT: # ID, TITLE, NUMBER OF CREDITS,
(PREREQUISITE COURSES), DESCRIPTION INSTRUCTOR REPORT: # ID, NAME, STREET ADDRESS, CITY, STATE,
ZIP CODE, HOME PHONE, OFFICE PHONE, (QUALIFIED COURSES) SECTION REPORT: YEAR, SEMESTER, BUILDING, ROOM, DAYS,
TIMES, INSTRUCTOR ID, INSTRUCTOR NAME, COURSE ID, NUMBER OF CREDITS,
(STUDENT ID, STUDENT NAME, GRADE)
Author’s Solution
Database design is not an exact science, so there is some latitude for alternative
solu-tions However, all must meet the criteria for third normal form Here are the
normal-ized relations, with the hash mark (#) denoting primary key attributes:
COURSE: # COURSE ID, TITLE, DESCRIPTION, NUMBER OF CREDITS
INSTRUCTOR: # INSTRUCTOR ID, NAME, HOME ADDRESS STREET,
HOME ADDRESS CITY, HOME ADDRESS STATE, HOME ADDRESS ZIP CODE, HOME PHONE, OFFICE PHONE COURSE SECTION: # SECTION ID, YEAR, SEMESTER, COURSE ID,
BUILDING, ROOM, MEETING DAY, MEETING TIME, INSTRUCTOR ID
STUDENT: # STUDENT ID, NAME, HOME ADDRESS, CITY, STATE,
ZIP CODE, PHONE STUDENT SECTION: # STUDENT ID, # SECTION ID, GRADE
COURSE PREREQUISITE: COURSE ID, PREREQUISITE COURSE ID
COURSE INSTRUCTOR QUALIFIED: INSTRUCTOR ID, COURSE ID
A few notes on this particular solution are in order:
• There was no simple natural key for the Course Section relation, so
a surrogate key was added
Trang 4168 Databases Demystified
• The Course Prerequisite relation can be quite confusing This is theintersection relation for a many-to-many recursive relationship A coursecan have many prerequisites, which may be found by joining COURSE ID
in the COURSE relation with COURSE ID in the COURSE PREREQUISITErelation At the same time, any course may be a prerequisite for many othercourses These may be found by joining COURSE ID in the COURSErelation with PREREQUISITE COURSE ID in the COURSE PREREQUISITErelation This means that there are two relationships between the COURSEand COURSE PREREQUISITE: one where COURSE ID is the foreignkey and another where PREREQUISITE COURSE ID is the foreign key.Comparing the upcoming illustrations for the COURSE and COURSE_PREREQUISITE tables should help make this point clear
To assist you in visualizing how all this works, the following illustrations showeach of the tables as implemented in a Microsoft Access database, each loaded withthe data from the original user view (report) examples Figure 6-5 shows the ERD forthe solution, using the Microsoft Relationships panel as the presentation media.COURSE table:
INSTRUCTOR table:
Trang 5COURSE_SECTION table:
STUDENT table:
STUDENT_SECTION table:
COURSE_PREREQUISITE table:
Trang 6COURSE_INSTRUCTOR_QUALIFIED table:
Computer Books Company
The Computer Books Company (CBC) buys books from publishers and sells them
to individuals via mail and telephone orders They are looking to expand their vices by offering online ordering via the Internet, and in doing so, have a compellingneed to build a database to hold their business information
Figure 6-5 ERD (Relationships panel)
Trang 7The User Views
Throughout these user views, “sale” and “price” are references to the retail sale of a
book to a CBC customer, whereas “purchase” and “cost” are references to the
pur-chase of books from a publisher (CBC supplier) Each user view is described briefly
with a list of the attributes in the view following each description Per our
conven-tion, multivalued attributes and repeating groups are enclosed in parentheses
The Book Catalog lists all the books that CBC has for sale Each book is uniquely
identified by the International Standard Book Number (ISBN) Although an ISBN
uniquely identifies a book, it is essentially a surrogate key, so there is no way to tell
what edition a particular book is simply by looking at the ISBN When new editions
come out, CBC typically has leftover stock of prior editions and offers them at a
re-duced price The previous edition code in the Book Catalog is intended to help the
buyer find the prior edition, if there is one Books are organized by subject, with each
book having only one subject Any book may have multiple authors (Although the
catalog shows only author names, keep in mind that people’s names are seldom
unique, and nothing would stop two people with the same name from both writing
books) Here is the information in the Book Catalog:
BOOK CATALOG: SUBJECT CODE, SUBJECT DESCRIPTION, BOOK TITLE,
BOOK ISBN, BOOK PRICE, PREVIOUS EDITION ISBN, PREVIOUS EDITION PRICE, (BOOK AUTHORS),
PUBLISHER NAME
The Book Inventory Report helps the warehouse manager control the inventory in
the warehouse The Recommended Quantity is the reorder point, meaning when
on-hand inventory falls below the recommended quantity, it is time to order more books
of that title
INVENTORY REPORT: BOOK ISBN, BOOK EDITION CODE, COST,
SELLING PRICE, QUANTITY ON HAND, QUANTITY ON ORDER, RECOMMENDED QUANTITY
The Customer Book Orders view shows orders placed by CBC customers for
pur-chases of books:
CUSTOMER BOOK ORDERS: CUSTOMER ID, CUSTOMER NAME,
STREET ADDRESS, CITY, STATE, ZIP CODE (ISBN, BOOK EDITION CODE, QUANTITY, PRICE), ORDER DATE, TOTAL PRICE
Trang 8CBC bills customers as books are shipped An invoice is created for each ment (An order can have zero, one, or more invoices, but each invoice belongs toonly one order.) The Book Sales Invoice looks like this:
ship-BOOK SALES INVOICE: SALES INVOICE NUMBER, CUSTOMER ID,
CUSTOMER NAME, CUSTOMER STREET ADDRESS, CUSTOMER CITY, CUSTOMER STATE,
CUSTOMER ZIP CODE, (BOOK ISBN, TITLE, EDITION CODE, (BOOK AUTHORS), QUANTITY, PRICE, PUBLISHER NAME),
SHIPPING CHARGES, SALES TAX
The Master Billing Report helps the Collections and Customer Service ments manage customer accounts A system for recording customer paymentsagainst invoices is out of scope for the current project, but the CBC project sponsors
Depart-do want to keep a running balance showing what each customer owes CBC As voices are generated, a database trigger will be used to add invoice totals to the Bal-ance Due As payments are received, the CBC staff will manually adjust the BalanceDue The Master Billing Report attributes are as follows:
in-MASTER BILLING REPORT: CUSTOMER ID, NAME, STREET ADDRESS,
CITY, STATE, ZIP CODE, PHONE, BALANCE DUE
Each time CBC buys books from a publisher, the publisher sends an invoice toCBC To assist in managing inventory cost, CBC wishes to store the Purchase In-voice information and report it using this view:
PURCHASE INVOICE: PUBLISHER ID, PUBLISHER NAME,
STREET ADDRESS, CITY, STATE, ZIP CODE, PURCHASE INVOICE NUMBER, INVOICE DATE, (BOOK ISBN, EDITION CODE, TITLE,
QUANTITY, COST EACH, EXTENDED COST), TOTAL COST
Note that Extended Cost is calculated as Cost Each times Quantity
Author’s Solution
As before, there is some room for alternative solutions, provided all relations are inthird normal form The normalized relations in this solution follow, with primarykeys noted with a hash mark (#):
BOOK: # ISBN, BOOK TITLE, SUBJECT CODE, PUBLISHER ID,
EDITION CODE, COST, SELLING PRICE, QUANTITY ON HAND, QUANTITY ON ORDER, RECOMMENDED QUANTITY,
Trang 9PREVIOUS EDITION ISBN CUSTOMER ORDER: # CUSTOMER ORDER NUMBER, CUSTOMER ID,
ORDER DATE, CANCEL DATE CUSTOMER ORDER BOOK: # CUSTOMER ORDER NUMBER, # ISBN,
QUANTITY, BOOK PRICE SUBJECT: # SUBJECT CODE, DESCRIPTION
AUTHOR: # AUTHOR ID, AUTHOR NAME
BOOK-AUTHOR: # AUTHOR ID, # ISBN
CUSTOMER: # CUSTOMER ID, NAME, STREET ADDRESS, CITY, STATE,
ZIP CODE, PHONE, BALANCE DUE PUBLISHER: # PUBLISHER ID, NAME, STREET ADDRESS, CITY,
STATE, ZIP CODE, AMOUNT PAYABLE RECEIVABLE (SHIPPED) ORDER: # SALES INVOICE NUMBER,
CUSTOMER ORDER NUMBER, SALES TAX, SHIPPING CHARGES RECEIVABLE ORDER BOOK: # SALES INVOICE NUMBER, # ISBN,
QUANTITY PAYABLE (PURCHASES): # PURCHASE INVOICE NUMBER,
PUBLISHER ID, INVOICE DATE, INVOICE AMOUNT PAYABLE BOOK: # PURCHASE INVOICE NUMBER, # ISBN, QUANTITY,
COST EACH
Figure 6-6 shows the complete design, implemented in Microsoft Access
Figure 6-6 CBC ERD (Microsoft Access Relationships panel)
Trang 10Choose the correct responses to each of the multiple-choice questions Note thatthere may be more than one correct response to each question
1 Normalization:
a Was developed by Dr Codd
b Was first introduced with five normal forms
c First appeared in 1972
d Provides a set of rules for each normal form
e Provides a procedure for converting relations to each normal form
2 The purpose of normalization is
a To eliminate redundant data
b To remove certain anomalies from the relations
c To provide a reason to denormalize the database
d To optimize data-retrieval performance
e To optimize data for inserts, updates, and deletes
3 When implemented, a third normal form relation becomes
4 The insert anomaly refers to a situation where:
a Data must be inserted before it can be deleted
b Too many inserts cause the table to fill up
c Data must be deleted before it can be inserted
d A required insert cannot be done due to an artificial dependency
e A required insert cannot be done due to duplicate data
5 The delete anomaly refers to a situation where:
a Data must be deleted before it can be inserted
b Data must be inserted before it can be deleted
c Data deletion causes unintentional loss of another entity’s data
d A required delete cannot be done due to referential constraints
e A required delete cannot be done due to lack of privileges
6 The update anomaly refers to a situation where:
a A simple update requires updates to multiple rows of data
b Data cannot be updated because it does not exist in the database
Trang 11c Data cannot be updated due to lack of privileges.
d Data cannot be updated due to an existing unique constraint
e Data cannot be updated due to an existing referential constraint
7 The roles of unique identifiers in normalization are
a They are unnecessary
b They are required once you reach third normal form
c All normalized forms require designation of a primary key
d You cannot normalize relations without first choosing a primary key
e You cannot choose a primary key until relations are normalized
8 Writing sample user views with representative data in them is
a The only way to successfully normalize the user views
b A tedious and time-consuming process
c An effective way to understand the data being normalized
d Only as good as the examples shown in the sample data
e A widely used normalization technique
9 Criteria useful in selecting a primary key from among several candidate
keys are
a Choose the simplest candidate
b Choose the shortest candidate
c Choose the candidate most likely to have its value change
d Choose concatenated keys over single attribute keys
e Invent a surrogate key if that is the best possible key
10 First normal form resolves anomalies caused by:
Trang 1214 A foreign key in a normalized relation may be
a The entire primary key of the relation
b Part of the primary key of the relation
d Determinants that are not primary or candidate keys
e Constraints that are not the result of the definitions of domains and keys
16 Fourth normal form deals with anomalies caused by:
a Multivalued attributes
b Transitive dependencies
c Join dependencies
d Determinants that are not primary or candidate keys
e Constraints that are not the result of the definitions of domains and keys
17 Fifth normal form deals with anomalies caused by:
a Multivalued attributes
b Transitive dependencies
c Join dependencies
d Determinants that are not primary or candidate keys
e Constraints that are not the result of the definitions of domains and keys
18 Domain key normal form deals with anomalies caused by:
a Multivalued attributes
b Transitive dependencies
c Join dependencies
d Determinants that are not primary or candidate keys
e Constraints that are not the result of the definitions of domains and keys
Trang 1319 Most business systems require that you normalize only as far as:
a First normal form
b Second normal form
c Third normal form
d Boyce-Codd normal form
e Fourth normal form
20 Proper handling of multivalued attributes when converting relations to first
normal form usually prevents subsequent problems with:
a First normal form
b Second normal form
c Third normal form
d Boyce-Codd normal form
e Fourth normal form
Trang 14This page intentionally left blank.
Trang 15Data and Process
Modeling
As you saw in Chapter 5, data and process modeling are major undertakings that are
part of the logical design stage of an application system development project You
have already seen the rudiments of data modeling when we used entity relationship
diagrams (ERDs) in prior chapters In this chapter, we will look at ERDs and data
modeling in more detail Process modeling, on the other hand, is less important to a
database designer because application processes are designed by application de
signers and seldom directly involve the database designer However, because the
database designer must work closely with the application designer in gathering data
requirements and in supplying a database design that will support the processes
being designed, the database designer should be at least familiar with the basic
con-cepts It is for this reason that the second part of this chapter includes a high-level
survey of process design concepts and diagramming techniques
179
Trang 16Entity Relationship Modeling
Entity relationship modeling is the process of visually representing entities, utes, and relationships, producing a diagram called an entity relationship diagram(ERD) The process is iterative in nature because entities are discovered throughoutthe design process The chief advantage of ERDs is that they can be understood bynontechnical people while still providing great value to technical people Done cor-rectly, ERDs are platform independent and can even be used for nonrelational data-bases if desired
attrib-ERD Formats
Peter Chen developed the original ERD format in 1976 Since then, vendors, puter scientists, and academics have developed many variations, all of them concep-tually the same It is important to understand the most commonly used variationsbecause you are likely to encounter them in active use in IT organizations Here arethe elements common to all ERD formats:
com-• Entities are represented as rectangles or boxes
• Relationships are represented as lines
• Line ends indicate the maximum cardinality of the relationship (that is,one or many)
• Symbols near the line ends indicate the minimum cardinality of therelationship (that is, whether participation in the relationship is mandatory
ap-Here are the particulars of the Chen format:
• Relationship lines contain a diamond in which is written a word or shortphrase that describes the relationship For example, the relationshipbetween Invoice and Product may be read as “An invoice contains manyproducts.”
Trang 17• For many-to-many relationships that require an intersection table in an
RDBMS, such as the one between Invoice and Product, a rectangle isoften drawn around the diamond
• Maximum cardinality of each relationship is shown using the symbol “1”
for “one” or “M” for “many.”
• Minimum cardinality is not shown
• Attributes, when shown, appear in ellipses, connected to the entity or
relationship to which they belong with a line
In practice, Chen ERDs proved to be cumbersome for complicated data models
The diamonds take a lot of space for the added value they provide Also, any ERD
that includes many attributes becomes very difficult to read Notwithstanding, we
owe Chen a lot for his pioneering work, which laid the foundation for the techniques
that followed
The Relational Format
Over time, an ERD format known generically as the relational format evolved It is
in use (or available as an option) by several of the better-known data modeling
software tools, including PowerDesigner from Sybase and ER/Studio from
Embarcadero Technologies, and in popular general drawing tools such as Visio from
Microsoft Figure 7-2 shows the ERD from Figure 7-1, converted to the relational
format In this example, the ERD is represented at a physical level, meaning that
physical table names are shown instead of logical entity names, and physical column
names are shown instead of logical attribute names Also, intersection tables are
shown to resolve many-to-many relationships As the logical data model is
trans-formed into a physical database design, it is essential to have a physical ERD that the
Figure 7-1 Acme Industries logical ERD in Chen’s format
Trang 18project team can use in developing the application system The beginnings of thephysical model are shown here to help make that point.
Here are the particulars of the relational ERD format:
• Relationship cardinality is shown with an arrowhead on the line end to signify
“one” and nothing on the line end to signify “many.” This will seem odd atfirst, but it aligns nicely with object diagrams, so this format is favored byobject-oriented designers and developers
• Attributes are shown inside the rectangle that represents each entity
• Unique identifier attributes are shown above a horizontal line within therectangle and are usually also shown in bold with “PK” (signifying
“primary key”) in the margin to the left of the attribute name
• Attributes that are foreign keys are shown with “FK” and a number inthe margin to the left of the attribute name
The IDEF1X Format
The Computer Systems Laboratory of the National Institute of Standards and nology released the IDEF1X standard for data modeling in FIPS Publication 184,which was released in December 1993 The standard covers both a method for datamodeling as well as the format for the ERDs produced during the modeling effort It
Tech-is widely used and understood across the information technology industry and Tech-is aU.S Federal Government standard Thanks to its underlying standard, it has few
Figure 7-2 Acme Industries logical ERD, relational format