You can’t insert the following row into the child table titles, because the pub_idvalue P05 doesn’t exist in the parent table publishers: T07 I Blame My Mother P05 You can insert this ro
Trang 1Foreign Keys
Information about different entities is
stored in different tables, so you need a way
to navigate between tables The relational
model provides a mechanism called a foreign
key to associate tables A foreign key has
these characteristics:
◆ It’s a column (or group of columns) in a
table whose values relate to, or reference,
values in some other table
◆ It ensures that rows in one table have
corresponding rows in another table
◆ The table that contains the foreign key
is the referencing or child table The other
table is the referenced or parent table.
◆ A foreign key establishes a direct
rela-tionship to the parent table’s primary key
(or any candidate key), so foreign-key
values are restricted to existing parent-key
values This constraint is called referential
integrity A particular row in a table
appointmentsmust have an associated
row in a table patients, for example, or
there would be appointments for patients
who don’t exist or can’t be identified An
orphan row is a row in a child table for
which no associated parent-table row
exists In a properly designed database,
you can’t insert new orphan rows or
make orphans out of existing child-table
◆ The values in the foreign key have the same domain as the parent key Recall from “Tables, Columns, and Rows” earlier
in this chapter that a domain defines the set of valid values for a column
◆ Unlike primary-key values, foreign-key values can be null (empty); see the Tips
in this section
◆ A foreign key can have a different column name than its parent key
◆ Foreign-key values generally aren’t unique in their own table
◆ I’ve made a simplification in the first point:
In reality, a foreign key can reference the
primary key of its own table (rather than
only some other table) A table employees
with the primary key emp_idcan have a foreign key boss_id, for example, that references the column emp_id This type
of table is called self-referencing.
Trang 2Figure 2.9 shows a primary- and foreign-key
relationship between two tables
After a foreign key is defined, your DBMS will enforce referential integrity You can’t insert the following row into the child table
titles, because the pub_idvalue P05 doesn’t exist in the parent table publishers:
T07 I Blame My Mother P05
You can insert this row only if the foreign key accepts nulls:
T07 I Blame My Mother NULL
This row is legal:
T07 I Blame My Mother P03
✔ Tips
■ See also “Specifying a Foreign Key with
FOREIGN KEY” in Chapter 11
■ SQL lets you specify the referential-integrity action that the DBMS takes when you attempt to update or delete a parent-table key value to which foreign-key values point; see the Tips in “Specifying a Foreign Key with FOREIGN KEY” in Chapter 11
■ Allowing nulls in a foreign-key column complicates enforcement of referential integrity In practice, nulls in a foreign key often remain null temporarily, pend-ing a real-life decision or discovery; see
“Nulls” in Chapter 3
pub_id pub name
P01 Abatis Publishers
P02 Core Dump Books
P03 Schadenfreude Press
P04 Tenterhooks Press
Primary key
Primary key
publishers
title_id title_name pub_id
T02 200 Years of Ger… P03
T03 Ask Your System… P02
T04 But I Did It Unco… P04
Foreign key
titles
P04 Exchange of Plat…
T05
P01 How About Never?
T06
table titles that references the column pub_id of
publishers
Trang 3A relationship is an association established
between common columns in two tables
A relationship can be:
◆ One-to-one
◆ One-to-many
One-to-one
In a one-to-one relationship, each row in
table A can have at most one matching row
in the table B, and each row in table B can
have at most one matching row in table A.
Even though it’s practicable to store all the
information from both tables in only one
table, one-to-one relationships usually are
used to segregate confidential information
for security reasons, speed queries by
split-ting single monolithic tables, and avoid
inserting nulls into tables (see “Nulls” in
Chapter 3)
A one-to-one relationship is established
when the primary key of one table also is a
foreign key referencing the primary key of
another table (Figures 2.10 and 2.11).
title_id advance
royalties
title_id title_name T01 1977!
T02 200 Years of Ger…
T03 Ask Your System…
T04 But I Did It Unco…
titles
Figure 2.10 A one-to-one relationship Each row
in titles can have at most one matching row in
royalties , and each row in royalties can have at most one matching row in titles Here, the primary key of royalties also is a foreign key referencing the primary key of titles
titles title_id title_name
royalties title_id advance
Figure 2.11 This diagram shows an alternative way to
depict the one-to-one relationship in Figure 2.10 The connecting line indicates associated columns The key symbol indicates a primary key.
Trang 4In a one-to-many relationship, each row
in table A can have many (zero or more)
matching rows in table B, but each row in
table B has only one matching row in table A.
A publisher can publish many books, but each book is published by only one publisher, for example
One-to-many relationships are established
when the primary key of the one table appears as a foreign key in the many table
(Figures 2.12 and 2.13).
pub_id pub name
P01 Abatis Publishers
P02 Core Dump Books
P03 Schadenfreude Press
P04 Tenterhooks Press
publishers
title_id title_name pub_id
T02 200 Years of Ger… P03
T03 Ask Your System… P02
T04 But I Did It Unco… P04
titles
T05 Exchange of Plati… P04
Figure 2.12 A one-to-many relationship Each row in
publishers can have many matching rows in titles ,
and each row in titles has only one matching row in
publishers Here, the primary key of publishers (the
one table) appears as a foreign key in titles (the
many table).
publishers
pub_id
pub_name
titles title_id title_name pub_id
Figure 2.13 This diagram shows an alternative way to
depict the one-to-many relationship in Figure 2.12.
The connecting line’s unadorned end indicates the
one table, and the arrow indicates the many table.
Trang 5In a many-to-many relationship, each row
in table A can have many (zero or more)
matching rows in table B, and each row in
table B can have many matching rows in
table A Each author can write many books,
and each book can have many authors,
for example
A many-to-many relationships is established
only by creating a third table called a junction
table, whose composite primary key is a
combination of both tables’ primary keys;
each column in the composite key separately
is a foreign key This technique always
pro-duces a unique value for each row in the
junction table and splits the many-to-many
relationship into two separate one-to-many
relationships (Figures 2.14 and 2.15).
✔ Tips
■ Joins (for performing operations on
mul-tiple tables) are covered in Chapter 7
■ You can establish a many-to-many
rela-tionship without creating a third table if
you add repeating groups to the tables,
but that method violates first normal
form; see the next section
■ A one-to-many relationship also is
called a parent–child or master–detail
relationship
title_id au_id
title_id title_name T01 1977!
T02 200 Years of Ger…
T03 Ask Your System…
T04 But I Did It Unco…
T05 Exchange of Plati…
titles
au_id au_fname au_lname
authors title_authors
Figure 2.14 A many-to-many relationship The
junction table title_authors splits the many-to-many relationship between titles and authors into two one-to-many relationships Each row in titles can have many matching rows in title_authors , as can each row in authors Here, title_id in title_authors
is a foreign key that references the primary key of
titles , and au_id in title_authors is a foreign key that references the primary key of authors
titles title_authors authors
Trang 6It’s possible to consolidate all information
about books (or any entity type) into a single monolithic table, but that table would be
loaded with duplicate data; each title (row)
would contain redundant author, publisher,
and royalty details Redundancy is the enemy
of database users and administrators: It
caus-es databascaus-es to grow wildly large, it slows
queries, and it’s a maintenance nightmare
(When someone moves, you want to change her address in one place, not thousands of
places.)
Redundancies lead to a variety of update
anomalies—that is, difficulties with
opera-tions that insert, update, and delete rows
Normalization is the process—a series of
steps—of modifying tables to reduce redun-dancy and inconsistency After each step,
the database is in a particular normal form.
The relational model defines three normal
forms, named after famous ordinal numbers:
◆ First normal form (1NF)
◆ Second normal form (2NF)
◆ Third normal form (3NF)
Each normal form is stronger than its prede-cessors; a database in 3NF also is in 2NF
and 1NF Higher normalization levels tend
to increase the number of tables relative to
lower levels Lossless decomposition ensures
that table splitting doesn’t cause information
loss, and dependency-preserving decomposi-tion ensures that reladecomposi-tionships aren’t lost.
The matching primary- and foreign-key
columns that appear when tables are split
are not considered to be redundant data
Trang 7First normal form
A table in first normal form:
◆ Has columns that contain only atomic
values
and
◆ Has no repeating groups
An atomic value, also called a scalar value,
is a single value that can’t be subdivided
(Figure 2.16) A repeating group is a set of
two or more logically related columns
(Figure 2.17) To fix these problems, store
the data in two related tables (Figure 2.18).
A database that violates 1NF causes problems:
◆ Multiple values in a row–column inter-section mean that the combination of table name, column name, and key value
is insufficient to address every value in the database
◆ It’s difficult to retrieve, insert, update,
or delete a single value (among many) because you must rely on the order of the values
◆ Queries are complex (a performance killer)
◆ The problems that further normalization solves become unsolvable
title_id title_name authors
- -
-T01 1977! A01
T04 But I Did It Unconsciously A03, A04
T11 Perhaps It's a Glandular Problem A03, A04, A06
Trang 8Second normal form
Before I give the constraints for second normal form, I’ll mention that a 1NF table automatically is in 2NF if:
◆ Its primary key is a single column (that
is, the key isn’t composite)
or
◆ All the columns in the table are part of the primary key (simple or composite)
A table in second normal form:
◆ Is in first normal form
and
◆ Has no partial functional dependencies
A table contains a partial functional depend-ency if some (but not all) of a composite
key’s values determine a nonkey column’s
value A 2NF table is fully functionally dependent, meaning that a nonkey column’s value might need to be updated if any column
values in the composite key change
The composite key in the table in Figure
2.19 istitle_idandau_id The nonkey columns are au_order(the order in which authors are listed on the cover of a book with multiple authors) and au_phone(the author’s phone number)
For each nonkey column, ask, “Can I deter-mine a nonkey column value if I know only
part of the primary-key value?” A no answer
means the nonkey column is fully
function-ally dependent (good); a yes answer means
that it’s partially functionally dependent (bad)
title_id au_id
title_id title_name
T01 1977!
T04 But I Did It Unco…
T11 Perhaps It's a Gla…
Figure 2.18 The correct design solution is to move the
author information to a new child table that contains
one row for each author of a title The primary key in
the parent table is title_id , and the composite key in
the child table is title_id and au_id
title_authors
title_id
au_id
au_order
au_phone
au_id but not title_id , so this
table contains a partial functional
dependency and isn’t in 2NF.
Atomicity
Atomic values are perceived to be
indivis-ible from the point of view of database
users A date, a telephone number, and a
character string, for example, aren’t really
intrinsically indivisible because you can
decompose the date into a year, month,
and day; the phone number into a country
code, area code, and subscriber number;
and the string into its individual characters
What’s important as far as you’re
con-cerned is that the DBMS provide operators
and functions that let you extract and
Trang 9For the columnau_order, the questions are:
◆ Can I determine au_orderif I know only
title_id? No, because there might be
more than one author for the same title
◆ Can I determine au_orderif I know only
au_id? No, because I need to know the
particular title too
Good—au_orderis fully functionally
dependent and can remain in the table
This dependency is written
{title_id, au_id} ➝ {au_order}
and is read “title_idandau_iddetermine
au_order” or “au_orderdepends on title_id
andau_id.” The determinant is the
expres-sion to the left of the arrow
For the columnau_phone, the questions are:
◆ Can I determine au_phoneif I know only
title_id? No, because there might be
more than one author for the same title
◆ Can I determine au_phoneif I know only
au_id? Yes! The author’s phone number
doesn’t depend upon the title
Bad—au_phone is partially functionally
dependent and must be moved elsewhere
(probably to an authorsorphone_numbers
table) to satisfy 2NF rules
titles title_id price pub_city pub_id
on pub_id , so this table contains
a transitive dependency and isn’t in 3NF.
Trang 10Third normal form
A table in third normal form:
◆ Is in second normal form
and
◆ Has no transitive dependencies
A table contains a transitive dependency if a
nonkey column’s value determines another
nonkey column’s value In 3NF tables,
non-key columns are mutually independent and
dependent on only primary-key column(s)
3NF is the next logical step after 2NF
The primary key in the table in Figure 2.20
istitle_id The nonkey columns are price
(the book’s price), pub_city(the city where
the book is published), and pub_id(the
book’s publisher)
For each nonkey column, ask, “Can I
deter-mine a nonkey column value if I know any
other nonkey column value?” A no answer
means that the column is not transitively
dependent (good); a yes answer means that
the column whose value you can determine
is transitively dependent on the other
col-umn (bad)
For the columnprice, the questions are:
◆ Can I determine pub_idif I know
price? No
◆ Can I determine pub_cityif I know
price? No
For the columnpub_city, the questions are:
◆ Can I determine priceif I know
pub_city? No
◆ Can I determine pub_idif I know
pub_city? No, because a city might have many publishers
For the columnpub_id, the questions are:
◆ Can I determine priceif I know
pub_id? No
◆ Can I determine pub_cityif I know
pub_id? Yes! The city where the book is published depends on the publisher
Bad—pub_city is transitively dependent
onpub_idand must be moved elsewhere (probably to a publisherstable) to satisfy 3NF rules
As you can see, it’s not enough to ask, “Can
I determine A if I know B?” to discover a transitive dependency; you also must ask,
“Can I determine B if I know A?”