Foreign keys When two entities tables relate to one another, one entity is typically the primary entity and the other entity is the secondary entity.. Each order item primary entity can
Trang 1For some entities, there might be multiple possible primary keys to choose from: employee number,
driver’s license number, national ID number (ssn) In this case, all the potential primary keys are known
as candidate keys Candidate keys that are not selected as the primary key are then known as alternate
keys It’s important to document all the candidate keys because later, at the SQL DLL layer, they will
need unique constraints
At the conceptual diagramming phase, a primary key might be obvious — an employee number, an
automobile VIN number, a state or region name — but often there is no clearly recognizable uniquely
identifying value for each item in reality That’s OK, as that problem can be solved later during the SQL
DLL layer
Foreign keys
When two entities (tables) relate to one another, one entity is typically the primary entity and the other
entity is the secondary entity
The connection between the two entities is made by replicating the primary key from the primary entity
in the secondary entity The duplicated attributes in the secondary entity are known as a foreign key.
Informally this type of relationship is sometimes called a parent-child relationship
Enforcing the foreign key is referred to as referential integrity.
The classic example of a primary key and foreign key relationship is the order and order details
rela-tionship Each order item (primary entity) can have multiple order detail rows (secondary entity) The
order’s primary key is duplicated in the order detail entity, providing the link between the two entities,
as shown in Figure 3-3
You’ll see several examples of primary keys and foreign keys in the ‘‘Data Design Patterns’’ section later
in this chapter
Cardinality
The cardinality of the relationship describes the number of tuples (rows) on each side of the
relation-ship Either side of the relationship may be restricted to allow zero, one, or multiple tuples
The type of key enforces the restriction of multiple tuples Primary keys are by definition unique and
enforce the single-tuple restriction, whereas foreign keys permit multiple tuples
There are several possible cardinality combinations, as shown in Table 3-2 Within this section, each of
the cardinality possibilities is examined in detail
Optionality
The second property of the relationship is its optionality The difference between an optional relationship
and a mandatory relationship is critical to the data integrity of the database
Trang 2FIGURE 3-3
A one-to-many relationship consists of a primary entity and a secondary entity The secondary entity’s
foreign key points to the primary entity’s primary key In this case, the Sales.SalesOrderDetail’s
SalesOrderID is the foreign key that relates to Sales.SalesOrderheader’s primary key
TABLE 3-2
Common Relationship Cardinalities
Relationship Type First Entity’s Key Second Entity’s Key
One-to-one Primary entity–primary key–single
tuple
Primary entity–primary key–single tuple
One-to-many Primary entity–primary key–single
tuple
Secondary entity–foreign key–multiple tuples
Many-to-many Multiple tuples Multiple tuples
Trang 3Some relationships are mandatory, or strong These secondary tuples (rows) require that the foreign key
point to a primary key The secondary tuple would be incomplete or meaningless without the primary
entity For the following examples, it’s critical that the relationship be enforced:
■ An order-line item without an order is meaningless
■ An order without a customer is invalid
■ In theCape Hatteras Adventuresdatabase, an event without an associated tour tuple is a
useless event tuple
Conversely, some relationships are optional, or weak The secondary tuple can stand alone without the
primary tuple The object in reality that is represented by the secondary tuple would exist with or
with-out the primary tuple For example:
■ A customer is valid with or without a discount code
■ In theOBXKitessample database, an order may or may not have a priority code Whether
the order points to a valid tuple in the order priority entity or not, it’s still a valid order
Some database developers prefer to avoid optional relationships and so they design all relationships as
mandatory and point tuples that wouldn’t need a foreign key value to a surrogate tuple in the primary
table For example, rather than allow nulls in the discount attribute for customers without discounts, a
‘‘no discount’’ tuple is inserted into thediscountentity and every customer without a discount points
to that tuple
There are two reasons to avoid surrogate null tuples (pointing to a ‘‘no discount’’ tuple): The design
adds work when work isn’t required (additional inserts and foreign key checks), and it’s easier to locate
a tuple without the relationship by selectingwhere column is not null The null value is a standard
and useful design element Ignoring the benefits of nullability only creates additional work for both the
developer and the database
From a purist’s point of view, a benefit of using the surrogate null tuple is that the ‘‘no discount’’ is
explicit and a null value can then actually mean unknown or missing, rather than ‘‘no discount.’’
Some rare situations call for a complex optionality based on a condition Depending on a rule, the
rela-tionship must be enforced, for example:
■ If an organization sometimes sells ad hoc items that are not in the item entity, then the
rela-tionship may, depending on the item, be considered optional Theorderdetailentity can use two attributes for the item If theItemIDattribute is used, then it must point to a valid itementity primary key
■ However, if theNonStandardItemDescriptionattribute is used instead, theItemID
attribute is left null
■ A check constraint ensures that for each row, either theItemIDor
NonStandardItemDescriptionis null
How the optionality is implemented is up to the SQL DDL layer The only purpose of the conceptual
design layer is to model the organization’s objects, their relationships, and their business rules
Data schema diagrams for the sample databases are in Appendix B The code to create the sample database may be downloaded from www.sqlserverbible.com
Trang 4Data-Model Diagramming
Data modelers use several methods to graphically work out their data models The Chen ER diagramming
method is popular, and Visio Professional includes it and five others The method I prefer, Information
Engineering — E/R Diagramming, is rather simple and works well on a whiteboard, as shown in Figure 3-4
The cardinality of the relationship is indicated by a single line or by three lines (crow’s feet) If the relationship
is optional, a circle is placed near the foreign key
FIGURE 3-4
A simple method for diagramming logical schemas
Primary Table Secondary Table
Another benefit of this simple diagramming method is that it doesn’t require an advanced version of Visio
Visio is OK as a starting point, but it doesn’t give you a nice life cycle like a dedicated modeling tool There
are several more powerful tools, but it’s really a personal preference
Data Design Patterns
Design is all about building something new by combining existing concepts or items using patterns The
same is true for database design The building blocks are tables, rows, and columns, and the patterns are
one-to-many, many-to-many, and others This section explains these patterns
Once the entities — nouns and verbs — are organized, the next step is to determine the relationships
among the objects Each relationship connects two entities using their primary and foreign keys
Clients or business analysts should be able to describe the common relationships between the objects
using terms such as includes, has, or contains For example, a customer may place (has) many orders An
order may include (contains) many items An item may be on many orders
Based on these relationship descriptions, the best data design pattern may be chosen
One-to-many pattern
By far the most common relationship is a one-to-many relationship; this is the classic parent-child
rela-tionship Several tuples (rows) in the secondary entity relate to a single tuple in the primary entity The
relationship is between the primary entity’s primary key and the secondary entity’s foreign key, as
illus-trated in the following examples:
■ In theCape Hatteras Adventuresdatabase, each base camp may have several tours that
originate from it Each tour may originate from only one base camp, so the relationship is
Trang 5modeled as one base camp relating to multiple tours The relationship is made between the BaseCamp’s primary key and theTourentity’sBaseCampIDforeign key, as diagrammed in Figure 3-5 EachTour’s foreign key attribute contains a copy of itsBaseCamp’s primary key
FIGURE 3-5
The one-to-many relationship relates zero to many tuples (rows) in the secondary entity to a single
tuple in the primary entity
Ashville Ashville
Ashville
Appalachian Trail Blue Ridge Parkway Hike Cape Hatteras Outer Banks
Lighthouses Cape Hatteras
Primary Key: Base Camp Foreign Key: Base Camp Tour
■ Each customer may place multiple orders While each order has its own uniqueOrderID
primary key, theOrderentity also has a foreign key attribute that contains theCustomerID
of the customer who placed the order TheOrderentity may have several tuples with the sameCustomerIDthat defines the relationship as one-to-many
■ A non-profit organization has an annual pledge drive As each donor makes an annual pledge,
the pledges go into a secondary entity that can store an infinite number of years’ worth of pledges — one tuple per year
One-to-one pattern
At the conceptual diagram layer, one-to-one relationships are quite rare Typically, one-to-one
relation-ships are used in the SQL ODD or the physical layer to partition the data for some performance or
secu-rity reason
One-to-one relationships connect two entities with primary keys at both entities Because a primary key
must be unique, each side of the relationship is restricted to one tuple
For example, anEmployeeentity can store general information about the employee However, more
sensitive classified information is stored in a separate entity as shown in Figure 3-6 While security can
be applied on a per-attribute basis, or a view can project selected attributes, many organizations choose
to model sensitive information as two one-to-one entities
Many-to-many pattern
In a many-to-many relationship, both sides may relate to multiple tuples (rows) on the other side
of the relationship The many-to-many relationship is common in reality, as shown in the following
Trang 6FIGURE 3-6
This one-to-one relationship partitions employee data, segmenting classified information into a
separate entity
Employee Employee_Classified
John Smith John Smith
Mary Jones
Secret Stuff Secret Stuff Mary Jones
Davey Jones
Sue Miller
Primary Key: EmployeeID Classified Primary Key: EmployeeID
■ The classic example is members and groups A member may belong to multiple groups, and a
group may have multiple members
■ In theOBXKitessample database, an order may have multiple items, and each item may be
sold on multiple orders
■ In theCape Hatteras Adventuressample database, a guide may qualify for several tours,
and each tour may have several qualified guides
In a conceptual diagram, the many-to-many relationship can be diagramed by signifying multiple
cardi-nality at each side of the relationship, as shown in Figure 3-7
FIGURE 3-7
The many-to-many logical model shows multiple tuples on both ends of the relationship
Many-to-many relationships are nearly always optional For example, the many customers-to-many
events relationship is optional because the customer and the tour/event are each valid without the other
The one-to-one and the one-to-many relationship can typically be constructed from items within an
organization that users can describe and understand That’s not always the case with many-to-many
relationships
To implement a many-to-many relationship in SQL DDL, a third table, called an associative table
(some-times called a junction table) is used, which artificially creates two one-to-many relationships between the
two entities (see Figure 3-8)
Figure 3-9 shows the associative entity with data to illustrate how it has a foreign key to each of the two
many-to-many primary entities This enables each primary entity to assume a one-to-many relationship
with the other entity
Trang 7FIGURE 3-8
The many-to-many implementation adds an associative table to create artificial one-to-many
relation-ships for both tables
FIGURE 3-9
In the associative entity (Customer_mm_Event), each customer can be represented multiple times,
which creates an artificial one-event-to-many-customers relationship Likewise, each event can be
listed multiple times in the associative entity, creating a one-customer-to-many-events relationship
John
Foreign Key: CustomerID Foreign Key: EventID
John John
Appalachian Trail Blue Ridge Parkway Hike
Appalachian Trail Blue Ridge Parkway Hike Outer Banks Lighthouses
Primary Key: ContactID
Primary Key: CustomerID
Trang 8In some cases the subject-matter experts will readily recognize the associated table:
■ In the case of the many orders to many products example, the associative entity is theorder
detailsentity
■ A class may have many students and each student may attend many classes The associative
entity would be recognized as theregistrationentity
In other cases an organization might understand that the relationship is a many-to-many relationship,
but there’s no term to describe the relationship In this case, the associative entity is still required to
resolve the many-to-many relationship — just don’t discuss it with the subject-matter experts
Typically, additional facts and attributes describe the many-to-many relationship These attributes belong
in the associative entity For example:
■ In the case of the many orders to many products example, the associative entity (order
detailsentity) would include thequantityandsales priceattributes
■ In the members and groups example, themember_groupsassociative entity might include
thedatejoinedandstatusattributes
When designing attributes for associative entities, it’s extremely critical that every attribute actually
describe only the many-to-many relationship and not one of the primary entities For example, including
a product name describes the product entity and not the many orders to many products relationship
Supertype/subtype pattern
One of my favorite design patterns, that I don’t see used often enough, is the supertype/subtype pattern
It supports generalization, and I use it extensively in my designs The supertype/subtype pattern is also
perfectly suited to modeling an object-oriented design in a relational database
The supertype/subtype relationship leverages the one-to-one relationship to connect one supertype entity
with one or more subtype entities This extends the supertype entity with what appears to be flexible
attributes
The textbook example is a database that needs to store multiple types of contacts All contacts have
basic contact data such as name, location, phone number, and so on Some contacts are customers
with customer attributes (credit limits, loyalty programs, etc.) Some contacts are vendors with
vendor-specific data
While it’s possible to use separate entities for customers and vendors, an alternative design is to use a
singleContactentity (the supertype) to hold every contact, regardless of their type, and the attributes
common to every type (probably just the name and contact attributes) Separate entities (the subtypes)
hold the attributes unique to customers and vendors A customer would have a tuple (row) in the
con-tact and the customer entities A vendor would have tuples in both the concon-tact and vendor entities All
three entities share the same primary key, as shown in Figure 3-10
Sometime data modelers who use the supertype/subtype pattern add atypeattribute in the supertype
entity so it’s easy to quickly determine the type by searching the subtypes This works well but it
restricts the tuples to a single subtype
Trang 9FIGURE 3-10
The supertype/subtype pattern uses an optional one-to-one relationship that relates a primary key to a
primary key
John
John Paul
10 Points
3 Points
Contact
Customer
Vendor
Paul
Earnest Baked
Good
Frank’s General
Store
Earnest Baked Good Nulls-R-Us
Always fresh Never know when he’ll show up Nulls-R-Us
Frank’s General Store Dependable
Primary Key: ContactID
Primary Key: ContactID Customer Loyality data
Primary Key: ContactID Vendor Status
Without thetypeattribute, it’s possible to allow tuples to belong to multiple subtypes Sometimes this
is referred to as allowing the supertype to have multiple roles In the contact example, multiple roles
(e.g a contact who is both an employee and customer) could mean the tuple has data in the supertype
entity (e.g contact entity) and each role subtype entity (e.g employee and customer entities.)
Nordic O/R DBMS
Nordic (New Object/Relational Design) is my open-source experiment to transform SQL Server into an
object-oriented database
Nordic builds on the supertype/subtype pattern and uses T-SQL code generation to create a T-SQL API
fac¸ade that supports classes with multiple inheritance, attribute inheritance, polymorphism, inheritable class
roles, object morphing, and inheritable class-defined workflow state If you want to play with Nordic, go to
www.CodePlex.com/nordic
Trang 10Domain integrity lookup pattern
The domain integrity lookup pattern, informally called the lookup table pattern, is very common in
pro-duction databases This pattern only serves to limit the valid options for an attribute, as illustrated in
Figure 3-11
FIGURE 3-11
The domain integrity lookup pattern uses a foreign key to ensure that only valid data is entered into
the attribute
Primary Key: ContactID Foreign Key: RegionID
Contact
North Carolina NC
Region
Earnest Baked Good CO
Primary Key: RegionID Region Description
Nulls-R-Us NY Frank’s General
The classic example is the state, or region, lookup entity Unless the organization regularly deals with
several states as clients, the state lookup entity only serves to ensure that the state attributes in other
entities are entered correctly Its only purpose is data consistency
Recursive pattern
A recursive relationship pattern (sometimes called a self-referencing, unary, or self-join relationship) is one
that relates back to itself In reality, these relationships are quite common:
■ An organizational chart represents a person reporting to another person
■ A bill of materials details how a material is constructed from other materials
■ Within theFamilysample database, a person relates to his or her mother and father
Chapter 17, ‘‘Traversing Hierarchies,’’ deals specifically with modeling and querying
recur-sive relationships within SQL Server 2008.