Figure 7.2 Each order in the Orders table refers to a customer from the Customers table.. A schema should show the tables along with their columns, the data types of the columns and indi
Trang 1Figure 7.2 Each order in the Orders table refers to a customer from the Customers table.
Schemas
The complete set of the table designs for a database is called the database schema It is
akin to a blueprint for the database A schema should show the tables along with their columns, the data types of the columns and indicate the primary key of each table and any foreign keys A schema does not include any data, but you might want to show sam-ple data with your schema to explain what it is for.The schema can be shown as it is in the diagrams we are using, in entity relationship diagrams (which are not covered in this book), or in a text form, such as
Customers(CustomerID, Name, Address, City) Orders(OrderID, CustomerID, Amount, Date) Underlined terms in the schema are primary keys in the relation in which they are underlined Dotted underlined terms are foreign keys in the relation in which they appear with a dotted underline
Relationships
Foreign keys represent a relationship between data in two tables For example, the link from Orders to Customers represents a relationship between a row in the Orders table and a row in the Customers table
Three basic kinds of relationships exist in a relational database.They are classified according to the number of things on each side of the relationship Relationships can be either one-to-one, one-to-many, or many-to-many
A one-to-one relationship means that there is one of each thing in the relationship For example, if we had put addresses in a separate table from Customers, there would be
a one-to-one relationship between them.You could have a foreign key from Addresses to Customer or the other way around (both are not required)
CustomerID
CUSTOMERS
1 Julie Smith 25 Oak Street Airport West
2 Alan Wong 1/47 Haines Avenue Box Hill
3 Michelle Arthur 357 North Road Yarraville
OrderID
ORDERS
CustomerID Amount Date
Trang 2In a one-to-many relationship, one row in one table is linked to many rows in
anoth-er table In this example, one Customanoth-er might place many Ordanoth-ers In these relationships, the table that contains the many rows will have a foreign key to the table with the one row Here, we have put the CustomerID into the Order table to show the relationship
In a many-to-many relationship, many rows in one table are linked to many rows in another table For example, if we had two tables,Booksand Authors, you might find that one book had been written by two coauthors, each of whom had written other books,
on their own or possibly with other authors.This type of relationship usually gets a table all to itself, so you might have Books,Authors, and Books_Authors.This third table would only contain the keys of the other tables as foreign keys in pairs, to show which authors have been involved with which books
How to Design Your Web Database
Knowing when you need a new table and what the key should be can be something of
an art.You can read huge reams of information about entity relationship diagrams and database normalization, which are beyond the scope of this book Most of the time, however, you can follow a few basic design principles Let’s consider these in the context
of Book-O-Rama
Think About the Real World Objects You Are Modeling
When you create a database, you are usually modeling real-world items and relationships and storing information about those objects and relationships
Generally, each class of real-world objects you model will need its own table.Think about it:We want to store the same information about all our customers If there is a set
of data that has the same “shape,” we can easily create a table corresponding to that data
In the Book-O-Rama example, we want to store information about our customers, the books that we sell, and details of the orders.The customers all have a name and address.The orders have a date, a total amount, and a set of books that were ordered.The books have an ISBN, an author, a title, and a price
This suggests we need at least three tables in this database:Customers,Orders, and Books.This initial schema is shown in Figure 7.3
At present, we can’t tell from the model which books were ordered in each order.We will deal with this in a minute
Avoid Storing Redundant Data
Earlier, we asked the question: “Why not just store Julie Smith’s address in the Orders table?”
If Julie orders from Book-O-Rama on a number of occasions, which we hope she will, we will end up storing her data multiple times.You might end up with an Orders table that looks like the one shown in Figure 7.4
Trang 3Figure 7.3 The initial schema consists of Customers, Orders, and Books.
CustomerID
CUSTOMERS
1 Julie Smith 25 Oak Street Airport West
2 Alan Wong 1/47 Haines Avenue Box Hill
3 Michelle Arthur 357 North Road Yarraville
ISBN
BOOKS
0-672-31687-8 Michael Morgan Java 2 for Professional Developers 34.99 0-672-31745-1 Thomas Down Installing Debian GNU/Linux 24.99 0-672-31509-2 Pruitt, et al Teach Yourself GIMP in 24 Hours 24.99
OrderID
ORDERS
OrderID
ORDERS
CustomerID Amount Date
Name
Julie Smith Julie Smith Julie Smith Julie Smith
Address
28 Oak Street
28 Oak Street
28 Oak Street
28 Oak Street
City
Airport West Airport West Airport West Airport West
Figure 7.4 A database design that stores redundant data takes up extra space and can cause anomalies in the data.
There are two basic problems with this
The first is that it’s a waste of space.Why store Julie’s details three times if we only have to store them once?
The second problem is that it can lead to update anomalies, that is, situations where we
change the database and end up with inconsistent data.The integrity of the data is vio-lated and we no longer know which data is correct and which incorrect.This generally leads to losing information
Three kinds of update anomalies need to be avoided: modification, insertion, and deletion anomalies
If Julie moves to a new house while she has pending orders, we will need to update her address in three places instead of one, doing three times as much work It is easy to
Trang 4overlook this fact and only change her address in one place, leading to inconsistent data
in the database (a very bad thing).These problems are called modification anomalies
because they occur when we are trying to modify the database
With this design, we need to insert Julie’s details every time we take an order, so each time we must check and make sure that her details are consistent with the existing rows
in the table If we don’t check, we might end up with two rows of conflicting informa-tion about Julie For example, one row might tell us that Julie lives in Airport West, and
another might tell us she lives in Airport.This is called an insertion anomaly because it
occurs when data is being inserted
The third kind of anomaly is called a deletion anomaly because it occurs (surprise,
sur-prise) when we are deleting rows from the database For example, imagine that when an order has been shipped, we delete it from the database.When all Julie’s current orders have been fulfilled, they are all deleted from the Orders table.This means that we no longer have a record of Julie’s address.We can’t send her any special offers, and next time she wants to order something from us, we will have to get her details all over again
Generally you want to design your database so that none of these anomalies occur
Use Atomic Column Values
This means that in each attribute in each row, we store only one thing For example, we need to know what books make up each order.There are several ways we could do this
We could add a column to the Orderstable which lists all the books that have been ordered, as shown in Figure 7.5
OrderID
ORDERS
CustomerID Amount Date
Books Ordered
0-672-31697-8 0-672-31745-1, 0-672-31509-2 0-672-31697-8
0-672-31745-1, 0-672-31509-2, 0-672-31697-8
Figure 7.5 With this design, the Books Ordered attribute in each row has multiple values.
This isn’t a good idea for a few reasons.What we’re really doing is nesting a whole table inside one column—a table that relates orders to books.When you do it this way, it
becomes more difficult to answer questions like “How many copies of Java 2 for Professional Developers have been ordered?”The system can no longer just count the
matching fields Instead, it has to parse each attribute value to see if it contains a match anywhere inside it
Because we’re really creating a table-inside-a-table, we should really just create that new table.This new table is called Order_Itemsand is shown in Figure 7.6
Trang 5Figure 7.6 This design makes it easier to search for particular books that have been ordered.
This table provides a link between the Orderstable and the Bookstable.This type of table is common when there is a many-to-many relationship between two objects—in this case, one order might consist of many books, and each book can be ordered by many people
Choose Sensible Keys
Make sure that the keys you choose are unique In this case, we’ve created a special key for customers (CustomerID) and for orders (OrderID) because these real-world objects might not naturally have an identifier that can be guaranteed to be unique.We don’t need to create a unique identifier for books—this has already been done, in the form of
an ISBN For Order_Item, you can add an extra key if you want, but the combination of the two attributes OrderID and ISBN will be unique as long as more than one copy of the same book in an order is treated as one row For this reason, the table Order_Items has a Quantity column
Think About the Questions You Want to Ask the Database
Continuing from the last section, think about what questions you want the database to answer (Think back to those questions we mentioned at the start of the chapter For example, what are Book-O-Rama’s bestselling books?) Make sure that the database con-tains all the data required, and that the appropriate links exist between tables to answer the questions you have
Avoid Designs with Many Empty Attributes
If we wanted to add book reviews to the database, there are at least two ways we could
do this.These two approaches are shown in Figure 7.7
The first way means adding a Review column to the Bookstable.This way, there is a field for the Review to be added for each book If many books are in the database, and the reviewer doesn’t plan to review them all, many rows won’t have a value in this attribute.This is called having a null value
OrderID
ORDER_ITEMS
Quantity
ISBN
0-672-31697-8 0-672-31745-1 0-672-31509-2 0-672-31697-8