Students table:StudentID Student Teachers table: TeacherID Teacher Assistant Tests table: TestID TeacherID Test Date TotalPoints Formats table: TestID TestFormat 1 Multiple Choice 2 Mult
Trang 1Students table:
StudentID Student
Teachers table:
TeacherID Teacher Assistant
Tests table:
TestID TeacherID Test Date TotalPoints
Formats table:
TestID TestFormat
1 Multiple Choice
2 Multiple Choice
3 Multiple Choice
5 Multiple Choice
Trang 2Grades table:
StudentID TestID Grade
Your first impression might be that we have unnecessarily complicated the situation, rather than improved it For example, the Grades table is now a mass of numbers, the meaning of which is not completely obvious on quick inspection This is true However, remembering the ability of SQL to join tables together easily, you can also see that there is now much greater flexibility in this new design Not only are we free to join together only those tables needed for any particular analysis, but we can now add new columns to these tables much more readily, without affecting anything else
Our information has become more modularized For example, if we should decide that we want to capture additional information about each student, such
as address and phone, we can simply add new columns to the Students table Additionally, when we want to modify a student’s address or phone later, it only affects one row in the table
The Art of Database Design
Ultimately, designing a database is much more than simply going through the normalization procedures Database design is really more of an art than a sci-ence, and it requires asking and thinking about relevant business issues
In our grades example, we presented one possible database design as an illustra-tion of how to normalize data In truth, there are many possibilities for designing this database Much depends on the realities of how the data will be accessed and modified Numerous questions can be asked to ascertain whether your design is
as flexible and meaningful as it needs to be For example:
■ Are there other tables that need to be added to our database? One obvious
choice would be a Subjects table, so you could easily select tests by subject,
Chapter 19 ■ Principles of Database Design
202
Trang 3such as English or Math If you did this, would you relate the subject to the
test or to the teacher who gave the test?
■ Is it possible for a grade to count in more than one subject? Maybe the
English and Social Studies teachers are doing a combined lesson and want
certain tests to count for both subjects How do you account for that?
■ What do you do if a child flunks a grade and is now taking the same tests
for a second year? How do you differentiate his grade now from last year’s
grades?
■ How do you allow for special rules that teachers might implement, such
as dropping the lowest quiz score in a particular time period?
■ Are there special analysis requirements for the data? If there is more than
one teacher for the same subject, do you want to be able to compare the
average grades for the students of each teacher, to make sure that one
teacher isn’t unfairly inflating grades?
The list of possible questions is endless But the point is that data doesn’t exist in a
vacuum There is a necessary interaction between data design and requirements in
the real world Databases need to be designed in such a way as to allow for needed
flexibility However, there is also a danger that databases can be overly designed to a
point where the data becomes unintelligible A zealous database administrator may
decide to create 20 tables to allow for every possible situation That, too, is
inad-visable Database design is something of a balancing act in search of a design that is
sufficiently flexible but also intuitive and understandable by users of the system
Alternatives to Normalization
We have emphasized that normalization is the overriding principle that should
be followed in designing a database In certain situations, however, there are
viable alternatives that might make more sense
For example, in the realm of data warehouse systems and software, many
prac-titioners advocate utilizing a star schema design for databases rather than
nor-malization In a star schema, a certain amount of redundancy is allowed and
encouraged The emphasis is on creating a data structure that more intuitively
reflects business realities, and also one that allows for quick processing of data by
special analytical software
Trang 4To give a brief overview of star schema designs, the main idea is to create a central fact table, which is related to any number of dimension tables The fact table contains all the quantitative numbers that are additive in nature In our prior example, the Grade column is such a number, since we can add up grades
to obtain a meaningful total grade The dimension tables contain information on all the entities that are related to the central facts, such as subject, time, teacher, student, and so on
Furthermore, special analytical software exists that allows database developers to create cubes from their star schema databases These cubes extend analysis capa-bilities, allowing users to drill down predefined hierarchies, which are defined
in the various dimensions A user of such a system would be able to drill down from viewing a semester’s worth of grades for a student, to his grades in any individual week
Figure 19.2 shows what a database with a star schema design might look like for our grades example
In this design, the Grades table is the central fact table The other tables are all dimension tables
The first four columns in the Grades table (Date, TestID, StudentID, and TeacherID) are there only to relate the table to each of the dimensions The other two columns have the additive numeric quantities we talked about Notice that
Figure 19.2
Star schema design.
Chapter 19 ■ Principles of Database Design
204
Trang 5TotalPoints is now in the Grades table In our normalized design, it was an
attribute of the Tests table By putting both the Grade and TotalPoints in the
Grades table, we can use our analytical software to easily sum up grades and
compute average grades (Grade divided by the TotalPoints) for any set of data
Certainly, this is only a brief introduction to the subject of designing databases
for data warehouses It illustrates the point that there are many different ways to
design a database, and the best way often relates to the type of software that will
be used with the data
Looking Ahead
This chapter covered the principles of database design We went over the basics
of the normalization process, showing how a database with a single table can be
converted into a more flexible structure with multiple tables, related by
addi-tional key columns We also emphasized that database design is not merely a
technical exercise Attention must be paid to organizational realities and to
considerations as to how the data will be utilized Finally, we briefly described
one alternative to the conventional normalized design, in an effort to emphasize
that there is often more than one approach to this endeavor
In our final chapter, ‘‘Strategies for Displaying Data,’’ we’re going to discuss
some interesting possibilities for using reporting software tools to complement
our knowledge of SQL In our quest to sharpen our SQL skills, we must not
forget that there is a world beyond SQL We make to make sure that we don’t
expend our efforts in SQL when the underlying objective can be accomplished
more effectively through other means