But what if you want to only display data for students who have an average quiz grade of 70 or more?. The following statement produces what we desire: SELECT Student AS 'Student', AVG Gr
Trang 1Notice that quizzes with a score less than 70 aren’t shown For example, you can see Alec’s quiz score of 74, but not his quiz score of 58
But what if you want to only display data for students who have an average quiz
grade of 70 or more? Then you want to select on an average, not on individual rows This is where the HAVING keyword comes in You need to first group grades by student and then apply your selection criteria to an aggregate statistic based on the entire group The following statement produces what we desire:
SELECT
Student AS 'Student',
AVG (Grade) AS 'Average Quiz Grade'
FROM Grades
WHERE GradeType ¼ 'Quiz'
GROUP BY Student
HAVING AVG (Grade) >= 70
ORDER BY Student
The output is:
Student Average Quiz Grade
ThisSELECThas both aWHEREand aHAVINGclause TheWHEREensures that you only select rows with a GradeType of ‘‘Quiz.’’ TheHAVINGguarantees that you only select students with an average score of at least 70
What if you wanted to add a column with the GradeType value? If you attempt
to add GradeType to the SELECT columnlist, the statement will error This is
because all columns must be either listed in the GROUP BY or involved in an aggregation If you want to show the GradeType column, it must be added to the
GROUP BYclause, as follows:
SELECT
Student AS 'Student',
GradeType AS 'Grade Type',
AVG (Grade) AS 'Average Grade'
FROM Grades
WHERE GradeType ¼ 'Quiz'
Chapter 10 ■ Summarizing Data
106
Trang 2GROUP BY Student, GradeType
HAVING AVG (Grade) >= 70
ORDER BY Student
The resulting data is:
Student Grade Type Average Grade
Now that we’ve added the HAVING clause to the mix, let’s recap the general
format of theSELECTstatement:
SELECT columnlist
FROM tablelist
WHERE condition
GROUP BY columnlist
HAVING condition
ORDER BY columnlist
It should be emphasized that, when employing any of the above keywords in a
SELECT, they need to be entered in the order shown For example, theHAVING
keyword needs to always be after aGROUP BYbut before anORDER BY
Looking Ahead
In this chapter, we covered several forms of aggregation, starting with the
sim-plest—that of eliminating duplicates We then introduced a number of aggregate
functions, which are a different class of functions from the scalar functions seen
in Chapter 4 The real power of aggregate functions becomes apparent when they
are used in conjunction with the GROUP BY keyword, which allows for true
aggregation of data into groups Finally, we covered theHAVINGkeyword, which
allows you to apply group-level selection criteria to values in aggregate functions
In our next chapter, ‘‘Combining Tables with an Inner Join,’’ we’re going to
begin our exploration of a key topic in SQL, the ability to access data from
multiple tables Up until now, all SELECT queries have been against a single
table In the real world, this is an unrealistic scenario The true value of relational
databases lies in their ability to utilize multiple tables with related data Seldom
would one require data from only a single table
Looking Ahead 107
Trang 3The topic of accessing data from multiple tables will be directly addressed in Chapters 11 and 12 Chapter 11 covers the inner join and Chapter 12 looks at the outer join Subsequently, Chapters 13 through 15 will explore variations on the same theme After you complete the next five chapters, you will have mastered the essential techniques of obtaining data from multiple tables
Chapter 10 ■ Summarizing Data
108
Trang 4chapter 11
Combining Tables
with an Inner Join
Back in Chapter 1, we talked about the great advance of relational databases over their predecessors The significant achievement of relational databases was their ability to allow data to be organized into any number of tables that are related but at the same time independent of each other Unlike earlier databases, the relationships between tables in relational databases are not explicitly defined by a series of pointers Instead, relationships are inferred by columns that tables have
in common Sometimes, these relationships are formalized by the definition of primary and foreign keys, but this isn’t always necessary
The great virtue of relational databases lies in the fact that someone can analyze business entities and then design an appropriate database design, which allows for maximum flexibility
Let’s look at a common example Most organizations have a business entity known as the ‘‘customer.’’ As such, it is typical for a database to contain a Cus-tomers table that defines each customer Such a table would normally contain a primary key to uniquely identify each customer and any number of columns with attributes describing the customer Common attributes might include phone number, address, city, state, and so on
The main idea is that all information about the customer is stored in a single table and only in that table This simplifies the task of data updates When a customer changes his phone number, there is only one table that needs to be updated However, the downside to this setup is that whenever someone needs
109
Trang 5any information about a customer, that person needs to access the Customers table to retrieve the information
This brings us to the concept of a join Let’s say that someone is analyzing
products that have been purchased Along with information about the products,
it is often necessary to provide information about the customers who purchased each product For example, an analyst may desire to obtain customer ZIP codes for a geographic analysis The ZIP code is only stored in the Customers table Product information is stored in a Products table To get information from both customers and products, the tables must be joined together in such a way that the information matches correctly
In essence, the promise of relational databases is fulfilled by the ability to join tables together in any desired manner
Joining Two Tables
To begin our exploration of the join process, let’s revisit the Orders table that we first encountered in Chapter 3:
OrderID FirstName LastName QuantityPurchased PricePerItem
The use of this table in earlier chapters was somewhat misleading In reality, a competent database designer would never create a table such as this The pro-blem is that it contains information about two separate entities: customers and orders In the real world, the information would be split into at least two separate tables A Customers table might look like this:
CustomerID FirstName LastName
Chapter 11 ■ Combining Tables with an Inner Join
110