What values are returned after executing the following statement?SELECT JOB_ID, MAX_SALARY FROM JOBS GROUP BY MAX_SALARY; Assume that the JOBS table has ten records with the same JOB_ID
Trang 1Include or Exclude Grouped Rows
Using the HAVING Clause
• Clustering rows using a common grouping attribute with the GROUP BY clause and applying an aggregate function to each of these groups returns
group-level results.
• The HAVING clause provides the language to limit the group-level results
returned
• The HAVING clause may only be specified if there is a GROUP BY clause present
• All grouping is performed and group functions are executed prior to
evaluating the HAVING clause
Self Test
1 What result is returned by the following statement?
SELECT COUNT(*) FROM DUAL;
(Choose the best answer.)
A NULL
B 0
C 1
D None of the above
2 Choose one correct statement regarding group functions
A Group functions may only be used when a GROUP BY clause is present
B Group functions can operate on multiple rows at a time
C Group functions only operate on a single row at a time
D Group functions can execute multiple times within a single group
3 What value is returned after executing the following statement?
SELECT SUM(SALARY) FROM EMPLOYEES;
Assume there are ten employee records and each contains a SALARY value of
100, except for one, which has a null value in the SALARY field (Choose the best answer.)
A 900
B 1000
C NULL
D None of the above
4 Which values are returned after executing the following statement?
SELECT COUNT(*), COUNT(SALARY) FROM EMPLOYEES;
Trang 2Assume there are ten employee records and each contains a SALARY value of
100, except for one, which has a null value in their SALARY field (Choose all
that apply.)
A 10 and 10
B 10 and NULL
C 10 and 9
D None of the above
5 What value is returned after executing the following statement?
SELECT AVG(NVL(SALARY,100)) FROM EMPLOYEES;
Assume there are ten employee records and each contains a SALARY value
of 100, except for one employee, who has a null value in the SALARY field
(Choose the best answer.)
A NULL
B 90
C 100
D None of the above
6 What value is returned after executing the following statement?
SELECT SUM((AVG(LENGTH(NVL(SALARY,0)))))
FROM EMPLOYEES
GROUP BY SALARY;
Assume there are ten employee records and each contains a SALARY value of
100, except for one, which has a null value in the SALARY field (Choose the
best answer.)
A An error is returned
B 3
C 4
D None of the above
7 How many rows are returned by the following query?
SELECT SUM(SALARY), DEPARTMENT_ID FROM EMPLOYEES
GROUP BY DEPARTMENT_ID;
Assume there are 11 non-null and 1 null unique DEPARTMENT_ID values All
records have a non-null SALARY value (Choose the best answer.)
A 12
B 11
C NULL
D None of the above
Trang 38 What values are returned after executing the following statement?
SELECT JOB_ID, MAX_SALARY FROM JOBS GROUP BY MAX_SALARY; Assume that the JOBS table has ten records with the same JOB_ID value of DBA and the same MAX_SALARY value of 100 (Choose the best answer.)
A One row of output with the values DBA, 100
B Ten rows of output with the values DBA, 100
C An error is returned
D None of the above
9 How many rows of data are returned after executing the following statement? SELECT DEPT_ID, SUM(NVL(SALARY,100)) FROM EMP
GROUP BY DEPT_ID HAVING SUM(SALARY) > 400;
Assume the EMP table has ten rows and each contains a SALARY value of 100, except for one, which has a null value in the SALARY field The first five rows have a DEPT_ID value of 10 while the second group of five rows, which includes the row with a null SALARY value, has a DEPT_ID value of 20 (Choose the best answer.)
A Two rows
B One row
C Zero rows
D None of the above
10 How many rows of data are returned after executing the following statement?
SELECT DEPT_ID, SUM(SALARY) FROM EMP GROUP BY DEPT_ID HAVING SUM(NVL(SALARY,100)) > 400;
Assume the EMP table has ten rows and each contains a SALARY value of 100, except for one, which has a null value in the SALARY field The first five rows have a DEPT_ID value of 10, while the second five rows, which include the row with a null SALARY value, have a DEPT_ID value of 20 (Choose the best answer.)
A Two rows
B One row
C Zero rows
D None of the above
Self Test Answers
1 þ C The DUAL table has one row and one column The COUNT(*)
function returns the number of rows in a table or group
ý A, B, and D.
2 þ B By definition, group functions can operate on multiple rows at a time,
unlike single-row functions
Trang 4ý A, C, and D A group function may be used without a GROUP BY clause
In this case, the entire dataset is operated on as a group The COUNT function
is often executed against an entire table, which behaves as one group D is
incorrect Once a dataset has been partitioned into different groups, any
group functions execute once per group
3 þ A The SUM aggregate function ignores null values and adds non-null
values Since nine rows contain the SALARY value 100, 900 is returned
ý B, C, and D B would be returned if SUM(NVL(SALARY,100)) were
executed C is a tempting choice, since regular arithmetic with NULL
values returns a NULL result However, the aggregate functions, except
for COUNT(*), ignore NULL values
4 þ C COUNT(*) considers all rows, including those with NULL values,
while COUNT(SALARY) only considers the non-null rows
ý A, B, and D.
5 þ C The NVL function converts the one NULL value into 100 Thereafter,
the average function adds the SALARY values and obtains 1000 Dividing this
by the number of records returns 100
ý A, B, and D B would be returned if AVG(NVL(SALARY,0)) were selected
It is interesting to note that if AVG(SALARY) were selected, 100 would have
also been returned, since the AVG function would sum the non-null values
and divide the total by the number of rows with non-null SALARY values So
AVG(SALARY) would be calculated as: 900/9=100
6 þ C The dataset is segmented by the SALARY column This creates two
groups: one with SALARY values of 100 and the other with a null SALARY
value The average length of SALARY value 100 is 3 for the rows in the first
group The NULL salary value is first converted into the number 0 by the NVL
function, and the average length of SALARY is 1 The SUM function operates
across the two groups adding the values 3 and 1, returning 4
ý A, B, and D A seems plausible, since group functions may not be
nested more than two levels deep Although there are four functions,
only two are group functions, while the others are single-row functions
evaluated before the group functions B would be returned if the expression
SUM(AVG(LENGTH(SALARY))) were selected
7 þ A There are 12 distinct DEPARTMENT_ID values Since this is
the grouping attribute, 12 groups are created, including 1 with a null
DEPARTMENT_ID value Therefore 12 rows are returned
ý B, C, and D.
Trang 58 þ C For a GROUP BY clause to be used, a group function must appear in
the SELECT list
ý A, B, and D These are incorrect, since the statement is syntactically
inaccurate and is disallowed by Oracle Do not mistake the column named MAX_SALARY for the MAX(SALARY) function
9 þ B Two groups are created based on their common DEPT_ID values The
group with DEPT_ID values of 10 consists of five rows with SALARY values
of 100 in each of them Therefore, the SUM(SALARY) function returns 500 for this group, and it satisfies the HAVING SUM(SALARY) > 400 clause The group with DEPT_ID values of 20 has four rows with SALARY values of 100 and one row with a NULL SALARY SUM(SALARY) only returns 400 and this group does not satisfy the HAVING clause
ý A, C, and D Beware of the SUM(NVL(SALARY,100)) expression in the
SELECT clause This expression selects the format of the output It does not restrict or limit the dataset in any way
10 þ A Two groups are created based on their common DEPT_ID values The
group with DEPT_ID values of 10 consists of five rows with SALARY values of
100 in each of them Therefore the SUM(NVL(SALARY,100)) function returns
500 for this group and satisfies the HAVING SUM(NVL(SALARY,100))>400 clause The group with DEPT_ID values of 20 has four rows with SALARY values of 100 and one row with a null SALARY SUM(NVL(SALARY,100)) returns 500, and this group satisfies the HAVING clause Therefore, two rows are returned
ý B, C, and D Although the SELECT clause contains SUM(SALARY),
which returns 500 and 400 for the two groups, the HAVING clause contains the SUM(NVL(SALARY,100)) expression, which specifies the inclusion or exclusion criteria for a group-level row
Trang 6CHAPTER 12
SQL Joins
Exam Objectives
In this chapter you will learn to
• 051.6.1 Write SELECT Statements to Access Data from More Than One Table Using Equijoins and Nonequijoins
• 051.6.2 Join a Table to Itself Using a Self-Join
• 051.6.3 View Data That Does Not Meet a Join Condition Using Outer Joins
• 051.6.4 Generate a Cartesian Product of All Rows from Two or More Tables
481
Trang 7The three pillars of relational theory are selection, projection, and joining This chapter
focuses on the practical implementation of joining Rows from different tables or views are associated with each other using joins Support for joining has implications for the
way data is stored, and many data models such as third normal form or star schemas have emerged to exploit this feature
Tables may be joined in several ways The most common technique is called an
equijoin, where a row is associated with one or more rows in another table based on the equality of column values or expressions Tables may also be joined using a nonequijoin,
where a row is associated with one or more rows in another table if its column values fall into a range determined by inequality operators
A less common technique is to associate rows with other rows in the same table This association is based on columns with logical and usually hierarchical relationships
with each other and is called a self-join Rows with null or differing entries in common
join columns are excluded when equijoins and nonequijoins are performed An outer join is available to fetch these one-legged or orphaned rows, if necessary.
A cross join or Cartesian product is formed when every row from one table is
joined to all rows in another This join is often the result of missing or inadequate join conditions but is occasionally intentional
Write SELECT Statements to Access Data
from More Than One Table Using Equijoins
and Nonequijoins
This section introduces the different types of joins in their primitive forms, outlining
the broad categories that are available before delving into an in-depth discussion of
the various join clauses The modern ANSI-compliant and traditional Oracle syntaxes
are discussed, but emphasis is placed on the modern syntax This section concludes with a discussion of nonequijoins and additional join conditions Joining is described
by focusing on the following eight areas:
• Types of joins
• Joining tables using SQL:1999 syntax
• Qualifying ambiguous column names
• The NATURAL JOIN clause
• The natural JOIN USING clause
• The natural JOIN ON clause
• N-way joins and additional join conditions
• Nonequijoins
Types of Joins
Two basic joins are the equijoin and the nonequijoin Joins may be performed between
multiple tables, but much of the following discussion will use two hypothetical tables
to illustrate the concepts and language of joins The first table is called the source, and
Trang 8the second is called the target Rows in the source and target tables comprise one or
more columns As an example, assume that the source and target are the COUNTRIES
and REGIONS tables from the HR schema, respectively
The COUNTRIES table comprises three columns named COUNTRY_ID, COUNTRY_
NAME, and REGION_ID, while the REGIONS table comprises two columns named
REGION_ID and REGION_NAME The data in these two tables is related via the
common REGION_ID column Consider the following queries:
Query 1: select * from countries where country_id='CA';
Query 2: select region_name from regions where region_id='2';
Query 1 retrieves the column values associated with the row from the COUNTRIES
table with COUNTRY_ID=’CA’ The REGION_ID value of this row is 2 Query 2 fetches
Americas as the region name from the REGIONS table for the row with REGION_ID=2,
thus identifying the one region in which Canada lies Joining facilitates the retrieval of
column values from multiple tables using a single query
The source and target tables can be swapped, so the REGIONS table could be the
source and the COUNTRIES table could be the target Consider the following two queries:
Query 1: select * from regions where region_name='Americas';
Query 2: select country_name from countries where region_id='2';
Query 1 fetches one row with a REGION_ID value of 2 Joining in this reversed
manner allows the following question to be asked: What countries belong to the
Americas region? The answers from the second query are five countries named:
Argentina, Brazil, Canada, Mexico, and the United States of America These results
may be obtained from a single query that joins the tables together.
Natural Joins
The natural join is implemented using three possible join clauses that use the
following keywords in different combinations: NATURAL JOIN, USING, and ON.
When the source and target tables share identically named columns, it is possible
to perform a natural join between them without specifying a join column This is
sometimes referred to as a pure natural join In this scenario, columns with the same
names in the source and target tables are automatically associated with each other
Rows with matching column values in both tables are retrieved The REGIONS and
COUNTRIES table both share a commonly named column: REGION_ID They may
be naturally joined without specifying join columns, as shown in the first two queries
in Figure 12-1
The NATURAL JOIN keywords instruct Oracle to identify columns with identical
names between the source and target tables Thereafter, a join is implicitly performed
between them In the first query, the REGION_ID column is identified as the only
commonly named column in both tables REGIONS is the source table and appears
after the FROM clause The target table is therefore COUNTRIES For each row in the
REGIONS table, a match for the REGION_ID value is sought from all the rows in the
COUNTRIES table An interim result set is constructed containing rows matching the
join condition This set is then restricted by the WHERE clause In this case, because
the COUNTRY_NAME must be Canada, the REGION_NAME Americas is returned
Trang 9The second query shows a natural join where COUNTRIES is the source table The REGION_ID value for each row in the COUNTRIES table is identified and a search for
a matching row in the REGIONS table is initiated If matches are found, the interim results are limited by any WHERE conditions The COUNTRY_NAME from rows with Americas as their REGION_NAME is returned
Sometimes more control must be exercised regarding which columns to use for joins When there are identical column names in the source and target tables you
want to exclude as join columns, the JOIN USING format may be used Remember
that Oracle does not impose any rules stating that columns with the same name in two discrete tables must have a relationship with each other The third query explicitly specifies that the REGIONS table be joined to the COUNTRIES table based on common
values in their REGION_ID columns This syntax allows natural joins to be formed on
specific columns instead of on all commonly named columns
The fourth query demonstrates the JOIN ON format of the natural join, which
allows join columns to be explicitly stated This format does not depend on the columns
Figure 12-1 Natural joins
Trang 10in the source and target tables having identical names This form is more general and
is the most widely used natural join format
TIP Be wary when using pure natural joins, since database designers may
assign the same name to key or unique columns These columns may have
names like ID or SEQ_NO If a pure natural join is attempted between such
tables, ambiguous and unexpected results may be returned
Outer Joins
Not all tables share a perfect relationship, where every record in the source table
can be matched to at least one row in the target table It is occasionally required that
rows with nonmatching join column values also be retrieved by a query Suppose the
EMPLOYEES and DEPARTMENTS tables are joined with common DEPARTMENT_ID
values EMPLOYEES records with null DEPARTMENT_ID values are excluded along
with values absent from the DEPARTMENTS table An outer join fetches these rows.
Cross Joins
A cross join or Cartesian product derives its names from mathematics, where it is also
referred to as a cross product between two sets or matrices This join creates one row
of output for every combination of source and target table rows
If the source and target tables have three and four rows, respectively, a cross join
between them results in (3 × 4 = 12) rows being returned Consider the row counts
retrieved from the queries in Figure 12-2
Figure 12-2 Cross join