CHAPTER 9Retrieving, Restricting, and Sorting Data Using SQL Exam Objectives In this chapter you will learn to • 051.1.1 List the Capabilities of SQL SELECT Statements • 051.1.2 Execute
Trang 1OCA/OCP Oracle Database 11g All-in-One Exam Guide
366
9 þ A, B, and D Changes to any of these will generate redo.
ý C Changes to temporary segments do not generate redo.
10 þ A and B Both DDL and access control commands include a COMMIT.
ý C and D C is wrong because a savepoint is only a marker within a transaction D is wrong because this is a SQL*Plus command that acts locally
on the user process; it has no effect on an active transaction
11 þ C Triggers cannot be packaged.
ý A, B, and D A and B are wrong because functions and procedures can be packaged D is wrong because neither anonymous blocks nor triggers can be
packaged
12 þ C This correctly describes the operation of the enqueue mechanism.
ý A, B, and D A is wrong because locks are granted sequentially, not randomly B is wrong because the shared locks apply to the object; row locks must be exclusive D is wrong because this is more like a description of how
deadlocks are managed
13 þ C All DML occurs in the database buffer cache, and changes to both data
block and undo blocks are protected by redo
ý A, B, and D A is wrong because writing to disk is independent of executing the statement B and D are incomplete: redo protects changes to
both data blocks and undo blocks
14 þ B This is the option that would require downtime, because the datafile
would have to taken offline during the move and you cannot take it offline while the database is open
ý A, C, and D These are wrong because they are all operations that can be
carried out during normal running without end users being aware
15 þ C To calculate, take the largest figure for UNDBLKS, which is for a
ten-minute period Divide by 600 to get the rate of undo generation in blocks per second, and multiply by the block size to get the figure in bytes Multiply by the largest figure for MAXQUERYLEN, to find the space needed if the highest rate of undo generation coincided with the longest query, and divide by a billion to get the answer in gigabytes:
237014 / 600 * 4192 * 1740 = 2.9 (approximately)
ý A, B, and D The following algorithm should be followed when sizing an
undo tablespace: Calculate the rate at which undo is being generated at your peak workload, and multiply by the length of your longest query
Trang 2CHAPTER 9
Retrieving, Restricting, and Sorting Data Using SQL
Exam Objectives
In this chapter you will learn to
• 051.1.1 List the Capabilities of SQL SELECT Statements
• 051.1.2 Execute a Basic SELECT Statement
• 051.2.1 Limit the Rows Retrieved by a Query
• 051.2.2 Sort the Rows Retrieved by a Query
• 051.2.3 Use Ampersand Substitution
367
Trang 3OCA/OCP Oracle Database 11g All-in-One Exam Guide
368
This chapter contains several sections that are not directly tested by the exam but are considered prerequisite knowledge for every student Two tools used extensively for exercises are SQL*Plus and SQL Developer, which are covered in Chapter 2 Oracle specialists use these every day in their work The exercises and many of the examples are based on two demonstration sets of data The first, known as the HR schema, is supplied by Oracle, while the second, the WEBSTORE schema, is designed, created, and populated later in this chapter There are instructions on how to launch the tools and create the demonstration schemas
The exam-testable sections include the concepts behind the relational paradigm and normalizing of data into relational structures and of retrieving data stored in relational tables using the SELECT statement The statement is introduced in its basic form and is progressively built on to extend its core functionality This chapter also
discusses the WHERE clause, which specifies one or more conditions that the Oracle
server evaluates to restrict the rows returned by the statement A further language enhancement is introduced by the ORDER BY clause, which provides data sorting capabilities The chapter closes by discussing ampersand substitution: a mechanism that provides a way to reuse the same statement to execute different queries by
substituting query elements at runtime
List the Capabilities of SQL SELECT Statements
Knowing how to retrieve data in a set format using a query language is the first step toward understanding the capabilities of SELECT statements Describing the relations involved provides a tangible link between the theory of how data is stored in tables and the practical visualization of the structure of these tables These topics form an important precursor to the discussion of the capabilities of the SELECT statement The three primary areas explored are as follows:
• Introducing the SQL SELECT statement
• The DESCRIBE table command
• Capabilities of the SELECT statement
Introducing the SQL SELECT Statement
The SELECT statement from Structured Query Language (SQL) has to be the single most powerful nonspoken language construct It is an elegant, flexible, and highly extensible mechanism created to retrieve information from a database table A database would serve little purpose if it could not be queried to answer all sorts of interesting questions For example, you may have a database that contains personal financial records like your bank statements, your utility bills, and your salary statements You could easily ask the database for a date-ordered list of your electrical utility bills for the last six months or query your bank statement for a list of payments made to a certain account over the
Trang 4same period The beauty of the SELECT statement is encapsulated in its simple,
English-like format that allows questions to be asked of the database in a natural manner
The DESCRIBE Table Command
To get the answers one seeks, one must ask the correct questions An understanding
of the terms of reference, which in this case are relational tables, is essential for the
formulation of the correct questions A structural description of a table is useful to
establish what questions can be asked of it The Oracle server stores information
about all tables in a special set of relational tables called the data dictionary, in order
to manage them The data dictionary is quite similar to a regular language dictionary
It stores definitions of database objects in a centralized, ordered, and structured format
The data dictionary is discussed in detail in Chapter 1
A clear distinction must be drawn between storing the definition and the contents
of a table The definition of a table includes information like table name, table owner,
details about the columns that compose the table, and its physical storage size on disk
This information is also referred to as metadata The contents of a table are stored in
rows and are referred to as data.
The structural metadata of a table may be obtained by querying the database for
the list of columns that compose it using the DESCRIBE command The general form
of the syntax for this command is intuitively
DESC[RIBE] <SCHEMA>.tablename
This command shall be systematically unpacked The DESCRIBE keyword can be
shortened to DESC All tables belong to a schema or owner If you are describing a
table that belongs to the schema to which you have connected, the <SCHEMA> portion
of the command may be omitted Figure 9-1 shows how the EMPLOYEES table is
described from SQL*Plus after connecting to the database as the HR user with the
DESCRIBE EMPLOYEES command and how the DEPARTMENTS table is described
using the shorthand notation: DESC HR.DEPARTMENTS The HR notational prefix
could be omitted, since the DEPARTMENTS table belongs to the HR schema The HR
schema (and every other schema) has access to a special table called DUAL, which
belongs to the SYS schema This table can be structurally described with the command
DESCRIBE SYS.DUAL
Describing tables yields interesting and useful results You know which columns
of a table can be selected, since their names are exposed You also know the nature of
the data contained in these columns, since the column data type is exposed Chapter 7
details column types
Mandatory columns, which are forced to store data for each row, are exposed by
the “Null?” column output produced by the DESCRIBE command having the value
NOT NULL You are guaranteed that any column restricted by the NOT NULL constraint
contains some data It is important to note that NULL has special meaning for the
Oracle server NULL refers to an absence of data Blank spaces do not count as NULL,
since they are present in the row and have some length even though they are not visible
Trang 5OCA/OCP Oracle Database 11g All-in-One Exam Guide
370
Capabilities of the SELECT Statement
Relational database tables are built on a mathematical foundation called relational
theory In this theory, relations or tables are operated on by a formal language called relational algebra Relational algebra uses some specialized terms: relations store tuples,
which have attributes Or in Oracle-speak, tables store rows, which have columns SQL is a commercial interpretation of the relational algebra constructs Three concepts from relational theory encompass the capability of the SELECT statement: projection, selection, and joining
Projection refers to the restriction of columns selected from a table When requesting
information from a table, you can ask to view all the columns You can retrieve all data from the HR.DEPARTMENTS table with a simple SELECT statement This query will return DEPARTMENT_ID, DEPARTMENT_NAME, MANAGER_ID, and LOCATION_ID information for every department record stored in the table What if you wanted a list containing only the DEPARTMENT_NAME and MANAGER_ID columns? Well, you would request just those two columns from the table This restriction of columns is
called projection.
Selection refers to the restriction of the rows selected from a table It is often not
desirable to retrieve every row from a table Tables may contain many rows, and instead
of requesting all of them, selection provides a means to restrict the rows returned Perhaps you have been asked to identify only the employees who belong to department
30 With selection it is possible to limit the results set to those rows of data with a
DEPARTMENT_ID value of 30
Joining, as a relational concept, refers to the interaction of tables with each other
in a query Third normal form presents the notion of separating different types of data
into autonomous tables to avoid duplication and maintenance anomalies and to
Figure 9-1 Describing EMPLOYEES, DEPARTMENTS, and DUAL tables
Trang 6associate related data using primary and foreign key relationships These relationships
provide the mechanism to join tables with each other (discussed in Chapter 12)
Assume there is a need to retrieve the e-mail addresses for employees who work
in the Sales department The EMAIL column belongs to the EMPLOYEES table, while
the DEPARTMENT_NAME column belongs to the DEPARTMENTS table Projection and
selection from the DEPARTMENTS table may be used to obtain the DEPARTMENT_ID
value that corresponds to the Sales department The matching rows in the EMPLOYEES
table may be joined to the DEPARTMENTS table based on this common DEPARTMENT_
ID value The EMAIL column may then be projected from this set of results.
The SQL SELECT statement is mathematically governed by these three tenets An
unlimited combination of projections, selections, and joins provides the language to
extract the relational data required
EXAM TIP The three concepts of projection, selection, and joining, which
form the underlying basis for the capabilities of the SELECT statement, are
usually measured in the exam You may be asked to choose the correct three
fundamental concepts or to choose a statement that demonstrates one or
more of these concepts
Data Normalization
The process of modeling data into relational tables is known as normalization There
are commonly said to be three levels of normalization: the first, second, and third
normal forms There are higher levels of normalization: fourth and fifth normal forms
are well defined, but not commonly used It is possible for SQL to address un-normalized
data, but this will usually be inefficient, as that is not what the language is designed to
do In most cases, data stored in a relational database and accessed with SQL should
be normalized to the third normal form
TIP There are often several possible normalized models for an application
It is important to use the most appropriate—if the systems analyst gets this
wrong, the implications can be serious for performance, storage needs, and
development effort
As an example of normalization, consider an un-normalized table called BOOKS
that stores details of books, authors, and publishers, using the ISBN number as the
primary key A primary key is the one attribute (or attributes) that can uniquely identify
a record These are two entries:
12345 Oracle 11g OCP SQL
Fundamentals 1 Exam Guide
John Watson, Roopesh Ramklass
McGraw-Hill, Spear Street, San Francisco, CA 94105
67890 Oracle 11g New Features
Exam Guide
Sam Alapati McGraw-Hill, Spear Street,
San Francisco, CA 94105
Storing the data in this table gives rise to several anomalies First, here is the insertion
anomaly: it is impossible to enter details of authors who are not yet published, because
Trang 7OCA/OCP Oracle Database 11g All-in-One Exam Guide
372
there will be no ISBN number under which to store them Second, a book cannot be deleted without losing the details of the publisher: a deletion anomaly Third, if a publisher’s address changes, it will be necessary to update the rows for every book it has published: an update anomaly Furthermore, it will be very difficult to identify every book written by one author The fact that a book may have several authors means that the “author” field must be multivalued, and a search will have to search all the values Related to this is the problem of having to restructure the table if a book comes along with more authors than the original design can handle Also, the storage is very inefficient due to replication of address details across rows, and the possibility of error as this data is repeatedly entered is high Normalization should solve all these issues
The first normal form is to remove the repeating groups, in this case, the multiple authors: pull them out into a separate table called AUTHORS The data structures will now look like the following
Two rows in the BOOKS table:
12345 Oracle 11g OCP SQL Fundamentals 1 Exam Guide McGraw-Hill, Spear Street,
San Francisco, California
San Francisco, California
And three rows in the AUTHORS table:
The one row in the BOOKS table is now linked to two rows in the AUTHORS table This solves the insertion anomaly (there is no reason not to insert as many unpublished authors as necessary), the retrieval problem of identifying all the books by one author (one can search the AUTHORS table on just one name) and the problem of a fixed maximum number of authors for any one book (simply insert as many AUTHORS as are needed)
This is the first normal form: no repeating groups
The second normal form removes columns from the table that are not dependent
on the primary key In this example, that is the publisher’s address details: these are dependent on the publisher, not the ISBN The BOOKS table and a new PUBLISHERS table will then look like this:
BOOKS
Fundamentals 1 Exam Guide
McGraw-Hill
Exam Guide
McGraw-Hill
Trang 8PUBLISHERS
All the books published by one publisher will now point to a single record in
PUBLISHERS This solves the problem of storing the address many times, and it also
solves the consequent update anomalies and the data consistency errors caused by
inaccurate multiple entries
Third normal form removes all columns that are interdependent In the PUBLISHERS
table, this means the address columns: the street exists in only one city, and the city
can be in only one state; one column should do, not three This could be achieved by
adding an address code, pointing to a separate address table:
PUBLISHERS
ADDRESSES
One characteristic of normalized data that should be emphasized now is the use
of primary keys and foreign keys A primary key is the unique identifier of a row in a
table, either one column or a concatenation of several columns (known as a composite
key) Every table should have a primary key defined This is a requirement of the
relational paradigm Note that the Oracle database deviates from this standard: it is
possible to define tables without a primary key—though it is usually not a good idea,
and some other RDBMSs do not permit this
A foreign key is a column (or a concatenation of several columns) that can be used
to identify a related row in another table A foreign key in one table will match a primary
key in another table This is the basis of the many-to-one relationship A many-to-one
relationship is a connection between two tables, where many rows in one table refer
to a single row in another table This is sometimes called a parent-child relationship: one
parent can have many children In the BOOKS example so far, the keys are as follows:
Foreign key: Publisher
Foreign key: ISBN
Foreign key: Address code
Trang 9OCA/OCP Oracle Database 11g All-in-One Exam Guide
374
These keys define relationships such as that one book can have several authors There are various standards for documenting normalized data structures, developed
by different organizations as structured formal methods Generally speaking, it really doesn’t matter which method one uses as long as everyone reading the documents understands it Part of the documentation will always include a listing of the
attributes that make up each entity (also known as the columns that make up
each table) and an entity-relationship diagram representing graphically the foreign
to primary key connections A widely used standard is as follows:
• Primary key columns identified with a hash (#)
• Foreign key columns identified with a backslash (\)
• Mandatory columns (those that cannot be left empty) with an asterisk (*)
• Optional columns with a lowercase “o”
The second necessary part of documenting the normalized data model is the
entity-relationship diagram This represents the connections between the tables graphically
There are different standards for these; Figure 9-2 shows the entity-relationship diagram for the BOOKS example using a very simple notation limited to showing the direction
of the one-to-many relationships, using what are often called crow’s feet to indicate
which sides of the relationship are the many and the one It can be seen that one BOOK can have multiple AUTHORS, one PUBLISHER can publish many books Note that the diagram also states that both AUTHORS and PUBLISHERS have exactly one ADDRESS More complex notations can be used to show whether the link is required or optional, information that will match that given in the table columns listed previously
This is a very simple example of normalization, and it is not in fact complete If one author were to write several books, this would require multiple values in the ISBN column
of the AUTHORS table That would be a repeating group, which would have to be removed because repeating groups break the rule for first normal form A challenging exercise with data normalization is ensuring that the structures can handle all possibilities
A table in a real-world application may have hundreds of columns and dozens of foreign keys Entity-relationship diagrams for applications with hundreds or thousands
of entities can be challenging to interpret
ADDRESSES
Figure 9-2
An
entity-relationship
diagram relating
AUTHORS, BOOKS,
PUBLISHERS, and
ADDRESSES
Trang 10Create the Demonstration Schemas
Throughout this book, there are many examples of SQL code that run against tables
The examples use tables in the HR schema, which is sample data that simulates a
simple human resources application, and the WEBSTORE schema, which simulates
an order entry application
The HR schema can be created when the database is created; it is an option
presented by the Database Configuration Assistant If they do not exist, they can be
created later by running some scripts that will exist in the database Oracle Home
The HR and WEBSTORE Schemas
The HR demonstration schema consists of seven tables, linked by primary key to foreign
key relationships Figure 9-3 illustrates the relationships between the tables, as an
entity-relationship diagram
Two of the relationships shown in Figure 9-3 may not be immediately
comprehensible First, there is a many-to-one relationship from EMPLOYEES to
EMPLOYEES This is what is known as a self-referencing foreign key This means that
many employees can be connected to one employee, and it’s based on the fact that
REGIONS
COUNTRIES
LOCATIONS
DEPARTMENTS
EMPLOYEES JOB_HISTORY
JOBS
Figure 9-3
The HR
entity-relationship diagram