Errors caused by access permissions are a case where SELECT and DML statements may return different results: it is possible for a user to have permission to see the rows in a table, but
Trang 1row selection, as there is with a DELETE.
One part of the definition of a table as stored in the data dictionary is the table’s physical location When first created, a table is allocated a single area of space, of fixed
size, in the database’s datafiles This is known as an extent and will be empty Then, as
rows are inserted, the extent fills up Once it is full, more extents will be allocated to the table automatically A table therefore consists of one or more extents, which hold the rows As well as tracking the extent allocation, the data dictionary also tracks how
much of the space allocated to the table has been used This is done with the high
water mark The high water mark is the last position in the last extent that has been
used; all space below the high water mark has been used for rows at one time or another, and none of the space above the high water mark has been used yet
Note that it is possible for there to be plenty of space below the high water mark that is not being used at the moment; this is because of rows having been removed with a DELETE command Inserting rows into a table pushes the high water mark up Deleting them leaves the high water mark where it is; the space they occupied remains assigned to the table but is freed up for inserting more rows
Truncating a table resets the high water mark Within the data dictionary, the recorded position of the high water mark is moved to the beginning of the table’s first extent As Oracle assumes that there can be no rows above the high water mark, this has the effect of removing every row from the table The table is emptied and remains empty until subsequent insertions begin to push the high water mark back up again
In this manner, one DDL command, which does little more than make an update in the data dictionary, can annihilate billions of rows in a table
The syntax to truncate a table couldn’t be simpler:
TRUNCATE TABLE table;
Figure 8-2 shows access to the TRUNCATE command through the SQL Developer navigation tree, but of course it can also be executed from SQL*Plus
MERGE
There are many occasions where you want to take a set of data (the source) and integrate it into an existing table (the target) If a row in the source data already exists
in the target table, you may want to update the target row, or you may want to replace it completely, or you may want to leave the target row unchanged If a row in the source does not exist in the target, you will want to insert it The MERGE command lets you
do this A MERGE passes through the source data, for each row attempting to locate a matching row in the target If no match is found, a row can be inserted; if a match is
Trang 2found, the matching row can be updated The release 10g enhancement means that
the target row can even be deleted, after being matched and updated The end result
is a target table into which the data in the source has been merged
A MERGE operation does nothing that could not be done with INSERT, UPDATE,
and DELETE statements—but with one pass through the source data, it can do all
three Alternative code without a MERGE would require three passes through the data,
one for each command
The source data for a MERGE statement can be a table or any subquery The
condition used for finding matching rows in the target is similar to a WHERE clause
The clauses that update or insert rows are as complex as an UPDATE or an INSERT
command It follows that MERGE is the most complicated of the DML commands,
which is not unreasonable, as it is (arguably) the most powerful Use of MERGE is
not on the OCP syllabus, but for completeness here is a simple example:
merge into employees e using new_employees n
on (e.employee_id = n.employee_id)
when matched then
update set e.salary=n.salary
when not matched then
insert (employee_id,last_name,salary)
Figure 8-2 The TRUNCATE command in SQL Developer, from the command line and from the menus
Trang 3EMPLOYEES If there is not such a row, one will be inserted Variations on the syntax allow the use of a subquery to select the source rows, and it is even possible to delete matching rows
DML Statement Failures
Commands can fail for many reasons, including the following:
• Syntax errors
• References to nonexistent objects or columns
• Access permissions
• Constraint violations
• Space issues
Figure 8-3 shows several attempted executions of a statement with SQL*Plus
Figure 8-3 Some examples of statement failure
Trang 4In Figure 8-3, a user connects as SUE (password, SUE—not an example of good
security) and queries the EMPLOYEES table The statement fails because of a simple
syntax error, correctly identified by SQL*Plus Note that SQL*Plus never attempts
to correct such mistakes, even when it knows exactly what you meant to type Some
third-party tools may be more helpful, offering automatic error correction
The second attempt to run the statement fails with an error stating that the object
does not exist This is because it does not exist in the current user’s schema; it exists in
the HR schema Having corrected that, the third run of the statement succeeds—but
only just The value passed in the WHERE clause is a string, ‘21-APR-2000’, but the
column HIRE_DATE is not defined in the table as a string, it is defined as a date To
execute the statement, the database had to work out what the user really meant and
cast the string as a date In the last example, the typecasting fails This is because the
string passed is formatted as a European-style date, but the database has been set up
as American: the attempt to match “21” to a month fails The statement would have
succeeded if the string had been ‘04/21/2007’
If a statement is syntactically correct and has no errors with the objects to which
it refers, it can still fail because of access permissions If the user attempting to execute
the statement does not have the relevant permissions on the tables to which it refers,
the database will return an error identical to that which would be returned if the
object did not exist As far as the user is concerned, it does not exist
Errors caused by access permissions are a case where SELECT and DML statements
may return different results: it is possible for a user to have permission to see the rows
in a table, but not to insert, update, or delete them Such an arrangement is not
uncommon; it often makes business sense Perhaps more confusingly, permissions
can be set up in such a manner that it is possible to insert rows that you are not
allowed to see And, perhaps worst of all, it is possible to delete rows that you can
neither see nor update However, such arrangements are not common
A constraint violation can cause a DML statement to fail For example, an INSERT
command can insert several rows into a table, and for every row the database will
check whether a row already exists with the same primary key This occurs as each row
is inserted It could be that the first few rows (or the first few million rows) go in
without a problem, and then the statement hits a row with a duplicate value At this
point it will return an error, and the statement will fail This failure will trigger a
reversal of all the insertions that had already succeeded This is part of the SQL
standard: a statement must succeed in total, or not at all The reversal of the work
is a rollback The mechanisms of a rollback are described in the next section of this
chapter, titled “Control Transactions.”
If a statement fails because of space problems, the effect is similar A part of the
statement may have succeeded before the database ran out of space The part that did
succeed will be automatically rolled back Rollback of a statement is a serious matter
It forces the database to do a lot of extra work and will usually take at least as long as
the statement has taken already (sometimes much longer)
Trang 5the concept of a transaction A related topic is read consistency; this is automatically implemented by the Oracle server, but to a certain extent programmers can manage
it by the way they use the SELECT statement
Database Transactions
Oracle’s mechanism for assuring transactional integrity is the combination of undo segments and redo log files: this mechanism is undoubtedly the best of any database yet developed and conforms perfectly with the international standards for data processing Other database vendors comply with the same standards with their own mechanisms, but with varying levels of effectiveness In brief, any relational database must be able to pass the ACID test: it must guarantee atomicity, consistency, isolation, and durability
A is for Atomicity
The principle of atomicity states that either all parts of a transaction must successfully
complete or none of them (The reasoning behind the term is that an atom cannot be split—now well known to be a false assumption.) For example, if your business analysts have said that every time you change an employee’s salary you must also change the employee’s grade, then the atomic transaction will consist of two updates The database must guarantee that both go through or neither If only one of the updates were to succeed, you would have an employee on a salary that was incompatible with his grade:
a data corruption, in business terms If anything (anything at all!) goes wrong before the transaction is complete, the database itself must guarantee that any parts that did
go through are reversed; this must happen automatically But although an atomic transaction sounds small (like an atom), it can be enormous To take another example,
it is logically impossible for an accounting suite nominal ledger to be half in August and half in September: the end-of-month rollover is therefore (in business terms) one atomic transaction, which may affect millions of rows in thousands of tables and take hours to complete (or to roll back, if anything goes wrong) The rollback of an incomplete transaction may be manual (as when you issue the ROLLBACK command), but it must be automatic and unstoppable in the case of an error
C is for Consistency
The principle of consistency states that the results of a query must be consistent with
the state of the database at the time the query started Imagine a simple query that averages the value of a column of a table If the table is large, it will take many
minutes to pass through the table If other users are updating the column while the query is in progress, should the query include the new or the old values? Should it
Trang 6include rows that were inserted or deleted after the query started? The principle of
consistency requires that the database ensure that changed values are not seen by the
query; it will give you an average of the column as it was when the query started, no
matter how long the query takes or what other activity is occurring on the tables
concerned Oracle guarantees that if a query succeeds, the result will be consistent
However, if the database administrator has not configured the database appropriately,
the query may not succeed: there is a famous Oracle error, “ORA-1555 snapshot too
old,” that is raised This used to be an extremely difficult problem to fix with earlier
releases of the database, but with recent versions the database administrator should
always be able to prevent this
I is for Isolation
The principle of isolation states that an incomplete (that is, uncommitted) transaction
must be invisible to the rest of the world While the transaction is in progress, only
the one session that is executing the transaction is allowed to see the changes; all
other sessions must see the unchanged data, not the new values The logic behind
this is, first, that the full transaction might not go through (remember the principle
of atomicity and automatic or manual rollback?) and that therefore no other users
should be allowed to see changes that might be reversed And second, during the
progress of a transaction the data is (in business terms) incoherent: there is a short
time when the employee has had their salary changed but not their grade Transaction
isolation requires that the database must conceal transactions in progress from other
users: they will see the preupdate version of the data until the transaction completes,
when they will see all the changes as a consistent set Oracle guarantees transaction
isolation: there is no way any session (other than that making the changes) can see
uncommitted data A read of uncommitted data is known as a dirty read, which Oracle
does not permit (though some other databases do)
D is for Durability
The principle of durability states that once a transaction completes, it must be impossible
for the database to lose it During the time that the transaction is in progress, the principle
of isolation requires that no one (other than the session concerned) can see the changes
it has made so far But the instant the transaction completes, it must be broadcast to the
world, and the database must guarantee that the change is never lost; a relational database
is not allowed to lose data Oracle fulfills this requirement by writing out all change
vectors that are applied to data to log files as the changes are done By applying this log
of changes to backups taken earlier, it is possible to repeat any work done in the event
of the database being damaged Of course, data can be lost through user error such as
inappropriate DML, or dropping or truncating tables But as far as Oracle and the DBA
are concerned, such events are transactions like any other: according to the principle of
durability, they are absolutely nonreversible
Executing SQL Statements
The entire SQL language consists of only a dozen or so commands The ones we are
concerned with here are: SELECT, INSERT, UPDATE, and DELETE
Trang 7EXAM TIP Always remember that server processes read blocks from datafiles
into the database buffer cache, DBWn writes blocks from the database buffer cache to the datafiles
Once the data blocks required for the query are in the database buffer cache, any further processing (such as sorting or aggregation) is carried out in the PGA of the session When the execution is complete, the result set is returned to the user process How does this relate to the ACID test just described? For consistency, if the query encounters a block that has been changed since the time the query started, the server process will go to the undo segment that protected the change, locate the old version
of the data, and (for the purposes of the current query only) roll back the change Thus any changes initiated after the query commenced will not be seen A similar mechanism guarantees transaction isolation, though this is based on whether the change has been committed, not only on whether the data has been changed Clearly,
if the data needed to do this rollback is no longer in the undo segments, this
mechanism will not work That is when you get the “snapshot too old” error
Figure 8-4 shows a representation of the way a SELECT statement is processed
User process
1
2
3
Server process
System global area Database buffer
cache
Datafiles
Figure 8-4 The stages of execution of a SELECT
Trang 8In the figure, Step 1 is the transmission of the SELECT statement from the user
process to the server process The server will search the database buffer cache to
determine if the necessary blocks are already in memory, and if they are, proceed to
Step 4 If they are not, Step 2 is to locate the blocks in the datafiles, and Step 3 is to
copy them into the database buffer cache Step 4 transfers the data to the server
process, where there may be some further processing before Step 5 returns the result
of the query to the user process
Executing an UPDATE Statement
For any DML operation, it is necessary to work on both data blocks and undo blocks,
and also to generate redo: the A, C, and I of the ACID test require generation of undo;
the D requires generation of redo
EXAM TIP Undo is not the opposite of redo! Redo protects all block changes,
no matter whether it is a change to a block of a table segment, an index segment,
or an undo segment As far as redo is concerned, an undo segment is just
another segment, and any changes to it must be made durable
The first step in executing DML is the same as executing SELECT: the required
blocks must be found in the database buffer cache, or copied into the database buffer
cache from the datafiles The only change is that an empty (or expired) block of an
undo segment is needed too From then on, things are a bit more complicated
First, locks must be placed on any rows and associated index keys that are going
to be affected by the operation This is covered later in this chapter
Then the redo is generated: the server process writes to the log buffer the change
vectors that are going to be applied to the data blocks This generation of redo is
applied both to table block changes and to undo block changes: if a column of a
row is to be updated, then the rowid and the new value of the column are written
to the log buffer (which is the change that will be applied to the table block), and
also the old value (which is the change that will be applied to the undo block) If the
column is part of an index key, then the changes to be applied to the index are also
written to the log buffer, together with a change to be applied to an undo block to
protect the index change
Having generated the redo, the update is carried out in the database buffer cache: the
block of table data is updated with the new version of the changed column, and
the old version of the changed column is written to the block of undo segment From
this point until the update is committed, all queries from other sessions addressing
the changed row will be redirected to the undo data Only the session that is doing the
update will see the actual current version of the row in the table block The same
principle applies to any associated index changes
Executing INSERT and DELETE Statements
Conceptually, INSERT and DELETE are managed in the same fashion as an UPDATE
The first step is to locate the relevant blocks in the database buffer cache, or to copy
them into it if they are not there
Trang 9A crucial difference between INSERT and DELETE is in the amount of undo generated When a row is inserted, the only undo generated is writing out the new rowid to the undo block This is because to roll back an INSERT, the only information Oracle requires is the rowid, so that this statement can be constructed:
delete from table_name where rowid=rowid_of_the_new_row ;
Executing this statement will reverse the original change
For a DELETE, the whole row (which might be several kilobytes) must be written
to the undo block, so that the deletion can be rolled back if need be by constructing a statement that will insert the complete row back into the table
The Start and End of a Transaction
A session begins a transaction the moment it issues any DML The transaction
continues through any number of further DML commands until the session issues either a COMMIT or a ROLLBACK statement Only committed changes will be made permanent and become visible to other sessions It is impossible to nest transactions The SQL standard does not allow a user to start one transaction and then start another before terminating the first This can be done with PL/SQL (Oracle’s proprietary third-generation language), but not with industry-standard SQL
The explicit transaction control statements are COMMIT, ROLLBACK, and
SAVEPOINT There are also circumstances other than a user-issued COMMIT or ROLLBACK that will implicitly terminate a transaction:
• Issuing a DDL or DCL statement
• Exiting from the user tool (SQL*Plus or SQL Developer or anything else)
• If the client session dies
• If the system crashes
If a user issues a DDL (CREATE, ALTER, or DROP) or DCL (GRANT or REVOKE) command, the transaction in progress (if any) will be committed: it will be made permanent and become visible to all other users This is because the DDL and DCL commands are themselves transactions As it is not possible to nest transactions in SQL, if the user already has a transaction running, the statements the user has run will
be committed along with the statements that make up the DDL or DCL command
If you start a transaction by issuing a DML command and then exit from the tool you are using without explicitly issuing either a COMMIT or a ROLLBACK, the transaction will terminate—but whether it terminates with a COMMIT or a ROLLBACK
is entirely dependent on how the tool is written Many tools will have different
Trang 10behavior, depending on how the tool is exited (For instance, in the Microsoft Windows
environment, it is common to be able to terminate a program either by selecting the
File | Exit options from a menu on the top left of the window, or by clicking an “X” in
the top-right corner The programmers who wrote the tool may well have coded different
logic into these functions.) In either case, it will be a controlled exit, so the programmers
should issue either a COMMIT or a ROLLBACK, but the choice is up to them
If a client’s session fails for some reason, the database will always roll back the
transaction Such failure could be for a number of reasons: the user process can die or
be killed at the operating system level, the network connection to the database server
may go down, or the machine where the client tool is running can crash In any of
these cases, there is no orderly issue of a COMMIT or ROLLBACK statement, and it is
up to the database to detect what has happened The behavior is that the session is
killed, and an active transaction is rolled back The behavior is the same if the failure
is on the server side If the database server crashes for any reason, when it next starts
up all transactions from any sessions that were in progress will be rolled back
Transaction Control: COMMIT, ROLLBACK, SAVEPOINT,
SELECT FOR UPDATE
Oracle’s implementation of the relational database paradigm begins a transaction
implicitly with the first DML statement The transaction continues until a COMMIT or
ROLLBACK statement The SAVEPOINT command is not part of the SQL standard and
is really just an easy way for programmers to back out some statements, in reverse
order It need not be considered separately, as it does not terminate a transaction
COMMIT
Commit processing is where many people (and even some experienced DBAs) show an
incomplete, or indeed completely inaccurate, understanding of the Oracle architecture
When you say COMMIT, all that happens physically is that LGWR flushes the log buffer
to disk DBWn does absolutely nothing This is one of the most important performance
features of the Oracle database
EXAM TIP What does DBWn do when you issue a COMMIT command?
Answer: absolutely nothing
To make a transaction durable, all that is necessary is that the changes that make
up the transaction are on disk: there is no need whatsoever for the actual table data to
be on disk, in the datafiles If the changes are on disk, in the form of multiplexed redo
log files, then in the event of damage to the database the transaction can be reinstantiated
by restoring the datafiles from a backup taken before the damage occurred and applying
the changes from the logs This process is covered in detail in later chapters—for now,
just hang on to the fact that a COMMIT involves nothing more than flushing the log
buffer to disk, and flagging the transaction as complete This is why a transaction
involving millions of updates in thousands of files over many minutes or hours can