OCA Oracle Database 11g SQL Fundamentals I Exam Guide P2

As an example of normalization, consider an un-normalized table called BOOKS that stores details of books, authors, and publishers, using the ISBN number as the primary key.. These are t

Trang 1

data, but it is not appropriate for all applications As a general rule, a relational analysis should be the first approach taken when modeling a system Only if it proves inappropriate should one resort to nonrelational structures Applications where the relational model has proven highly effective include virtually all Online Transaction Processing (OLTP) systems and Decision Support Systems (DSS) The relational paradigm can be demanding in its hardware requirements and in the skill needed

to develop applications around it, but if the data fits, it has proved to be the most versatile model There can be, for example, problems caused by the need to maintain the indexes that maintain the links between tables and the space requirements of maintaining multiple copies of the indexed data in the indexes themselves and in the tables in which the columns reside Nonetheless, relational design is in most circumstances the optimal model

A number of software publishers have produced database management systems that conform (with varying degrees of accuracy) to the relational paradigm; Oracle

is only one IBM was perhaps the first company to commit major resources to it, but their product (which later developed into DB2) was not ported to non-IBM platforms for many years Microsoft’s SQL Server is another relational database that has been limited by the platforms on which it runs Oracle databases, by contrast, have always been ported to every major platform from the first release It may be this that gave Oracle the edge in the RDBMS market place

A note on terminology: confusion can arise when discussing relational databases with people used to working with Microsoft products SQL is a language and SQL Server is a database, but in the Microsoft world, the term SQL is often used to refer

to either

Data Normalization

The process of modeling data into relational tables is known as normalization and can be studied at university level for years There are commonly said to be three levels of normalization: the first, second, and third normal forms There are higher levels of normalization: fourth and fifth normal forms are well defined, but any normal data analyst (and certainly any normal human being) will not need to be concerned with them It is possible for a SQL application to address un-normalized data, but this will usually be inefficient as that is not what the language is designed

to do In most cases, data stored in a relational database and accessed with SQL should be normalized to the third normal form

Trang 2

There are often several possible normalized models for an application It

is important to use the most appropriate—if the systems analyst gets this wrong, the implications can be serious for performance, storage needs, and development effort.

As an example of normalization, consider an un-normalized table called BOOKS that stores details of books, authors, and publishers, using the ISBN number as the

primary key A primary key is the one attribute (or attributes) that can uniquely

identify a record These are two entries:

12345 Oracle 11g OCP SQL

Fundamentals 1 Exam Guide John Watson, Roopesh Ramklass McGraw-Hill, Spear Street, San Francisco,

CA 94105

67890 Oracle 11g New Features

Exam Guide Sam Alapati McGraw-Hill, Spear Street, San Francisco,

CA 94105

Storing the data in this table gives rise to several anomalies First, here is the insertion anomaly: it is impossible to enter details of authors who are not yet

SCENARIO & SOLUTION Your organization is designing a new

application Who should be involved? Everyone! The project team must involve business analysts (who model the business processes), systems analysts

(who model the data), system designers (who decide how

to implement the models), developers (you), database administrators, system administrators, and (most importantly) end users

It is possible that relational structures may

not be suitable for a particular application

How can this be determined, and what

should be done next? Can Oracle help?

Attempt to normalize the data into two-dimensional tables, linked with one-to-many relationships If this really cannot be done, consider other paradigms Oracle may well be able to help For instance, maps and other geographical data really don’t work relationally Neither does text data (such as word processing documents) But the Spatial and Text database options can be used for these purposes There is also the possibility of using user-defined objects to store nontabular data

Trang 3

Second, a book cannot be deleted without losing the details of the publisher: a deletion anomaly Third, if a publisher’s address changes, it will be necessary to update the rows for every book he has published: an update anomaly Furthermore,

it will be very difficult to identify every book written by one author The fact that a book may have several authors means that the “author” field must be multivalued, and a search will have to search all the values Related to this is the problem of having to restructure the table of a book that comes along with more authors than the original design can handle Also, the storage is very inefficient due to replication

of address details across rows, and the possibility of error as this data is repeatedly entered is high Normalization should solve all these issues

The first normal form is to remove the repeating groups, in this case, the multiple authors: pull them out into a separate table called AUTHORS The data structures will now look like the following

Two rows in the BOOKS table:

12345 Oracle 11g OCP SQL Fundamentals

1 Exam Guide McGraw-Hill, Spear Street, San Francisco, California

67890 Oracle 11g New Features Exam Guide McGraw-Hill, Spear Street,

San Francisco, California And three rows in the AUTHOR table:

The one row in the BOOKS table is now linked to two rows in the AUTHORS table This solves the insertion anomaly (there is no reason not to insert as many unpublished authors as necessary), the retrieval problem of identifying all the books

by one author (one can search the AUTHORS table on just one name) and the problem of a fixed maximum number of authors for any one book (simply insert as many or as few AUTHORS as are needed)

Trang 4

This is the first normal form: no repeating groups.

The second normal form removes columns from the table that are not dependent

on the primary key In this example, that is the publisher’s address details: these are dependent on the publisher, not the ISBN The BOOKS table and a new PUBLISHERS table will then look like this:

BOOKS

12345 Oracle 11g OCP SQL Fundamentals 1 Exam Guide McGraw-Hill

67890 Oracle 11g New Features Exam Guide McGraw-Hill PUBLISHERS

McGraw-Hill Spear Street San Francisco California

All the books published by one publisher will now point to a single record in PUBLISHERS This solves the problem of storing the address many times, and also solves the consequent update anomalies and the data consistency errors caused by inaccurate multiple entries

Third normal form removes all columns that are interdependent In the

PUBLISHERS table, this means the address columns: the street exists in only one city, and the city can be in only one state; one column should do, not three This could be achieved by adding an address code, pointing to a separate address table: PUBLISHERS

ADDRESSES

Trang 5

of primary keys and foreign keys A primary key is the unique identifier of a row

in a table, either one column or a concatenation of several columns (known as a

composite key) Every table should have a primary key defined This is a requirement

of the relational paradigm Note that the Oracle database deviates from this standard: it is possible to define tables without a primary key—though it is usually not a good idea, and some other RDBMSs do not permit this

A foreign key is a column (or a concatenation of several columns) that can be

used to identify a related row in another table A foreign key in one table will match

a primary key in another table This is the basis of the many-to-one relationship A

many-to-one relationship is a connection between two tables, where many rows in one table refer to a single row in another table This is sometimes called a parent-child relationship: one parent can have many parent-children In the BOOKS example so far, the keys are as follows:

Foreign key: Publisher

Foreign key: ISBN

Foreign key: Address code

These keys define relationships such as that one book can have several authors There are various standards for documenting normalized data structures, developed by different organizations as structured formal methods Generally speaking, it really doesn’t matter which method one uses as long as everyone reading the documents understands it Part of the documentation will always include a listing of the attributes that make up each entity (also known as the columns that make up each table) and an entity-relationship diagram representing graphically the foreign to primary key connections A widely used standard is as follows:

■ Primary key columns identified with a hash (#)

■ Foreign key columns identified with a back slash (\)

■ Mandatory columns (those that cannot be left empty) with an asterisk (*)

■ Optional columns with a lowercase “o”

Trang 6

The BOOKS tables can now be described as follows:

Table BOOKS

\* Publisher Foreign key, link to the PUBLISHERS table Table AUTHORS

#* Name Together with the ISBN, the primary key

#\o ISBN Part of the primary key, and a foreign key to the BOOKS table

Optional, because some authors may not yet be published Table PUBLISHERS

\o Address code Foreign key, link to the ADDRESSES table Table ADDRESSES

#* Address code Primary key

The second necessary part of documenting the normalized data model is the

entity-relationship diagram This represents the connections between the tables

graphically There are different standards for these; Figure 1-3 shows the entity-relationship diagram for the BOOKS example using a very simple notation limited

to showing the direction of the one-to-many relationships, using what are often

called crow’s feet to indicate which sides of the relationship are the many and

the one It can be seen that one BOOK can have multiple AUTHORS, one PUBLISHER can publish many books Note that the diagram also states that both AUTHORS and PUBLISHERS have exactly one ADDRESS More complex notations can be used to show whether the link is required or optional, information which will match that given in the table columns listed previously

ADDRESSES

FIGURE 1-3

An

entity-relationship

diagram

Trang 7

one author were to write several books, this would require multiple values in the ISBN column of the AUTHORS table That would be a repeating group, which would have to be removed because repeating groups break the rule for first normal form A major exercise with data normalization is ensuring that the structures can handle all possibilities

A table in a real-world application may have hundreds of columns and dozens

of foreign keys The standards for notation vary across organizations—the example given is very basic Entity-relationship diagrams for applications with hundreds or thousands of entities can be challenging to interpret

EXERCISE 1-2

Perform an Extended Relational Analysis

This is a paper-based exercise, with no specific solution

Consider the situation where one author can write many books, and one book can have many authors This is a many-to-many relationship, which cannot be fit into the relational model Sketch out data structures that demonstrate the problem, and develop another structure that would solve it Following is a possible solution The un-normalized table of books with many authors could look like this:

BOOKS

There could be two rows in this table:

11g SQL Fundamentals Exam Guide John Watson, Roopesh Ramklass

10g DBA Exam Guide John Watson, Damir Bersinic And that of authors could look like this:

AUTHORS

Trang 8

There could be three rows in this table:

John Watson 11g SQL Fundamentals Exam Guide, 10g DBA Exam Guide

Roopesh Ramklass 11g SQL Fundamentals Exam Guide

Damir Bersinic 10g DBA Exam Guide

This many-to-many relationship needs to be resolved into many-to-one

relationships by taking the repeating groups out of the two tables and storing them

in a separate books-per-author table It will also become necessary to introduce some codes, such as ISBNs to identify books and social security numbers to identify authors This is a possible normalized structure:

BOOKS

AUTHORS

BOOKAUTHORS

#\* ISBN Part of the primary key and a foreign key to BOOKS

#\* SSNO Part of the primary key and a foreign key to AUTHORS The rows in these normalized tables would be as follows:

BOOKS

Trang 9

SSNO Name

11111 John Watson

22222 Damir Bersinic

33333 Roopesh Ramklass

BOOKAUTHORS

Figure 1-4 shows the entity-relationship diagram for the original un-normalized structure, followed by the normalized structure

As a further exercise, consider the possibility that one publisher could have offices at several addresses, and one address could have offices for several companies Authors will also have addresses, and this connection too needs to be defined These enhancements can be added to the example worked through previously

FIGURE 1-4

Un-normalized

and normalized

data models

First, an un-normalized many-to-many relationship:

The many-to-many relationship resolved, by interposing another entity:

BOOKS

AUTHORS

AUTHORS BOOKAUTHORS

Trang 10

CERTIFICATION OBJECTIVE 1.03

Summarize the SQL Language

SQL is defined, developed, and controlled by international bodies Oracle Corporation does not have to conform to the SQL standard but chooses to do so The language itself can be thought as being very simple (there are only 16 commands), but in practice SQL coding can be phenomenally complicated That is why a whole book is needed to cover the bare fundamentals

SQL Standards

Structured Query Language (SQL) was first invented by an IBM research group in the ’70s, but in fact Oracle Corporation (then trading as Relational Software, Inc.) claims to have beaten IBM to market by a few weeks with the first commercial implementation: Oracle 2, released in 1979 Since then the language has evolved enormously and is no longer driven by any one organization SQL is now an international standard It is managed by committees from ISO and ANSI ISO is the Organisation Internationale de Normalisation, based in Geneva; ANSI is the American National Standards Institute, based in Washington, DC The two bodies cooperate, and their SQL standards are identical

Earlier releases of the Oracle database used an implementation of SQL that had some significant deviations from the standard This was not because Oracle was being deliberately different: it was usually because Oracle implemented features that were ahead of the standard, and when the standard caught up, it used different syntax An example is the outer join (detailed in Chapter 8), which Oracle implemented long before standard SQL; when standard SQL introduced an outer join, Oracle added support for the new join syntax while retaining support for its own proprietary syntax Oracle Corporation ensures future compliance by inserting personnel onto the various ISO and ANSI committees and is now assisting with driving the SQL standard forward

SQL Commands

These are the 16 SQL commands, separated into commonly used groups:

Tiêu đề	Oracle server technologies and the relational paradigm
Thể loại	Exam Guide

Định dạng
Số trang	10
Dung lượng	409,03 KB