For a direct path import, Data Pump reads the dump file, uses its content to assemble blocks of table data, and writes them directly to the datafiles.. Even though Data Pump is manipulat
Trang 1To create an external table, use the CREATE TABLE command with the keywords ORGANIZATION EXTERNAL These tell Oracle that the table does not exist as a segment Then specify the layout and location of the operating system file For example,
create table new_dept
(deptno number(2),
dname varchar2(14),
loc varchar2(13))
organization external (
type oracle_loader
default directory jon_dir
access parameters
(records delimited by newline
badfile 'depts.bad'
discardfile 'depts.dsc'
log file 'depts.log'
fields terminated by ','
missing field values are null)
location ('depts.txt'));
This command will create an external table that will be populated by the DEPTS TXT file shown in the section “SQL*Loader” earlier in this chapter The syntax for the
Figure 23-1
Managing directories
with SQL*Plus
Trang 2ACCESS PARAMETERS is virtually identical to the SQL*Loader controlfile syntax and
is used because the TYPE has been set to ORACLE_LOADER The specification for the
DEFAULT DIRECTORY gives the Oracle directory where Oracle will look for the
source datafile, and where it will write the log and other files
External tables can be queried in exactly the same way as internal tables Any SQL
involving a SELECT will function against an external table: they can be used in joins,
views, and subqueries They cannot have indexes, constraints, or triggers
Exercise 23-1: Use SQL*Loader and External Tables In this exercise, you
will install and use SQL*Loader to insert data into a table, and also to generate the
CREATE TABLE script for an external table
1 Connect to your database as user SYSTEM (in the examples, the SYSTEM
password is ORACLE) with SQL*Plus
2 Create a table to use for the exercise:
create table names(first varchar2(10),last varchar2(10));
3 Using any editor that will create plain text files, create a file names.txt with
these values (or similar):
John,Watson
Roopesh,Ramklass
Sam,Alapati
4 Using the editor, create a controlfile names.ctl with these settings:
load data
infile 'names.txt'
badfile 'names.bad'
truncate
into table names
fields terminated by ','
trailing nullcols
(first,last)
This controlfile will truncate the target file before carrying out the insert.
5 From an operating system prompt, run SQL*Loader as follows:
sqlldr system/oracle control=names.ctl
6 Study the log file names.log that will have been generated
7 With SQL*Plus, confirm that the rows have been inserted:
select * from names;
8 To generate a statement that will create an external table, you can use
SQL*Loader and an existing controlfile:
sqlldr userid=system/oracle control=names.ctl external_table=generate_only
9 This will have generated a CREATE TABLE statement in the log file names
log, which will look something like this:
CREATE TABLE "SYS_SQLLDR_X_EXT_NAMES"
(
Trang 3"LAST" VARCHAR2(10) )
ORGANIZATION external (
TYPE oracle_loader DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000 ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252 BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'names.bad' Log file 'names.log_xt'
READSIZE 1048576 FIELDS TERMINATED BY "," LDRTRIM MISSING FIELD VALUES ARE NULL REJECT ROWS WITH ALL NULL FIELDS (
"FIRST" CHAR(255) TERMINATED BY ",", "LAST" CHAR(255) TERMINATED BY ","
) ) location ( 'names.txt' )
)REJECT LIMIT UNLIMITED
10 From your SQL*Plus session, create an Oracle directory pointing to the
operating system directory where your names.txt file is For example,
create directory system_dmp as '/home/oracle';
11 Make any edits you wish to the command shown in Step 9 For example, you
might want to change the name of the table being created (“SYS_SQLLDR_X_ EXT_NAMES” isn’t very useful) to something more meaningful You will need
to change both the DEFAULT DIRECTORY and BADFILE settings to point to the directory created in Step 10
12 Run the statement created in Step 11 from your SQL*Plus session.
13 Query the table with a few SELECT and DML statements You will find that a
log file is generated for every SELECT, and that DML is not permitted
14 Tidy up: delete the names.txt and names.ctl files; drop the tables; as SYS,
drop the directory
Data Pump
In the normal course of events, SELECT and DML commands are used to extract data from the database and to insert data into it, but there are occasions when you will need a much faster method for bulk operations For many reasons it may be desirable
to extract a large amount of data and the associated object definitions from a database
in a form that will allow it to be easily loaded into another One obvious purpose for extracting large amounts of data is for backups, but there are others, such as archiving
of historical data before deleting it from the live system, or to transfer data between
Trang 4production and test environments, or between an online system and a data warehouse
Data Pump (introduced with release 10g and enhanced with 11g) is a tool for large-scale,
high-speed data transfer between Oracle databases
Data Pump Architecture
Data Pump is a server-side utility You initiate Data Pump jobs from a user process,
either SQL*Plus or through Enterprise Manager, but all the work is done by server
processes This improves performance dramatically over the old Export/Import utilities,
because the Data Pump processes running on the server have direct access to the
datafiles and the SGA; they do not have to go via a session Also, it is possible to
launch a Data Pump job and then detach from it, leaving it running in the background
You can reconnect to the job to monitor its progress at any time
There are a number of processes involved in a Data Pump job, two queues, a
number of files, and one table First, the processes:
The user processes are expdp and impdp (for Unix) or expdp.exe and impdp
.exe (Windows) These are used to launch, control, and monitor Data Pump jobs
Alternatively, there is an Enterprise Manager interface The expdp or impdp user
process establishes a session against the database through a normal server process
This session then issues commands to control and monitor Data Pump jobs When
a Data Pump job is launched, at least two processes are started: a Data Pump Master
process (the DMnn) and one or more worker processes (named DWnn) If multiple
Data Pump jobs are running concurrently, each will have its own DMnn process, and
its own set of DWnn processes As the name implies, the master process controls the
workers If you have enabled parallelism, then each DWnn may make use of two or
more parallel execution servers (named Pnnn)
Two queues are created for each Data Pump job: a control queue and a status
queue The DMnn divides up the work to be done and places individual tasks that
make up the job on the control queue The worker processes pick up these tasks and
execute them—perhaps making use of parallel execution servers This queue operates
on a deliver-exactly-once model: messages are enqueued by the DMnn and dequeued
by the worker that picks them up The status queue is for monitoring purposes: the
DMnn places messages on it describing the state of the job This queue operates on a
publish-and-subscribe model: any session (with appropriate privileges) can query the
queue to monitor the job’s progress
The files generated by Data Pump come in three forms: SQL files, dump files, and log
files SQL files are DDL statements describing the objects included in the job You can
choose to generate them (without any data) as an easy way of getting this information
out of the database, perhaps for documentation purposes or as a set of scripts to recreate
the database Dump files contain the exported data This is formatted with XML tags The
use of XML means that there is a considerable overhead in dump files for describing the
data A small table like the REGIONS table in the HR sample schema will generate a
94KB dump file, but while this overhead may seem disproportionately large for a tiny
table like that, it becomes trivial for larger tables The log files describe the history of the
job run
Trang 5EXAM TIP Remember the three Data Pump file types: SQL files, log files, and
dump files
Finally, there is the control table This is created for you by the DMnn when you launch a job, and is used both to record the job’s progress and to describe it It is included in the dump file as the final item of the job
Directories and File Locations
Data Pump always uses Oracle directories These are needed to locate the files that it will read or write, and its log files One directory is all that is needed, but often a job will use several If the amount of data is many gigabytes to be written out in parallel to many files, you may want to spread the disk activity across directories in different file systems
If a directory is not specified in the Data Pump command, there are defaults Every
11g database will have an Oracle directory that can be used This is named DATA_PUMP_
DIR If the environment variable ORACLE_BASE has been set at database creation time,
the operating system location will be the ORACLE_BASE/admin/database_name/
dpdump directory If ORACLE_BASE is not set, the directory will be ORACLE_HOME/
admin/database_name/dpdump (where database_name is the name of the database)
To identify the location in your database, query the view DBA_DIRECTORIES However, the fact that this Oracle directory exists does not mean it can be used; any user wishing to use Data Pump will have to be granted read and/or write permissions on it first
Specifying the directory (or directories) to use for a Data Pump job can be done at four levels In decreasing order of precedence, these are
• A per-file setting within the Data Pump job
• A parameter applied to the whole Data Pump job
• The DATA_PUMP_DIR environment variable
• The DATA_PUMP_DIR directory object
So it is possible to control the location of every file explicitly, or a single Oracle directory can be nominated for the job, or an environment variable can be used,
or failing all of these, Data Pump will use the default directory The environment variable should be set on the client side but will be used on the server side An
example of setting it on Unix is
DATA_PUMP_DIR=SCOTT_DIR; export DATA_PUMP_DIR
or on Windows:
set DATA_PUMP_DIR=SCOTT_DIR
Direct Path or External Table Path?
Data Pump has two methods for loading and unloading data: the direct path and the external table path The direct path bypasses the database buffer cache For a direct path export, Data Pump reads the datafile blocks directly from disk, extracts and
Trang 6formats the content, and writes it out as a dump file For a direct path import, Data
Pump reads the dump file, uses its content to assemble blocks of table data, and
writes them directly to the datafiles The write is above the “high water mark” of the
table, with the same benefits as those described earlier for a SQL*Loader direct load
The external table path uses the database buffer cache Even though Data Pump is
manipulating files that are external to the database, it uses the database buffer cache
as though it were reading and writing an internal table For an export, Data Pump
reads blocks from the datafiles into the cache through a normal SELECT process
From there, it formats the data for output to a dump file During an import, Data
Pump constructs standard INSERT statements from the content of the dump file and
executes them by reading blocks from the datafiles into the cache, where the INSERT
is carried out in the normal fashion As far as the database is concerned, external table
Data Pump jobs look like absolutely ordinary (though perhaps rather large) SELECT
or INSERT operations Both undo and redo are generated, as they would be for any
normal DML statement Your end users may well complain while these jobs are in
progress Commit processing is absolutely normal
So what determines whether Data Pump uses the direct path or the external table
path? You as DBA have no control; Data Pump itself makes the decision based on the
complexity of the objects Only simple structures, such as heap tables without active
triggers, can be processed through the direct path; more complex objects such as
clustered tables force Data Pump to use the external table path because it requires
interaction with the SGA in order to resolve the complexities In either case, the dump
file generated is identical
EXAM TIP The external table path insert uses a regular commit, like any
other DML statement A direct path insert does not use a commit; it simply
shifts the high water mark of the table to include the newly written blocks
Data Pump files generated by either path are identical
Using Data Pump Export and Import
Data Pump is commonly used for extracting large amounts of data from one database
and inserting it into another, but it can also be used to extract other information such
as PL/SQL code or various object definitions There are several interfaces:
command-line utilities, Enterprise Manager, and a PL/SQL API Whatever purpose and technique
are used, the files are always in the Data Pump proprietary format It is not possible to
read a Data Pump file with any tool other than Data Pump
Capabilities
Fine-grained object and data selection facilities mean that Data Pump can export either
the complete database or any part of it It is possible to export table definitions with
or without their rows; PL/SQL objects; views; sequences; or any other object type If
exporting a table, it is possible to apply a WHERE clause to restrict the rows exported
(though this may make direct path impossible) or to instruct Data Pump to export a
random sample of the table expressed as a percentage
Trang 7Parallel processing can speed up Data Pump operations Parallelism can come at two levels: the number of Data Pump worker processes, and the number of parallel execution servers each worker process uses
An estimate facility can calculate the space needed for a Data Pump export, without actually running the job
The Network Mode allows transfer of a Data Pump data set from one database to another without ever staging it on disk This is implemented by a Data Pump export job on the source database writing the data over a database link to the target database, where a Data Pump import job reads the data from the database link and inserts it Remapping facilities mean that objects can be renamed or transferred from one schema to another and (in the case of data objects) moved from one tablespace to another as they are imported
When exporting data, the output files can be compressed and encrypted
Using Data Pump with the Command-Line Utilities
The executables expdb and impdp are installed into the ORACLE_HOME/bin
directory Following are several examples of using them Note that in all cases the command must be a single one-line command; the line breaks are purely for
readability
To export the entire database,
expdp system/manager@orcl11g full=y
parallel =4
dumpfile=datadir1:full1_%U.dmp,
datadir2:full2_%U.dmp,
datadir3:full3_%U.dmp,
datadir4:full4_%U.dmp,
filesize=2G
compression=all
This command will connect to the database as user SYSTEM and launch a full Data Pump export, using four worker processes working in parallel Each worker will generate its own set of dump files, uniquely named according to the %U template, which generates strings of eight unique characters Each worker will break up its output into files of 2GB (perhaps because of underlying file system restrictions) of compressed data
A corresponding import job (which assumes that the files generated by the export have all been placed in one directory) would be
impdb system/manager@dev11g full=y
directory=data_dir
parallel=4
dumpfile=full1_%U.dmp,full2_%U.dmp,full3_%U.dmp,full4_%U.dmp
This command makes a selective export of the PL/SQL objects belonging to two schemas:
expdp system/manager schemas=hr,oe
directory=code_archive
dumpfile=hr_oe_code.dmp
Trang 8This command will extract everything from a Data Pump export that was in the
HR schema, and import it into the DEV schema:
impdp system/manager
directory=usr_data
dumpfile=usr_dat.dmp
schema=hr
remap_schema=hr:dev
Using Data Pump with Database Control
The Database Control interface to Data Pump generates the API calls that are invoked
by the expdp and impdp utilities, but unlike the utilities it makes it possible to see
the scripts and if desired copy, save, and edit them To reach the Data Pump facilities,
from the database home page select the Data Movement tab In the Move Row Data
section, there are four links that will launch wizards:
• Export to Export Files Define Data Pump export jobs.
• Import from Export Files Define Data Pump import jobs.
• Import from Database Define a Data Pump network mode import.
• Monitor Export and Import Jobs Attach to running jobs to observe their
progress, to pause or restart them, or to modify their operation
The final stage of each wizard gives the option to see the PL/SQL code that is being
generated The job is run by the Enterprise Manager job system, either immediately or
according to a schedule Figure 23-2 shows this final step of scheduling a simple export
job of the HR.REGIONS table
Figure 23-2
The final step of the
Database Control
Data Pump Export
Wizard
Trang 9Exercise 23-2: Perform a Data Pump Export and Import In this exercise, you will carry out a Data Pump export and import using Database Control
1 Connect to your database as user SYSTEM with SQL*Plus, and create a table
to use for the exercise:
create table ex232 as select * from all_users;
2 Connect to your database as user SYSTEM with Database Control Navigate
to the Export Wizard: select the Data Movement tab from the database home page, then the Export To Export Files link in the Move Row Data section
3 Select the radio button for Tables Enter your operating system username and password for host credentials (if these have not already been saved as preferred credentials) and click CONTINUE
4 In the Export: Tables window, click ADD and find the table SYSTEM.EX232 Click NEXT
5 In the Export: Export Options window, select the directory SYSTEM_DMP (created in Exercise 23-1) as the Directory Object for Optional Files Click NEXT
6 In the Export: Files window, choose the directory SYSTEM_DMP and click NEXT
7 In the Export: Schedule window, give the job a name and click NEXT to run the job immediately
8 In the Review window, click SUBMIT JOB
9 When the job has completed, study the log file that will have been created
in the operating directory mapped onto the Oracle directory SYSTEM_DMP Note the name of the Data Pump file EXPDAT01.DMP produced in the directory
10 Connect to the database with SQL*Plus, and drop the table:
drop table system.ex232;
11 In Database Control, select the Data Movement tab from the database home
page, then the Import from Export Files link in the Move Row Data section
12 In the Import: Files window, select your directory and enter the filename
noted in Step 9 Select the radio button for Tables Enter your operating system username and password for host credentials (if these have not already been saved as preferred credentials) and click CONTINUE
13 In the Import: Tables window, click ADD Search for and select the SYSTEM EX232 table Click SELECT and NEXT
14 In the Import: Re-Mapping window, click NEXT
15 In the Import: Options window, click NEXT
16 In the Import: Schedule window, give the job a name and click NEXT
17 In the Import: Review window, click SUBMIT JOB
18 When the job has completed, confirm that the table has been imported by
querying it from your SQL*Plus session
Trang 10Tablespace Export and Import
A variation on Data Pump export/import is the tablespace transport capability This is a
facility whereby entire tablespaces and their contents can be copied from one database
to another This is the routine:
1 Make the source tablespace(s) read only
2 Use Data Pump to export the metadata describing the tablespace(s) and the
contents
3 Copy the datafile(s) and Data Pump export file to the destination system
4 Use Data Pump to import the metadata
5 Make the tablespace(s) read-write on both source and destination
An additional step that may be required when transporting tablespaces from one
platform to another is to convert the endian format of the data A big-endian platform
(such as Solaris on SPARC chips) stores a multibyte value such as a 16-bit integer with
the most significant byte first A little-endian platform (such as Windows on Intel
chips) stores the least significant byte first To transport tablespaces across platforms
with a different endian format requires converting the datafiles: you do this with the
RMAN command CONVERT
To determine the platform on which a database is running, query the column
PLATFORM_NAME in V$DATABASE Then to see the list of currently supported
platforms and their endian-ness, query the view V$TRANSPORTABLE_PLATFORM:
orcl > select * from v$transportable_platform order by platform_name;
PLATFORM_ID PLATFORM_NAME ENDIAN_FORMAT
-
6 AIX-Based Systems (64-bit) Big
16 Apple Mac OS Big
19 HP IA Open VMS Little
15 HP Open VMS Little
5 HP Tru64 UNIX Little
3 HP-UX (64-bit) Big
4 HP-UX IA (64-bit) Big
18 IBM Power Based Linux Big
9 IBM zSeries Based Linux Big
13 Linux 64-bit for AMD Little
10 Linux IA (32-bit) Little
11 Linux IA (64-bit) Little
12 Microsoft Windows 64-bit for AMD Little
7 Microsoft Windows IA (32-bit) Little
8 Microsoft Windows IA (64-bit) Little
20 Solaris Operating System (AMD64) Little
17 Solaris Operating System (x86) Little
1 Solaris[tm] OE (32-bit) Big
2 Solaris[tm] OE (64-bit) Big
19 rows selected.
Database Control has a wizard that takes you through the entire process of
transporting a tablespace (or several) From the database home page, select the Data
Movement tab and then the Transport Tablespaces link in the Move Database Files