Types of Output FilesGenerally speaking, there are four types of output files you can produce when extracting data with SQL*Plus: Delimited columns Fixed-width columns DML Data Manipulat
Trang 1Types of Output Files
Generally speaking, there are four types of output files you can produce when extracting data with SQL*Plus:
Delimited columns
Fixed-width columns
DML (Data Manipulation Language)
DDL (Data Definition Language)
There may be variations on these typesdelimited files, for example, could be either tab-delimited or comma-delimited, and you may be able to dream up some novel format, but, generally speaking, these are the most useful
Delimited Files
Delimited files use a special text character to separate each data value in a record Typically the delimiter is either a tab
or a comma, but any character may be used Here's an example of a comma-delimited file containing employee
information (ID, rate, hire date, and name):
1,135, 15-Nov-1961, Jonathan Gennick
2,95, 10-Jan-1991, Bohdan Khmelnytsky
3,200, 17-Jul-1984, Ivan Mazepa
This example illustrates a very commonly used format called the CSV (Comma Separated Values) format
CSV-formatted files use commas to delimit the values, and they enclose text fields within quotes The CSV format is
recognized by most spreadsheets and desktop databases, and is the format for the examples in this chapter
Fixed-Width Files
A fixed-width file contains data in columns, where each column is a certain width, and all values in that column are the same width Here's an example of the same employee data shown earlier, but formatted into fixed-width columns:
00113515-Nov-1961Jonathan Gennick
00209510-Jan-1991Bohdan Khmelnytsky
00320017-Jul-1984Ivan Mazepa
In the above example, the columns abut each other with no space in between If you aren't trying to match an existing file layout, you may prefer to allow at least one space between columns to aid readability
Trang 2
DML Files.
A DML file contains Data Manipulation Language statements, such as INSERT, DELETE, UPDATE, and SELECT This type
of file can be used as a quick and dirty way of extracting data from one database for insertion into another Here, for example,
is a file of INSERT statements that recreate some employee data:
INSERT INTO employee
(employee_ID,employee_billing_rate,employee_hire_date,employe e_name)
VALUES (1,135,TO_DATE(15-Nov-1961, DD-MON-YYYY), Jonathan Gennick);
INSERT INTO employee
(employee_ID,employee_billing_rate,employee_hire_date,employee_name)
VALUES (2,95, TO_DATE (10-Jan-1991, DD-MON-YYYY), Bohdan Khmelnytsky);
INSERT INTO employee
(employee_ID, employee_billing_rate, employee_hire_date, employee_name)
VALUES (3,200,TO_DATE(17-Jul-1984,DD-MON-YYYY),Ivan Mazepa);
You can generate these INSERT statements, based on existing data, using SQL*Plus and SQL Then you can easily apply those inserts to another database This may not seem to be the most efficient way of moving data around, but if you have a low volume of data, such as a few dozen records that you want to send off to a client, it works very well
DDL Files
A DDL file contains Data Definition Language statements It's not much different from a DML file, except that the goal is to modify your database rather than to extract data for another application Say, for example, that you need to create public
synonyms for all your tables You could use a SQL query to generate the needed CREATE PUBLIC SYNONYM statements, spool those to a file, and then execute that file You will find a brief example showing how to do this later in this chapter
Chapter 7, Advanced Scripting, explores this subject in greater depth.
Limitations of SQL*Plus
When using SQL*Plus to extract data, there are some limitations to keep in mind Because SQL*Plus was designed as a
reporting tool and not a data extraction tool, the output must be text If you need to write a file containing packed-decimal data
or binary data, SQL*Plus is not the tool to use
A second SQL*Plus limitation you may encounter when extracting data is the linesize This maximum output line supported
by SQL*Plus varies from one platform to the next, but under Windows NT/95 it is 32,767 characters
The final thing to keep in mind are the datatypes you can extract SQL*Plus was really designed to work with the traditional scalar datatypes, such as NUMBER,
Trang 3
VARCHAR2, etc It can't be used to extract large object types, such as BLOBs and CLOBs, nor can it handle other large, complex data types, such as VARRAYs and Oracle8 objects Here is a list of the datatypes you can easily extract using SQL*Plus:
VARCHAR2
NVARCHAR2
NUMBER
DATE
ROWID
CHAR
NCHAR
These types can be converted to character strings, and you can easily concatenate those strings together to produce whatever output format you want, whether that be comma-delimited, tab-delimited, or fixed-width Text is the key to extracting data with SQL*Plus Think text If you have nonalphanumeric data to extract, you should use a different tool Extracting the Data
To write a script to extract data from Oracle and place it in a flat file, follow these steps:
1 Formulate the query
2 Format the data
3 Spool the extract to a file
4 Make the script user-friendly
The last step, making the script user-friendly, isn't really necessary for a one-off effort However, if it's an extraction you are going to perform often, it's worth taking a bit of time to make it easy and convenient to use
Formulate the Query
The very first step in extracting data is to figure out just what data you need to extract You need to develop a SQL query that will return the data you need For the example in this chapter, we will just extract the employee data, so the query will look like this:
SELECT employee_id,
employee_name,
employee_hire_date,
employee_termination_date,
employee_billing_rate
FROM employee;
Trang 4
Keep in mind that you can write queries that are much more complicated than those shown here If necessary, you can join several tables together, or you can UNION several queries together
Format the Data
The next step, once you have your query worked out, is to format the data to be extracted The way to do this is to modify your query so it returns a single, long expression that combines the columns together in the format that you want
in your output file It's often necessary to include text literals in the SELECT statement as part of this calculation For example, if you want to produce a comma-delimited file, you will need to include those commas in your SELECT statement
Be sure to keep in mind the ultimate destination of the data If your purpose is to pull data for someone to load into a spreadsheet, you will probably want to use a comma-delimited format If you are passing data to another application, you may find it easier to format the data in fixed-width columns Dates require some extra thought With Oracle's
built-in TO_CHAR function, you can format a date any way you want Be sure, though, to use a format easily recognized by the application that needs to read that date
Comma-delimited
To produce a comma-delimited text file, you need to do two things First, you need to add commas between each field Second, you need to enclose text fields within quotes The final query looks like this:
SELECT TO_CHAR(employee_id) ¦¦ ,
¦¦ ¦¦ employee_name ¦¦ ,
¦¦TO_CHAR(employee_hire_date,MM/DD/YYYY) ¦¦ ,
¦¦ TO_CHAR(employee_termination_date, MM/DD/YYYY) ¦¦ ,
¦¦ TO_CHAR(employee_billing_rate)
FROM employee;
Oracle's TO_CHAR function has been used to explicitly convert numeric fields to text strings TO_CHAR is also used
to convert date fields to text, and a date format string is included to get the dates into MM/DD/YYYY format SQL's concatenation operator, ¦¦, is used to concatenate all the fields together into one long string, and you can see that
commas are included between fields The output from this query will look like this:
107, Bohdan Khmelnytsky,01/02/1998,,45
108, Pavlo Chubynsky,03/01/1994,11/15/1998,220
110, Ivan Mazepa,04/04/1998,09/30/1998,84
111, Taras Shevchenko,08/23/1976,,100
Trang 5
Notice that in addition to the commas, the employee_name field has been enclosed in quotes This is done to
accommodate the possibility that someone's name will contain a comma Most commercial programs that load comma-delimited data will allow text strings to be optionally enclosed in quotes
You can use the same technique to generate tab-delimited data Instead of a comma, use CHR(9) to put a tab character between each field CHR() is an Oracle SQL function that converts an ASCII code into a character The ASCII code uses the value 9 to represent a tab character
Fixed-width
The easiest way to produce an output file with fixed-width columns is to use the SQL*Plus COLUMN command to format the output from a standard SQL query The following example shows one way to dump the employee data in a fixed-width column format:
COLUMN employee_id FORMAT 09999 HEADING
COLUMN employee_name FORMAT A20 HEADING TRUNCATED
COLUMN employee_hire_date FORMAT A10 HEADING
COLUMN employee_termination_date FORMAT A10 HEADING
COLUMN employee_billing_rate FORMAT S099.99 HEADING
SELECT employee_id,
employee_name,
TO_CHAR(employee_hire_date,MM/DD/YYYY) employee_hire_date,
TO_CHAR(employee_termination_date, MM/DD/YYYY)
employee_termination_date,
employee_billing_rate
FROM employee;
Notice some things about the example:
The heading for each column has explicitly been set to a null string That's because, in the case of a numeric column, SQL*Plus will make the column wide enough to accommodate the heading You don't want that behavior You want the format specification to control the column width
Both numeric fields have been formatted to show leading zeros Most programs that read fixed-width data, such as COBOL programs, will expect this
The employee_billing_rate column will always display a leading sign character, either a + or a - The employee_id field,
on the other hand, displays a leading sign only for negative values Positive employee_id values are output with a
leading space Neither field would ever be negative, and these two approaches were taken just for illustrative purposes Finally, notice that the TRUNCATED option was used to format the employee_name field That's because the
employee_name can be up to 40 characters long in the database If TRUNCATED were not specified, any names that
Trang 6
pened to be longer than 20 characters would wrap to a second linedefinitely not what you want to happen in a flat file Here's how the output from the example will look:
00107 Bohdan Khmelnytsky 01/02/1998 +045.00
00108 Pavlo Chubynsky 03/01/1994 11/15/1998 +220.00
00110 Ivan Mazepa 04/04/1998 09/30/1998 +084.00
00111 Taras Shevchenko 08/23/1976 +100.00
Each column in the output is separated by one space, because that's the SQL*Plus default If you like, you can use the SET COLSEP command to change the number of spaces or eliminate them entirely To run the columns together, you could eliminate the space between columns by setting the column separator to a null string For example:
SET COLSEP
Now the output will look like this:
00107Bohdan Khmelnytsky 01/02/1998 +045.00
00108Pavlo Chubynsky 03/01/199411/15/1998+220.00
00110Ivan Mazepa 04/04/199809/30/1998+084.00
00111Taras Shevchenko 08/23/1976 +100.00
You can use any column separation string that you like, and you aren't limited to just one character
The command SET SPACE 0 has the same effect as SET COLSEP it eliminates the space between columns Oracle considers SET SPACE to be an obsolete command, but
some people still use it
Using SQL*Plus to format fixed-width output works best when you have some control over the format expected by the destination of that data If you are also writing the program to load the data somewhere else, of course you can code it to match what you can easily produce with SQL*Plus Sometimes though, you need to match an existing format required
by the destination; one you cannot change Depending on your exact requirements, it may be easier to code one large expression in your SQL statement, and use Oracle's built-in functions to gain more control over the output The
following example produces the same output as before, but without a leading sign character in the numeric fields:
SELECT LTRIM(TO_CHAR(employee_id,09999))
¦¦ SUBSTR(RPAD(employee_name,20, ),1,20)
Trang 7
¦¦ TO_CHAR(employee_hire_date, MM/DD/YYYY)
¦¦ NVL(TO_CHAR(employee_termination_date, MM/DD/YYYY), ) ¦¦ LTRIM(TO_CHAR(employee_billing_rate, 099.99))
FROM employee;
The output from this example looks like this:
00107Bohdan Khmelnytsky 01/02/1998 045.00
00108Pavlo Chubynsky 03/01/199411/15/1998220.00
00110Ivan Mazepa 04/04/199809/30/1998084.00
00111Taras Shevchenko 08/23/1976 100.00
Look carefully, and you will see that neither the employee_id field nor the employee_billing_rate field has a leading sign Not only do they not have a leading sign, there isn't even space left for one The LTRIM function was used to remove it
There is a wide variety of built-in Oracle functions available Add to that the ability to write your own, and you should
be able to generate output in any conceivable format
DML
If you are extracting data from Oracle in order to move it to another database, and if the volume of data isn't too high, you can use SQL*Plus to generate a file of INSERT statements Here is a query to generate INSERT statements for each employee record:
SELECT INSERT INTO employee ¦¦ chr(10)
¦¦ (employee_id, employee_name, employee_hire_date ¦¦ chr(10)
¦¦ ,employee_termination_date, employee_billing_rate) ¦¦ chr(10)
¦¦ VALUES ( ¦¦ TO_CHAR(employee_id) ¦¦ , ¦¦ chr(10)
¦¦ ¦¦ employee_name ¦¦ , ¦¦ chr(10)
¦¦ TO_DATE(
¦¦ TO_CHAR(employee_hire_date,MM/DD/YYYY)
¦¦ ,MM/DD/YYYY), ¦¦ chr(10)
¦¦ TO_DATE (
¦¦ TO_CHAR(employee_termination_date,MM/DD/YYYY)
¦¦ ,MM/DD/YYYY ), ¦¦ chr(10)
¦¦ ¦¦ TO_CHAR(employee_billing_rate) ¦¦ );
FROM employee;
As you can see, this type of query can get a bit hairy You have to deal with nested, quoted strings; you have to
concatenate everything together; and you have to place line breaks so the output at least looks decent The doubled-up quotes you see in the above statement are there because single quotes are required in the final output So, for example, the string MM/DD/YYYY ) will resolve to MM/DD/YYYY) when the SELECT statement is executed The SELECT statement just shown will produce the following INSERT statement for each employee record:
Trang 8
INSERT INTO employee
(employee_id, employee_name, employee_hire_date,
,employee_termination_date, employee_billing_rate)
VALUES (111,
Taras Shevcheriko,
TO_DATE( 08/23/1976, MM/DD/YYYY),
TO_DATE(, MM/DD/YYYY),
100);
This is not a technique I use very often, because it can be frustrating to get the SQL just right I use it most often on code tables and other small tables with only two or three columns I also use it sometimes when I'm sending data to a client That way I just send one file, and my client doesn't have to mess with SQL*Loader or Oracle's Import utility
DDL
Another twist on using SQL to write SQL is to generate commands that help you maintain your database Such
commands are referred to as Data Definition Language, or DDL, commands Using SQL*Plus to generate DDL scripts can help in automating many database administration tasks, and is often well worth the effort The following command, for example, generates CREATE PUBLIC SYNONYM commands for each table you own:
SELECT CREATE PUBLIC SYNONYM ¦¦ table_name
¦¦ for ¦¦ user ¦¦ , ¦¦ table_name ¦¦ ;
FROM USER_TABLES;
It's not unusual to need public synonyms, and if you have a large number of tables, you can save yourself a lot of typing
by letting SQL*Plus do the work for you The output from the above command looks like this:
CREATE PUBLIC SYNONYM EMPLOYEE for JONATHAN EMPLOYEE;
CREATE PUBLIC SYNONYM PROJECT for JONATHAN PROJECT;
CREATE PUBLIC SYNONYM PROJECT_HOURS for JONATHAN PROJECT_HOURS;
Once you have spooled the above commands to a file, you can then execute that file to create the synonyms In addition
to one-off tasks like creating synonyms, you can use SQL*Plus to generate DDL commands for use by ongoing
maintenance tasks Going beyond that, you can even use SQL*Plus to generate operating-system script files
Spool the Extract to a File
Once you have your query worked out and the data formatted the way you need it, it's time to spool your output to a file
In order to get a clean file, there are four things you must do
Trang 9
1 Set the linesize large enough to accommodate the longest possible line Pay close attention to this if you are
generating comma-delimited data You need to allow for the case where each field is at its maximum size Use the SET LINESIZE command for this
2 Turn off all pagination features You can use SET PAGESIZE 0 for this purpose It turns off all column headings, page headings, page footers, page breaks, etc
3 Turn feedback off with the SET FEEDBACK OFF command
4 Use the SET TRIMSPOOL ON command to eliminate trailing spaces in the output data file Use this for
comma-delimited output and when you generate a file of SQL statements Do not use this command if you are generating a file
with fixed-width columns
The following script generates a clean, comma-delimited file containing employee information:
This script extracts data from the employee
table and writes it to a text file in
a comma-delimited format
Set the linesize to accommodate the longest possible line
SET LINESIZE 136
Turn off all page headings, column headings, etc
SET PAGESIZE 0
Turn off feedback
SET FEEDBACK OFF
Eliminate trailing blanks at the end of a line
SET TRIMSPOOL ON
SET TERMOUT OFF
SPOOL C:\A\EMP_DATA.CSV
SELECT TO_CHAR(employee_id) ¦¦ ,
¦¦ ¦¦ employee_name ¦¦ ,
¦¦ TO_CHAR(employee_hire_date, MM/DD/YYYY) ¦¦ ,
¦¦ TO_CHAR(employee_termination_date, MM/DD/YYYY) ¦¦ ,
¦¦ TO_CHAR (employee_billing_rate)
FROM employee;
SPOOL OFF
Restore the default settings
SET LIMESIZE 80
SET PAGESIZE 24
SET FEEDBACK ON
SET TERMOUT ON
Trang 10
The SPOOL command is used to send the output to C:\A\EMP_DATA.CSV, and SET TERMOUT OFF is used to
disable the display while the data is being written to the file Here's how you run this script:
SQL> @C\EMPLOYEE_EXTRACT
SQL>
The resulting output file looks like this:
101,Jonathan Gennick,11/15/1961,,169
102,Jenny Gennick,09/16/1964,05/05/1998,135
104,Jeff Gennick,12/29/1987,04/01/1998,99
105,Horace Walker,06/15/1998,,121
107,Bohdan Khmelnytsky,01/02/1998,,45
108,Pavlo Chubynsky,03/01/1994,11/15/1998,220
110,Ivan Mazepa,04/04/1998,09/30/1998,84
111,Taras Shevchenko,08/23/1976,,100
112,Hermon Goche,11/15/1961,04/04/1998,70
113,Jacob Marley,03/03/1998,10/31/1998,300
211,Taras Shevchenko,08/23/1976,,100
You can see that the output contains no page or column headings of any kind It's just plain, comma-delimited data, with one line for each record in the table
Make Your Extract Script User-Friendly
There are at least two things you can do to improve on the extract script just shown in the previous section First, it might be nice to display a brief message to remind the user of what the script does This will serve to give the user confidence that he or she has indeed started the correct script The following PROMPT commands, added at the
beginning of the script, should serve the purpose:
PROMPT
PROMPT This script creates a comma-delimited text file containing
PROMPT employee data All records from the employee table will
PROMPT be output to this file
PROMPT
It may not be much, but it always makes me feel better to have some indication that the script I am executing really does what I thought it did For scripts that change data or do something difficult to reverse, I will often include a PAUSE command such as this:
PAUSE Press ENTER to continue, or ctrl-C to abort
Another nice thing to do would be to prompt for the output filename The following ACCEPT command will prompt the user for a filename, and put the user's response into the variable named output_file:
ACCEPT output_file CHAR PROMPT Enter the output filename >