SQL VISUAL QUICKSTART GUIDE- P44 ppsx

Listing 15.4 defines the sequence shown in Figure 15.4.. The SQL standard provides the built-in function NEXT VALUE FOR to increment a sequence value, as in: INSERT INTO shipment part_nu

Trang 1

Listing 15.4 defines the sequence shown

in Figure 15.4 You can use a sequence

generator in a few ways The SQL standard

provides the built-in function NEXT VALUE FOR

to increment a sequence value, as in:

INSERT INTO shipment(

part_num,

desc,

quantity)

VALUES(

NEXT VALUE FOR part_seq,

‘motherboard’,

5);

If you’re creating a column of unique

values, you can use the keyword IDENTITY

to define a sequence right in the CREATE

TABLEstatement:

CREATE TABLE parts (

part_num INTEGER AS

IDENTITY(INCREMENT BY 1

MINVALUE 1 MAXVALUE 10000 START WITH 1

NO CYCLE), desc AS VARCHAR(100),

quantity INTEGER;

This table definition lets you omit NEXT

VALUE FORwhen you insert a row:

INSERT INTO shipment(

desc,

quantity)

VALUES(

‘motherboard’,

5);

SQL also provides ALTER SEQUENCEand

DROP SEQUENCEto change and remove

sequence generators

Listing 15.4 Create a sequence generator for the

consecutive integers 1 to 10,000 See Figure 15.4 for the result.

CREATE SEQUENCE part_seq INCREMENT BY 1 MINVALUE 1 MAXVALUE 10000 START WITH 1

NO CYCLE;

Listing

1 2 3

9998 9999 10000

Figure 15.4 The sequence that Listing 15.4 generates.

✔ Tip

■ Oracle, DB2, and PostgreSQL

support CREATE SEQUENCE,ALTER SEQUENCE, and DROP SEQUENCE In Oracle,

useNOCYCLEinstead of NO CYCLE See your DBMS documentation to see how sequences are used in your system Most DBMSs don’t support IDENTITY columns because they have other (pre-SQL:2003) ways that define columns with unique values See Table 3.18 in “Unique

Identifiers” in Chapter 3 PostgreSQL’s

generate_series()function offers a quick way to generate numbered rows

Trang 2

A one-column table containing a sequence

of consecutive integers makes it easy to solve problems that would otherwise be difficult with SQL’s limited computational

power Sequence tables aren’t really part of

the data model—they’re auxiliary tables that are adjuncts to queries and other “real” tables You can create a sequence table by using one

of the methods just described Alternatively,

you can create one by using Listing 15.5,

which creates the sequence table seqby cross-joining the intermediate table temp09 with itself The CASTexpression concatenates digit characters into sequential numbers and then casts them as integers You can drop temp09after seqis created Figure 15.5

shows the result The table seqcontains the integer sequence 0, 1, 2, …, 9999 You can shrink or grow this sequence by changing the SELECTandFROMexpressions in the INSERT INTO seqstatement

Listing 15.5 Create a one-column table that contains

consecutive integers See Figure 15.5 for the result.

CREATE TABLE temp09 (

i CHAR(1) NOT NULL PRIMARY KEY

);

INSERT INTO temp09 VALUES('0');

CREATE TABLE seq (

i INTEGER NOT NULL PRIMARY KEY

);

INSERT INTO seq

SELECT CAST(t1.i || t2.i ||

t3.i || t4.i AS INTEGER)

FROM temp09 t1, temp09 t2,

temp09 t3, temp09 t4;

DROP TABLE temp09;

Listing

i

-0

1

2

3

4

9996

9997

9998

9999

Figure 15.5 Result of Listing 15.5.

Trang 3

A sequence table is especially useful for

enumerative and datetime functions

Listing 15.6 lists the 95 printable

charac-ters in the ASCII character set (if that’s the

character set in use) See Figure 15.6 for

the result

Listing 15.7 adds monthly intervals to

today’s date (7-March-2005) for the next six

months See Figure 15.7 for the result This

example works on Microsoft SQL Server;

the other DBMSs have similar functions that

increment dates

Sequence tables are handy for normalizing

data that you’ve imported from a

non-relational environment such as a spreadsheet

Suppose that you have the following

non-normalized table, named au_orders, showing

the order of the authors’ names on each

book’s cover:

title_id author1 author2 author3

———————— ——————— ——————— ———————

T01 A01 NULL NULL

T02 A01 NULL NULL

T03 A05 NULL NULL

T04 A03 A04 NULL

T05 A04 NULL NULL

T06 A02 NULL NULL

T07 A02 A04 NULL

T08 A06 NULL NULL

T09 A06 NULL NULL

T10 A02 NULL NULL

T11 A06 A03 A04

T12 A02 NULL NULL

T13 A01 NULL NULL

Listing 15.8 cross-joins au_orderswith seq

to produce Figure 15.8 You can DELETEthe

result rows with nulls in the column au_id,

leaving the result set looking like the table

title_authorsin the sample database

Note that Listing 15.8 does the reverse of

Listing 8.18 in Chapter 8

Listing 15.6 List the characters associated with a set

of character codes See Figure 15.6 for the result.

SELECT

i AS CharCode, CHR(i) AS Ch FROM seq WHERE i BETWEEN 32 AND 126;

Listing

CharCode Ch

32

33 !

34 "

35 #

36 $

37 %

38 &

39 '

40 (

41 )

42 *

43 +

44 ,

45

-46

47 /

48 0

49 1

50 2

51 3

52 4

Trang 4

Listing 15.7 Increment today’s date to six months

hence, in one-month intervals See Figure 15.7 for

the result.

SELECT

i AS MonthsAhead,

DATEADD("m", i, CURRENT_TIMESTAMP)

AS FutureDate

FROM seq

WHERE i BETWEEN 1 AND 6;

Listing

MonthsAhead FutureDate

-

-1 2005-04-07

2 2005-05-07

3 2005-06-07

4 2005-07-07

5 2005-08-07

6 2005-09-07

Listing 15.8 Normalize the table au_orders See

Figure 15.8 for the result.

SELECT title_id,

(CASE WHEN i=1 THEN '1'

WHEN i=2 THEN '2'

WHEN i=3 THEN '3'

END) AS au_order,

(CASE WHEN i=1 THEN author1

WHEN i=2 THEN author2

WHEN i=3 THEN author3

END) AS au_id

FROM au_orders, seq

WHERE i BETWEEN 1 AND 3

ORDER BY title_id, i;

Listing

title_id au_order au_id - - -T01 1 A01 T01 2 NULL T01 3 NULL T02 1 A01 T02 2 NULL T02 3 NULL T03 1 A05 T03 2 NULL T03 3 NULL T04 1 A03 T04 2 A04 T04 3 NULL T05 1 A04 T05 2 NULL T05 3 NULL T06 1 A02 T06 2 NULL T06 3 NULL T07 1 A02 T07 2 A04 T07 3 NULL T08 1 A06 T08 2 NULL T08 3 NULL T09 1 A06 T09 2 NULL T09 3 NULL T10 1 A02 T10 2 NULL T10 3 NULL T11 1 A06 T11 2 A03 T11 3 A04 T12 1 A02 T12 2 NULL T12 3 NULL T13 1 A01 T13 2 NULL T13 3 NULL

Trang 5

✔ Tips

■ If you have a column of sequential inte-gers that’s missing some numbers, you can fill in the gaps by EXCEPTing the column with a sequence column See

“Finding Different Rows with EXCEPT” earlier in this chapter

■ To run Listing 15.5 in Microsoft

Access and Microsoft SQL Server, change the CASTexpression to: t1.i + t2.i + t3.i + t4.i

To run Listing 15.5 in MySQL, change

the CASTexpression to:

CONCAT(t1.i, t2.i, t3.i, t4.i)

To run Listing 15.6 in Microsoft SQL

Server and MySQL, change CHR(i)

toCHAR(i)

To run Listing 15.8 in Microsoft Access,

change the CASEexpressions to Switch() function calls (see the DBMS Tip in

“Evaluating Conditional Values with CASE” in Chapter 5):

(Switch(i=1, ‘1’, i=2, ‘2’, i=3, ‘3’)) AS au_order, (Switch(i=1, author1, i=2, author2, i=3, author3)) AS au_id

Calendar Tables

Another useful auxiliary table is a calendar

table One type of calendar table has a

primary-key column that contains a row

for each calendar date (past and future)

and other columns that indicate the

date’s attributes: business day, holiday,

international holiday, fiscal-month end,

fiscal-year end, Julian date,

business-day offsets, and so on Another type of

calendar table stores the starting and

ending dates of events (in the columns

event_id,start_date, and end_date, for

example) Spreadsheets have more

date-arithmetic functions than DBMSs, so it

might be easier to build a calendar table

in a spreadsheet and then import it as a

database table

Even if your DBMS has plenty of

date-arithmetic functions, it might be faster to

look up data in a calendar table than to

call these functions in a query

Trang 6

Finding Sequences, Runs, and Regions

A sequence is a series of consecutive values without gaps A run is like a sequence, but

the values don’t have to be consecutive, just increasing (that is, gaps are allowed)

A region is an unbroken series of values that

all are equal

Finding these series requires a table that has

at least two columns: a primary-key column that holds a sequence of consecutive inte-gers and a column that holds the values of interest The table temps(Listing 15.9 and

Figure 15.9) shows a series of high

temper-atures over 15 days

As a set-oriented language, SQL isn’t a good choice for finding series of values The fol-lowing queries won’t run very fast, so if you have a lot of data to analyze, you might con-sider exporting it to a statistical package or using a procedural host language

✔ Tip

■ These queries are based on the ideas in David Rozenshtein, Anatoly Abramovich,

and Eugene Birger’s Optimizing

Transact-SQL: Advanced Programming Techniques

(SQL Forum Press) You can use the queries’ common framework to create similar queries that find other series

of values

Listing 15.9 List all the column in the table temps

See Figure 15.9 for the result.

SELECT *

FROM temps;

Listing

id hi_temp

-1 49

2 46

3 48

4 50

5 50

6 50

7 51

8 52

9 53

10 50

11 50

12 47

13 50

14 51

15 52

Trang 7

Listing 15.10 finds all the sequences in

tempsand lists each sequence’s start

position, end position, and length See

Figure 15.10 for the result This query

is a lot to take in at first glance, but it’s

easier to understand it if you look at it

piecemeal Then you’ll be able to

under-stand the rest of the queries in this section

The subquery’s WHEREclause subtracts id

fromhi_temp, yielding (internally):

id hi_temp diff

—— ——————— ————

1 49 48

2 46 44

3 48 45

4 50 46

5 50 45

6 50 44

7 51 44

8 52 44

9 53 44

10 50 40

11 50 39

12 47 35

13 50 37

14 51 37

15 52 37

In the column diff, note that successive

differences are constant for sequences

(50 – 6 = 44, 51 – 7 = 44, and so on) To find

neighboring rows, the outer query cross-joins

two instances of the same table (t1andt2), as

described in “Calculating Running Statistics”

earlier in this chapter The condition

WHERE (t1.id < t2.id)

guarantees that any t1row represents an

element with an index (id) lower than the

correspondingt2row

Listing 15.10 List the starting point, ending point,

and length of each sequence in the table temps See Figure 15.10 for the result.

SELECT t1.id AS StartSeq, t2.id AS EndSeq, t2.id - t1.id + 1 AS SeqLen FROM temps t1, temps t2 WHERE (t1.id < t2.id) AND NOT EXISTS(

SELECT * FROM temps t3 WHERE (t3.hi_temp - t3.id <> t1.hi_temp - t1.id AND t3.id BETWEEN t1.id AND t2.id)

OR (t3.id = t1.id - 1 AND t3.hi_temp - t3.id = t1.hi_temp - t1.id)

OR (t3.id = t2.id + 1 AND t3.hi_temp - t3.id = t1.hi_temp - t1.id) );

Listing

StartSeq EndSeq SeqSize - -

-6 9 4

13 15 3

Trang 8

The subquery detects sequence breaks with the condition

t3.hi_temp - t3.id <> t1.hi_temp - t1.id

The third instance of temps(t3) in the

sub-query is used to determine whether any row

in a candidate sequence (t3) has the same

difference as the sequence’s first row (t1)

If so, it’s a sequence member If not, the can-didate pair (t1andt2) is rejected

The last two ORconditions determine whether the candidate sequence’s borders can expand

A row that satisfies these conditions means the current candidate sequence can be extended and is rejected in favor of a longer one

✔ Tip

■ To find only sequences larger than n

rows, add the WHEREcondition

AND (t2.id - t1.id) >= n - 1

To change Listing 15.10 to find all

sequences of four or more rows, for

example, replace

with

AND (t2.id - t1.id) >= 3

The result is:

StartSeq EndSeq SeqSize

———————— —————— ———————

6 9 4

Trang 9

Listing 15.11 finds all the runs in tempsand

lists each run’s start position, end position,

and length See Figure 15.11 for the result.

The logic of this query is similar to that

of the preceding one but accounts for run

values needing only to increase, not

(neces-sarily) be consecutive The fourth instance

oftemps(t4) is needed because there doesn’t

have to be a constant difference between id

andhi_tempvalues The subquery

cross-joins t3andt4to check rows in the middle

of a candidate run, whose borders are t1

andt2 For every element between t1andt2

(limited by BETWEEN),t3and its predecessor

t4are compared to see whether their values

are increasing

Listing 15.11 List the starting point, ending point, and

length of each run in the table temps See Figure 15.11 for the result.

SELECT t1.id AS StartRun, t2.id AS EndRun, t2.id - t1.id + 1 AS RunLen FROM temps t1, temps t2 WHERE (t1.id < t2.id) AND NOT EXISTS(

SELECT * FROM temps t3, temps t4 WHERE (t3.hi_temp <= t4.hi_temp AND t4.id = t3.id - 1 AND t3.id BETWEEN t1.id + 1 AND t2.id)

OR (t3.id = t1.id - 1 AND t3.hi_temp <

t1.hi_temp)

OR (t3.id = t2.id + 1 AND t3.hi_temp >

t2.hi_temp) );

Listing

StartRun EndRun RunLen

-2 4 3

6 9 4

12 15 4

Trang 10

Listing 15.12 finds all regions in tempswith

a high temperature of 50 and lists each region’s start position, end position, and

length See Figure 15.12 for the result.

✔ Tips

■ To rank regions by length, add an ORDER

BYclause to the outer query:

ORDER BY t2.id - t1.id DESC

■ To list the individual ids that fall in a region (with value 50), type:

SELECT DISTINCT t1.id FROM temps t1, temps t2 WHERE t1.hi_temp = 50 AND t2.hi_temp = 50 AND ABS(t1.id - t2.id) = 1;

The standard function ABS(), which all DBMSs support, returns the absolute value of its argument The result is:

id ––

4 5 6 10 11

Listing 15.12 List the starting point, ending point, and

length of each region (with value 50) in the table

temps See Figure 15.12 for the result.

SELECT

t1.id AS StartReg,

t2.id AS EndReg,

t2.id - t1.id + 1 AS RegLen

FROM temps t1, temps t2

AND NOT EXISTS(

SELECT *

FROM temps t3

WHERE (t3.hi_temp <> 50

AND t3.id BETWEEN

t1.id AND t2.id)

OR (t3.id = t1.id - 1

AND t3.hi_temp = 50)

OR (t3.id = t2.id + 1

AND t3.hi_temp = 50)

);

Listing

StartReg EndReg RegLen

-4 6 3

10 11 2

Định dạng
Số trang	10
Dung lượng	165,64 KB