1. Trang chủ
  2. » Công Nghệ Thông Tin

Joe Celko s SQL for Smarties - Advanced SQL Programming P31 pdf

10 119 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 10
Dung lượng 378,75 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

C H A P T E R13 BETWEEN and OVERLAPS Predicates showing that one value lies within a range defined by two other OVERLAPS predicate looks at two time periods defined either by they overla

Trang 2

C H A P T E R

13 BETWEEN and OVERLAPS Predicates

showing that one value lies within a range defined by two other

OVERLAPS predicate looks at two time periods (defined either by

they overlap in time

13.1 The BETWEEN Predicate

value expression> AND <high value expression> is a feature of SQL that is used often enough to deserve special attention It

is also just tricky enough to fool beginning programmers This predicate is actually just shorthand for the expression:

((<low value expression> <= <value expression>) AND (<value expression> <= <high value expression>))

Please note that the end points are included in this definition This predicate works with any data types that can be compared Most programmers miss this fact and use it only for numeric values, but it can be used for character strings and temporal data as well The <high

Trang 3

274 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES

value expression> and <low value expression> can be expressions or constants, but again, programmers tend to use just constants

13.1.1 Results with NULL Values

The results of this predicate with NULL values for <value expression>, <low value expression>, or <high value expression> follow directly from the definition If both <low value expression> and <high value expression> are NULL, the result

is UNKNOWN for any value of <value expression> If <low value expression> or <high value expression> is NULL, but not both

expression> is NULL, the results are UNKNOWN for any values of <low value expression> and <high value expression>

13.1.2 Results with Empty Sets

expression>, the expression will always be FALSE unless the value is

NULL; then it is UNKNOWN That is a bit confusing, since there is no value

to which <value expression> could resolve itself that would

definition:

x BETWEEN 12 AND 15 depends on the value of x

x BETWEEN 15 AND 12 always FALSE

x BETWEEN NULL AND 15 always UNKNOWN NULL BETWEEN 12 AND 15 always UNKNOWN

x BETWEEN 12 AND NULL always UNKNOWN

x BETWEEN x AND x always TRUE

13.1.3 Programming Tips

The BETWEEN range includes the end points, so you have to be careful Here is an example that deals with changing a percent range on a test into a letter grade:

Grades low_score high_score grade

=========================

90 100 'A'

Trang 4

13.2 OVERLAPS Predicate 275

80 90 'B'

70 80 'C'

60 70 'D'

00 60 'F'

However, this will not work when a student gets a grade on the borderlines (90, 80, 70, or 60) One way to solve the problem is to change the table by adding 1 to the low scores Of course, the student who got 90.1 will argue that he should have gotten an ‘A’ and not a ‘B’ If you add 0.01 to the low scores, the student who got 90.001 will argue that he should have gotten an ‘A’ and not a ‘B’, and so forth This is a problem with a continuous variable A better solution might be to change

AND (score > low_score) or simply to ((low_score < score) AND (score <= high_score)) Neither approach will be much different in this example, since few values will fall on the borders between grades and this table is very, very small

As a sidebar, the reader might want to look up an introductory book

to fuzzy logic In that model, an entity can have a degree of membership

in a set, rather than being strictly in or out of the set Some experimental databases use fuzzy logic

predicate the better choice for larger tables of this sort They will keep index values in trees whose nodes hold a range of values (look up a description of the B-Tree family in a computer science book) An

predicate were presented as two comparisons, it might execute them as separate actions against the database, which would be slower

13.2 OVERLAPS Predicate

The OVERLAPS predicate is a feature not yet available in most SQL implementations, because it requires more of the Standard SQL

temporal data features than most implementations have Many

type with the existing date and time features of their products

13.2.1 Time Periods and OVERLAPS Predicate

An INTERVAL is a measure of temporal duration, expressed in units such as days, hours, minutes, and so forth This is how you add or

Trang 5

276 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES

subtract days to or from a date, hours and minutes to or from a time, and

time periods are defined as row values with two columns The first column (the starting time) of the pair is always a <datetime> data type, and the second column (the termination time) is a <datetime> data type that can be used to compute a <datetime> value If the starting and termination times are the same, this is an instantaneous event

The result of the <overlaps predicate> is formally defined as the result of the following expression:

(S1 > S2 AND NOT (S1 >= T2 AND T1 >= T2))

OR (S2 > S1 AND NOT (S2 >= T1 AND T2 >= T1))

OR (S1 = S2 AND (T1 <> T2 OR T1 = T2))

In this expression, S1 and S2 are the starting times of the two time periods, and T1 and T2 are their termination times

are not The principles that we wanted in the standard were:

its end point The reason for this model is that it follows the ISO convention that there is no 24:00 today; midnight is 00:00 tomorrow Half-open durations have closure properties that are useful The concatenation of two half-open durations is a half-open duration

they share a common time period

second term is an instantaneous event (a <datetime> data type), they overlap when the second term is in the time period (but is not the end point of the time period)

they overlap only when they are equal

<datetime> value, the finishing time becomes the starting time and we have an event If the starting time is NULL and the

and starting times are NULL

Trang 6

13.2 OVERLAPS Predicate 277

Please consider how your intuition reacts to these results, when the

begins at 00:00

let’s see what we have to do to handle overlapping times Consider a table

of hotel guests with the days of their stays and a table of special events being held at the hotel The tables might look like this:

CREATE TABLE Guests

(guest_name CHARACTER(30) NOT NULL PRIMARY KEY,

arrival_date DATE NOT NULL,

depart_date DATE NOT NULL,

);

Guests

guest_name arrival_date depart_date

==============================================

'Dorothy Gale' '2005-02-01' '2005-11-01'

'Indiana Jones' '2005-02-01' '2005-02-01'

'Don Quixote' '2005-01-01' '2005-10-01'

'James T Kirk' '2005-02-01' '2005-02-28'

'Santa Claus' '2005-12-01' '2005-12-25'

CREATE TABLE Celebrations

(eventname CHARACTER(30) PRIMARY KEY,

start_date DATE NOT NULL,

finish_date DATE NOT NULL,

.);

Celebrations

celeb_name start_date finish_date

==================================================

'Apple Month' '2005-02-01' '2005-02-28'

'Christmas Season' '2005-12-01' '2005-12-25'

Trang 7

278 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES

'Garlic Festival' '2005-01-15' '2005-02-15' 'National Pear Week' '2005-01-01' '2005-01-07' 'New Year's Day' '2005-01-01' '2005-01-01' 'St Fred's Day' '2005-02-24' '2005-02-24' 'Year of the Prune' '2005-01-01' '2005-12-31'

The BETWEEN operator will work just fine with single dates that fall between the starting and finishing dates of these celebrations, but please

interval, and the OVERLAPS predicate will not To find out if a particular date occurs during an event, you can simply write queries like:

SELECT guest_name, ' arrived during ', celeb_name FROM Guests, Celebrations

WHERE arrival_date BETWEEN start_date AND finish_date AND arrival_date <> finish_date;

This query will find the guests who arrived at the hotel during each event The final predicate can be kept, if you want to conform to the ANSI convention, or dropped, if that makes more sense in your situation From now on, we will keep both end points to make the queries easier to read

SELECT guest_name, ' arrived during ', celeb_name FROM Guests, Celebrations

WHERE arrival_date BETWEEN start_date AND finish_date; Results

guest_name " arrived during " celeb_name ========================================================= 'Dorothy Gale' 'arrived during' 'Apple Month'

'Dorothy Gale' 'arrived during' 'Garlic Festival' 'Dorothy Gale' 'arrived during' 'Year of the Prune' 'Indiana Jones' 'arrived during' 'Apple Month' 'Indiana Jones' 'arrived during' 'Garlic Festival' 'Indiana Jones' 'arrived during' 'Year of the Prune' 'Don Quixote' 'arrived during' 'National Pear Week' 'Don Quixote' 'arrived during' 'New Year's Day' 'Don Quixote' 'arrived during' 'Year of the Prune' 'James T Kirk' 'arrived during' 'Apple Month' 'James T Kirk' 'arrived during' 'Garlic Festival'

Trang 8

13.2 OVERLAPS Predicate 279

'James T Kirk' 'arrived during' 'Year of the Prune' 'Santa Claus' 'arrived during' 'Christmas Season' 'Santa Claus' 'arrived during' 'Year of the Prune'

The obvious question is which guests were at the hotel during each event A common programming error when trying to find out if two

SELECT guest_name, ' was here during ', celeb_name FROM Guests, Celebrations

WHERE arrival_date BETWEEN start_date AND finish_date

OR depart_date BETWEEN start_date AND finish_date;

This is wrong, because it does not cover the case where the event began and finished during the guest’s visit Seeing his error, the programmer will sit down and draw a timeline diagram of all four possible overlapping cases, as shown in Figure 13.1

So the programmer adds more predicates, thus:

SELECT guest_name, ' was here during ', celeb_name FROM Guests, Celebrations

WHERE arrival_date BETWEEN start_date AND finish_date

OR depart_date BETWEEN start_date AND finish_date

OR start_date BETWEEN arrival_date AND depart_date

OR finish_date BETWEEN arrival_date AND depart_date;

A thoughtful programmer will notice that the last predicate is not needed and might drop it, but either way, this is a correct query But it is not the best answer In the case of the overlapping intervals, there are two cases where a guest’s stay at the hotel and an event do not both fall within the same time frame: either the guest checked out before the

Figure 13.1

Timeline Diagram

of All Possible

Overlapping

Cases.

Trang 9

280 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES

event started, or the event ended before the guest arrived If you want to

do the logic, that is what the first predicate will work out to be when you also add the conditions that arrival_date <= depart_date and start_date

<= finish_date But it is easier to see in a timeline diagram, thus:

Both cases can be represented in one SQL statement as:

SELECT guest_name, celeb_name FROM Guests, Celebrations WHERE NOT ((depart_date < start_date) OR (arrival_date > finish_date));

VIEW GuestsEvents guest_name celeb_name ======================================

'Dorothy Gale' 'Apple Month' 'Dorothy Gale' 'Garlic Festival' 'Dorothy Gale' 'St Fred's Day' 'Dorothy Gale' 'Year of the Prune' 'Indiana Jones' 'Apple Month' 'Indiana Jones' 'Garlic Festival' 'Indiana Jones' 'Year of the Prune' 'Don Quixote' 'Apple Month' 'Don Quixote' 'Garlic Festival' 'Don Quixote' 'National Pear Week' 'Don Quixote' 'New Year's Day' 'Don Quixote' 'St Fred's Day' 'Don Quixote' 'Year of the Prune' 'James T Kirk' 'Apple Month' 'James T Kirk' 'Garlic Festival' 'James T Kirk' 'St Fred's Day' 'James T Kirk' 'Year of the Prune' 'Santa Claus' 'Christmas Season' 'Santa Claus' 'Year of the Prune'

This VIEW is handy for other queries The reason for using the NOT in the WHERE clause is so that you can add or remove it to reverse the sense

Figure 13.2

Timeline Diagram.

Trang 10

13.2 OVERLAPS Predicate 281

of the query For example, to find out how many celebrations each guest could have seen, you would write:

CREATE VIEW GuestCelebrations (guest_name, celeb_name)

AS SELECT guest_name, celeb_name

FROM Guests, Celebrations

WHERE NOT ((depart_date < start_date) OR (arrival_date > finish_date));

SELECT guest_name, COUNT(*) AS celebcount

FROM GuestCelebrations

GROUP BY guest_name;

Results

guest_name celebcount

=========================

'Dorothy Gale' 4

'Indiana Jones' 3

'Don Quixote' 6

'James T Kirk' 4

'Santa Claus' 2

Then, to find out how many guests were at the hotel during each celebration, you would write:

SELECT celeb_name, COUNT(*) AS guestcount

FROM GuestCelebrations

GROUP BY celeb_name;

Result

celeb_name guestcount

============================

'Apple Month' 4

'Christmas Season' 1

'Garlic Festival' 4

'National Pear Week' 1

'New Year's Day' 1

'St Fred's Day' 3

'Year of the Prune' 5

This last query is only part of the story What the hotel management really wants to know is how many room nights were sold for a

Ngày đăng: 06/07/2014, 09:20

TỪ KHÓA LIÊN QUAN