C H A P T E R13 BETWEEN and OVERLAPS Predicates showing that one value lies within a range defined by two other OVERLAPS predicate looks at two time periods defined either by they overla
Trang 2C H A P T E R
13 BETWEEN and OVERLAPS Predicates
showing that one value lies within a range defined by two other
OVERLAPS predicate looks at two time periods (defined either by
they overlap in time
13.1 The BETWEEN Predicate
value expression> AND <high value expression> is a feature of SQL that is used often enough to deserve special attention It
is also just tricky enough to fool beginning programmers This predicate is actually just shorthand for the expression:
((<low value expression> <= <value expression>) AND (<value expression> <= <high value expression>))
Please note that the end points are included in this definition This predicate works with any data types that can be compared Most programmers miss this fact and use it only for numeric values, but it can be used for character strings and temporal data as well The <high
Trang 3274 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES
value expression> and <low value expression> can be expressions or constants, but again, programmers tend to use just constants
13.1.1 Results with NULL Values
The results of this predicate with NULL values for <value expression>, <low value expression>, or <high value expression> follow directly from the definition If both <low value expression> and <high value expression> are NULL, the result
is UNKNOWN for any value of <value expression> If <low value expression> or <high value expression> is NULL, but not both
expression> is NULL, the results are UNKNOWN for any values of <low value expression> and <high value expression>
13.1.2 Results with Empty Sets
expression>, the expression will always be FALSE unless the value is
NULL; then it is UNKNOWN That is a bit confusing, since there is no value
to which <value expression> could resolve itself that would
definition:
x BETWEEN 12 AND 15 depends on the value of x
x BETWEEN 15 AND 12 always FALSE
x BETWEEN NULL AND 15 always UNKNOWN NULL BETWEEN 12 AND 15 always UNKNOWN
x BETWEEN 12 AND NULL always UNKNOWN
x BETWEEN x AND x always TRUE
13.1.3 Programming Tips
The BETWEEN range includes the end points, so you have to be careful Here is an example that deals with changing a percent range on a test into a letter grade:
Grades low_score high_score grade
=========================
90 100 'A'
Trang 413.2 OVERLAPS Predicate 275
80 90 'B'
70 80 'C'
60 70 'D'
00 60 'F'
However, this will not work when a student gets a grade on the borderlines (90, 80, 70, or 60) One way to solve the problem is to change the table by adding 1 to the low scores Of course, the student who got 90.1 will argue that he should have gotten an ‘A’ and not a ‘B’ If you add 0.01 to the low scores, the student who got 90.001 will argue that he should have gotten an ‘A’ and not a ‘B’, and so forth This is a problem with a continuous variable A better solution might be to change
AND (score > low_score) or simply to ((low_score < score) AND (score <= high_score)) Neither approach will be much different in this example, since few values will fall on the borders between grades and this table is very, very small
As a sidebar, the reader might want to look up an introductory book
to fuzzy logic In that model, an entity can have a degree of membership
in a set, rather than being strictly in or out of the set Some experimental databases use fuzzy logic
predicate the better choice for larger tables of this sort They will keep index values in trees whose nodes hold a range of values (look up a description of the B-Tree family in a computer science book) An
predicate were presented as two comparisons, it might execute them as separate actions against the database, which would be slower
13.2 OVERLAPS Predicate
The OVERLAPS predicate is a feature not yet available in most SQL implementations, because it requires more of the Standard SQL
temporal data features than most implementations have Many
type with the existing date and time features of their products
13.2.1 Time Periods and OVERLAPS Predicate
An INTERVAL is a measure of temporal duration, expressed in units such as days, hours, minutes, and so forth This is how you add or
Trang 5276 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES
subtract days to or from a date, hours and minutes to or from a time, and
time periods are defined as row values with two columns The first column (the starting time) of the pair is always a <datetime> data type, and the second column (the termination time) is a <datetime> data type that can be used to compute a <datetime> value If the starting and termination times are the same, this is an instantaneous event
The result of the <overlaps predicate> is formally defined as the result of the following expression:
(S1 > S2 AND NOT (S1 >= T2 AND T1 >= T2))
OR (S2 > S1 AND NOT (S2 >= T1 AND T2 >= T1))
OR (S1 = S2 AND (T1 <> T2 OR T1 = T2))
In this expression, S1 and S2 are the starting times of the two time periods, and T1 and T2 are their termination times
are not The principles that we wanted in the standard were:
its end point The reason for this model is that it follows the ISO convention that there is no 24:00 today; midnight is 00:00 tomorrow Half-open durations have closure properties that are useful The concatenation of two half-open durations is a half-open duration
they share a common time period
second term is an instantaneous event (a <datetime> data type), they overlap when the second term is in the time period (but is not the end point of the time period)
they overlap only when they are equal
<datetime> value, the finishing time becomes the starting time and we have an event If the starting time is NULL and the
and starting times are NULL
Trang 613.2 OVERLAPS Predicate 277
Please consider how your intuition reacts to these results, when the
begins at 00:00
let’s see what we have to do to handle overlapping times Consider a table
of hotel guests with the days of their stays and a table of special events being held at the hotel The tables might look like this:
CREATE TABLE Guests
(guest_name CHARACTER(30) NOT NULL PRIMARY KEY,
arrival_date DATE NOT NULL,
depart_date DATE NOT NULL,
);
Guests
guest_name arrival_date depart_date
==============================================
'Dorothy Gale' '2005-02-01' '2005-11-01'
'Indiana Jones' '2005-02-01' '2005-02-01'
'Don Quixote' '2005-01-01' '2005-10-01'
'James T Kirk' '2005-02-01' '2005-02-28'
'Santa Claus' '2005-12-01' '2005-12-25'
CREATE TABLE Celebrations
(eventname CHARACTER(30) PRIMARY KEY,
start_date DATE NOT NULL,
finish_date DATE NOT NULL,
.);
Celebrations
celeb_name start_date finish_date
==================================================
'Apple Month' '2005-02-01' '2005-02-28'
'Christmas Season' '2005-12-01' '2005-12-25'
Trang 7278 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES
'Garlic Festival' '2005-01-15' '2005-02-15' 'National Pear Week' '2005-01-01' '2005-01-07' 'New Year's Day' '2005-01-01' '2005-01-01' 'St Fred's Day' '2005-02-24' '2005-02-24' 'Year of the Prune' '2005-01-01' '2005-12-31'
The BETWEEN operator will work just fine with single dates that fall between the starting and finishing dates of these celebrations, but please
interval, and the OVERLAPS predicate will not To find out if a particular date occurs during an event, you can simply write queries like:
SELECT guest_name, ' arrived during ', celeb_name FROM Guests, Celebrations
WHERE arrival_date BETWEEN start_date AND finish_date AND arrival_date <> finish_date;
This query will find the guests who arrived at the hotel during each event The final predicate can be kept, if you want to conform to the ANSI convention, or dropped, if that makes more sense in your situation From now on, we will keep both end points to make the queries easier to read
SELECT guest_name, ' arrived during ', celeb_name FROM Guests, Celebrations
WHERE arrival_date BETWEEN start_date AND finish_date; Results
guest_name " arrived during " celeb_name ========================================================= 'Dorothy Gale' 'arrived during' 'Apple Month'
'Dorothy Gale' 'arrived during' 'Garlic Festival' 'Dorothy Gale' 'arrived during' 'Year of the Prune' 'Indiana Jones' 'arrived during' 'Apple Month' 'Indiana Jones' 'arrived during' 'Garlic Festival' 'Indiana Jones' 'arrived during' 'Year of the Prune' 'Don Quixote' 'arrived during' 'National Pear Week' 'Don Quixote' 'arrived during' 'New Year's Day' 'Don Quixote' 'arrived during' 'Year of the Prune' 'James T Kirk' 'arrived during' 'Apple Month' 'James T Kirk' 'arrived during' 'Garlic Festival'
Trang 813.2 OVERLAPS Predicate 279
'James T Kirk' 'arrived during' 'Year of the Prune' 'Santa Claus' 'arrived during' 'Christmas Season' 'Santa Claus' 'arrived during' 'Year of the Prune'
The obvious question is which guests were at the hotel during each event A common programming error when trying to find out if two
SELECT guest_name, ' was here during ', celeb_name FROM Guests, Celebrations
WHERE arrival_date BETWEEN start_date AND finish_date
OR depart_date BETWEEN start_date AND finish_date;
This is wrong, because it does not cover the case where the event began and finished during the guest’s visit Seeing his error, the programmer will sit down and draw a timeline diagram of all four possible overlapping cases, as shown in Figure 13.1
So the programmer adds more predicates, thus:
SELECT guest_name, ' was here during ', celeb_name FROM Guests, Celebrations
WHERE arrival_date BETWEEN start_date AND finish_date
OR depart_date BETWEEN start_date AND finish_date
OR start_date BETWEEN arrival_date AND depart_date
OR finish_date BETWEEN arrival_date AND depart_date;
A thoughtful programmer will notice that the last predicate is not needed and might drop it, but either way, this is a correct query But it is not the best answer In the case of the overlapping intervals, there are two cases where a guest’s stay at the hotel and an event do not both fall within the same time frame: either the guest checked out before the
Figure 13.1
Timeline Diagram
of All Possible
Overlapping
Cases.
Trang 9280 CHAPTER 13: BETWEEN AND OVERLAPS PREDICATES
event started, or the event ended before the guest arrived If you want to
do the logic, that is what the first predicate will work out to be when you also add the conditions that arrival_date <= depart_date and start_date
<= finish_date But it is easier to see in a timeline diagram, thus:
Both cases can be represented in one SQL statement as:
SELECT guest_name, celeb_name FROM Guests, Celebrations WHERE NOT ((depart_date < start_date) OR (arrival_date > finish_date));
VIEW GuestsEvents guest_name celeb_name ======================================
'Dorothy Gale' 'Apple Month' 'Dorothy Gale' 'Garlic Festival' 'Dorothy Gale' 'St Fred's Day' 'Dorothy Gale' 'Year of the Prune' 'Indiana Jones' 'Apple Month' 'Indiana Jones' 'Garlic Festival' 'Indiana Jones' 'Year of the Prune' 'Don Quixote' 'Apple Month' 'Don Quixote' 'Garlic Festival' 'Don Quixote' 'National Pear Week' 'Don Quixote' 'New Year's Day' 'Don Quixote' 'St Fred's Day' 'Don Quixote' 'Year of the Prune' 'James T Kirk' 'Apple Month' 'James T Kirk' 'Garlic Festival' 'James T Kirk' 'St Fred's Day' 'James T Kirk' 'Year of the Prune' 'Santa Claus' 'Christmas Season' 'Santa Claus' 'Year of the Prune'
This VIEW is handy for other queries The reason for using the NOT in the WHERE clause is so that you can add or remove it to reverse the sense
Figure 13.2
Timeline Diagram.
Trang 1013.2 OVERLAPS Predicate 281
of the query For example, to find out how many celebrations each guest could have seen, you would write:
CREATE VIEW GuestCelebrations (guest_name, celeb_name)
AS SELECT guest_name, celeb_name
FROM Guests, Celebrations
WHERE NOT ((depart_date < start_date) OR (arrival_date > finish_date));
SELECT guest_name, COUNT(*) AS celebcount
FROM GuestCelebrations
GROUP BY guest_name;
Results
guest_name celebcount
=========================
'Dorothy Gale' 4
'Indiana Jones' 3
'Don Quixote' 6
'James T Kirk' 4
'Santa Claus' 2
Then, to find out how many guests were at the hotel during each celebration, you would write:
SELECT celeb_name, COUNT(*) AS guestcount
FROM GuestCelebrations
GROUP BY celeb_name;
Result
celeb_name guestcount
============================
'Apple Month' 4
'Christmas Season' 1
'Garlic Festival' 4
'National Pear Week' 1
'New Year's Day' 1
'St Fred's Day' 3
'Year of the Prune' 5
This last query is only part of the story What the hotel management really wants to know is how many room nights were sold for a