CREATE PROCEDURE InsertNewWeekDay IN new_week_nbr INTEGER LANGUAGE SQL INSERT INTO WeeklyReport week_nbr, day_nbr VALUES new_week_nbr, CASE WHEN 1 NOT IN SELECT day_nbr FROM WeeklyRepo
Trang 1but it at least follows the specs that were given without making too many guesses as to what should have been done.
But can we do this without a loop and get a pure, nonprocedural SQL solution? Yes, there are several ways: Because the purpose of finding this weekday number is to insert a row in the table, why not do that in one procedure instead of finding the number in a function, and then doing the insertion in another procedural step Think at the level of a whole process and not in sequential steps.
This first answer is ugly looking and difficult to generalize, but it is fast if the optimizer factors out the tabular subquery in the WHEN clauses and computes it once It also uses no local variables.
CREATE PROCEDURE InsertNewWeekDay (IN new_week_nbr INTEGER) LANGUAGE SQL
INSERT INTO WeeklyReport (week_nbr, day_nbr)
VALUES (new_week_nbr,
(CASE WHEN 1 NOT IN
(SELECT day_nbr FROM WeeklyReport WHERE week_nbr = new_week_nbr)
THEN 1
WHEN 2 NOT IN
(SELECT day_nbr FROM WeeklyReport WHERE week_nbr = new_week_nbr)
THEN 2
WHEN 3 NOT IN
(SELECT day_nbr FROM WeeklyReport WHERE week_nbr = new_week_nbr)
THEN 3
WHEN 4 NOT IN
(SELECT day_nbr FROM WeeklyReport WHERE week_nbr = new_week_nbr)
THEN 4
WHEN 5 NOT IN
(SELECT day_nbr FROM WeeklyReport WHERE week_nbr = new_week_nbr)
THEN 5
WHEN 6 NOT IN
(SELECT day_nbr FROM WeeklyReport WHERE week_nbr = new_week_nbr)
THEN 6
WHEN 7 NOT IN
Trang 2188 CHAPTER 10: THINKING IN SQL
(SELECT day_nbr FROM WeeklyReport WHERE week_nbr = new_week_nbr)
THEN 7 ELSE NULL END;—null will violate primary key
The thought process was to get the entire set of weekday numbers present in the week, and then compare them to each value in an ordered list The CASE expression is just a way to hide that list Although it is a step forward, it is not yet really a set-oriented solution.
Here is another version that uses a table constructor This is more compact and easy to generalize Here we are actually using a set-oriented solution! We are subtracting the set of actual days from the set of all possible days, and then looking at the minimum value in the result to get
an answer.
CREATE PROCEDURE InsertNewWeekDay (IN new_week_nbr INTEGER) LANGUAGE SQL
INSERT INTO WeeklyReport (week_nbr, day_nbr) (SELECT my_week_nbr, MIN(n)
FROM (VALUES (1), (2), (3), (4), (5), (6), (7)) AS Weekdays(n) WHERE NOT EXISTS
(SELECT * FROM WeeklyReport AS W WHERE W.week_nbr = my_week_nbr AND Weekdays.n = W.my_day_nbr));
You can also use a pure set operations approach The set difference operator can remove all of the numbers that are present, so that we can pick the minimum value from the leftovers.
CREATE PROCEDURE InsertNewWeekDay (IN new_week_nbr INTEGER) LANGUAGE SQL
INSERT INTO WeeklyReport (week_nbr, day_nbr) SELECT my_week_nbr, MIN(n)
FROM (VALUES (1), (2), (3), (4), (5), (6), (7) EXCEPT
SELECT day_nbr FROM WeeklyReport AS W WHERE W.week_nbr = my_week_nbr) AS N(n);
Trang 3If all seven days are present, we will get an empty set, which will return a NULL for the day_nbr, and the NULL will violate the primary-key constraint.
Here is a third, generalized version with the Sequence table providing any range of integers desired Just remember that the DDL has to also match that change.
CREATE PROCEDURE InsertNewWeekDay (IN new_week_nbr INTEGER) LANGUAGE SQL
INSERT INTO WeeklyReport (week_nbr, day_nbr)
SELECT my_week_nbr, MIN(n)
FROM (SELECT seq FROM Sequence WHERE seq <= 7—change to any value
EXCEPT
SELECT day_nbr
FROM WeeklyReport AS W
WHERE W.week_nbr = my_week_nbr) AS N(n);
In the case of only seven values, there is not going to be a huge difference in performance among any of these answers However, with a huge number of values, the use of hashing or bit vector indexes would be
a noticeable improvement over a loop.
10.2 Thinking of Columns as Fields
The original code was actually much worse, because the poster wanted to create and drop tables on the fly The purpose is to load totals into a summary report table.
CREATE PROCEDURE SurveySummary()
LANGUAGE SQL
BEGIN
DECLARE sche_yes INTEGER;
DECLARE sche_no INTEGER;
DECLARE sche_mb INTEGER;
DECLARE sche_other INTEGER;
DECLARE how_yes INTEGER;
DECLARE how_no INTEGER;
DECLARE how_mb INTEGER;
DECLARE how_other INTEGER;
Trang 4190 CHAPTER 10: THINKING IN SQL
DECLARE paaexpl_yes INTEGER;
DECLARE paaexpl_no INTEGER;
DECLARE paaexpl_mb INTEGER;
DECLARE paaexpl_other INTEGER;
SET sche_yes = (SELECT COUNT(*) FROM SurveyForms WHERE sche = 1);
SET sche_no = (SELECT COUNT(*) FROM SurveyForms WHERE sche = 2); SET sche_mb = (SELECT COUNT (*) FROM SurveyForms WHERE sche = 3); SET sche_other = (SELECT COUNT(*)
FROM SurveyForms WHERE NOT (sche IN (1, 2, 3)));
SET how_yes = (SELECT COUNT(*) FROM SurveyForms WHERE howwarr = 1);
SET how_no = (SELECT COUNT(*) FROM SurveyForms WHERE howwarr = 2);
SET how_mb = (SELECT COUNT (*) FROM SurveyForms WHERE howwarr = 3);
SET how_other = (SELECT COUNT(*) FROM SurveyForms WHERE NOT (howwarr IN (1,2,3)));
SET paaexpl_yes = (SELECT COUNT(*) FROM SurveyForms WHERE paaexpl = 1);
SET paaexpl_no = (SELECT COUNT(*) FROM SurveyForms WHERE paaexpl = 2);
SET paaexpl_mb = (SELECT COUNT (*) FROM SurveyForms WHERE paaexpl
= 3);
SET paaexpl_other = (SELECT COUNT(*) FROM SurveyForms WHERE NOT (paaexpl IN (1, 2, 3)));
DELETE FROM SurveyWorkingtable;
INSERT INTO SurveyWorkingtable VALUES (sche_yes, sche_no, sche_mb, sche_other, How_yes, how_no, how_mb, how_other, Paaexpl_yes, paaexpl_no, paaexpl_mb, paaexpl_other); END;
Why did the poster create a dozen local variables and then use scalar subqueries to load them? The poster is still thinking in terms of a 3GL
Trang 5programming language In COBOL or other 3GL languages, the file containing the Construction Survey data would be read in one record at
a time, and then each record would be read one field at a time, from left
to right A sequence of IF-THEN statements would look at the fields and increment the appropriate counter When the entire file is read, the results would be written to the working file for the survey summary The poster looked at each column as if it were a field and asked how
to get the value for it, in isolation from the whole The poster had seen the use of a subquery expression and implemented it that way The subqueries will not be well optimized, so this is actually going to run longer than if the poster had used SQL/PSM to mimic the classic COBOL program for this kind of summary.
Without repeating a dozen columns again, a set-oriented solution is this:
CREATE PROCEDURE SurveySummary()
LANGUAGE SQL
BEGIN
DELETE FROM SurveyWorkingtable;
INSERT INTO SurveyWorkingtable (sche_yes, sche_no, ,
paaexpl_other)
SELECT SUM (CASE WHEN sche = 1 THEN 1 ELSE 0 END) AS sche_yes, SUM (CASE WHEN sche = 2 THEN 1 ELSE 0 END) AS sche_no,
SUM (CASE WHEN paaexpl NOT IN (1, 2, 3)
THEN 1 ELSE 0 END) AS paaexpl_other
FROM SurveyForms;
END;
The trick was to ask what you want in each row of a summary table, as
a completed unit of work, and not start at the column level The answer
is a tally of answers to some questions The word tally leads you to SUM()
or COUNT(), and you remember the trick with the CASE expression The final question is why not use a VIEW to get the summary instead
of a procedure?
10.3 Thinking in Processes, Not Declarations
This is a simple schema for checking items out of an inventory The original schema lacked keys and constraints that had to be added to give
us this: