192 CHAPTER 10: THINKING IN SQL CREATE TABLE Users user_id CHAR8 NOT NULL PRIMARY KEY, password VARCHAR10 NOT NULL, max_reserves INTEGER NOT NULL CHECK max_reserves >= 0; CREATE TABLE R
Trang 1192 CHAPTER 10: THINKING IN SQL
CREATE TABLE Users (user_id CHAR(8) NOT NULL PRIMARY KEY, password VARCHAR(10) NOT NULL,
max_reserves INTEGER NOT NULL CHECK (max_reserves >= 0));
CREATE TABLE Reservations (user_id CHAR(8) NOT NULL REFERENCES Users(user_id)
ON UPDATE CASCADE
ON DELETE CASCADE, item_id INTEGER NOT NULL REFERENCES Items(item_id));
The original narrative specification was:
Each user can reserve a maximum of (n) items Whenever a user reserves something, the “max_reserves” field [sic] of the user is retrieved and checked Then a record [sic] is inserted into the Reservations table, and the “max_reserves” field [sic]
of the user is updated accordingly I would like to ask if there is
a better way to implement this system, because there is a chance that the user reserves more than the maximum num-ber, if he or she is logged in from two computers
The first proposal was for a stored procedure that looked like this in SQL/PSM:
CREATE PROCEDURE InsertReservations (IN max_reserves INTEGER,
IN my_user_id CHAR(8), IN my_item_id INTEGER) LANGUAGE SQL
BEGIN DECLARE my_count INTEGER;
SET my_count = (SELECT COUNT(*) FROM Reservations WHERE user_id = my_user_id);
IF my_count >= max_reserves THEN RETURN ('You have Reached you MAX number of items'); ELSE INSERT INTO Reservations (user_id, item_id)
VALUES(my_user_id, my_item_id);
END IF;
END;
Trang 210.3 Thinking in Processes, Not Declarations 193
Passing the maximum number of items as a parameter makes no sense, because you have to look it up; this will let you pass any value you desire Having a local variable for the count is redundant; SQL is orthogonal, and the scalar subquery can be used wherever the scalar variable is used
Rows are not records and columns are not fields SQL is a declarative language, not a procedural one So a sequence of procedural steps like
“Retrieve → check → insert → update” does not make sense Instead, you say that you make a reservation such that the user is not over his or her limit Think of logic, not process
CREATE PROCEDURE MakeReservation
(IN my_user_id CHAR(8), IN my_item_id INTEGER)
LANGUAGE SQL
BEGIN
INSERT INTO Reservations (user_id, item_id)
SELECT my_user_id, my_item_id
FROM Users AS U
WHERE U.user_id = my_user_id
AND U.max_reserves
>= (SELECT COUNT(*)
FROM Reservations AS R
WHERE R.user_id = my_user_id);
add error handling here
END;
Instead of recording the tally of reserved items in local storage, you can get it with a subquery expression In fact, you might want to have a view to use for reports
CREATE VIEW Loans (user_id, max_reserves, current_loans)
AS
SELECT U.user_id, U.max_reserves, COUNT(*)
FROM Reservations AS R, Users AS U
WHERE R.user_id = U.user_id
GROUP BY U.user_id, U.max_reserves;
Trang 3194 CHAPTER 10: THINKING IN SQL
10.4 Thinking the Schema Should Look Like the Input Forms
There are several versions of this error The easiest one is a simple timecard form that gets modeled exactly as it is printed on the paper form
CREATE TABLE Timecards (user_id CHAR(8) NOT NULL, punch_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL, event_flag CHAR(3) DEFAULT 'IN ' NOT NULL
CHECK(flag IN ('IN ', 'OUT')), PRIMARY KEY (user_id, punch_time));
But to answer even basic questions, you have to match up in and out times Dr Codd (1979) described a row in an RDBMS as containing a fact, but more than that, it should contain a whole fact and not half of it The “half-fact” that John showed up at the job at 09:00 Hrs has nothing
to do with paying him I need to know that John was on the job from 09:00 to 17:00 Hrs The correct design holds a whole in each row, thus:
CREATE TABLE Timecards (user_id CHAR(8) NOT NULL, in_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL, out_time TIMESTAMP,—null means current
CHECK(in_time < out_time), PRIMARY KEY (user_id, in_time));
Many new SQL programmers are scared of NULLs, but this is a good use of them We do not know the future, so we cannot assign a value to the out_time until we have that information
Another common example is a simple order form that is copied directly into DDL In skeleton form, the usual layout is something like this:
CREATE TABLE Orders (order_nbr INTEGER NOT NULL PRIMARY KEY,
order_total DECIMAL(12,2) NOT NULL, );
CREATE TABLE OrdersDetails
Trang 410.4 Thinking the Schema Should Look Like the Input Forms 195
(order_nbr INTEGER NOT NULL,
line_nbr INTEGER NOT NULL,
PRIMARY KEY (order_nbr, line_nbr),
item_id INTEGER NOT NULL
REFERENCES Inventory(item_id),
qty_ordered INTEGER NOT NULL
CHECK (qty_ordered > 0)
);
The order total can be computed from the order details, so it is redundant in the Orders table; but the total was a box on the paper form,
so the newbie put it in the table
Nobody is actually buying or shipping a line number Customers are ordering items, but the lines on the paper form are numbered, so the line numbers are in the OrderDetails table This is dangerous, because if
I repeat the same item on another line, I have to consolidate them in the database Otherwise, quantity discounts will be missed, and I am wasting storage with redundant data
For example, each of the rows shows a “half-fact” in each row One says that I ordered two pairs of lime green pants and the other says that I ordered three pairs of lime green pants on my order #123 The whole fact is that I ordered five pairs of lime green pants on my order #123
In 2004, I pointed this out to a programmer who had such a schema She insisted that they needed the line numbers to be able to reproduce the original order exactly as it was keyed in, but then in a following posting in the same thread, she complained that her people were spending hours every day verifying the quantity of items in orders they received, because their suppliers did not use the proper model to present
a consolidated, sorted display of the data