6.2.2.3 Other CASE Expressions The SQL-92 standard defines other functions in terms of the CASE expression, which makes the language a bit more compact and easier to implement.. COALESCE
Trang 1112 CHAPTER 6: CODING CHOICES
WHEN 9 THEN 'N/A' ELSE NULL END
could also be written as:
CASE WHEN iso_sex_code = 0 THEN 'Unknown' WHEN iso_sex_code = 1 THEN 'Male' WHEN iso_sex_code = 2 THEN 'Female' WHEN iso_sex_code = 9 THEN 'N/A' ELSE NULL END
There is a gimmick in this definition, however The expression:
CASE foo WHEN 1 THEN 'bar' WHEN NULL THEN 'no bar' END
becomes:
CASE WHEN foo = 1 THEN 'bar' WHEN foo = NULL THEN 'no_bar' —error!
ELSE NULL END
The second WHEN clause is always UNKNOWN Use the simple CASE expression when it is appropriate
6.2.2.3 Other CASE Expressions
The SQL-92 standard defines other functions in terms of the CASE expression, which makes the language a bit more compact and easier to implement For example, the COALESCE () function can be defined for one or two expressions by:
1 COALESCE (<value exp #1>) is equivalent to (<value exp #1>)
2 COALESCE (<value exp #1>, <value exp #2>) is equivalent to:
CASE WHEN <value exp #1> IS NOT NULL THEN <value exp #1>
ELSE <value exp #2> END
Trang 26.2 Pick Compact Constructions over Longer Equivalents 113
Then we can recursively define it for (n) expressions, where
(n >= 3), in the list by:
COALESCE (<value exp #1>, <value exp #2>, , n) as
equivalent to:
CASE WHEN <value exp #1> IS NOT NULL
THEN <value exp #1>
ELSE COALESCE (<value exp #2>, , n)
END
Likewise, NULLIF (<value exp #1>, <value exp #2>) is
equivalent to:
CASE WHEN <value exp #1> = <value exp #2>
THEN NULL
ELSE <value exp #1> END
Use the most compact form of these CASE expressions, and do not
expand them out to their definitions
6.2.3 Avoid Redundant Expressions
Rationale:
Most modern SQL engines are pretty smart This was not always the case,
so older SQL programmers will sometimes add redundant predicates to a
where clause For example, if none of the columns in the table Foobar is
NULL-able, then given:
SELECT a, b, c
FROM Foobar
WHERE a = b
AND b = c
AND c = a;
One of the three search conditions is redundant, because it can be
deduced from the other two Redundant predicates only confuse the
human readers and do not give information to a good optimizer
Trang 3114 CHAPTER 6: CODING CHOICES
Exceptions:
If your SQL has a bad optimizer and needs the extra help, then add redundant predicates
6.2.4 Seek a Compact Form
Rationale:
Many of the earlier SQL engines could not use an index on a column if it were in an expression, and they did not do any algebraic optimizations
Today, we do this bit of cleanup work because a simpler form of an expression is easier to maintain and to read:
SELECT a, b, c FROM Foobar WHERE a + 2 = b - 4;
And a little algebra becomes:
SELECT a, b, c FROM Foobar WHERE a = b + 2;
Exceptions:
If your SQL has a really good optimizer, and the complicated form is easier for a human being to read for some reason, then use it Sometimes there is no simple form
6.2.4.1 Use BETWEEN, Not AND-ed Predicates
Rationale:
Consider this simple query:
SELECT a, b, c FROM Foobar WHERE a <= b AND b <= c;
which can be written as:
SELECT a, b, c FROM Foobar WHERE b BETWEEN a AND c;
Trang 46.2 Pick Compact Constructions over Longer Equivalents 115
The BETWEEN is more compact and gives the reader information about the relationship among three columns that might not be so obvious amid a longer list of search conditions
Exceptions:
This rule makes sense from a readability standpoint, but it does not always stand up in terms of performance Consider DB2 for z/OS in which “<column name> BETWEEN <expression> AND <expression> is both indexable and a stage one predicate.” Without explaining what a stage one predicate is, it is preferred for performance
However, “<value> BETWEEN <column name 1>AND <column name 2>” is both stage two and nonindexable, but formulating the same using two <= predicates could be both stage one and indexable and therefore preferable for performance Likewise, the same execution plan applies to
“<column name 1> BETWEEN <column name 2> AND <column name 3>” predicates This will differ from DBMS to DBMS and platform to platform As optimizers get better, this will be less and less true
6.2.4.2 Use IN(), Not OR-ed predicates
Rationale:
The IN() predicate was first introduced in the Pascal programming language In SQL it has two forms; the list and the subquery The list form has a comma-separated list of values or expressions on the right-hand side The predicate returns a TRUE result if there is a match in that list with the left-hand side of the predicate It is shorthand for a list or OR-ed predicates For example consider:
SELECT a, b, c
FROM Foobar
WHERE a = b
OR a = c;
which can be written as:
SELECT a, b, c
FROM Foobar
WHERE a IN (b, c);
The IN() is more compact and gives the reader information about the relationship among three columns that might not be so obvious amid a
Trang 5116 CHAPTER 6: CODING CHOICES
longer list of search conditions The list can also consist of scalar expressions, but that is not common
Exceptions:
Watch out for NULLs! The IN () predicate is defined as a chain of OR-ed predicates, thus:
a IN (x, y, z) means ((a = x) OR (a = y) OR (a = z))
Therefore:
a IN (x, y, NULL) means ((a = x) OR (a = y) OR (a = NULL)) ((a = x) OR (a = y) OR UNKNOWN)
We are now in SQL’s three-valued logic Remember that a NULL is not the same thing as an UNKNOWN; SQL-92 has no Boolean data type; and you cannot use AND, OR, and NOT on a NULL
The NOT IN () predicate is defined as the negation of the IN():
a NOT IN (x, y, z)
means:
NOT (a IN (x, y, z)) NOT ((a = x) OR (a = y) OR (a = z)) (NOT(a = x) AND NOT(a = y) AND NOT(a = z)) DeMorgan's law ((a <> x) AND (a <> y) AND (a <> z)) definition
Now put in a NULL for one of the list elements:
((a <> x) AND (a <> y) AND UNKNOWN) (UNKNOWN)
If you wish to have a match on a NULL in a list, then you can COALESCE() the NULLs to the left-hand expression, thus:
WHERE a IN (x, y, COALESCE (z, a))