Date’s predicate excludes the case where all conditions in the predicate are TRUE.. 11.1 The CASE Expression The CASE expression allows the programmer to pick a value based on a logical
Trang 1242 CHAPTER 10: VALUED PREDICATES
non-NULL values is defined by Table 10.1, where Degree means the number of columns in the row expression
Table 10.1 Cases Where a Row Is a Mix of NULL and non- NULL Values
R IS R IS NOT NOT R IS NOT R IS NOT
Expression NULL NULL NULL NULL
==============================================================
Degree = 1
NULL TRUE FALSE FALSE TRUE
No NULL FALSE TRUE TRUE FALSE
==============================================================
Degree > 1
All NULLs TRUE FALSE FALSE TRUE
Some NULLs FALSE FALSE TRUE TRUE
No NULLs FALSE TRUE TRUE FALSE
Note that R IS NOT NULL has the same result as NOT R IS NULL
if and only if R is of degree 1 This is a break in the usual pattern of predicates with a NOT option in them Here are some examples:
(1, 2, 3) IS NULL = FALSE (1, NULL, 3) IS NULL = FALSE (1, NULL, 3) IS NOT NULL = FALSE (NULL, NULL, NULL) IS NULL = TRUE (NULL, NULL, NULL) IS NOT NULL = FALSE NOT (1, 2, 3) IS NULL = TRUE
NOT (1, NULL, 3) IS NULL = TRUE NOT (1, NULL, 3) IS NOT NULL = TRUE NOT (NULL, NULL, NULL) IS NULL = FALSE NOT (NULL, NULL, NULL) IS NOT NULL = TRUE
10.1.1 Sources of NULLs
It is important to remember where NULLs can occur They are more than just a possible value in a column Aggregate functions on empty sets, OUTER JOINs, arithmetic expressions with NULLs, and OLAP operators all return NULLs These constructs often show up as columns in VIEWs
10.2 IS [NOT]{TRUE | FALSE | UNKNOWN} Predicate
This predicate tests a condition that has the truth-value TRUE, FALSE, or UNKNOWN, and returns TRUE or FALSE The syntax is:
Trang 210.2 IS [NOT]{TRUE | FALSE | UNKNOWN} Predicate 243
<Boolean test> ::=
<Boolean primary> [IS [NOT] <truth value>]
<truth value> ::= TRUE | FALSE | UNKNOWN
<Boolean primary> ::=
<predicate> | <left paren> <search condition> <right paren>
As you would expect, the expression IS NOT <logical value> is the same as NOT (IS <logical value>), so the predicate can be defined as shown in Table 10.2
Table 10.2 Defining the Predicate: True, False, or Unknown
IS
condition | TRUE FALSE UNKNOWN
=======================================
TRUE | TRUE FALSE FALSE
FALSE | FALSE TRUE FALSE
UNKNOWN | FALSE FALSE TRUE
If you are familiar with some of Chris Date’s writings, his MAYBE(x) predicate is not the same as the ANSI (x) IS NOT FALSE predicate, but it is equivalent to the (x) IS UNKNOWN predicate Date’s predicate excludes the case where all conditions in the predicate are TRUE Date points out that it is difficult to ask a conditional question in English To borrow one of his examples (Date 1990), consider the problem of finding employees who might be programmers born before January 18, 1975 with a salary less than $50,000 The statement of the problem is a bit unclear as to what the “might be” covers—just being a programmer, or all three conditions Let’s assume that we want some doubt on any of the three conditions With this predicate, the answer is fairly easy to write:
SELECT *
FROM Personnel
WHERE (job = 'Programmer'
AND dob < CAST ('1975-01-18' AS DATE)
AND salary < 50000) IS UNKNOWN;
Trang 3244 CHAPTER 10: VALUED PREDICATES
This could be expanded in the old SQL-89 to:
SELECT * FROM Personnel WHERE (job = 'Programmer' AND dob < CAST ('1975-01-18' AS DATE) AND salary < 50000.00)
OR (job IS NULL AND dob < CAST ('1975-01-18' AS DATE) AND salary < 50000.00)
OR (job = 'Programmer' AND dob IS NULL AND salary < 50000.00)
OR (job = 'Programmer' AND dob < CAST ('1975-01-18' AS DATE) AND salary IS NULL)
OR (job IS NULL AND dob IS NULL AND salary < 50000.00)
OR (job IS NULL AND dob < CAST ('1975-01-18' AS DATE) AND salary IS NULL)
OR (job = 'Programmer' AND dob IS NULL AND salary IS NULL)
OR (job IS NULL AND dob IS NULL AND salary IS NULL);
The problem is that every possible combination of NULLs and
non-NULLs has to be tested Since there are three predicates involved, this gives us (3^2) = 8 combinations to check out The IS NOT UNKNOWN predicate does not have to bother with the combinations, only the final logical value
10.3 IS [NOT] NORMALIZED Predicate
<string> IS [NOT] NORMALIZED determines whether a Unicode string is one of the four normal forms (D, C, KD, or KC) The use of the words “normal form” here are not the same as in a relational context In the Unicode model, a single character can be built from several other
Trang 410.3 IS [NOT] NORMALIZED Predicate 245
characters Accent marks can be put on basic Latin letters Certain combinations of letters can be displayed as ligatures (‘æ’ becomes ‘Ê’) Some languages, such as Hangul (Korean) and Vietnamese, build glyphs from concatenating symbols in two dimensions Some languages have special forms of one letter that are determined by context, such as the terminal sigma in Greek or accented ‘u’ in Czech In short, writing is more complex than just putting one letter after another
The Unicode standard defines the order of such constructions in their normal forms You can still produce the same results with
different orderings and sometimes with different combinations of symbols But it is very handy when you are searching such text to know that it is normalized, rather than to try and parse each glyph on the fly You can find details about normalization and links to free software at www.unicode.org
Trang 6C H A P T E R
11
CASE Expressions
THE CASE EXPRESSION IS probably the most useful addition in SQL-92 This is a quick overview of how to use the expression, but you will find other tricks spread throughout the book
The reason it is so important is that:
1 It works with any data type
2 It allows the programmer to avoid procedural code by replacing IF-THEN-ELSE control flow with CASE expression inside the query
3 It makes SQL statements equivalent to primitive recursive functions You can look up what that means in a book on the theory of computation, but it is a nice mathematical property that guarantees certain kinds of problems can be solved
11.1 The CASE Expression
The CASE expression allows the programmer to pick a value based on
a logical expression in his code ANSI stole the idea and the syntax from the now-defunct Ada programming language Here is the syntax for a <case specification>:
Trang 7248 CHAPTER 11: CASE EXPRESSIONS
<case specification> ::= <simple case> | <searched case>
<simple case> ::=
CASE <case operand>
<simple when clause>
[<else clause>]
END <searched case> ::=
CASE <searched when clause>
[<else clause>]
END <simple when clause> ::= WHEN <when operand> THEN <result>
<searched when clause> ::= WHEN <search condition> THEN
<result>
<else clause> ::= ELSE <result>
<case operand> ::= <value expression>
<when operand> ::= <value expression>
<result> ::= <result expression> | NULL <result expression> ::= <value expression>
The searched CASE expression is probably the most used version of the expression First, the expression is given a data type by finding the highest data type in its THEN clauses The WHEN THEN clauses are executed in left-to-right order The first WHEN clause that tests TRUE returns the value given in its THEN clause
And, yes, you can nest CASE expressions inside each other If no explicit ELSE clause is given for the CASE expression, then the database will insert an implicit ELSE NULL clause If you wish to return a NULL from a THEN, however, you should use a CAST (NULL AS <data type>) expression to establish the data type for the compiler
Trang 811.1 The CASE Expression 249
this works
CASE WHEN 1 = 1
THEN NULL
ELSE CAST(NULL AS INTEGER) END
this works
CASE WHEN 1 = 1
THEN CAST(NULL AS INTEGER)
ELSE NULL END
this does not work; no <result> to establish a data type CASE WHEN 1 = 1
THEN NULL
ELSE NULL END
might or might not work in your SQL
CAST (CASE WHEN 1 = 1
THEN NULL
ELSE NULL END AS INTEGER)
I recommend always writing an explicit ELSE clause, so that you can change it later when you find a value to return I would also recommend that you explicitly cast a NULL in the CASE expression THEN clause to the desired data type
If the THEN clauses have results of different data types, the compiler will find the most general one and CAST() the others to it But again, actual implementations might have slightly different ideas about how and when this casting should be done
The <simple case expression> is defined as a searched CASE expression in which all the WHEN clauses are made into equality
comparisons against the <case operand> For example:
CASE iso_sex_code
WHEN 0 THEN 'Unknown'
WHEN 1 THEN 'Male'
WHEN 2 THEN 'Female'
WHEN 9 THEN 'N/A'
ELSE NULL END
This could also be written as:
Trang 9250 CHAPTER 11: CASE EXPRESSIONS
CASE WHEN iso_sex_code = 0 THEN 'Unknown' WHEN iso_sex_code = 1 THEN 'Male' WHEN iso_sex_code = 2 THEN 'Female' WHEN iso_sex_code = 9 THEN 'N/A' ELSE NULL END
There is a gimmick in this definition, however The expression:
CASE foo WHEN 1 THEN 'bar' WHEN NULL THEN 'no bar' END
becomes:
CASE WHEN foo = 1 THEN 'bar' WHEN foo = NULL THEN 'no_bar' problem!
ELSE NULL END
The WHEN foo = NULL clause is always UNKNOWN This definition can get really weird with a random number generator in the expression Let’s assume that RANDOM() uses a seed value and returns a uniformly distributed random floating point number between 0.0000 and 0.99999999 99 whenever it is called
This expression will spend most of its time in the ELSE clause instead
of returning a number word between one and five
SET pick_one = CASE CAST((5.0 * RANDOM()) + 1 AS INTEGER) WHEN 1 THEN 'one'
WHEN 2 THEN 'two' WHEN 3 THEN 'three' WHEN 4 THEN 'four' WHEN 5 THEN 'five' ELSE 'This should not happen' END;
The expansion will reproduce the CAST() expression for each WHEN clause, and the RANDOM() function will be reevaluated each time You need to be sure that it is evaluated only once
Trang 1011.1 The CASE Expression 251
BEGIN
DECLARE pick_a_number INTEGER;
SET pick_a_number = CAST((5.0 * RANDOM()) + 1 AS INTEGER); SET pick_one = CASE pick_a_number
WHEN 1 THEN 'one'
WHEN 2 THEN 'two'
WHEN 3 THEN 'three'
WHEN 4 THEN 'four'
WHEN 5 THEN 'five'
ELSE 'This should not happen' END;
END;
The variable pick_a_number is also expanded in the WHEN clause, but because it is not a function call, it is not evaluated over and over
11.1.1 The COALESCE() and NULLIF() Functions
The SQL-92 Standard defines other functions in terms of the CASE expression, which makes the language a bit more compact and easier to implement For example, the COALESCE() function can be defined for one or two expressions by:
1) COALESCE (<value exp #1>) is equivalent to (<value exp #1>) 2) COALESCE (<value exp #1>, <value exp #2>) is equivalent to CASE WHEN <value exp #1> IS NOT NULL
THEN <value exp #1>
ELSE <value exp #2> END
Then we can recursively define it for (n) expressions, where (n >= 3),
in the list by:
COALESCE (<value exp #1>, <value exp #2>, , n), as equivalent to:
CASE WHEN <value exp #1> IS NOT NULL
THEN <value exp #1>
ELSE COALESCE (<value exp #2>, , n)
END
Likewise,
NULLIF (<value exp #1>, <value exp #2>) is equivalent to: