252 CHAPTER 11: CASE EXPRESSIONS CASE WHEN = THEN NULL ELSE END 11.1.2 CASE Expressions with GROUP BY to determine how many employees of each gender by department you have in your
Trang 1252 CHAPTER 11: CASE EXPRESSIONS
CASE WHEN <value exp #1> = <value exp #2>
THEN NULL ELSE <value exp #1> END
11.1.2 CASE Expressions with GROUP BY
to determine how many employees of each gender by department you have in your Personnel table, you can write:
SELECT dept_nbr, SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0) AS males, SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0) AS females FROM Personnel
GROUP BY dept_nbr;
or:
SELECT dept_nbr, COUNT(CASE WHEN gender = 'M' THEN 1 ELSE NULL) AS males, COUNT(CASE WHEN gender = 'F' THEN 1 ELSE NULL) AS females FROM Personnel
GROUP BY dept_nbr;
I am not sure if there is any general rule as to which form will run
expression For example, assume you are given a table of employees’
skills:
CREATE TABLE PersonnelSkills (emp_id CHAR(11) NOT NULL, skill_id CHAR(11) NOT NULL, primary_skill_ind CHAR(1) NOT NULL CONSTRAINT primary_skill_given CHECK (primary_skill_ind IN ('Y', 'N'), PRIMARY KEY (emp_id, skill_id));
Trang 2Each employee has a row in the table for each of his skills If the employee has multiple skills, she will have multiple rows in the table, and the primary skill indicator will be a ‘Y’ for her main skill If she only has one skill (which means one row in the table), the value of
primary_skill_ind is indeterminate The problem is to list each employee once along with her only skill, if she only has one row in the table, or her primary skill, if she has multiple rows in the table
SELECT emp_id,
CASE WHEN COUNT(*) = 1
THEN MAX(skill_id)
ELSE MAX(CASE WHEN primary_skill_ind = 'Y'
THEN skill_id END)
ELSE NULL END)
END AS main_skill
FROM PersonnelSkills
GROUP BY emp_id;
This solution looks at first like a violation of the rule in SQL that prohibits nested aggregate functions, but if you look closely, it is not
inside its MAX() that can be resolved to a single value
11.1.3 CASE, CHECK() Clauses and Logical Implication
returns either 1 (TRUE) or 0 (FALSE):
CONSTRAINT implication_example
CHECK (CASE WHEN dept_nbr = 'D1'
THEN CASE WHEN salary < 44000.00
THEN 1 ELSE 0 END
ELSE 1 END = 1)
This is a logical implication operator It is usually written as an arrow
⇒
Trang 3constraint In standard Boolean logic, there is a simple transformation called the Smisteru rule (after the engineer who discovered it), which
In SQL, the Data Declaration language (DDL) uses predicates in CHECK() constraints and treats TRUE and UNKNOWN alike The Data Manipulation Language (DML) uses predicates in the WHERE and ON clauses and treats treats FALSE and UNKNOWN alike How do you define logical implication with two different rules?
Let’s try the Smisteru transform first:
CREATE TABLE Foobar_DDL_1 (a CHAR(1) CHECK (a IN ('T', 'F')),
b CHAR(1) CHECK (b IN ('T', 'F')), CONSTRAINT implication_example CHECK (NOT (A ='T') OR (B = 'T')));
INSERT INTO Foobar_DDL_1 VALUES ('T', 'T');
INSERT INTO Foobar_DDL_1 VALUES ('T', 'F'); fails INSERT INTO Foobar_DDL_1 VALUES ('T', NULL);
INSERT INTO Foobar_DDL_1 VALUES ('F', 'T');
INSERT INTO Foobar_DDL_1 VALUES ('F', 'F');
INSERT INTO Foobar_DDL_1 VALUES ('F', NULL);
INSERT INTO Foobar_DDL_1 VALUES (NULL, 'T');
INSERT INTO Foobar_DDL_1 VALUES (NULL, 'F');
INSERT INTO Foobar_DDL_1 VALUES (NULL, NULL);
SELECT * FROM Foobar_DDL_1;
Results
a b
===========
NULL T NULL F NULL NULL
Trang 4Now my original version:
CREATE TABLE Foobar_DDL
(a CHAR(1) CHECK (a IN ('T', 'F')),
b CHAR(1) CHECK (b IN ('T', 'F')),
CONSTRAINT implication_example_2
CHECK(CASE WHEN A = 'T'
THEN CASE WHEN B = 'T'
THEN 1 ELSE 0 END
ELSE 1 END = 1));
INSERT INTO Foobar_DDL
VALUES ('T', 'T')
('T', 'F'), fails
('T', NULL),
('F', 'T'), ('F', 'F'), ('F', NULL),
(NULL, 'T'), (NULL, 'F'), (NULL, NULL);
SELECT * FROM Foobar_DDL;
Results
a b
===========
NULL T
NULL F
NULL NULL
operators!
Let’s now look at the query side of the house:
Trang 5INSERT INTO Foobar_DML VALUES ('T', 'F');
INSERT INTO Foobar_DML VALUES ('T', NULL);
INSERT INTO Foobar_DML VALUES ('F', 'T');
INSERT INTO Foobar_DML VALUES ('F', 'F');
INSERT INTO Foobar_DML VALUES ('F', NULL);
INSERT INTO Foobar_DML VALUES (NULL, 'T');
INSERT INTO Foobar_DML VALUES (NULL, 'F');
INSERT INTO Foobar_DML VALUES (NULL, NULL);
Using the Smisteru rule as the search condition:
SELECT * FROM Foobar_DML WHERE (NOT (A ='T') OR (B = 'T'));
Results
a b
==========
NULL T
Using the original predicate:
SELECT * FROM Foobar_DML WHERE CASE WHEN A = 'T' THEN CASE WHEN B = 'T' THEN 1 ELSE 0 END ELSE 1 END = 1;
Results
a b
==========
NULL T NULL F NULL NULL
Trang 6This is why I used the CASE expression; it works the same way in both the DDL and DML
11.1.4 Subquery Expressions and Constants
course, there is more to it than that
The four flavors of subquery expressions are tabular, columnar, row, and scalar subquery expressions As you might guess from the names, the tabular or table subquery returns a table as a result, so it has to appear any place that a table is used in SQL-92, which usually means it
is in the FROM clause
The columnar subquery returns a table with a single column in it This was the important one in the original SQL-86 and SQL-89
standards, because the IN, <comp op> ALL and <comp op>
{ANY|SOME} predicates were based on the ability of the language to
or ORs
The row subquery returns a single row It can be used anywhere a
statement used in the embedded SQL It is not used too much right now, but with the extension of theta operators to handle row comparisons, it might become more popular
The scalar subquery returns a single scalar value It can be used anywhere a scalar value can be used, which usually means it is in the
SELECT or WHERE clauses If a scalar subquery returns an empty result,
row, you get a cardinality violation
I will make the very general statement now that the performance of
scalar subqueries depends largely on the architecture of the hardware upon which your SQL is implemented A massively parallel machine can allocate a processor to each scalar subquery and get drastic performance improvement
A table constant of any shape can be constructed using the
VALUES() expression New SQL programmers think that this is only an
you to use it to build a row as a comma-separated list of scalar
Trang 7.
('GA', 30000, 30399),
('WY', 82000, 83100);
rebuilding it It has no named base table
11.2 Rozenshtein Characteristic Functions
A characteristic function converts a logical expression into a one if it is
literature uses a lowercase delta (δ) or a capital chi (Χ) as the symbol for this operator Programmers first saw this in Ken Iverson’s APL
programming language, and then later in Donald Knuth’s books on programming theory The name comes from the fact that it is used to define a set by giving a rule for membership in the set
David Rozenshtein found ways of implementing characteristic functions with algebraic expression on numeric columns in the Sybase
their product Without going into the details, I will borrow Dr
Rozenshtein’s notation and give the major formulas for putting converted numeric comparisons into a computed characteristic function: ((a = b) becomes (1 - ABS(SIGN(a - b)))
((a <> b) becomes (ABS(SIGN(a - b))) ((a < b) becomes (1 - SIGN(1 + SIGN(a - b))) ((a <= b) becomes (SIGN(1 - SIGN(a - b))) ((a > b) becomes (1 - SIGN(1 - SIGN(a - b))) ((a >= b) becomes (SIGN(1 + SIGN(a - b)))
The basic logical operators can also be put into computed
logic, we can write these expressions:
NOT ((a) becomes (1 - ((a)) (((a) AND ((b)) becomes SIGN(((a) * ((b)) (((a) OR ((b)) becomes SIGN(((a) + ((b))
If you remember George Boole’s original notation for Boolean
Algebra, this will look very familiar But be aware that if a or b is a NULL,
Trang 8then the results will be a NULL and not a one or zero—something Mr Boole never thought about
are careful
((a = s) becomes POSITION(a IN s)
((a <> s) becomes SIGN (1 - POSITION (a IN s))
Rozenshtein’s book gives more tricks (Rozenshtein 1995), but many
of them depend on Sybase’s T-SQL functions and are not portable Another problem is that the code can become very hard to read, so that what is happening is not obvious to the next programmer to read the
choice, since a human being must maintain the code
Trang 10C H A P T E R
12
LIKE Predicate
<like predicate> ::=
<match value> [NOT] LIKE <pattern>
[ESCAPE <escape character>]
<match value> ::= <character value expression>
<pattern> ::= <character value expression>
<escape character> ::= <character value expression>
allowed in the <pattern> string They are the ‘%’ and ‘_’ characters The ‘_’ character represents a single arbitrary character; the ‘%’ character represents an arbitrary substring, possibly of length zero Notice that there is no way to represent zero or one arbitrary character This is not the case in many text-search languages, and can lead to