Joe Celko s SQL for Smarties - Advanced SQL Programming P29 doc

252 CHAPTER 11: CASE EXPRESSIONS CASE WHEN = THEN NULL ELSE END 11.1.2 CASE Expressions with GROUP BY to determine how many employees of each gender by department you have in your

Trang 1

252 CHAPTER 11: CASE EXPRESSIONS

CASE WHEN <value exp #1> = <value exp #2>

THEN NULL ELSE <value exp #1> END

11.1.2 CASE Expressions with GROUP BY

to determine how many employees of each gender by department you have in your Personnel table, you can write:

SELECT dept_nbr, SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0) AS males, SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0) AS females FROM Personnel

GROUP BY dept_nbr;

or:

SELECT dept_nbr, COUNT(CASE WHEN gender = 'M' THEN 1 ELSE NULL) AS males, COUNT(CASE WHEN gender = 'F' THEN 1 ELSE NULL) AS females FROM Personnel

GROUP BY dept_nbr;

I am not sure if there is any general rule as to which form will run

expression For example, assume you are given a table of employees’

skills:

CREATE TABLE PersonnelSkills (emp_id CHAR(11) NOT NULL, skill_id CHAR(11) NOT NULL, primary_skill_ind CHAR(1) NOT NULL CONSTRAINT primary_skill_given CHECK (primary_skill_ind IN ('Y', 'N'), PRIMARY KEY (emp_id, skill_id));

Trang 2

Each employee has a row in the table for each of his skills If the employee has multiple skills, she will have multiple rows in the table, and the primary skill indicator will be a ‘Y’ for her main skill If she only has one skill (which means one row in the table), the value of

primary_skill_ind is indeterminate The problem is to list each employee once along with her only skill, if she only has one row in the table, or her primary skill, if she has multiple rows in the table

SELECT emp_id,

CASE WHEN COUNT(*) = 1

THEN MAX(skill_id)

ELSE MAX(CASE WHEN primary_skill_ind = 'Y'

THEN skill_id END)

ELSE NULL END)

END AS main_skill

FROM PersonnelSkills

GROUP BY emp_id;

This solution looks at first like a violation of the rule in SQL that prohibits nested aggregate functions, but if you look closely, it is not

inside its MAX() that can be resolved to a single value

11.1.3 CASE, CHECK() Clauses and Logical Implication

returns either 1 (TRUE) or 0 (FALSE):

CONSTRAINT implication_example

CHECK (CASE WHEN dept_nbr = 'D1'

THEN CASE WHEN salary < 44000.00

THEN 1 ELSE 0 END

ELSE 1 END = 1)

This is a logical implication operator It is usually written as an arrow

⇒

Trang 3

constraint In standard Boolean logic, there is a simple transformation called the Smisteru rule (after the engineer who discovered it), which

In SQL, the Data Declaration language (DDL) uses predicates in CHECK() constraints and treats TRUE and UNKNOWN alike The Data Manipulation Language (DML) uses predicates in the WHERE and ON clauses and treats treats FALSE and UNKNOWN alike How do you define logical implication with two different rules?

Let’s try the Smisteru transform first:

CREATE TABLE Foobar_DDL_1 (a CHAR(1) CHECK (a IN ('T', 'F')),

b CHAR(1) CHECK (b IN ('T', 'F')), CONSTRAINT implication_example CHECK (NOT (A ='T') OR (B = 'T')));

INSERT INTO Foobar_DDL_1 VALUES ('T', 'T');

INSERT INTO Foobar_DDL_1 VALUES ('T', 'F'); fails INSERT INTO Foobar_DDL_1 VALUES ('T', NULL);

INSERT INTO Foobar_DDL_1 VALUES ('F', 'T');

INSERT INTO Foobar_DDL_1 VALUES ('F', 'F');

INSERT INTO Foobar_DDL_1 VALUES ('F', NULL);

INSERT INTO Foobar_DDL_1 VALUES (NULL, 'T');

INSERT INTO Foobar_DDL_1 VALUES (NULL, 'F');

INSERT INTO Foobar_DDL_1 VALUES (NULL, NULL);

SELECT * FROM Foobar_DDL_1;

Results

a b

===========

NULL T NULL F NULL NULL

Trang 4

Now my original version:

CREATE TABLE Foobar_DDL

(a CHAR(1) CHECK (a IN ('T', 'F')),

b CHAR(1) CHECK (b IN ('T', 'F')),

CONSTRAINT implication_example_2

CHECK(CASE WHEN A = 'T'

THEN CASE WHEN B = 'T'

THEN 1 ELSE 0 END

ELSE 1 END = 1));

INSERT INTO Foobar_DDL

VALUES ('T', 'T')

('T', 'F'), fails

('T', NULL),

('F', 'T'), ('F', 'F'), ('F', NULL),

(NULL, 'T'), (NULL, 'F'), (NULL, NULL);

SELECT * FROM Foobar_DDL;

Results

a b

===========

NULL T

NULL F

NULL NULL

operators!

Let’s now look at the query side of the house:

Trang 5

INSERT INTO Foobar_DML VALUES ('T', 'F');

INSERT INTO Foobar_DML VALUES ('T', NULL);

INSERT INTO Foobar_DML VALUES ('F', 'T');

INSERT INTO Foobar_DML VALUES ('F', 'F');

INSERT INTO Foobar_DML VALUES ('F', NULL);

INSERT INTO Foobar_DML VALUES (NULL, 'T');

INSERT INTO Foobar_DML VALUES (NULL, 'F');

INSERT INTO Foobar_DML VALUES (NULL, NULL);

Using the Smisteru rule as the search condition:

SELECT * FROM Foobar_DML WHERE (NOT (A ='T') OR (B = 'T'));

Results

a b

==========

NULL T

Using the original predicate:

SELECT * FROM Foobar_DML WHERE CASE WHEN A = 'T' THEN CASE WHEN B = 'T' THEN 1 ELSE 0 END ELSE 1 END = 1;

Results

a b

==========

NULL T NULL F NULL NULL

Trang 6

This is why I used the CASE expression; it works the same way in both the DDL and DML

11.1.4 Subquery Expressions and Constants

course, there is more to it than that

The four flavors of subquery expressions are tabular, columnar, row, and scalar subquery expressions As you might guess from the names, the tabular or table subquery returns a table as a result, so it has to appear any place that a table is used in SQL-92, which usually means it

is in the FROM clause

The columnar subquery returns a table with a single column in it This was the important one in the original SQL-86 and SQL-89

standards, because the IN, <comp op> ALL and <comp op>

{ANY|SOME} predicates were based on the ability of the language to

or ORs

The row subquery returns a single row It can be used anywhere a

statement used in the embedded SQL It is not used too much right now, but with the extension of theta operators to handle row comparisons, it might become more popular

The scalar subquery returns a single scalar value It can be used anywhere a scalar value can be used, which usually means it is in the

SELECT or WHERE clauses If a scalar subquery returns an empty result,

row, you get a cardinality violation

I will make the very general statement now that the performance of

scalar subqueries depends largely on the architecture of the hardware upon which your SQL is implemented A massively parallel machine can allocate a processor to each scalar subquery and get drastic performance improvement

A table constant of any shape can be constructed using the

VALUES() expression New SQL programmers think that this is only an

you to use it to build a row as a comma-separated list of scalar

Trang 7

.

('GA', 30000, 30399),

('WY', 82000, 83100);

rebuilding it It has no named base table

11.2 Rozenshtein Characteristic Functions

A characteristic function converts a logical expression into a one if it is

literature uses a lowercase delta (δ) or a capital chi (Χ) as the symbol for this operator Programmers first saw this in Ken Iverson’s APL

programming language, and then later in Donald Knuth’s books on programming theory The name comes from the fact that it is used to define a set by giving a rule for membership in the set

David Rozenshtein found ways of implementing characteristic functions with algebraic expression on numeric columns in the Sybase

their product Without going into the details, I will borrow Dr

Rozenshtein’s notation and give the major formulas for putting converted numeric comparisons into a computed characteristic function: ((a = b) becomes (1 - ABS(SIGN(a - b)))

((a <> b) becomes (ABS(SIGN(a - b))) ((a < b) becomes (1 - SIGN(1 + SIGN(a - b))) ((a <= b) becomes (SIGN(1 - SIGN(a - b))) ((a > b) becomes (1 - SIGN(1 - SIGN(a - b))) ((a >= b) becomes (SIGN(1 + SIGN(a - b)))

The basic logical operators can also be put into computed

logic, we can write these expressions:

NOT ((a) becomes (1 - ((a)) (((a) AND ((b)) becomes SIGN(((a) * ((b)) (((a) OR ((b)) becomes SIGN(((a) + ((b))

If you remember George Boole’s original notation for Boolean

Algebra, this will look very familiar But be aware that if a or b is a NULL,

Trang 8

then the results will be a NULL and not a one or zero—something Mr Boole never thought about

are careful

((a = s) becomes POSITION(a IN s)

((a <> s) becomes SIGN (1 - POSITION (a IN s))

Rozenshtein’s book gives more tricks (Rozenshtein 1995), but many

of them depend on Sybase’s T-SQL functions and are not portable Another problem is that the code can become very hard to read, so that what is happening is not obvious to the next programmer to read the

choice, since a human being must maintain the code

Trang 10

C H A P T E R

12

LIKE Predicate

<like predicate> ::=

[ESCAPE <escape character>]

allowed in the <pattern> string They are the ‘%’ and ‘_’ characters The ‘_’ character represents a single arbitrary character; the ‘%’ character represents an arbitrary substring, possibly of length zero Notice that there is no way to represent zero or one arbitrary character This is not the case in many text-search languages, and can lead to

Định dạng
Số trang	10
Dung lượng	236,92 KB