defined, 449
for numeric values, 449
with scalar subqueries, 454
for strings, 449
for temporal data types, 449
Median, 512–27
Celko’s first, 514–16
Celko’s second, 517–19
Celko’s third, 522–26
as central tendency measure, 513
with characteristic function, 520–
22
Date’s first, 513–14
Date’s second, 516
defined, 512
defining, 523
financial, 521
Henderson’s, 526–27
Murchison’s, 516–17
statistical, 512, 520
Vaughan’s, with VIEWs, 519–20
See also Statistics
MERGE statement, 232–34
correlation name, 233
syntax, 233
Metaphone, 177–81
defined, 177
Pascal version, 177–81
See also Phonetic matching
MIN() function, 349, 449–50
defined, 449
for numeric values, 449
for strings, 450
for temporal data types, 449–50
Minimum subsets, 620
Missing tables, 187
Missing times
in contiguous events, 652–56
end date, 653, 655
start date, 653, 655
See also Time(s)
Missing values, 187–90
in columns, 187–89 context and, 189–90 multiple, 199
See also Values
Modes, 510–12 changes, 511 defined, 510 derived tables for, 512 multiple, 510
MOD() function, 114–15 computation, 605 odd/even determination, 472 Modifications
audit log, 164 bitemporal tables, 165–66 current, 146–50
nonsequenced, 155 sequenced, 150–55 Moreno, Francisco, 364 Multiple aggregation levels, 431–35 CASE expressions for, 434–35 grouped VIEWs for, 432 intent, 431
subquery expressions for, 433–34 Multiple column data elements, 201–9 currency conversion, 205–6 distance functions, 201–2
IP address storage, 202–5 rational numbers, 209 Social Security numbers, 206–9 Multiple criteria
extrema functions, 460–62 forms, 461
ordering, 461 Multiple parameter auxiliary tables,
488–89 Multiple translation auxiliary tables,
487–88 Multivalued dependencies (MVDs), 75,
76
Trang 2Multivariable descriptive statistics,
546–48
covariance, 546–47
NULLs in, 548
Pearson’s r, 547
Murchison’s median, 516–17
N Named columns, 576–79
Names
column, 5
SQL, 5–6
table, 5
use guidelines, 5–6
NaN (Not a Number), 103
NATURAL JOINs, 327
Natural keys, 89
NCHAR() data type, 169
Negative values, 114
Nested EXISTs, 410
Nested parenthesis, 565
Nested queries, 745–46
Nested set model, 631–39
acyclic directed graphs, loading,
682
adjacency list conversion to, 637–
39
containment property, 634–35
converting, to adjacency list
model, 635
converting adjacency list model to,
637–39
counting property, 633–34
defined, 631
deleting nodes and subtrees, 636–
37
hierarchical aggregations, 636
results, 631, 632
self-JOIN query, 634
subordinates, 635
See also Hierarchies
Nested sets, 458
Nested UNIQUE constraints, 18–22 defined, 18
example, 18–22 Nested VIEWs, 370, 377–79 drawback, 378
restrictions, 377–78
See also VIEWs Nesting
aggregate functions, 431, 433, 434 subqueries, 454
VIEWs, 370 Net Present Value (NPV), 494, 497 Nodes
all in graph, viewing, 682–83 children, 623–24
defined, 623 deleting, 630–31, 636–37 depth, 630
descendents, 630 duplicate, 695 edges, 684 finding, 629–30 indegree, 684–85 inserting, 628 internal, 686 isolated, 685–86 leaf, 623, 625, 626 outdegree, 684–85 pairs, 693
reachable, 683–84 root, 629
sink, 685 source, 685 splitting, 682 total number, 691
See also Graphs; Trees
Nonacyclic graphs, 703–4 Nonsequenced queries, 139, 144, 162,
163 Nonsequence modifications, 155 Nonsubversion Rule, 63
Trang 3Normal forms, 64–87
1NF, 64–69
2NF, 70–71
3NF, 71–72
4NF, 75–76
5NF, 76–78
BCNF, 73–75
defined, 64
DKNF, 78–87
EKNF, 72–73
Normalization, 61–99
denormalization, 91–93
key types, 88–99
normal forms, 64–87
practical hints, 87–88
NOT DETERMINISTIC option, 607
Not equal (<>) operator, 235
NOT EXISTS() predicate, 291–92,
559
outer joins and, 303–4
subquery expression, 435
TRUE return, 302
NOT IN() predicate, 291–92, 294
NOT NULL constraint, 11–12
NULLIF() function, 110–11, 193,
251–52, 473
NULLs, 185–200
arithmetic and, 109–10
avoiding, 87
BETWEEN predicate results, 274
comparing, 190
concept, 109
converting values to/from, 110–13
cosine of, 193
in date fields, 196
dates, 655
design advice for, 195–98
encoding schemes and, 196
as eternity marker, 673
EXISTS predicate and, 300–302
FOREIGN KEYS and, 195
functions and, 193–94 general-purpose, 187, 346
as global, 110 groups and, 427 host languages and, 194–95
in host programs, 197–98 IN() predicate and, 293–95 INTERSECT/EXCEPT with, 600– 601
INTERSECT/EXCEPT without, 599–600
introduction, 185–87 logic and, 190–93 math and, 193 multiple values, 198–200
in multivariable descriptive statistics, 548
“Not Applicable,” 188 not using, 186–87 ORDER BY clause and, 329–33 OUTER JOINs and, 342–44 PRIMARY KEY columns and, 195 propagation, 110
quantities and, 196 row comparisons and, 240 rules, 186
sources, 242
in subquery predicates, 191–93 values, treatment of, 62
Number generators, 45 Numbering regions, 551–52 Numbers
approximate, 102 converting to words, 117–18 exact, 102
lists, condensing, 567 lists, folding, 567–68 ordinal, 554
rational, 209 row, 606 sequence, filling in, 560–62
Trang 4sequence, mapping to cycle, 481–
83
Social Security, 206–9
summation, 444
Number theory operators, 113–16
Numeric data, 101–18
MAX() function for, 449
MIN() function for, 449
NUMERIC numeric type, 102
Numeric types, 101–6
BIGINT, 102
conversion, 105–7
DECIMAL, 102
INTEGER, 102
NUMERIC, 102
SMALLINT, 102
Numeric values
approximate, 102–3
exact, 102
NVARCHAR() data type, 169
NYSIIS algorithm, 181–82
O OCTET_LENGTH() function, 173
ON clause
join conditions and, 354
OUTER JOINs and, 340
search predicate in, 340–41
One-level SELECT statement, 317–24
defined, 317–18
execution order, 318–20
FROM clause, 321, 324
GROUP BY clause, 319, 323
HAVING clause, 319
ORDER BY clause, 328–36
SELECT clause, 319–20
starting tables, 320–21
syntax, 318, 326–28
WHERE clause, 318, 323
See also SELECT statement
One-to-many relationships, 375
One True Lookup Table (OTLT), 491–
93 data type choices, 493 defined, 491
See also Lookup auxiliary tables
Online Analytic Processing (OLAP),
709–18 CUBES function, 713–14 defined, 709
DENSE_RANK function, 711 enterprise-wide dimensional layer, 717–18
example, 716–17 functionality, 711–18 functions, specifying, 711 GROUPING operators, 712–14 languages, 710
RANK function, 711 ROLLUP function, 713, 716 ROW_NUMBER function, 711–12 Star Schema, 710–11
window clause, 714–16 Online Transaction Processing (OLTP),
709 data warehousing and, 709–10 speed, 710
OPEN statement, 55 Optimistic concurrency control, 727–
29 Optimization, 731–60 Optimizers
<> comparison and, 736 cost-based, 731
defined, 731
“hot spots,” 756 JOIN orderings and, 740 JOIN pairs, 739
knowing, 754–56 rule-based, 731 types, 731 ORDER BY clause, 328–36
Trang 5CASE expression and, 333–36
cursor and, 328
execution expense, 437
NULLs and, 329–33
rules, applying, 331
SELECT statement and, 328
syntax, 328
Ordering
default, 438
multiple criteria, 461
predicates, 460
strings, 171
subset, 437
Ordinal numbers, 554
ORed predicates, 292–93
OR function, 474–75
ORM (Object Role Model), 78
Outdegree, 684–85
OUTER JOINs, 336–51
aggregate functions and, 348–49
crosstabs by, 543–44
execution order, 341
FULL, 337, 349–50, 351
functioning of, 337–38
LEFT, 337, 343, 351
multiple, 346–48
NATURAL, 344–45
NOT EXISTS predicate and, 303–
4
NULL result in, 349
NULLs and, 342–44
OLAP functions and, 342
ON clause and, 340
operators, 347
as query within SELECT clause,
341
RIGHT, 337, 351
searched, 344–45
self, 345–46
syntax, 337–42
table reconstruction from, 343
universal use, 336 WHERE clause and, 350–51
See also JOINs Overlapping keys, 22–25, 34 OVERLAPS predicate, 275–85, 667 avoiding, 285
defined, 273 end point interval and, 278 result, 276
rules, 276 time periods and, 275–85
P Packing joins, 358–59 Pairs
duration, 672–73 grouping into, 436–37 linear regression with, 548 node, 693
Parallel processing, 2 Parsing lists, 68–69 Partitions, 401–23 coverings and, 401–6
by functions, 403–4
by ranges, 402–3
by sequences, 404–6 Path enumeration model, 628–31 defined, 628
deleting nodes/subtrees, 630–31 finding levels/subordinates, 630 finding subtrees/nodes, 629–30 integrity constraints, 631
See also Trees
Paths, 686–95 cost, 692 with CTE, 697–705 eliminating, 694 endpoints, 683 finding, 686
by iteration, 688–90 least cost, 690 lengths, 687, 692
Trang 6listing, 691–95
shortest, 687–88
shortest, without recursion, 689–
90
steps, 693
tables holding, 691
See also Graphs
Patterns
% in front of, 263
special symbols, 267–68
tricks with, 262–64
See also LIKE predicate
Pearson’s r, 547
Period of applicability (PA), 150
Period of validity (PV), 150
Persistent tables, 3
Personal calendars, 643–45
Pessimistic concurrency control, 726–
27
Phonetic matching, 175–82
Metaphone, 177–81
NYSIIS algorithm, 181–82
Soundex, 176–77
Soundex functions, 175–76
Physical addresses, 39
Physical Data Independence Rule, 63
Physical grouping, 716
Pointer structures, 382–83
Points inside polygons, 706–7
Polygons
convex, 706
defined, 706
points inside, 706–7
POSITION() function, 173, 259, 540
POWER() function, 116, 565
PRD() function, 468–73
DISTINCT option, 469
by expressions, 469–70
by logarithms, 470–73
Preallocated values, 44–45
Predicates, 17
ALL, 312, 313–14 ANY, 312, 313 BETWEEN, 240, 273–75
in CHECK() constraint, 254 CONTAINS, 613
dummy, 736 EXISTS, 216, 288, 299–308
IN, 192, 287–97, 742–44
IS [NOT] NORMALIZED, 244–45 LIKE, 261–71
NOT IN, 192, 291–92 ordering, 460
ORed, 292–93 OVERLAPS, 275–85, 667 quantified, 309–15 SIMILAR TO, 267–69 subquery, NULLs in, 191–93 UNIQUE, 314–15
valued, 241–45 WHERE, 144, 212–16 PRIMARY KEY constraint, 14, 738,
741 compound, 20 defined, 14 Primary keys choosing, 22 fundamental requirement, 62 uniqueness, 751
Procedural loops, 326 Procedures, 53 Pseudo-random number generators,
609–10 defined, 609 linear congruence, 609–10
Q QNaN (Quiet NaN), 103 Quantified predicates, 309–15 Quantifiers
defined, 309 EXISTS() predicate and, 304–5 forms, 309
Trang 7as logical quantity, 309
missing data and, 311–13
Queries
ad hoc, 742
audit log, 160–64
bound, 557
current, 144
date arithmetic, 128–29
derived tables inside, 370
extra join information, 738–40
GROUP BY, 252–53
JOIN, 290
leaf nodes, 625
nested, 745–46
nonsequenced, 139, 144, 162,
163
partitioning data in, 401–23
procedural traversal, 627–28
recursive, 705
relational division, 408
runs and sequence, 557–62
scalar, 297
sequenced, 138, 144, 162, 163
sequenced JOINs, 141
temporal, 641–80
unnested, 733–38
VIEWs in, 370
Quintiles, 537–38
defined, 537–38
example, 538
See also Statistics
R RANDOM() function, 250, 608–10, 611
Random numbers
calculating, 611–12
generators, 45, 609
Random-order keys, 45
Random order values, 45–48
additive congruential method, 45–
46
defined, 45
four-bit generator, 46 tap positions, 47
See also Values
Range auxiliary tables, 489–90 Ranges
counters, 40 holes in, 572 partitioning by, 402–3 single-column tables, 402–3 RANK function, 711
Rankings, 533–37 defined, 533 defining, 556 query, 534 versions, 534–36
See also Statistics
Rational numbers, 209 Reachable nodes, 683–84 READ COMMITTED isolation level, 725 Read-only VIEWs, 371–73
READ UNCOMMITTED isolation level,
726 Reconvergent graphs, 681 Recursive queries, 705 Redundancy removal, 622 Redundant duplicates, 217–19 defined, 217
removal with ROWID, 218–19 rows, 617
in tables, 217–18 REFERENCES clause, 15–17 actions, 15–17
defined, 15 lookup tables and, 296 REFERENCES constraint, 36
<references specification>, 15 Referential actions, 16–17
CASCADE option, 16 deleting multiple tables without, 220
NO ACTION option, 16–17
Trang 8SET DEFAULT option, 16
SET NULL option, 16
Referential constraints
EXISTS() and, 305–6
IN() predicate and, 295–96
See also Constraints
Referential integrity
declarative, 31
redundant duplicate removal and,
217–18
Regions
defined, 550
finding, of maximum size, 552–56
numbering, 551–52
Relational database management
system (RDBMS), 418–19
Relational division, 406–8
CROSS JOIN, 408
defined, 406
exact, 409
example, 406–8
HAVING clause, 416
with JOINs, 412–13
query, 408
with remainder, 408–9
Romley’s, 414–18
Todd’s, 410–12
Relations, 61–63
Relationships
first rule, 549
many-to-many, 87
one-to-many, 375
REPEATABLE READ isolation level,
725
Repeating groups, 66–69
columns, 67–68
parsing lists, 68–69
See also Groups
RIGHT OUTER JOINs, 337, 351
ROLLUP group, 713
Romley’s division, 414–18
ROUND() function, 117 Rounding, 105–7, 612 conventions, 106 implementation, 106 types of, 444
Row comparisons, 238–40 defined, 238
NULLs and, 240 rules, 239–40 ROWID
physical addresses and, 39 redundant duplicate removal with, 218–19
ROW_NUMBER function, 711–12 Rows
attribute split, 33–34 candidate, 622 constructing, 397 deleting, 38 duplicate, 48–50 equality, 614 inserting, 38 numbers, 606 random, picking, 607–12 redundant duplicate, 617
in self-join, 33 sorting, 93–99 subqueries, 257, 310 subset, removing, 213 updating, 38
value-equivalent, prevention, 132–33
See also Tables
Rule-based optimizers, 731 Running differences, 530–31
Running statistics See Cumulative
statistics Running totals, 529–30, 562 Runs
construction, 557 defined, 550
Trang 9queries, 557–62
S Scalar queries
IN() predicate and, 297
use, 433
Scalar subqueries, 257
comparisons, 310–11
with MAX() function, 454
placing, 310
running, 310
Schema-level constraints, 25–29
Schemas
bad design, 1–2
creating, 1, 3–5
default character set, 3
defined, 3
name, 3
Schema tables, 50–51
information, 50
querying, 50
Second Normal Form (2NF), 70–71
Seed values, 609
SELECT statement, 58, 143–44, 317–
68
correlated subqueries, 324–26
DISTINCT option, 745, 746, 747
INNER JOINs and, 327
JOINs and, 317–36
one-level, 317–24
ORDER BY clause, 328–36
syntax, 326–28
Self joins, 606
in nested set model, 634
quintuple, 571
Self OUTER JOINs, 345–46
Sequenced duplicates, 132, 134, 135
Sequence deletions, 150–52
illustrated, 151–52
period of applicability (PA), 150
period of validity (PV), 150
physical modifications, 151
Sequenced JOINs, 140, 141 Sequenced modifications, 150–55 deletion, 150–52
update, 152–55 Sequenced queries, 138, 144, 162, 163 Sequence generators, 36–37, 42 Sequences, 554
columns, 549 mapping, into cycles, 481–83 missing values, finding, 554 numbers, filling in, 560–62 partition by, 404–6
queries, 557–62 resetting, 479 restrictions, 559 start/finish values, 553 Sequence tables, 477–85, 572 Cartesian product and, 485 constructor syntax, 479 defined, 477–78 general declaration, 478 Sequential access, 732 Sequential numbers generating as keys, 36–48
in pure SQL, 39–41 SERIALIZABLE isolation level, 725 Sessions, 719–20
SET clause execution, 225 row change, 225 UPDATE statement, 224, 225–26
Set functions See Aggregate functions
Set operators, 591–603 ALL option, 601–2 defined, 591 division with, 413–14 EXCEPT, 596–601 INTERSECT, 596–601 UNION, 592–96 UNION ALL, 592–96 Sets, 591
Trang 10equality, testing for, 589
nested, 695–97
SET TRANSACTION statement, 724–
25
SIGN() function, 471, 475
SIMILAR TO predicate, 267–69
Single-column range tables, 402–3
Sink nodes, 685
SMALLINT numeric type, 102
Smisteru rule, 254, 256
SNaN (Signalling NaN), 103, 104
Snapshot isolation, 727–29
Social Security numbers (SSNs), 206–9
Area portion, 206
Group portion, 208–9
parts, 206
Serial portion, 209
Sorting
avoiding, 746–50
Bose-Nelson, 94, 95, 98
columns, 329
direction, controlling, 334
GROUP BY clause and, 437–38
grouped query results, 437
networks, 94
orders, 331
rows, 93–99
stable, 438
values, 333
by weekday names, 669–70
Soundex, 403
algorithm, 176–77
algorithm alternative, 177
defined, 175
drawback, 177
English pronunciation, 176
functions, 175–76
original, 176–77
See also Phonetic matching
Source nodes, 685
SQL
arrays, 575–89 graphs, 681–707 learning, 2 model for, 2 names, 5–6 numeric data in, 101–18 OLAP in, 709–18 optimizing, 731–60 static, 756–57 statistics, 509–48 temporal support, 167 trees/hierarchies, 623–40 working with, 4
SQRT() function, 116, 547 Stable sorts, 438
Standard deviation, 527–28 Star Schema, 710–11 Statements
ALTER TABLE, 5, 7–8 CLOSE, 56
CONNECT TO, 719–20 CREATE ASSERTION, 26 CREATE DOMAIN, 51–52 CREATE INDEX, 752 CREATE PROCEDURE, 53 CREATE SCHEMA, 3–5 CREATE TABLE, 5, 8–9 CREATE TEMP TABLE, 390–91 CREATE TRIGGER, 52–53 CREATE VIEW, 380 DEALLOCATE, 56 DECLARE CURSOR, 53–55, 58 DELETE, 58–59, 214
DELETE FROM, 211–20, 694 DROP ASSERTION, 25 DROP TABLE, 5, 6–7, 391 DROP VIEW, 389
FETCH, 55–56, 194 INSERT INTO, 221–23, 404 MERGE, 232–34
OPEN, 55