The columns are listed within parentheses as column name, data type, and any column attributes such as constraints, nullability, or default value: CREATE TABLE TableName ColumnName DATA
Trang 1OrderID UNIQUEIDENTIFIER NOT NULL
FOREIGN KEY REFERENCES dbo.[Order]
ON DELETE CASCADE,
ProductID UNIQUEIDENTIFIER NULL
FOREIGN KEY REFERENCES dbo.Product,
Chapter 23, ‘‘T-SQL Error Handling,’’ shows how to create triggers that handle custom ref-erential integrity and cascading deletes for nonstandard data schemas or cross-database referential integrity.
Creating User-Data Columns
A user-data column stores user data These columns typically fall into two categories: columns users use
to identify a person, place, thing, event, or action, and columns that further describe the person, place,
thing, event, or action
SQL Server tables may have up to 1,024 columns, but well-designed relational-database tables seldom
have more than 25, and most have only a handful
Data columns are created during table creation by listing the columns as parameters to theCREATE
TABLEcommand The columns are listed within parentheses as column name, data type, and any
column attributes such as constraints, nullability, or default value:
CREATE TABLE TableName ( ColumnName DATATYPE Attributes, ColumnName DATATYPE Attributes );
Data columns can be added to existing tables using theALTER TABLE ADD columnnamecommand:
ALTER TABLE TableName ADD ColumnName DATATYPE Attributes;
An existing column may be modified with theALTER TABLE ALTER COLUMNcommand:
ALTER TABLE TableName ALTER COLUMN ColumnName NEWDATATYPE Attributes;
To list the columns for the current database using code, query the sys.objects and sys.columns catalog views.
Column data types
The column’s data type serves two purposes:
Trang 2■ It enforces the first level of data integrity Character data won’t be accepted into adatetime
ornumericcolumn I have seen databases with every column set tonvarcharto ease
data entry What a waste The data type is a valuable data-validation tool that should not be
overlooked
■ It determines the amount of disk storage allocated to the column
Character data types
SQL Server supports several character data types, listed in Table 20-2
TABLE 20-2
Character Data Types
Data Type Description Size in Bytes
Char(n) Fixed-length character data up to 8,000 characters long
using collation character set
Defined length
* 1 byte Nchar(n) Unicode fixed-length character data Defined length
* 2 bytes VarChar(n) Variable-length character data up to 8,000 characters
long using collation character set
1 byte per character VarChar(max) Variable-length character data up to 2GB in length using
collation character set
1 byte per character nVarChar(n) Unicode variable-length character data up to 8,000
characters long using collation character set
2 bytes per character nVarChar(max) Unicode variable-length character data up to 2GB in
length using collation character set
2 bytes per character Text Variable-length character data up to 2,147,483,647
characters in lengthWarning: Deprecated
1 byte per character nText Unicode variable-length character data up to
1,073,741,823 characters in lengthWarning: Deprecated
2 bytes per character Sysname A Microsoft user-defined data type used for table and
column names that is the equivalent of nvarchar(128)
2 bytes per character
Unicode data types are very useful for storing multilingual data The cost, however, is the doubled size
Some developers usenvarcharfor all their character-based columns, while others avoid it at all costs
I recommend using Unicode data when the database might use foreign languages; otherwise, usechar,
varchar, ortext
Numeric data types
SQL Server supports several numeric data types, listed in Table 20-3
Trang 3Best Practice
When working with monetary values, be very careful with the data type Using float or real data
types for money will cause rounding errors The data types money and smallmoney are accurate
to one hundredth of a U.S penny For some monetary values, the client may request precision only to the
penny, in which case decimal is the more appropriate data type
TABLE 20-3
Numeric Data Types
Data Type Description Size in Bytes
Smallint Integers from -32,768 to 32,767 2 bytes
Int Integers from -2,147,483,648 to 2,147,483,647 4 bytes
Bigint Integers from -2 ˆ 63 to 2 ˆ 63-1 8 bytes
Decimalor Numeric Fixed-precision numbers up to -10 ˆ 38+ 1 Varies according
to length Money Numbers from -2 ˆ 63 to 2 ˆ 63, accuracy to one
ten-thousandths (.0001)
8 bytes
SmallMoney Numbers from -214,748.3648 through
+214,748.3647, accuracy to ten thousandths (.0001) 4 bytes Float Floating-point numbers ranging from -1.79E+ 308
through 1.79E+ 308, depending on the bit precision 4 or 8 bytes
Date/Time data types
Traditionally, SQL Server stores both the date and the time in a single column using thedatetimeand
smalldatetimedata types, described in Table 20-4 With SQL Server 2008, Microsoft released several
new date/time data types, making life much easier for database developers
Some programmers (non-DBAs) choose character data types for date columns This can cause a horrid conversion mess Use the IsDate() function to sort through the bad data.
Other data types
Other data types, listed and described in Table 20-5, fulfill the needs created by unique values, binary
large objects, and variant data
Trang 4TABLE 20-4
Date/Time Data Types
Data Type Description Size in Bytes
Datetime Date and time values from January 1, 1553
(beginning of the Julian calendar), through December
31, 9999, accurate to three milliseconds
8 bytes
Smalldatetime Date and time values from January 1, 1900, through
June 6, 2079, accurate to one minute
4 bytes
DateTime2() Date and time values January 1, 0001 through
December 31, 9999 (Gregorian calendar), variable accuracy from 01 seconds to 100 nanoseconds
6–8 bytes depending
on precision
Date Date and time values January 1, 0001 through
December 31, 9999 (Gregorian calendar)
3 bytes
Time(2) Time values, variable accuracy from 01 seconds to
100 nanoseconds
3–5 bytes depending
on precision Datetimeoffset Date and time values January 1, 0001 through
December 31, 9999 (Gregorian calendar), variable accuracy from 01 seconds to 100 nanoseconds, includes embedded time zone
8–10 bytes depending
on precision
TABLE 20-5
Other Data Types
Data Type Description Size in Bytes
Timestamp or
Rowversion
Database-wide unique random value generated with every update based on the transaction log LSN value
8 bytes
Uniqueidentifier System-generated 16-byte value 16 bytes
Binary(n) Fixed-length data up to 8,000 bytes Defined length
VarBinary(max) Fixed-length data up to 8,000 bytes Defined length
VarBinary Variable-length binary data up to 8,000 bytes Bytes used
Image Variable-length binary data up to 2,147,483,647
bytesWarning: Deprecated
Bytes used Sql_variant Can store any data type up to 2,147,483,647 bytes Depends on data
type and length
Trang 5Calculated columns
A calculated column is powerful in that it presents the results of a predefined expression the way a view
(a stored SQLSELECTstatement) does, but without the overhead of a view Calculated columns also
improve data integrity by performing the calculation at the table level, rather than trusting that each
query developer will get the calculation correct
By default, a calculated column doesn’t actually store any data; instead, the data is calculated when
queried However, since SQL Server 2005, calculated columns may be optionally persisted, in which
case they are calculated when entered and then sorted as regular, but read-only, row data They may
even be indexed Personally, I’ve replaced several old triggers with persisted, indexed, calculated
columns with great success They’re easy, and fast
The syntax simply defines the formula for the calculation in lieu of designating a data type:
ColumnName as Expression
TheOrderDetailtable from theOBXKitessample database includes a calculated column for the
extended price, as shown in the following abbreviated code:
CREATE TABLE dbo.OrderDetail (
.
Quantity NUMERIC(7,2) NOT NULL, UnitPrice MONEY NOT NULL,
ExtendedPrice AS Quantity * UnitPrice Persisted,
.
)
ON [Primary];
Go
Sparse columns
New for SQL Server 2008, sparse columns use a completely different method for storing data within the
page Normal columns have a predetermined designated location for the data If there’s no data, then
some space is wasted Even nullable columns use a bit to indicate the presence or absence of a null for
the column
Sparse columns, however, store nothing on the page if no data is present for the column for that row
To accomplish this, SQL Server essentially writes the list of sparse columns that have data into a list for
the row (5 bytes+ 2–4 bytes for every sparse column with data) If the columns usually hold data,
then sparse columns actually require more space than normal columns However, if the majority of
rows are null (I’ve heard a figure of 50%, but I’d rather go much higher), then the sparse column will
save space
Because sparse columns are intended for columns that infrequently hold data, they can be used for very
wide tables — up to 30,000 columns
To create a sparse column, add theSPARSEkeyword to the column definition The sparse column must
be nullable:
Trang 6CREATE TABLE Foo (
FooPK INT NOT NULL IDENTITY PRIMARY KEY,
Name VARCHAR(25) NOT NULL,
ExtraData VARCHAR(50) SPARSE NULL
);
Worst Practice
Any table design that requires sparse columns is a horrible design A different pattern, probably a
super-type subtype pattern, should be used instead Please don’t ever implement a table with sparse
columns Anyone who tells you they need to design a database with sparse columns should get a job flipping
burgers Don’t let them design your database
Column constraints and defaults
The database is only as good as the quality of the data A constraint is a high-speed data-validation
check or business-logic check performed at the database-engine level Besides the data type itself, SQL
Server includes five types of constraints:
■ Primary key constraint: Ensures a unique non-null key
■ Foreign key constraint: Ensures that the value points to a valid key
■ Nullability: Indicates whether the column can accept a null value
■ Check constraint: Custom Boolean constraint
■ Unique constraint: Ensures a unique value
SQL Server also includes the following column option:
■ Column Default: Supplies a value if none is specified in theINSERTstatement
The column default is referred to as a type of constraint on one page of SQL Server Books Online, but
is not listed in the constraints on another page I call it a column option because it does not constrain
user-data entry, nor does it enforce a data-integrity rule However, it serves the column as a useful
option
Column nullability
A null value is an unknown value; typically, it means that the column has not yet had a user entry
Chapter 9, ‘‘Data Types, Expressions, and Scalar Functions,’’ explains how to define, detect, and handle nulls.
Whether or not a column will even accept a null value is referred to as the nullability of the column and
is configured by thenullornot nullcolumn attribute
New columns in SQL Server default tonot null, meaning that they do not accept nulls However, this
option is normally overridden by the connection propertyansi_null_dflt_on The ANSI standard is
Trang 7to default tonull, which accepts nulls, in table columns that aren’t explicitly created with anot null
option
Best Practice
Because the default column nullability differs between ANSI SQL and SQL Server, it’s best to avoid relying
on the default behavior and explicitly declare null or not null when creating tables
The following code demonstrates the ANSI default nullability versus SQL Server’s nullability The
first test uses the SQL Server default by setting the databaseANSI NULLoption tofalse, and the
ANSI_NULL_DFLT_OFFconnection setting toON:
USE TempDB;
EXEC sp_dboption ‘TempDB’, ANSI_NULL_DEFAULT, ‘false’;
SET ANSI_NULL_DFLT_OFF ON;
TheNullTesttable is created without specifying the nullability:
CREATE TABLE NullTest(
PK INT IDENTITY, One VARCHAR(50) );
The following code attempts to insert a null:
INSERT NullTest(One) VALUES (NULL);
Result:
Server: Msg 515, Level 16, State 2, Line 1 Cannot insert the value NULL into column ‘One’, table ‘TempDB.dbo.NullTest’;
column does not allow nulls INSERT fails
The statement has been terminated
Because the nullability was set to the SQL Server default when the table was created, the column does
not accept null values The second sample will rebuild the table with the ANSI SQL nullability default:
EXEC sp_dboption ‘TempDB’, ANSI_NULL_DEFAULT, ‘true’;
SET ANSI_NULL_DFLT_ON ON;
DROP TABLE NullTest;
CREATE TABLE NullTest(
PK INT IDENTITY,
Trang 8One VARCHAR(50)
);
The next example attempts to insert a null:
INSERT NullTest(One)
VALUES (NULL);
Result:
(1 row(s) affected)
Managing Optional Data
Databases attempt to model reality In reality, sometimes there’s standard data that for one reason or
another doesn’t apply to a specific object Some people don’t have a suffix (e.g., Jr or Sr.) Some
addresses don’t have a second line Some orders are custom jobs and don’t have part numbers
Sometimes the missing data is only temporarily missing and it will be filled in later A new customer supplies
her name and e-mail address, but not her street address A new order doesn’t yet have a closed date, but will
have one later Every employee will eventually have a termination date
The usual method for handling optional or missing data is with a nullable column Nulls are controversial at
best Some database modelers use them constantly, while other believe that nulls are evil Even the meaning
of null is debated, with some claiming null means unknown, others saying null means the absence of data
When the bits hit the hard drive, there are three possible solutions for representing optional data in a
database Rather than debate the merits of each option, this is an opportunity to apply the database objectives
from Chapter 2, ‘‘Data Architecture’’:
■ Nullable columns: These use a consistent bit to represent the fact that the column is
missing data
■ Surrogate nulls: These use a data flag (e.g., ‘‘na’’, ‘‘n/a’’, empty string, -99) to
repre-sent the missing data While popular with data modelers who want to avoid nulls and
left outer joins, this solution has several problems
Real data is being used to represent missing data, so every query must filter out the
missing data correctly Using surrogate nulls for date/time columns is particularly
messy Surrogate nulls in a numeric aggregate must be filtered out (nulls handle this
automatically) Over time, surrogate nulls tend to become less consistent as more users
or developers employ differing values for the surrogate null
■ Absent rows: This solution removes the optional data column from the main table and
places it in another supertype/subtype table If the data does not apply to a given row,
that row is not inserted into the subtype table, hence the name missing row While this
continued
Trang 9completely eliminates nulls and surrogate nulls from the database and sounds correct
in theory, it presents a host of practical difficulties
Queries are now very complex to code correctly Left outer joins are required to retrieve or even test for the presence of data in the optional data column This can create data integrity issues if developers use the wrong type of join; and it kills perfor-mance, as SQL Server has to read from multiple tables and indexes to retrieve data for
a single entity
Inserts and updates have to parse out the columns to different tables
Creating Indexes
Indexes are the bridge from a query to the data Without indexes, SQL Server must scan and filter to
select specific rows — a dog slow process at best With the right indexes, SQL Server screams
SQL Server uses two types of indexes: clustered indexes, which reflect the logical sort order of the table,
and non-clustered indexes, which are additional b-trees typically used to perform rapid searches of
non-key columns The columns by which the index is sorted are referred to as the key columns.
Within the Management Studio’s Object Explorer, existing indexes for each table are listed under the
DatabaseName ➪ Tables ➪ TableName ➪ Indexes node Every index property for new or existing
indexes may be managed using the Index Properties page, shown in Figure 20-7 The page is opened for
existing indexes by right-clicking on the index and choosing Properties New indexes are created from
the context menu of the Indexes node under the selected table
While this chapter covers the syntax and mechanics of creating indexes, Chapter 64,
‘‘Indexing Strategies,’’ explores how to design indexes for performance.
Using Management Studio, indexes are visible as nodes under the table in Object Explorer Use the
Indexes context menu and select New Index to open the New Index form, which contains four pages:
■ General index information includes the index name, type, uniqueness, and key columns.
■ Index Options control the behavior of the index In addition, an index may be disabled or
re-enabled
■ Included Columns are non-key columns used for covering indexes.
■ The Storage page places the index on a selected filegroup.
■ The Spatial page has configuration options specific to indexes for the spatial data type.
■ The Filter page is for SQL Server 2008’s newWHEREclause option for indexes
When opening the properties of an existing index, the Index Properties form also includes two
additional pages:
Trang 10■ The Fragmentation page displays detailed information about the health of the index.
■ Extended Properties are user-defined additional properties.
FIGURE 20-7
Every index option may be set using Management Studio’s Index Properties page
Changes made in the Index Properties page may be executed immediately using the OK button or
scheduled or scripted using the icons at the top of the page
Indexes are created in code with theCREATE INDEXcommand The following command creates a
clustered index namedIxOrderIDon theOrderIDforeign key of theOrderDetailtable:
CREATE CLUSTERED INDEX IxOrderID
ON dbo.OrderDetail (OrderID);
To retrieve fascinating index information from T-SQL code, use the following functions
and catalog views: sys.indexes , sys.index_columns , sys.stats , sys.stats_columns ,
sys.dm_db_index_physical_stats , sys.dm_index_operational_stats , sys.indexkey_property ,
and sys.index_col