However, the Database Engine uses the file location information stored in the primary file to initialize the file location entries in the master database when attaching a database using
Trang 1SQL Server uses the file location information visible in the sys.master_files catalog view
most of the time However, the Database Engine uses the file location information stored
in the primary file to initialize the file location entries in the master database when
attaching a database using the CREATE DATABASE statement with either the FOR ATTACH or
FOR ATTACH_REBUILD_LOG options
Every database can have three types of files:
Primary data file
Secondary data files
Log files
In addition, in SQL Server 2008, databases can also have FILESTREAM data files and
full-TABLE 34.1 The sysfiles Table
Column Name Description
file_id A file identification number that is unique within each database
file_guid GUID for the file
type File type (0=rows [that is, data files], 1=log, 2=FILESTREAM, 4=Full-text
catalogs prior to SQL Server 2008 type_desc Description of the file type (ROWS, LOG, FILESTREAM, FULLTEXT)
data_space_id 0 represents a log file; values > 0 represent the ID of the filegroup the
data file belongs to name The logical name of the file
filename The physical name of the file, including path
state File state (0 = OFFLINE, 1 = RESTORING, 2 = RECOVERING, 3 =
RECOVERY_PENDING, 4 = SUSPECT, 6 = OFFLINE, 7=DEFUNCT) state_desc Description of the file state (OFFLINE, RESTORING, RECOVERING,
RECOVERY_PENDING, SUSPECT, OFFLINE, DEFUNCT) size Current size of the file in 8KB pages
max_size Maximum file size in 8KB pages
growth File growth setting (0=fixed, >0=autogrow in units of 8KB pages or by
percentage if is_percent_growth is set to 1) is_media_read_only 1=file is on read-only media
is_read_only 1= file is marked read-only
is_sparse 1=file is a sparse file
is_percent_growth 1=growth of file value is percentage
Trang 2Primary Data File
Every database has only one primary database file The location of the primary database
file is stored in the master database (visible via the filename column in the
sys.master_files view) When SQL Server opens a database, it looks for this file and then
reads from the file information on the other files defined for the database
The file extension for the primary database file defaults to mdf The primary database file
always belongs to the default filegroup It is often sufficient to have only one database file
for storing your tables and indexes (the primary database file) The file can, of course, be
created on a RAID partition to help spread I/O However, if you need finer control over
placement of your tables across disks or disk arrays, or if you want to be able to back up
only a portion of your database via filegroups, you can create additional, secondary data
files for a database
Secondary Data Files
A database can have any number of secondary files (in reality, the maximum number of
files per database is 32,767, but that should be sufficient for most implementations) You
can put a secondary file in the default filegroup or in another filegroup defined for the
database Secondary data files have the file extension.ndf by default
Following are some situations in which the use of secondary database files might be
beneficial:
You want to perform a partial backup A backup can be performed for the entire
database or a subset of the database The subset is specified as a set of files or
file-groups The partial backup feature is useful for large databases, where it is
impracti-cal to back up the entire database When recovering with partial backups, a
transaction log backup must also be available For more information about backups,
see Chapter 14, “Database Backup and Restore.”
You want more control over placement of database objects When you create a table
or index, you can specify the filegroup in which the object is created This could
help you spread I/O by placing your most active tables or indexes on separate
file-groups defined on separate disks or disk arrays
Creating multiple files on a single disk provides no real performance benefit but
could help in recovery If you have a 90GB database in a single file and have to
restore it, you need to have enough disk space available to create a new 90GB file If
you don’t have 90GB of space available on a single disk, you cannot restore the
data-base On the other hand, if the database was created with three files each 30GB in
size, you more likely will be able to find three 30GB chunks of space available on
your server
Trang 3The Log File
Each database must have at least one log file The log file contains the transaction log
records of all changes made in a database (for more information on what is contained in
the transaction log, see Chapter 31, “Transaction Management and the Transaction Log”)
By default, log files have the file extension ldf
A database can have several log files, and each log file can have a maximum size of 32TB
A log file cannot be part of a filegroup No information other than transaction log records
can be written to a log file
For more information on the log file and log file management, see Chapter 31
File Management
In SQL Server 2008, you can specify that a database file should grow automatically as
space is needed SQL Server can also shrink the size of the database if the space is not
needed You can control whether to use this feature along with the increment by which
the file is to be expanded The increment can be specified as a fixed number of megabytes
or as a percentage of the current size of the file You can also set a limit on the maximum
size of the file or allow it to grow until no more space is available on the disk
Listing 34.1 provides an example of a database being created with a 10MB growth increment
for the first database file, 20MB for the second, and 20% growth increment for the log file
LISTING 34.1 Creating a Database with Autogrowth
CREATE DATABASE Customer
ON ( NAME=’Customer_Data’,
FILENAME=’D:\SQL_data\Customer_Data1.mdf’,
SIZE=50,
MAXSIZE=100,
FILEGROWTH=10),
( NAME=’Customer_Data2’,
FILENAME=’E:\SQL_data\Customer_Data2.ndf’,
SIZE=100,
FILEGROWTH=20)
LOG ON ( NAME=’Customer_Log’,
FILENAME=’F:\SQL_data\Customer_Log.ldf’,
SIZE=50,
FILEGROWTH=20%)
GO
The Customer_Data file has an initial size of 50MB, a maximum size of 100MB, and a file
increment of 10MB
The Customer_Data2 file has an initial size of 100MB, has a file growth increment of
Trang 4The transaction log has an initial size of 50MB; the file increases by 20% with each file
growth The increment is based on the current file size, not the size originally specified.
When creating or expanding data files in SQL Server 2008, SQL Server uses fast file
initial-ization This allows for the fast execution of the file creation and growth With fast file
initialization, the space is added to the data file immediately, but without initializing the
logical pages in the data file with zeros The existing disk content in the data file is not
overwritten until new data is written to the files This provides a huge performance
advan-tage when a data file autogrows while an application is attempting to write data to the
database The application does not need to wait until the space is initialized; it can begin
writing to the database immediately
SQL Server also provides an option to autoshrink databases as well as manually shrink
databases However, shrinking a database is a resource-intensive process and should be
done only if it is absolutely imperative to reclaim disk space Also, if a data file is
constantly shrinking and growing, it can lead to excessive file fragmentation at the file
system level as well as excessive logical fragmentation within the file, both of which can
lead to poor I/O performance
Using Filegroups
All databases have a primary filegroup that contains the primary data file There can be
only one primary filegroup If you don’t create any other filegroups or change the default
filegroup to a filegroup other than the primary filegroup, all files will be in the primary
file group unless specifically placed in another filegroup
In addition to the primary filegroup, you can add one or more filegroups to the database,
and a filegroup can contain one or more files The main purpose of using filegroups is to
provide more control over the placement of files and data on your server When you
create a table or index, you can map it to a specific filegroup, thus controlling the
place-ment of data A typical SQL Server database installation generally uses a single RAID array
to spread I/O across disks and create all files in the primary filegroup; more advanced
installations or installations with very large databases spread across multiple array sets can
benefit from the finer level of control of file and data placement afforded by additional
filegroups
For example, for a simple database such as AdventureWorks, you can create just one
primary file that contains all data and objects and a log file that contains the transaction
log information For a larger and more complex database, such as a securities trading
system where large data volumes and strict performance criteria are the norm, you might
create the database with one primary file and four additional secondary files You can then
set up filegroups so you can place the data and objects within the database across all five
files If you have a table that itself needs to be spread across multiple disk arrays for
perfor-mance reasons, you can place multiple files in a filegroup, each of which resides on a
different disk, and create the table on that filegroup For example, you can create three
files (Data1.ndf, Data2.ndf, and Data3.ndf) on three disk arrays, respectively, and then
Trang 5assign them to the filegroup called spread_group Your table can then be created
specifi-cally on the filegroup spread_group Queries for data from the table are spread across the
three disk arrays, thereby improving I/O performance
If a filegroup contains more than one file, when space is allocated to objects stored in that
filegroup, the data is stored proportionally across the files In other words, if you have one
file in a filegroup with twice as much free space as another, the first file has two extents
allocated from it for each extent allocated from the second file (extents and space
alloca-tion are discussed in more detail later in this chapter)
Listing 34.2 provides an example of using filegroups in a database to control the file
place-ment of the customer_info table
LISTING 34.2 Using a Filegroup to Control Placement for a Table
CREATE DATABASE Customer
ON ( NAME=’Customer_Data’,
FILENAME=’C:\SQLData\Customer_Data1.mdf’,
SIZE=50,
MAXSIZE=100,
FILEGROWTH=10)
LOG ON ( NAME=’Customer_Log’,
FILENAME=’C:\SQLData\Customer_Log.ldf’,
SIZE=50,
FILEGROWTH=20%)
GO
ALTER DATABASE Customer
ADD FILEGROUP Cust_table
GO
ALTER DATABASE Customer
ADD FILE
( NAME=’Customer_Data2’,
FILENAME=’G:\SQLData\Customer_Data2.ndf’,
SIZE=100,
FILEGROWTH=20)
TO FILEGROUP Cust_Table
GO
USE Customer
CREATE TABLE customer_info
(cust_no INT, cust_address NCHAR(200), info NVARCHAR(3000))
ON Cust_Table
GO
Trang 6TABLE 34.2 The sys.filegroups System Catalog View
Column Name Description
name Name of the data space, unique within the database
data_space_id Data space ID number, unique within the database
type FG = Filegroup
type_desc Description of data space type: ROWS_FILEGROUP
is_default 1 = This is the default data space The default data space is used when a
file-group or partition scheme is not specified in a CREATE TABLE or CREATE INDEX statement
0 = This is not the default data space
filegroup_guid GUID for the filegroup.
NULL = PRIMARY filegroup
log_filegroup_id Not used; value is NULL
is_read_only
1 = Filegroup is read-only
0 = Filegroup is read/write
The CREATE DATABASE statement in Listing 34.2 creates a database with a primary database
file and log file The first ALTER DATABASE statement adds a filegroup A secondary
data-base file is added with the second ALTER DATABASE command This file is added to the
Cust_Table filegroup The CREATE TABLE statement creates a table; the ON Cust_Table
clause places the table in the Cust_Table filegroup (the Customer_Data2 file on the G: disk
partition)
The sys.filegroups system catalog view contains information about the database
file-groups defined within a database, as shown in Table 34.2
The following statement returns the filename, size in megabytes (not including autogrow),
and the name of the filegroup to which each file belongs:
SELECT
convert(varchar(30), sf.name) as filename,
size/128 as size_in_MB,
convert(varchar(30), sfg.name) as filegroupname
FROM sys.database_files sf
INNER JOIN sys.filegroups sfg
ON sf.data_space_id = sfg.data_space_id
Trang 7go
filename size_in_MB filegroupname
- -
-Customer_Data 50 PRIMARY
Customer_Data2 100 Cust_table
FILESTREAM Filegroups
FILESTREAM storage is a new feature in SQL Server 2008 for storing unstructured data,
such as documents, images, and videos FILESTREAM storage helps to solve the issues with
using unstructured data by integrating the SQL Server Database Engine with the NTFS file
system for storing the unstructured data, such as documents and images, on the file system
with the database storing a pointer to the data Although the actual data resides outside
the database in the NTFS file system, you can still use Transact-SQL (T-SQL) statements to
insert, update, query, and back up FILESTREAM data, while maintaining transactional
consistency between the unstructured data and corresponding structured data with same
level of security
NOTE
To use FILESTREAM storage, you must first enable FILESTREAM storage at the
Windows level as well as at the SQL Server instance level You can enable FILESTREAM
at the Windows level during installation of SQL Server 2008 or at any time using SQL
Server Configuration Manager After you enable FILESTREAM at the Windows level, you
next need to enable FILESTREAM for the SQL Server instance You can do this either
through SQL Server Management Studio (SSMS) or via T-SQL
After you enabled FILESTREAM for the SQL Server instance, you can enable it for a
data-base by creating a FILESTREAM filegroup You can do this when the datadata-base is created (or
to an existing database) by adding a filegroup and including the CONTAINS FILESTREAM
clause Unlike regular filegroups, a FILESTREAM filegroup can contain only a single file
reference, which is actually a file system folder rather than an actual file The actual folder
must not exist (although the path up to the folder must exist); SQL Server creates the
filestream folder For example, in Listing 34.3, the code adds a FILESTREAM filegroup
called CustFSGroup and adds the folder G:\SQLData\custinfo_FS into the file group This
custinfo_FS folder is created by SQL Server in the G:\SQLData folder
LISTING 34.3 Using a Filegroup to Control Placement for a Table
ALTER DATABASE Customer
ADD FILEGROUP Cust_FSGroup CONTAINS FILESTREAM
Trang 8ADD FILE
( NAME=custinfo_FS,
FILENAME = ‘G:\SQLData\custinfo_FS’)
to FILEGROUP Cust_FSGroup
GO
If you look in the G:\SQLData\custinfo_FS folder, you should see a Filestream.hdr file
and an $FSLOG folder The Filestream.hdr file is a FILESTREAM container header file that
should not be moved or modified
As you can see in the example in Listing 34.3, for FILESTREAM files or file groups, unlike
regular files, you do not specify size or growth information No space is preallocated The
file and filegroup grow as data is added to tables that have been created with
FILESTREAM columns
As you create tables with FILESTREAM columns, a subfolder is created in the filegroup
folder for each table The filenames are GUIDs Each FILESTREAM column created in the
table results in another subfolder created under the table subfolder The column subfolder
name is also a GUID At this point, there still are no actual files created That happens
after you start adding rows to the table A file is created in the column subfolder for each
row inserted into the table with a non-NULL value for the FILESTREAM column
For more information on creating and using tables with FILESTREAM columns, see
Chapter 42, “What’s New for Transact-SQL in SQL Server 2008.”
Database Pages
All information in SQL Server is stored at the page level The page is the smallest level of
I/O in SQL Server and is the fundamental storage unit Pages contain the data itself or
information about the physical layout of the data The page size is the same for all page
types: 8KB, or 8,192 bytes The pages are arranged in two basic types of storage structures:
linked data pages and index trees
Databases are divided into logical 8KB pages Within each file allocated to a database, the
pages are numbered contiguously from 0 to n The actual number of pages in the database
file depends on the size of the file Pages in a database are uniquely referenced by
specify-ing the database ID, the file ID for the file the page resides in, and the page number
within the file When you expand a database with ALTER DATABASE, the new space is
added at the end of the file, and the page numbers continue incrementing from the
previ-ous last page in the file If you add a completely new file, its first page number is 0 When
you shrink a database, pages are removed from the end of the file only, starting at the
highest page in the database and moving toward lower-numbered pages until the database
reaches the specified size or a used page that cannot be removed This ensures that page
numbers within a file are always contiguous
Trang 9TABLE 34.3 Page Types
Page Type Stores
Data Data rows for all data except text, ntext, image, nvarchar(max),
varchar(max), varbinary(max), and xml data Row Overflow Data columns that cause a data row to exceed the 8,060 bytes per page
limit LOB Large object types (text, ntext, image, nvarchar(max), varchar(max),
varbinary(max), xml data, and varchar, nvarchar, varbinary, and sqlvariant when data row size exceeds 8KB)
Index Index entries and pointers
Global Allocation
Map
Information about allocated (used) extents
Page Free Space Information about page allocation and free space on pages
Index Allocation
Map
Information about extents used by a table or an index
Differential
Changed Map
Information about which extents have been modified since the last full database backup
Bulk Changed Map Information about which extents have been used in a minimally logged or
bulk-logged operation since the last BACKUP LOG statement
Body
Header
96 byte header
8096 bytes
8K PagE
(8192Bytes)
FIGURE 34.1 SQL Server page layout
Page Types
There are eight page types in SQL Server, as listed in Table 34.3
All pages, regardless of type, have a similar layout They all have a page header, which is
96 bytes, and a body, which consequently is 8,096 bytes The page layout is shown in
Figure 34.1
Trang 10Data Pages
The actual data rows in tables are stored on data pages Figure 34.2 shows the basic
struc-ture of a data page
The following sections discuss and examine the contents of the data page
The Page Header
The page header contains control information for the page Some fields assist when SQL
Server checks for consistency among its storage structures, and some fields are used when
navigating among the pages that constitute a table Table 34.4 describes the more useful
fields contained in the page header
Header
Row Offset Table
.
96 118 140
…
…
Row 0
Row 1
Row 2
Byte Address Row ID
0 1 2
96
118
140
0
3 4 8095
FIGURE 34.2 The structure of a SQL Server data page