In Chapter 50, “SQL Server Text Search” on the CD, looks at how SQL Server’s Full-Text Search feature enables you to create an index of and perform specialized queries against all textua
Trang 1CHAPTER 49 SQL Server Service Broker
Now that you have a sense of what you can accomplish with this utility, you should take
some time to explore it further by monitoring the status of a conversation of the service
you set up in the XBikeDistributiondatabase Further information is available in the
MSDN article “ssbdiagnose Utility.”
Related System Catalogs
A few of the system catalogs and dynamic management views (DMVs) might be of interest
to you if you’re debugging Service Broker applications or simply seeking a greater
under-standing of how Service Broker works under the hood Let’s take a look at some of them
You’ve already seen sys.transmission_queue, which is used to store undelivered messages
in a particular database This table is very useful because it provides the reason a message
is undeliverable (in transmission_status), the date sent (in enqueue_time), a conversation
identifier (conversation_handle), contract and service names (service_contract_name,
to_service_name,from_service_name), and more
Another useful catalog is sys.service_queues, which holds the definitions of the queues
defined in a particular database It has a few interesting columns:
activation_procedure—This column contains the name of the activated service
program that is bound to the queue
max_readers—This column contains the integer value specified in the CREATE QUEUE
ofMAX_QUEUE_READERS
is_retention_enabled—This column contains the Boolean value of the RETENTION
flag in CREATE QUEUE
You can use the value in the object_idcolumn to figure out which queue is being
refer-enced in a particular error message, such as the following, which you may find in your
transmission queue someday: This message could not be delivered because the
destination queue has been disabled Queue ID: 325576198 This error occurs when
your activated code throws an error in its body after receiving a message, rolls back the
receive, is activated again, and so on, until Service Broker intervenes and disables the
queue (It usually takes three failures for this to happen.) A similar error is raised if you set
ENCRYPTION = ONand don’t set up certificates
To see all the services in a particular database, you can query sys.services To see all the
active conversations, you can query sys.conversation_groups The following query
shows how to use these tables together:
SELECT
sq.name as QueueName,
ss.name as ServiceName,
cg.conversation_group_id as CGId
FROM sys.services ss
JOIN sys.service_queues sq
ON ss.service_queue_id = sq.object_id
Trang 2Related System Catalogs
LEFT JOIN sys.conversation_groups cg
ON cg.service_id = ss.service_id
To see all the contracts in a particular database, you can query sys.service_contracts To
see all the message types, you can query sys.service_message_types These catalog views
are brought together in the system table sys.service_contract_message_usages(showing
message types by contract) You can also link them to sys.service_contract_usages
(showing contracts by service) via the following query:
SELECT
s.Name ServiceName,
sc.Name ContractName,
smt.Name as MsgTypeName,
scmu.is_sent_by_initiator,
scmu.is_sent_by_target
FROM sys.services s
JOIN sys.service_contract_usages scu
ON scu.service_id = s.service_id
JOIN sys.service_contracts sc
ON sc.service_contract_id = scu.service_contract_id
JOIN sys.service_contract_message_usages scmu
ON scmu.service_contract_id = sc.service_contract_id
JOIN sys.service_message_types smt
ON smt.message_type_id = scmu.message_type_id
In addition, you can view any certificates you have created by querying
sys.certificates, routes via sys.routes, and remote service bindings via
sys.remote_service_bindings Each side of a conversation is known as an endpoint, and
you can view endpoints by querying sys.conversation_endpoints
Five DMVs may be of interest in debugging live Service Broker applications:
sys.dm_broker_activated_tasks—Each row refers to a stored procedure being
acti-vated
sys.dm_broker_connections—Each row refers to an in-use Service Broker network
connection
sys.dm_broker_forwarded_messages—Each row refers to a message currently being
forwarded
sys.dm_broker_queue_monitors—Each row refers to the current behavior of a SQL
Server background task known as a queue monitor, which is responsible for activation
sys.dm_broker_transmission_status—Each row refers to the status of a message
being transmitted
To see all the activated stored procedures in a given database, for example, you can try
the following:
Trang 3CHAPTER 49 SQL Server Service Broker
SELECT
d.name DBName,
sq.name QueueName,
dmbat.spid SPID,
dmbat.procedure_name ProcName
FROM sys.dm_broker_activated_tasks dmbat
JOIN sys.databases d ON
d.database_id = dmbat.database_id
AND dmbat.database_id = DB_ID()
JOIN sys.service_queues sq
ON dmbat.queue_id = sq.object_id
Summary
Like the addition of native Web services, the addition of Service Broker pushes SQL Server
even further outside the bounds of being a pure database server and into the application
server realm
Because it is built directly into SQL Server databases, Service Broker inherently provides
backup and restoration, replication and failover, and single-mode transactions, which
together give Service Broker an edge over competing messaging technologies Plus, as
you’ve seen, it’s extremely easy to set up and begin coding, because you need to do very
little groundwork; all the ingredients are already “in there.”
One issue not covered in this chapter is the fact that service programs may be written in
managed code that makes use of SQL Server CLR integration At the time of this writing,
there is no officially released NET Framework library for the Service Broker objects, so this
chapter does not cover the subject However, Microsoft may release a Windows
Communication Foundation (WCF) channel that provides a Service Broker interface
In Chapter 50, “SQL Server Text Search” (on the CD), looks at how SQL Server’s
Full-Text Search feature enables you to create an index of and perform specialized queries
against all textual data in your tables
Trang 4SQL Server Full-Text
Search
IN THIS CHAPTER What’s New in SQL Server
2008 Full-Text Search
Upgrade Options in SQL Server 2008
How SQL Server FTS Works
Implementing SQL Server 2008 Full-Text Catalogs
Setting Up a Full-Text Index
Full-Text Searches
Full-Text Search Maintenance
Full-Text Search Performance
Full-Text Search Troubleshooting
This chapter looks at how to use SQL Server 2008 Full-Text
Search (FTS) SQL Server FTS allows you to do extremely fast
searches of textual contents stored in columns of the char,
nchar,varchar,nvarchar,varchar(max),nvarchar(max),
xml, and textdata types and binary content stored in image
andvarbinary(max)data types (if you have an IFilter for
the data stored in the imageorvarbinary(max)data types)
SQL Server FTS has considerable advantages over a search
based on a LIKEclause because it is faster, can search binary
content, and has language features that a LIKEclause does
not support SQL Server FTS also allows you to include a
wildcard at the end of a word (for example, doing a search
ontest*to match test, testing, tester, and testament).
However, SQL Server FTS does not allow a wildcard at the
beginning of a word; for these types of prefix-based
searches, you still have to use a LIKEclause
SQL Server FTS creates an index similar to one you can find
at the back of any book It contains a list of words, with
pointers to the tables and rows that contain the words SQL
Server consults this index, called a full-text index, when you
issue a full-text query; it returns a list of rows that contain
the words in your search phrase
SQL Server FTS ships, by default, in all versions of SQL
Server except SQL CE/Mobile and SQL Express There are
several versions of SQL Server Express If you want a
version of SQL Server Express with Full-Text Search, you
need to download SQL Server 2008 R2 Express with
Advanced Services
Trang 5CHAPTER 50 SQL Server Full-Text Search
What’s New in SQL Server 2008 Full-Text Search
Microsoft spent more than five years developing SQL Server 2008 R2 The developers at
Microsoft spent that time improving the engine, tools, and performance and making this
version more user friendly SQL Server 2008 introduces the following for Full-Text Search:
Full-text catalogs are now stored inside the database In previous versions, they were
stored in the filesystem
SQL Server execution plans are able to make intelligent queries against the
full-text catalogs
Stop lists (also known as noise words files) are now stored in the database You can
create any number of stop word lists; each full-text index table or indexed view can
have its own specific stop list
You are able to use two DMVs to troubleshoot indexing
More languages are now supported for Full-Text Search SQL Server 2008 supports 48
languages, up from the 23 supported in SQL Server 2005
Full-text catalogs in log-shipped or mirrored databases do not need to be repopulated
when you fail over In previous versions of SQL Server, if you log shipped or
mirrored a database that was full-text indexed, you would have to repopulate the
full-text indexes when you failed over to the secondary or mirror In SQL Server
2008, this step is no longer necessary; on failover, the full-text indexes are
immedi-ately accessible and queryable
FTS works with the FILESTREAM property of varbinary(max)columns
A new external engine performs the content extraction and indexing (it is called
FDHost also known as Filter Daemon Host)
DBCC CHECKDBvalidates SQL Full-Text structures but does not validate its contents,
however
Considerable performance improvements have been made
These new features are covered throughout this chapter
Upgrade Options in SQL Server 2008
When you upgrade SQL Server from a previous version to SQL Server 2008, or if you
attach or restore a SQL 2005 database to SQL Server 2008, you are prompted to upgrade
your full-text catalogs
There are three options:
Import
Rebuild
Reset
Trang 6How SQL Server FTS Works
In an import, the full-text catalog is imported from the SQL Server 2005 instance in the
case of an upgrade In the case of a database attach or restore, the SQL 2005 full-text
cata-logs are stored in the database in a different file group
An import is faster than a rebuild; however, the import does not use the new word
break-ers (described later), so for some languages where the word breakbreak-ers have been improved,
you may not get consistent results when compared with catalogs that have been rebuilt
In SQL Server 2008, if your full-text key is not an integer, the full-text engine builds a
document map mapping the full-text key with a more efficient integer-based key If you
import your full-text catalog and your full-text key is not an integer key, your full-text
queries are not able to take advantage of this efficiency
A rebuild takes advantage of the new word breakers and other capabilities of SQL Server
2008 However, the rebuild may take longer than an import
A reset drops the SQL 2005 catalogs but retains the metadata, so your full-text indexes are
intact but not populated until you start a population in SQL Server 2008
A rebuild is the preferred option but results in the full-text catalog being rebuilt;
conse-quently, full-text queries will be out of date until the rebuild is complete
How SQL Server FTS Works
As mentioned previously, in SQL Server 2008, the catalogs are now stored inside the
full-text engine This redesign has resulted in many architectural changes in SQL Server 2008
Full-Text Search
The two main components of Full-Text Search are as follows:
IndexingExtracts the textual content from your data and stores the words or tokens
in inverted file indexes
SearchingQueries these inverted file indexes and returns the rows that match the
query
Indexing
The indexing engine connects to your database and extracts the content from the tables
you are full-text indexing It then sends this stream to COM components called filters (or
IFilters) These COM components are run in an out-of-process service called the FT
Daemon Host These filters are able to understand the content and can extract text data
from them For example, if you store XML or Word documents in your database, these
filters can understand this data or binary data and emit words and/or tokens it finds in
there The filters chosen are the default text ones if you are using char,varchar, or text
data types or XML if you are using the xmldata type If you are indexing varbinary
docu-ments, the indexing engine reads the document type column and launches the filter
corresponding to the value stored in the document type column
Trang 7CHAPTER 50 SQL Server Full-Text Search
If you are storing Word documents in a varbinarydata type column, and in your full-text
creating statement you specified a document type column called DocumentType, the
contents of this column for that row should be doc,.doc,docx, or docx
You can obtain a list of filters in use by querying as follows:
select document_type from sys.fulltext_document_types
Each filter understands the file format of the type of document it indexes For example,
the Word filter understands the file formats for Word documents and emits the textual
data it finds in the Word documents; the XML filter understands the XML documents and
emits the textual data it finds in them
If you need to index documents for which the file type does not appear in the results of
sys.full_text_document_types, you need to install that filter on the server running SQL
Server 2008 and then allow SQL Server 2008 to use them
To allow SQL Server to use these third-party iFilters, you need to issue the following
command:
sp_FullText_Service ‘load_os_resources’,1
This command loads the filter if it is installed on the OS In most cases, this is sufficient
In many cases, SQL Server wants to verify the signature/certificate embedded in the COM
component/filter This can cause problems in two ways First, the filter may not have a
certificate, and when SQL Server tries to validate the certificate with the issuing authority,
it is unable to do so Second, the performance impact of having to validate the
certifi-cate/signature causes the initial queries to take a long time as the validation process
proceeds For these two reasons, you might want to disable the certificate/signature check
by using the following command:
sp_FullText_Service ‘verify_signature’,0
Microsoft has published documentation on how to develop your own filters For more
infor-mation on how to do this, consult
http://msdn2.microsoft.com/en-us/library/ms916793.aspx
The filters then send the stream of textual data emitted by them to another component
called word breakers Word breakers respect the language you specified to be used to index
your columns’ content
The neutral word breaker basically breaks words at whitespace boundaries and at
punctua-tion (, : ; ’ “ ! -) and indexes only alphanumeric characters
The English (U.S) and British (or International English) word breakers index hyphenated
words without the hyphens and as their component words, so data-base is indexed as
data, base, and database They also index acronyms as single letters and the whole word if
they are capitalized For example, F.B.I is indexed as f, b, i, and fbi (words are indexed
lowercase)
Trang 8How SQL Server FTS Works
The English and British English word breakers are nearly identical, with the exception that
during the searching process, different stems may be used In U.S English speakers may
say oriented, whereas British English speakers may say orientated (in Canada oriented is now
more common; however in the rest of the English-speaking world—with the exception of
the United States—orientated is more common).
The German and Dutch word breakers index compound words as the compound and
constituent words For example, the German word Volkswagen is indexed as volks and wagen.
For Far Eastern languages, the word breakers break the sentence at whitespace and then
go through the “words” and extract characters In some Far Eastern languages, characters
appear contiguous to each other in blocks that appear to Westerners as words In fact,
each character is a word unto itself, and characters can be combined to form new words
These characters may be indexed singly or in multiple character combinations
By default, the word breaker used by the indexing process is the language specified in
sp_configureunless you specify that you want the contents of the columns you are
full-text indexing to be indexed in a different language:
exec sp_configure ‘show advanced options’,1
reconfigure with override
exec sp_configure ‘default full-text language’
Refer to the sections titled “Using the Full-Text Indexing Wizard to Build Full-Text Indexes
and Catalogs” and “Using T-SQL Commands to Build Full-Text Indexes and Catalogs.”
Some documents have language-specific tags in them that launch different word breakers
than the ones you specify on your server or in your full-text index creation statement For
example, Word and XML documents have language tags embedded in them If your Word
documents are in German, and you specify in your full-text index creation statement to
use the French word breakers, your Word document are indexed in German, not French
When the word breakers have done their work, the stop lists are applied and the stop lists
are removed Then the words are sent to the full-text indexes The full-text indexes store
positional information, so they know where a word occurs in a document These word
positions also reflect stop list words that were removed
At any one time, there may be multiple temporary memory resident full-text indexes At
certain periods, these temporary full-text indexes are consolidated into a single master
full-text index This process is called a master merge You can force a master merge by
reor-ganizing a catalog ( using the T-SQL statement ALTER FULLTEXT CATALOG MyCatalog
REOR-GANIZE, where your catalog is name MyCatalog) or optimizing (an option available to you
in the Catalog Properties dialog)
Searching
Although the indexer launches word breakers and filters as out-of-process SQL Server
components, the search process is entirely within the SQL Server engine To query the
full-text indexes, you need to use CONTAINSorFREETEXTpredicates or their rowset analogs
(CONTAINSTABLE,FREETEXTTABLE)
Trang 9CHAPTER 50 SQL Server Full-Text Search
Just as the indexer applies the default server full-text language for indexing, it also applies
the default full-text language for searching Consider a search on the French word courir (to
run) If you were to search in English on this word, it would search on courir and courirs.
However, on a server with the default full-text language setting for French, your search
would be conducted on couraient, courais, courait, courant, coure, courent, coures, courez,
couriez, courions, courir, courons, courra, courrai, courraient, courrais, courrait, courras, courrez,
courriez, courrions, courrons, courront, cours, court, couru, courue, courues, courumes, courumes,
coururent, courus, courusse, courussent, courusses, courussiez, courussions, courut, courutes.
Now that you understand the architecture of Full-Text Search, let’s discuss how to create
full-text catalogs
NOTE
The examples in this chapter are based on the SQL Server 2005 version of the
AdventureWorks database The 2005 version of the AdventureWorks database can be
installed using the same installer that installs the AdventureWorks2008 or
AdventureWorks2008R2 database If you didn’t install AdventureWorks when you
installed either of these sample databases, simply relaunch the installer and choose to
install the AdventureWorks OLTP database
For more information on downloading and installing the AdventureWorks sample
data-bases, see the Introduction chapter
Implementing SQL Server 2008 Full-Text Catalogs
In SQL Server 2005 and previous versions, text catalogs were containers for your
full-text indexes In SQL Server 2008, they are really virtual containers on which you can tag
various settings and have these settings apply to all indexes placed in that catalog (for
example, accent sensitivity) or rebuild all indexes in a catalog at one time
To create a full-text catalog, you first need to full-text enable your database To do this,
issue the following query in your database:
sp_fulltext_database ‘enable’
You can also right-click on your database, select Properties, and then click on the Files tab
Check Use Full-Text Indexing and then click OK
After doing this, you can create your catalog There are two ways to do this: by using the
wizard or using T-SQL
Before you create a text index, you must create a text catalog In the wizard,
full-text catalog creation can be done alongside full-full-text index creation, but under the covers,
the catalog is always created first
We first discuss creating the full-text catalog using the wizard and then using the T-SQL
commands
Trang 10Setting Up a Full-Text Index
Setting Up a Full-Text Index
There are two ways to create a full-text index:
Using T-SQL commands
Using the Full-Text Wizard
Using T-SQL Commands to Build Full-Text Indexes and Catalogs
In SQL 2008 full-text catalogs are “virtual.” They are just containers for full-text catalog
properties like accent sensitivity or catalog rebuild or population statements They live
inside the database in SQL 2008, unlike SQL 2000 and 2005, where the catalogs and
full-text indexes resided in the filesystem
To build your full-text catalogs and indexes, you need to use the CREATE FULLTEXT
commands
NOTE
T-SQL commands are not case sensitive
There are three commands for full-text index creation and maintenance:
CREATE FULLTEXT CATALOG
CREATE FULLTEXT INDEX
ALTER FULLTEXT INDEX
Let’s look at how they work
CREATE FULLTEXT CATALOG
To create a full-text catalog in its simplest form, you enter this command:
USE AdventureWorks;
Create fulltext catalog MyCatalog
In this command, MyCatalogis the name of the catalog TheCREATE FULLTEXT CATALOG
statement has several switches:
ON FILEGROUP
IN PATH
WITH ACCENT_SENSITIVITY
AS DEFAULT
AUTHORIZATION
We next cover each of these parameters