TABLE 50.1 Full-Text Index Properties for the OBJECTPROPERTY TableFullText BackgroundUpdate IndexOn Indicates whether change 1= true and 0= false tracking is enabled.. TABLE 50.1 Full-Te
Trang 1catalogs are queryable, but no results are returned until you rebuild them); 2, which
means the full-text indexes are imported into the database (however, the results may
be inconsistent because some of the full-text indexes are generated by the SQL 2005
full-text word breakers and not the SQL Server 2008 word breakers)
Now you know how to build full-text catalogs and indexes and modify them The next
section describes how to get information on the catalogs and indexes you build
Diagnostics
After you create catalogs and indexes, you occasionally need to get information about
your catalogs, tables, and indexes So that you can do this, Microsoft has supplied the
sp_help_fulltext_tablesandsp_help_fulltext_columnsstored procedures and the
system view sys.fulltext_catalogs
These stored procedures and view allow you to examine the state of your full-text tables,
columns, and catalogs Microsoft recommends that rather than using these objects, you
use the OBJECTPROPERTY,COLUMNPROPERTY, and FULLTEXTCATALOGPROPERTYmetadata
func-tions Table 50.1 lists the full-text index properties for the OBJECTPROPERTYfunction
TABLE 50.1 Full-Text Index Properties for the OBJECTPROPERTY
TableFullText
BackgroundUpdate
IndexOn
Indicates whether change 1= true and 0= false tracking is
enabled
TableFulltext
CatalogId
Returns the catalog ID of the catalog the full-text index is placed on
CatalogIDor0(table not indexed)
TableFulltextChange
TrackingOn
Enables change tracking 1= true and 0= false
TableFulltextDocs
Processed
Returns the number of rows processed since indexing started
TableFulltextFail
Count
Returns the number of rows that failed to index
TableFulltextItem
Count
Returns the number of rows successfully indexed
TableFulltextKey
Column
Returns the ID of the key index used by SQL Server FTS (normally the primary key)
Trang 2Table 50.2 lists the full-text index properties for the COLUMNPROPERTYfunction
Table 50.3 lists the properties for the FULLTEXTCATALOGPROPERTYfunction
TABLE 50.1 Full-Text Index Properties for the OBJECTPROPERTY
TableFulltext
PendingChanges
Returns the number of rows outstanding to be indexed
TableFulltext
PopulateStatus
Returns a number indicating the state of the population
1= full population is in progress; 2
= incremental population is in progress;3= propagation of tracked changes is in progress; 4= background update index is in progress, such as autochange tracking; and 5= full-text indexing
is throttled or paused
TableHasActive
FulltextIndex
Indicates whether a table has an active full-text index on it
1= true and 0= false
TABLE 50.2 Full-Text Index Properties for the COLUMNPROPERTY Function
IsFulltextIndexed Indicates whether a column is full-text indexed 1= true and 0= false
FullTextTypeColumn Returns the ID of the document type column
TABLE 50.3 Properties for the FULLTEXTCATALOGPROPERTY
AccentSensitivity Indicates whether the catalog is accent
sensitive
1= true and 0= false
IndexSize Returns the size of the full-text catalog
ItemCount Returns the number of items (rows)
indexed in the catalog
Trang 3The following examples show how to query metadata functions using the full-text index
properties:
SELECT OBJECTPROPERTY(object_id(‘Person.Contact’),
‘TableFullTextBackgroundUpdateIndexOn’) select objectproperty(object_id(‘Person.Contact’),’TableFulltextChangeTrackingOn’)
SELECT OBJECTPROPERTY(object_id(‘Person.Contact’),’TableFulltextKeyColumn’)
SELECT OBJECTPROPERTY(object_id(‘Person.Contact’),’TableFulltextPendingChanges’)
SELECT OBJECTPROPERTY(object_id(‘Person.Contact’),’TableFulltextPopulateStatus’)
SELECT OBJECTPROPERTY(object_id(‘Person.Contact’),’TableHasActiveFulltextIndex’)
TABLE 50.3 Properties for the FULLTEXTCATALOGPROPERTY
MergeStatus Indicates whether a master merge is in
progress
1= true and 0= false
PopulateCompletion
Age
Specifies how long ago the last popula-tion completed
PopulateStatus Returns the status of the population 0= idle,1= full population
in progress,2= paused,3
= throttled,4= recovering,
5= shut down,6= incre-mental population in progress,7= building index,8= disk is full, paused, and 9= change tracking
UniqueKeyCount Returns the number of unique words
indexed
ResourceUsage Returns a number indicating how
aggressively SQL Server FTS is consoli-dating the catalog
Ranges from 1 to 5 (the most aggressive); 3 is the default
IsFulltextInstalled Indicates whether SQL Server FTS is
installed
1= true and 0= false
LoadOSResources Indicates whether third-party word
break-ers are loaded
1= true and 0= false
VerifySignature Determines whether signatures of word
breakers and language resources are checked
1= true and 0= false
Trang 4SELECT COLUMNPROPERTY ( object_id(‘Person.Contact’),
‘charcol’ , ‘IsFulltextIndexed’ ) SELECT COLUMNPROPERTY ( object_id(‘Person.Contact’),
‘VarbinaryColumn’,’FullTextTypeColumn’ ) SELECT FULLTEXTCATALOGPROPERTY(‘MyCatalog’,’indexsize’)
SELECT FULLTEXTCATALOGPROPERTY(‘MyCatalog’,’itemcount’)
SELECT FULLTEXTCATALOGPROPERTY(‘MyCatalog’,’mergestatus’)
SELECT FULLTEXTCATALOGPROPERTY(‘MyCatalog’,’populatecompletionage’)
SELECT FULLTEXTCATALOGPROPERTY(‘MyCatalog’,’populatestatus’)
SELECT FULLTEXTSERVICEPROPERTY(‘loadosresources’)
Using the Full-Text Indexing Wizard to Build Full-Text Indexes and
Catalogs
Although the T-SQL full-text commands provide a scriptable interface for creating full-text
catalogs and indexes, sometimes it is easier to use the Full-Text Indexing Wizard to create
them To create a full-text index, follow these steps:
1 Connect to SQL Server in SQL Server Management Studio
2 Expand the databasesfolder
3 Expand the database that contains the tables you want to full-text index
4 Expand the tablesfolder
5 Right-click the table you want to full-text index (in this example, the
Production.Documenttable)
6 Select Full-Text Index, as shown in Figure 50.1
FIGURE 50.1 Selecting the Full-Text Index menu in SSMS
Trang 5You then click Define Full-Text Index to launch the Full-Text Indexing Wizard On the
Welcome to the SQL Server Full-Text Indexing Wizard splash screen, you click Next to
bring up the Select an Index dialog, as shown in Figure 50.2 In the Unique Index
drop-down box, you select the unique index you want to use for the full-text index In this
example, the only option is the primary key, PK_Document_DocumentID
TIP
If there are multiple unique keys to choose from, it is recommended that you choose
the smallest of the unique keys It is also a good idea to choose a unique key; this is a
static column that is unlikely to be modified
You may get the message “A unique column must be defined on this table/view.” In this
case, you have to create a unique index or primary key on the table before you can
proceed If a unique index or primary key exists, the Next button is enabled When you
click the Next button, the next dialog you see is the Select Table Columns dialog (see Figure
50.3) In this dialog, you select the columns you want to index and the word breaker you
want to use to index the contents of this column
Notice that the Select Table Columns dialog displays only the columns that can be
full-text indexed In this example, the FileNameandDocumentSummarycolumns will be
indexed by the server default full-text language For the Documentcolumn, you select the
language (English) by clicking the drop-down box that displays the available languages
The document type (in this case FileExtension) also needs to be selected You then click
Next and proceed to choose the population type from the Select Change Tracking dialog
(see Figure 50.4)
FIGURE 50.2 The Full-Text Indexing Wizard Select an Index dialog
Trang 6FIGURE 50.3 The Full-Text Index Wizard Select Table Columns dialog
There are three options in the Select Change Tracking dialog: Automatically (continuous
change tracking), Manually (change tracking with scheduled or manual updates), and Do
Not Track Changes If you specify Do Not Track Changes, the Start Full Population When
Index Is Created check box is enabled You click Next to advance to the Select a Catalog
dialog This dialog allows you to select an existing catalog or create a new catalog with
options to set the catalog accent sensitivity and to make it the default catalog You click
Next to set incremental table and catalog populations You click Next to view the summary
page and finish creating your full-text indexes and catalogs You click Close to complete
FIGURE 50.4 The Full-Text Index Wizard Select Change Tracking dialog
Trang 7the wizard If you are running Service Pack 1, you need to right-click your table one more
time, select Full-Text Index, and select Enable Full-Text Index to start change tracking
You are now ready to start querying your full-text indexes
Full-Text Searches
Four SQL clauses allow you to conduct full-text searches on your full-text index tables:
CONTAINS—Specifies a strict exact match, with options to make the search flexible
CONTAINSTABLE—Returns a ranked rowset from SQL Server FTS implementing the
Containsalgorithm, which must be joined against the base table
FREETEXT—Specifies a stemmed search that returns results to all generations of the
search phrase
FREETEXTTABLE—Returns a ranked rowset from SQL Server FTS implementing the
FreeTextalgorithm, which must be joined against the base table
CONTAINS and CONTAINSTABLE
TheCONTAINSandCONTAINSTABLEpredicates have the following parameters:
Search phrase
Generation
Proximity
Weighted
Search Phrase
The search phrase is the phrase or word that you are looking for in a full-text indexed
table If you are searching for more than one word, you have to wrap your search phrase
in double quotation marks, as in this example:
SELECT * FROM Person.Contact
WHERE CONTAINS(*,’”search phrase”’) — search all columns
In this query, you are searching all full-text indexed columns However, you can search a
single column, a list of columns, or all columns The following example shows how:
SELECT * FROM Person.Contact
WHERE CONTAINS(FirstName, ‘“search phrase”’) — searching 1 column
SELECT * FROM Person.Contact
WHERE CONTAINS((FirstName,Lastname), ‘“search phrase”’) — searching 2 columns
You can also use Boolean operators in your search phrase, as in this example:
SELECT * FROM Person.Contact WHERE CONTAINS(*, ‘“Ford”
AND NOT (“Harrison” OR “Betty”)’)
Trang 8This example searches on Ford cars, where you don’t want hits to rows that contain
refer-ences to Harrison and Ford or Betty and Ford.
CONTAINSsupports Boolean AND,OR, and AND NOTbut not OR NOT
You can also use wildcards in your searches by adding the *to the end of a word in your
search phrase A wildcard added to one word acts as wildcard on all words in the search
phrase, so a search on Al Anon*matches with Alcoholics Anonymous, Al Anon, and
Alexander Anonuevo.
Generation
The term generation refers to all forms of a word, which could be the word itself, all
declensions (that is, singular or plural forms, such as book and books), conjugations of a
word (such as book, booked, booking, and books), and thesaurus replacements and
substitu-tions of a word To search on all generasubstitu-tions of a word, you use a FREETEXTsearch on the
formsOfpredicate The following example shows how to use the formsOfpredicate to
search on declensions and conjugations of a word:
SELECT * FROM Person.Contact WHERE CONTAINS(*,’formsOf(inflectional,book)’)
Generations of a word also include its thesaurus expansions and replacements An
expansion is the word and other synonyms of the word (for example, book and volume or
car and automobile) An expansion can also include alternate spellings, abbreviations, and
nicknames A replacement is a word that you want replaced in a search For example, if
you have users searching on the word sex, and you want sex interpreted as gender, you
can replace the search on the term sex with a search on the word gender To get the
thesaurus option to work, you need to edit the thesaurus file for your language By
default, the thesaurus files are inC:\Program Files\Microsoft SQL
Server\MSSQL.X\MSSQL\FTData, whereXis the instance number There is a thesaurus file
for each full-text supported language; it is namedTSXXX.XML, whereXXXis a three-letter
identifier for the language There also is another thesaurus file calledTSGlobal.XML
Changes made to theTSGlobalthesaurus file are effective in all languages but are
overrid-den by the language-specific thesaurus files To make the thesaurus file effective, you have
to remove the comment marks and then restartMSFTESQL(the Microsoft SQL Server
Full-Text Search service) Notice that the thesaurus files have an XML element called
<diacritics = true/> Setting this element tofalsemakes the thesaurus not sensitive
to accents; otherwise, the thesaurus file is accent sensitive
As mentioned previously, the thesaurus file has two sections: an expansion section and a
replacement section The expansion section looks like this:
<expansion>
<sub>Internet Explorer</sub>
<sub>IE</sub>
<sub>IE5</sub>
<sub>IE6</sub>
</expansion>.
Trang 9Thesubnodes refer to substitutes, so a search on Internet Explorer is substituted to
addi-tional searchers on Internet Explorer, IE, IE5, and IE6.
The replacement section looks like this:
<replacement>
<pat>NT5</pat>
<pat>W2K</pat><sub>Windows 2000</sub>
</replacement>
Here, searches on the patterns NT5 or W2K are replaced by a search on Windows 2000, so
your search will never find rows containing only the words NT5 or W2K.
To use the thesaurus option, you need to use the formsOfpredicate Here is an example of
aformsOfquery:
SELECT * FROM Person.Contact WHERE CONTAINS(*, ‘formsof(thesaurus,ie)’)
Proximity
SQL Server 2008 FTS supports the proximity predicate, which allows you to search on
tokens that are close, or near, to each other Near is defined as within 50 words Words
separated by more than 50 words do not show up in a CONTAINSorCONTAINSTABLEsearch
With a FREETEXTorFREETEXTtable search, the separation distance can be up to 1,326
words Here is an example of a proximity-based search:
SELECT * FROM Person.Contact WHERE CONTAINS(*, ‘“peanut butter” NEAR “jam”’)
Weighted
A weighted search allows you to assign different weights to search tokens; you use the
ISABOUTpredicate to do a weighted search If you want to search on Gulf of Mexico and
Oil, and you want to place more emphasis on Gulf of Mexico than on Oil, you could
query like this:
SELECT * FROM Person.Contact
WHERE CONTAINS(*, ‘isabout(“Gulf of Mexico” weight(0.7), Oil weight(0.1))’)
You can use multiple weighted search items in a search, but doing so decreases the
search speed
LANGUAGE
Sometimes you might want to conduct a search in a different language than the default
full-text language for your server For example, say you want to conduct a
German-language search on the contents of a column To do this, you would use the German-language
predicate like this:
SELECT * FROM Person.Contact WHERE CONTAINS(*, ‘volkswagen’, LANGUAGE 1031)
Trang 10In this search, German language rules are applied when searching the index In this case,
the search on Volkswagen is expanded to a search on Volkswagen, wagen, and volk If you
are storing multilingual content in a single column, you should have a column that
indi-cates the language of the content stored in the column Otherwise, your searches might
return unwanted results from content in different languages
CONTAINSTABLE
CONTAINSTABLEsupports all the predicates of the CONTAINSoperator but returns a result set
containing only the key and rank The CONTAINSTABLEclause also supports all predicates of
CONTAINS, but it allows you to use the TOP_n_BY RANKparameter to return only the first n
results Because theCONTAINSTABLEpredicate returns only the key value and rank, you
have to join it against the base table (or another related table) to get meaningful results
Here are some examples:
SELECT * FROM Person.Contact JOIN
(SELECT [key], rank FROM CONTAINSTABLE(Person.Contact, *, ‘test’)) AS k
ON k.[key]= Person.Contact.ContactID
In the following example, Person.Contactis a child table of the Sales.Individualtable
Sales.Individualhas a foreign key relationship to the Person.Contacttable’s primary
key, ContactID This query illustrates how you could join the CONTAINSTABLEresult set
from the Person.Contacttable against the Sales.Individualtable (this example also
illustrates the TOP_n_BY_RANKoption):
SELECT * FROM Sales.Individual as s
JOIN (SELECT [key], rank FROM CONTAINSTABLE(Person.Contact, *, ‘jon’,100)) AS k
ON k.[key]=s.Contactid order by rank desc
In this query, you limit the results to the top 100 rows The second query returns, at most,
100 rows with the highest-rank values
Keep in mind that CONTAINSis faster than FREETEXT, but it is a strict character-by-character
match, unless you use some of the word-generation searches
FREETEXT and FREETEXTTABLE
FREETEXTandFREETEXTTABLEincorporate what Microsoft considers to be the natural way
to search For example, if you were searching on book, you would expect to get hits to
rows containing the word books (the plural) If you were searching on the word swimming,
you would expect results containing the words swimming, swim, swims, swum, and so on.
TheFreeTextandFREETEXTTABLEqueries implicitly search on all generations of a word
and include a proximity-based search However, if you wrap your search in double
quota-tion marks, the FREETEXTandFREETEXTTABLEpredicates do not do any stemming
FREETEXTandFREETEXTTABLEalso include the TOP_n_BY_RANKparameter