1. Trang chủ
  2. » Công Nghệ Thông Tin

Hướng dẫn học Microsoft SQL Server 2008 part 55 pdf

10 325 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 10
Dung lượng 606,69 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Indexes search from the beginning of strings, as shown here: SELECT Title FROM Fable WHERE CONTAINS *,‘ "Hunt*" ’; Result: Title ---The Hunter and the Woodman The Ass in the Lion’s Skin

Trang 1

The other solution to the multiple-column search problem consists of adding an additional

col-umn to hold all the text to be searched and duplicating the data from the original colcol-umns to a

FullTextSearchcolumn within an after trigger or using a persisted computed column This solution

is not smooth either It duplicates data and costs performance time during inserts and updates The crux

of the decision regarding how to solve the multiple-column search is the conflict between fast reads and

fast writes — OLAP versus OLTP

Searches with wildcards

Because the full-text search engine has its roots in Windows Index and was not a SQL Server–developed

component, its wildcards use the standard DOS conventions (asterisk for a multi-character wildcard, and

double quotes) instead of SQL-style wildcards and SQL single quotes

The other thing to keep in mind about full-text wildcards is that they work only at the end of a word,

not at the beginning Indexes search from the beginning of strings, as shown here:

SELECT Title FROM Fable

WHERE CONTAINS (*,‘ "Hunt*" ’);

Result:

Title -The Hunter and the Woodman

The Ass in the Lion’s Skin The Bald Knight

Phrase searches

Full-text search can attempt to locate full phrases if those phrases are surrounded by double quotes

For example, to search for the fable about the boy who cried wolf, searching for ‘‘Wolf! Wolf!’’ does

the trick:

SELECT Title FROM Fable WHERE CONTAINS (*,‘ "Wolf! Wolf!" ’);

Result:

Title -The Shepherd’s Boy and the Wolf

Word-proximity searches

When searching large documents, it’s nice to be able to specify the proximity of the search words

Full-text search implements a proximity switch by means of theNEARoption The relative distance between

Trang 2

the words is calculated, and, if the words are close enough (within about 30 words, depending on the

size of the text), then full-text search returns atruefor the row

The story of Androcles, the slave who pulls the thorn from the lion’s paw, is one of the longer fables

in the sample database, so it’s a good test sample

The following query attempts to locate the fable ‘‘Androcles’’ based on the proximity of the words

‘‘pardoned’’ and ‘‘forest’’ in the fable’s text:

SELECT Title

FROM Fable

WHERE CONTAINS (*,‘pardoned NEAR forest’);

Result:

Title

-Androcles

The proximity switch can handle multiple words The following query tests the proximity of the words

‘‘lion,’’ ‘‘paw,’’ and ‘‘bleeding’’:

SELECT Title

FROM Fable

WHERE CONTAINS (*,‘lion NEAR paw NEAR bleeding’);

Result:

Title

-Androcles

The proximity feature can be used withCONTAINSTABLE; theRANKindicates relative proximity

The following query ranks the fables that mention the word ‘‘life’’ near the word ‘‘death’’ in order of

proximity:

SELECT Fable.Title, FTS.Rank

FROM Fable

INNER JOIN CONTAINSTABLE (Fable, *,‘life NEAR death’) AS FTS

ON Fable.FableID = FTS.[KEY]

ORDER BY FTS.Rank DESC;

Result:

-The Serpent and the Eagle 7

The Eagle and the Arrow 1

The Woodman and the Serpent 1

Trang 3

Word-inflection searches

The full-text search engine can actually perform linguistic analysis and base a search for different words

on a common root word This enables you to search for words without worrying about number or

tense For example, the inflection feature makes possible a search for the word ‘‘flying’’ that finds a

row containing the word ‘‘flew.’’ The language you specify for the table is critical in a case like this

Something else to keep in mind is that the word base will not cross parts of speech, meaning that

a search for a noun won’t locate a verb form of the same root The following query demonstrates

inflection by locating the fable with the word ‘‘flew’’ in ‘‘The Crow and the Pitcher’’:

SELECT Title FROM Fable

WHERE CONTAINS (*,‘FORMSOF(INFLECTIONAL,fly)’);

Result:

Title -The Crow and the Pitcher

The Bald Knight

Thesaurus searches

The full-text search engine has the capability to perform thesaurus lookups for word replacements as

well as synonyms To configure your own thesaurus options, edit the thesaurus file The location of the

thesaurus file is dependent on your language, and server

The thesaurus file for your language will follow the naming convention TSXXX.xml, where XXX is your

language code (e.g., ENU for U.S English, ENG for U.K English, and so on) You need to remove the

comment lines from your thesaurus file If you edit this file in a text editor, then there are two sections

or nodes to the thesaurus file: an expansion node and a replacement node The expansion node is used to

expand your search argument from one term to another argument For example, in the thesaurus file,

you will find the following expansion:

<expansion>

<sub>Internet Explorer</sub>

<sub>IE</sub>

<sub>IE5</sub>

</expansion>

This will convert any searches on ‘‘IE’’ to search on ‘‘IE’’ or ‘‘IE5’’ or ‘‘Internet Explorer.’’

The replacement node is used to replace a search argument with another argument For example, if you

want the search argument sex interpreted as gender, you could use the replacement node to do that:

<replacement>

<pat>sex</pat>

<sub>gender</sub>

</replacement>

Trang 4

Thepatelement (sex) indicates the pattern you want substituted by thesubelement (gender).

AFREETEXTquery will automatically use the thesaurus file for the language type Here is an example

of a generational query using theThesaurusoption:

SELECT * FROM TableName WHERE CONTAINS(*,‘FORMSOF(Thesaurus,"IE")’);

This returns matches to rows containing IE, IE5, and Internet Explorer

Variable-word-weight searches

In a search for multiple words, relative weight may be assigned, making one word critical to the search

and another word much less important The weights are set on a scale of 0.0 to 1.0

TheISABOUToption enables weighting, and any hit on the given word allows the rows to be returned,

so it functions as an implied BooleanORoperator

The following two queries use theweightoption withCONTAINSTABLEto highlight the differences

among the words ‘‘lion,’’ ‘‘brave,’’ and ‘‘eagle’’ as the weighting changes The query will examine only the

FableTextcolumn to prevent the results from being skewed by the shorter lengths of the text found

on the title and moral columns The first query weights the three words evenly:

SELECT Fable.Title, FTS.Rank

FROM Fable

INNER JOIN CONTAINSTABLE

(Fable, FableText,

‘ISABOUT (Lion weight (.5),

Brave weight (.5), Eagle weight (.5))’) AS FTS

ON Fable.FableID = FTS.[KEY]

ORDER BY Rank DESC;

Result:

-

The Eagle and the Fox 85

The Hunter and the Woodman 50

The Serpent and the Eagle 50

The Eagle and the Arrow 21

The Ass in the Lion’s Skin 16

When the relative importance of the word ‘‘eagle’’ is elevated, it’s a different story:

SELECT Fable.Title, FTS.Rank

FROM Fable

INNER JOIN CONTAINSTABLE

(Fable, FableText,

Trang 5

‘ISABOUT (Lion weight (.2), Brave weight (.2),

Eagle weight (.8))’) AS FTS

ON Fable.FableID = FTS.[KEY]

ORDER BY Rank DESC;

Result:

-The Eagle and the Fox 102

The Serpent and the Eagle 59 The Eagle and the Arrow 25

The Hunter and the Woodman 14

The Ass in the Lion’s Skin 4

When all the columns participate in the full-text search, the small size of the moral and the title make

the target words seem relatively more important within the text The next query uses the same weighting

as the previous query but includes all columns (*):

SELECT Fable.Title, FTS.Rank FROM Fable

INNER JOIN CONTAINSTABLE

(Fable, *,

‘ISABOUT (Lion weight (.2), Brave weight (.2),

Eagle weight (.8))’) AS FTS

ON Fable.FableID = FTS.[KEY]

ORDER BY Rank DESC;

Result:

The Hunter and the Woodman 408 The Eagle and the Fox 102 The Eagle and the Arrow 80 The Serpent and the Eagle 80

The Ass in the Lion’s Skin 23

The ranking is relative, and is based on word frequency, word proximity, and the relative importance of

a given word within the text ‘‘The Wolf and the Kid’’ does not contain an eagle or a lion, but two

fac-tors favor bravado First, ‘‘brave’’ is a rarer word than ‘‘lion’’ or ‘‘eagle’’ in both the column and the table

Second, the word ‘‘brave’’ appears in the moral as one of only 10 words So even though ‘‘brave’’ was

weighted less, it rises to the top of the list It’s all based on word frequencies and statistics (and

some-times, I think, the phase of the moon!)

Trang 6

Fuzzy Searches

While theCONTAINSpredicate andCONTAINSTABLE-derived table perform exact word searches, the

FREETEXTpredicate expands on theCONTAINSfunctionality to include fuzzy, or approximate, full-text

searches from free-form text

Instead of searching for two or three words and adding the options for inflection and weighting, the

fuzzy search handles the complexity of building searches that make use of all the full-text search engine

options, and tries to solve the problem for you Internally, the free-form text is broken down into

multiple words and phrases, and the full-text search with inflections and weighting is then performed on

the result

Freetext

FREETEXTworks within aWHEREclause just likeCONTAINS, but without all the options The

follow-ing query uses a fuzzy search to find the fable about the big race:

SELECT Title

FROM Fable

WHERE FREETEXT

(*,‘The tortoise beat the hare in the big race’);

Result:

Title

-The Hare and the Tortoise

FreetextTable

Fuzzy searches benefit from theFREETEXT-derived table that returns the ranking in the same way that

CONTAINSTABLEdoes The two queries shown in this section demonstrate a fuzzy full-text search using

theFREETEXT-derived table Here is the first query:

SELECT Fable.Title, FTS.Rank

FROM Fable

INNER JOIN FREETEXTTABLE

(Fable, *, ‘The brave hunter kills the lion’) AS FTS

ON Fable.FableID = FTS.[KEY]

ORDER BY Rank DESC;

Result:

-The Hunter and the Woodman 257

The Ass in the Lion’s Skin 202

Trang 7

The Dogs and the Fox 100 The Goose With the Golden Eggs 72 The Shepherd’s Boy and the Wolf 72

Here is the second query:

SELECT Fable.Title, FTS.Rank FROM Fable

INNER JOIN FREETEXTTABLE

(Fable, *, ‘The eagle was shot by an arrow’) AS FTS

ON Fable.FableID = FTS.[KEY]

ORDER BY Rank DESC;

Result:

-The Eagle and the Arrow 288

The Eagle and the Fox 135 The Serpent and the Eagle 112 The Hunter and the Woodman 102 The Father and His Two Daughters 72

Performance

SQL Server 2008’s full-text search engine performance is several orders of magnitude faster than

previous versions of SQL Server However, you still might want to tune your system for optimal

performance

■ iFTS benefits from a very fast subsystem Place your catalog on its own controller, preferably its own RAID 10 array A sweet spot exists for SQL iFTS on eight-way servers After a full or incremental population, force a master merge, which will consolidate all the shadow indexes into a single master index, by issuing the following command:

ALTER FULLTEXT CATALOG catalog_name REORGANIZE;

■ You can also increase the maximum number of ranges that the gathering process can use To

do so, issue the following command:

EXEC sp_configure ‘max full-text crawl range’, 32;

Summary

SQL Server indexes are not designed for searching for words in the middle of a column If the database

project requires flexible word searches, then Integrated Full-Test Search (iFTS) is the perfect solution,

even though it requires additional development and administrative work

Trang 8

■ iFTS requires configuring a catalog for each table to be searched.

■ iFTS catalogs are not populated synchronously within the SQL Server transaction They are

populated asynchronously following the transaction The recommended method is using

Change Tracking, which can automatically push changes as they occur

■ CONTAINSis used within theWHEREclause and performs simple word searches, but it can

also perform inflectional, proximity, and thesaurus searches

■ CONTAINSTABLEfunctions likeCONTAINSbut it returns a data set that can be referenced in

aFROMclause

■ FREETEXTandFREETEXTTABLEessentially turn on every advanced feature of iFTS and

perform a fuzzy word search

As you read through this ‘‘Beyond Relational’’ part of the book, I hope you’re getting a sense of the

breadth of data SQL Server can manage The next chapter concludes this part with Filestream, a new

way to store large BLOBs with SQL Server

Trang 10

Developing with

SQL Server

IN THIS PART

Chapter 20

Creating the Physical Database Schema

Chapter 21

Programming with T-SQL

Chapter 22

Kill the Cursor!

Chapter 23

T-SQL Error Handling

Chapter 24

Developing Stored Procedures

Chapter 25

Building User-Defined Functions

Chapter 26

Creating DML Triggers

Chapter 27

Creating DDL Triggers

Chapter 28

Building the Data Abstraction Layer

Chapter 29

Part II of this book was all about writing set-based queries Part III

extended theselectcommand to data types beyond relational

This part continues to expand onselectto provide programmable

flow of control to develop server-side solutions; and SQL Server has a large

variety of technologies to choose from to develop server-side code — from

the mature T-SQL language to NET assemblies hosted within SQL Server

This part opens with DDL commands (create,alter, anddrop), and

progresses through 10 chapters of Transact-SQL that build on one another

into a crescendo with the data abstraction layer and dynamic SQL The final

chapter fits CLR programming into the picture

So, unleash the programmer within and have fun There’s a whole world of

developer possibilities with SQL Server 2005

If SQL Server is the box, then Part IV is all about thinking inside the box,

and moving the processing as close to the data as possible

Ngày đăng: 04/07/2014, 09:20

TỪ KHÓA LIÊN QUAN