1. Trang chủ
  2. » Công Nghệ Thông Tin

The Language of SQL- P22 pdf

5 255 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 5
Dung lượng 112,98 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The result is:FirstName LastName Matching by Sound Let’s turn from matching letters and characters to matching sounds.. Let’s first look at an example that utilizes theSOUNDEX function:

Trang 1

The result is:

FirstName LastName

Matching by Sound

Let’s turn from matching letters and characters to matching sounds SQL

pro-vides two functions that give you some interesting ways to compare the sounds

of words or phrases The two functions areSOUNDEXandDIFFERENCE

Let’s first look at an example that utilizes theSOUNDEX function:

SELECT

SOUNDEX ('Smith') AS 'Sound of Smith',

SOUNDEX ('Smythe') AS 'Sound of Smythe'

The result is:

Sound of Smith Sound of Smythe

TheSOUNDEXfunction always returns a four-character response, which is a sort

of code for the sound of the phrase The first character is always the first letter of

the phrase In this case, the first character is S because both Smith and Smythe

begin with an S

The remaining three characters are calculated from an analysis of the sound of

the rest of the phrase Internally, the function first removes all vowels and the

letter Y So, the function takes the MITH from SMITH and converts it to MTH

Likewise, it takes the MYTHE from SMYTHE and converts it to MTH It then

assigns a number to represent the sound of the phrase In this example, that

number turns out to be 530

Since SOUNDEX returns a value of S530 for both Smith and Smythe, you can

conclude that they probably have very similar sounds

Microsoft SQL Server provides one additional function, called DIFFERENCE,

which works in conjunction with theSOUNDEXfunction

Trang 2

D A T A B A S E D I F F E R E N C E S : M y S Q L a n d O r a c l e

The DIFFERENCE function isn’t available in MySQL or Oracle.

Here’s an example, using the same words:

SELECT

DIFFERENCE ('Smith', 'Smythe') AS 'The Difference'

The result is:

The Difference

4

The DIFFERENCE function always requires two arguments Internally, the function first retrieves theSOUNDEX values for each of the arguments and then compares those values If it returns a value of 4, as in the previous example, that means that all four characters in theSOUNDEX value are identical A value of 0 means that none of the characters is identical Therefore, aDIFFERENCEvalue

of 4 indicates the highest possible match, and a value of 0 is the lowest possible match

With this in mind, here’s an example of how theDIFFERENCEfunction can be used to retrieve values that are very similar in sound to a specific phrase Work-ing from the Actors table, you’re goWork-ing to attempt to find rows with a first name that sounds like John TheSELECTstatement is:

SELECT

FirstName,

LastName

FROM Actors

WHERE DIFFERENCE (FirstName, 'John') ¼ 4

The results are:

FirstName LastName

Chapter 9 ■ Inexact Matches

92

Trang 3

TheDIFFERENCE function concluded that both John and Jon had a difference

value of 4 between the name and the specified value of John

If you want to analyze exactly why these two rows were selected, you can alter

yourSELECTto show both theSOUNDEXandDIFFERENCEvalues for all rows

in the table:

SELECT

FirstName,

LastName,

DIFFERENCE (FirstName, 'John') AS 'Difference Value',

SOUNDEX (FirstName) AS 'Soundex Value'

FROM Actors

This returns:

FirstName LastName Difference Value Soundex Value

Notice that both Jon Voight and John Wayne have aSOUNDEXvalue of J500 and

a DIFFERENCE value of 4 for their first names This explains why they were

initially selected Also notice that Julie Andrews has aDIFFERENCEvalue of 3 If

you had specified aWHEREclause where theDIFFERENCEvalue equaled 3 or 4,

that actor would have been selected as well

Looking Ahead

This concludes our study of matching phrases by pattern or sound Matching by

patterns is an important and widely used function of SQL Any time you enter a

word in a search box and attempt to retrieve all entities containing that word,

you are utilizing pattern matching Efforts to match by sound are much less

common The technology exists, but there is an inherent difficulty in translating

words to sounds The English language, or any language for that matter, contains

too many quirks and exceptions for such a match to be reliable

Trang 4

In our next chapter, ‘‘Summarizing Data,’’ we’re going to turn our attention to ways to separate data into groups and summarize the values in those groups with various statistics Back in Chapter 4, we talked about scalar functions The next

chapter will introduce another type of function, called aggregate functions These

aggregate functions will allow you to summarize your data in many useful ways For example, you’ll be able to look at any group of orders and determine the number of orders, the total dollar amount of the orders, and the average order size With these techniques, you’ll be able to move beyond the presentation of detailed data and begin to truly add value for your users as you deliver sum-marized information

Chapter 9 ■ Inexact Matches

94

Trang 5

Summarizing Data

Up until now, we’ve been presenting data basically as it exists in a database Sure, we’ve used some functions to move things around and have created some addi-tional calculations, but the rows we’ve retrieved have corresponded to rows in the underlying database We now want to turn to various methods to summarize our data

The computer term usually associated with this type of endeavor is aggregation,

which means ‘‘to combine into groups.’’ The ability to aggregate and summarize your data is key to being able to move beyond a mere display of data to some-thing approaching real information There’s a bit of magic involved when users view summarized data in a report They understand and appreciate that you’ve been able to extract some real meaning from the mass of data in a database, in order to present a clearer picture of what it all means

Eliminating Duplicates

Although it doesn’t provide a true aggregation, the most elementary way

to summarize data is to eliminate duplicates SQL has a keyword named

DISTINCT, which provides an easy way to remove duplicate rows from your output

95

Ngày đăng: 05/07/2014, 05:20