1. Trang chủ
  2. » Công Nghệ Thông Tin

Professional Information Technology-Programming Book part 112 doc

6 60 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 6
Dung lượng 16,81 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Combining Lookahead and Lookbehind Lookahead and lookbehind operations may be combined, as in the following example the solution to the problem at the start of this lesson: Ben Forta's

Trang 1

CFMX1: $899.00

XTC99: $69.96

Total items found: 4

(?<=\$)[0-9.]+

ABC01: $23.45

HGG42: $5.31

CFMX1: $899.00

XTC99: $69.96

Total items found: 4

That did the trick (?<=\$) matches $, but does not consume it, and so only the prices (without the leading $ signs) are returned

Compare the first and last expressions used in this example \$[0-9.]+ matched $ followed by a dollar amount (?<=\$)[0-9.]+ also matched $ followed by a dollar amount The difference between the two is not in what they located while

performing the search; it is in what they included in the results The former located and included the $ The latter located $ so as to correctly find the prices, but did not include that $ in the matched results

Lookahead patterns may be variable length; they may contain and

+, for example, so as to be highly dynamic

Lookbehind patterns, on the other hand, must generally be fixed

length This is a restriction imposed by almost all regular

Trang 2

expression implementations

Combining Lookahead and Lookbehind

Lookahead and lookbehind operations may be combined, as in the following example (the solution to the problem at the start of this lesson):

<HEAD>

<TITLE>Ben Forta's Homepage</TITLE>

</HEAD>

(?<=<[tT][iI][tT][lL][eE]>).*(?=</[tT][iI][tT][lL][eE]>)

<HEAD>

<TITLE>Ben Forta's Homepage</TITLE>

</HEAD>

That worked (?<=<[tT][iI][tT][lL][eE]>) is a lookbehind operation that matches (but does not consume) <TITLE>;

(?=</[tT][iI][tT][lL][eE]>) similarly matches (but does not consume) </TITLE> All that is returned is the title text (as that is all that was consumed)

Tip

In the preceding example, it may be worthwhile to escape the <

(the first character being matched) to prevent ambiguity, so (?<=\<

instead of (?<=<

Trang 3

Negating Lookaround

As seen thus far, lookahead and lookbehind are usually used to match text,

essentially to specify the location of text to be returned (by specifying the text before or after the desired match) These are known as positive lookahead and positive lookbehind The term positive refers to the fact that they look for a match

A lesser-used form of lookaround is the negative lookaround Negative lookahead looks ahead for text that does not match the specified pattern, and negative

lookbehind similarly looks behind for text that does not match the specified

pattern

You might have expected to be able to use ^ to negate a lookaround, but no, the syntax is a little different Lookaround operations are negated using ! (which replaces the =) Table 9.1 lists all the lookaround operations

Table 9.1 Lookaround Operations

Tip

Generally, any regular expression implementations supporting

lookahead support both positive and negative lookahead

Similarly, those implementations supporting lookbehind support

both positive and negative lookbehind

To demonstrate the difference between positive and negative lookbehind, here is

an example The following block of text contains numbers—both prices and

quantities First we'll just obtain the prices:

Trang 4

I paid $30 for 100 apples,

50 oranges, and 60 pears

I saved $5 on this order

(?<=\$)\d+

I paid $30 for 100 apples,

50 oranges, and 60 pears

I saved $5 on this order

This is very similar to the example seen previously \d+ matches numbers (one or more digits), and (?<=\$) looks behind to match (but not consume) the $

(escaped as \$) Therefore, the numbers in the two prices were matched, but not the quantities

Now we'll do the opposite, locating just the quantities but not the prices:

I paid $30 for 100 apples,

50 oranges, and 60 pears

I saved $5 on this order

\b(?<!\$)\d+\b

Trang 5

I paid $30 for 100 apples,

50 oranges, and 60 pears

I saved $5 on this order

Again, \d+ matched numbers, but this time only the quantities were matched and not the prices Expression (?<!\$) is a negative lookbehind that will match only when what precedes the numbers is not a $ Changing the = in the lookbehind changes the pattern from positive to negative

You may be wondering why the pattern in the negative lookbehind example

defines word boundaries (using \b) To understand why this is necessary, here is the same example without those boundaries:

I paid $30 for 100 apples,

50 oranges, and 60 pears

I saved $5 on this order

(?<!\$)\d+

I paid $30 for 100 apples,

50 oranges, and 60 pears

I saved $5 on this order

Ngày đăng: 07/07/2014, 03:20