You can refer to the i th captured group later in the expression As an example, each of the following represent valid regular expressions, and all will successfully match the character
Trang 1static method you’ll usually use getClass() You don’t need to use the
current class as the node identifier, but that’s the usual practice FeedbackOnce you create the node, it’s available for either loading or reading data This example loads the node with various types of items, and then gets the
keys() These come back as a String[], which you might not expect if you’re used to keys() in the collections library Here, they’re converted to
a List which is used to produce an Iterator for printing the keys and values Notice the second argument to get() This is the default value
which is produced if there isn’t any entry for that key value While
iterating through a set of keys, you always know there’s an entry so using
null as the default is safe, but normally you’ll be fetching a named key, as
This way, the first time you run the program the UsageCount will be
zero, but on subsequent invocations it will be nonzero Feedback
When you run PreferencesDemo.java you’ll see that the UsageCount
does indeed increment every time you run the program, but where is the data stored? There’s no local file that appears after the program is run the first time The Preferences API uses appropriate system resources to accomplish its task, and these will vary depending on the OS In Windows, the registry is used (since it’s already a hierarchy of nodes with key-value pairs) But the whole point is that the information is magically stored for you so that you don’t have to worry about how it works from one system
to another Feedback
There’s more to the preferences API than shown here Consult the JDK documentation, which is fairly understandable, for further details Feedback
Trang 2Regular expressions
To finish this chapter, we’ll look at regular expressions, which were added
in JDK 1.4 but have been integral to Standard Unix utilities like sed & awk, and languages like Python and Perl (some would argue that they are predominant reason for Perl’s success) Technically these are string
manipulation tools (previously delegated to the String, StringBuffer, & StringTokenizer classes in Java), but they are typically used in
conjunction with I/O so it’s not too far-fetched to include them here5 Feedback
Regular expressions are powerful and flexible text processing tools They allow you to specify, programmatically, complex patterns of text that can
be discovered in an input string Once you discover these patterns, you can then react to them any way you want Although the syntax of regular expressions can be intimidating at first, they provide a compact and dynamic language which can be employed to solve all sorts of string processing, matching and selection, editing, and verification problems in
a completely general way Feedback
Creating regular expressions
You can begin learning regular expressions with a useful subset of the possible constructs A complete list of constructs for building regular
expressions can be found in the JavaDocs for the Pattern class for
package java.util.regex Feedback
Characters
B The specific character B
\xhh Character with hex value 0xhh
\uhhhh The Unicode character with hex representation
0xhhhh
5 A chapter dedicated to strings will have to wait until the 4 th edition Mike Shea
contributed to this section
Trang 3\n Newline
\r Carriage return
The power of regular expressions begins to appear when defining
character classes Here are some typical ways to create character classes, and some predefined classes: Feedback
Character Classes
[abc] Any of the characters a, b, or c (same as
a|b|c)
[^abc] Any character except a, b, and c (negation)
[a-zA-Z] Any character a thru z or A thru Z (range)
[abc[hij]] Any of a,b,c,h,i,j (same as a|b|c|h|i|j)
(union)
[a-z&&[hij]] Either h, i, or j (intersection)
\s A whitespace character (space, tab,
newline, formfeed, carriage return)
In other languages, “\\” means “I want to insert a plain old (literal)
backslash in the regular expression Don’t give it any special meaning.” In
Java, “\\” means “I’m inserting a regular expression backslash, so the
following character has special meaning.” For example, if you want to indicate one or more word characters, your regular expression string will
be “\\w+” If you want to insert a literal backslash, you say “\\\\”
Trang 4However, things like newlines and tabs just use a single backslash: “\n\t” Feedback
What’s shown here is only a sampling; you’ll want to have the
java.util.regex.Pattern JDK documentation page bookmarked or on
your “start” menu so you can easily access all the possible regular
expression patterns Feedback
Logical Operators
(X) A capturing group You can refer to the
i th captured group later in the expression
As an example, each of the following represent valid regular expressions, and all will successfully match the character sequence "Rudolph":
A Quantifier describes the way that a pattern absorbs input text:
• Greedy: Quantifiers are greedy unless otherwise altered A greedy
expression finds as many possible matches for the pattern as possible A typical cause of problems is assuming that your pattern
Trang 5will only match the first possible group of characters, when it’s actually greedy and will keep going Feedback
• Reluctant: Specified with a question mark Matches the minimum
necessary number of characters to satisfy the pattern Also called
lazy, minimal matching, non-greedy or ungreedy Feedback
• Possessive: Currently only available in Java (not in other
languages), and is more advanced so you probably won’t use it right away As a regular expression is applied to a string, it
generates many states so that it can backtrack if the match fails Possessive quantifiers do not keep those intermediate states, preventing backtracking They can be used to prevent the a regular expression from running away and also to make it execute more efficiently Feedback
Greedy Reluctant Possessive Matches
Might seem like it would match the sequence ‘abc’ one or more times, and
if you apply it to the input string ‘abcabcabc’ you will in fact get three
matches However, the expression actually says “match ‘ab’ followed by
one or more occurrences of ‘c’.” To match the entire string ‘abc’ one or more times, you must say:
Trang 6(abc)+
You can easily be fooled when using regular expressions – it’s a new language, on top of Java Feedback
CharSequence
JDK1.4 defines a new interface called CharSequence, which establishes
a definition of a character sequence, abstracted from the String or
Pattern and Matcher
As a first example, the following class can be used to test regular
expressions against an input string The first argument is the input string
to match against, followed by one or more regular epressions to be
applied to the input Under Unix/Linux, the regular expressions must be quoted on the command line Feedback
This program can be useful in testing regular expressions as you construct them to see that they produce your intended matching behavior
//: c12:TestRegularExpression.java
// Allows you to easly try out regular expressions
// {Args: abcabcabcdefabc "abc+" "(abc)+" "(abc){2,}" } import java.util.regex.*;
public class TestRegularExpression {
public static void main(String[] args) {
if(args.length < 2) {
System.out.println("Usage:\n" +
"java TestRegularExpression " +
"characterSequence regularExpression+");
Trang 7System.exit(0);
}
System.out.println("Input: \"" + args[0] + "\""); for(int i = 1; i < args.length; i++) {
represents a compiled version of a regular expression The static
compile( ) method compiles a regular expression string into a Pattern object As seen above, you can use the matcher( ) method and the input string to produce a Matcher object from the compiled Pattern object Pattern also has a
static boolean matches(String regex, CharSequence input)
for quickly discerning if regex can be found in input, and a split( ) method that produces an array of String that has been broken around matches of the regex Feedback
A Matcher object is generated by calling Pattern.matcher( ) with the input string as an argument The Matcher object is then used to access
the results, using methods to evaluate the success or failure of different types of matches:
boolean matches()
boolean lookingAt()
boolean find()
boolean find(int start)
The matches( ) method is successful if the pattern matches the entire input string, while lookingAt( ) is successful if the input string, starting
at the beginning, is a match to the pattern Feedback
Trang 8public class FindDemo {
private static Test monitor = new Test();
public static void main(String[] args) {
}
} ///:~
The pattern “\\w+” indicates “one or more word characters,” so it will simply split the input up into words find( ) is like an iterator, moving forward through the input string However, the second version of find( )
can be given an integer argument that tells it the character position for the beginning of the search – this version resets the search position to the value of the argument, as you can see from the output Feedback
Trang 9Groups
Groups are regular expressions set off by parentheses, which can be called
up later with their group number Group zero indicates the whole
expression match, group one is the first parenthesized group, etc Thus in
matcher's pattern Group zero is not included in this count
public String group( ) returns group zero (the entire match) from the previous match operation (find( ), for example)
public String group(int i) returns the given group number during the
previous match operation If the match was successful but the group specified failed to match any part of the input string, then null is returned
public int start(int group) returns the start index of the group found
in the previous match operation
public int end(int group) returns the index of the last character, plus
one, of the group found in the previous match operation Feedback
Here’s an example of regular expression groups:
//: c12:Groups.java
import java.util.regex.*;
import com.bruceeckel.simpletest.*;
public class Groups {
private static Test monitor = new Test();
static public final String poem =
"Twas brillig, and the slithy toves\n" +
"Did gyre and gimble in the wabe.\n" +
"All mimsy were the borogoves,\n" +
"And the mome raths outgrabe.\n\n" +
"Beware the Jabberwock, my son,\n" +
"The jaws that bite, the claws that catch.\n" +
Trang 10"Beware the Jubjub bird, and shun\n" +
"The frumious Bandersnatch.";
public static void main(String[] args) {
Matcher m =
Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$") matcher(poem);
"[in the wabe.][in][the wabe.][the][wabe.]",
"[were the borogoves,]" +
"[bird, and shun][bird,][and shun][and][shun]",
"[The frumious Bandersnatch.][The]" +
"[frumious Bandersnatch.][frumious][Bandersnatch.]" });
}
} ///:~
The poem is the first part of Lewis Carroll’s “Jabberwocky,” from Through the Looking Glass You can see that the regular expression pattern has a
number of parenthesized groups, consisting of any number of
non-whitespace characters (‘\S+’) followed by any number of non-whitespace characters (‘\s+’) The goal is to capture the last three words on each line; the end of a line is delimited by ‘$’ However, the normal behavior is to match ‘$’ with the end of the entire input sequence, so we must explicitly
tell the regular expression to pay attention to newlines within the input
This is accomplished with the ‘(?m)’ pattern flag at the beginning of the
sequence (pattern flags will be shown shortly) Feedback
Trang 11start() and end()
Following a successful matching operation, start( ) returns the start index of the previous match, and end( ) returns the the index of the last character matched, plus one Invoking either start( ) or end( ) following
an unsuccessful matching operation (or prior to a matching operation
being attempted) produces an IllegalStateException The following program also demonstrates matches( ) and lookingAt( ): Feedback
//: c12:StartEnd.java
import java.util.regex.*;
import com.bruceeckel.simpletest.*;
public class StartEnd {
private static Test monitor = new Test();
public static void main(String[] args) {
String[] input = new String[] {
"Java has regular expressions in 1.4",
"regular expressions now expressing in Java",
"Java represses oracular expressions"
m1 = p1.matcher(input[i]),
m2 = p2.matcher(input[i]);
while(m1.find())
System.out.println("m1.find() '" + m1.group() + "' start = "+ m1.start() + " end = " + m1.end()); while(m2.find())
System.out.println("m2.find() '" + m2.group() + "' start = "+ m2.start() + " end = " + m2.end()); if(m1.lookingAt()) // No reset() necessary
System.out.println("m1.lookingAt() start = "
+ m1.start() + " end = " + m1.end());
if(m2.lookingAt())
System.out.println("m2.lookingAt() start = "
+ m2.start() + " end = " + m2.end());
if(m1.matches()) // No reset() necessary
System.out.println("m1.matches() start = "
+ m1.start() + " end = " + m1.end());
Trang 12"m1.find() 'ressions' start = 20 end = 28",
"m2.find() 'Java has regular expressions in 1.4'" + " start = 0 end = 35",
"m2.lookingAt() start = 0 end = 35",
"m2.matches() start = 0 end = 35",
"input 1: regular expressions now " +
"expressing in Java",
"m1.find() 'regular' start = 0 end = 7",
"m1.find() 'ressions' start = 11 end = 19",
"m1.find() 'ressing' start = 27 end = 34",
"m2.find() 'Java' start = 38 end = 42",
"m1.lookingAt() start = 0 end = 7",
"input 2: Java represses oracular expressions", "m1.find() 'represses' start = 5 end = 14",
"m1.find() 'ressions' start = 27 end = 35",
"m2.find() 'Java represses oracular expressions' " + "start = 0 end = 35",
"m2.lookingAt() start = 0 end = 35",
"m2.matches() start = 0 end = 35"
expression starts matching at the very beginning of the input While
matches( ) only succeeds if the entire input matches the regular
expression, lookingAt( ) 6 succeeds if only the first part of the input matches Feedback
6 I have no idea how they came up with this method name, or what it’s supposed to refer
to But it’s reassuring to know that whoever comes up with nonintuitive method names is still employed at Sun And that their apparent policy of not reviewing code designs is still
in place Sorry for the sarcasm, but this kind of thing gets tiresome after a few years
Trang 13Pattern flags
An alternative compile( ) method accepts flags that affect the behavior
of regular expression matching:
Pattern Pattern.compile(String regex, int flag)
where flag is drawn from among the following Pattern class constants:
Pattern.CANON_EQ Two characters will be considered to
match if, and only if, their full canonical decompositions match The expression “a\u030A”, for example, will match the string “?” when this flag is specified By default, matching does not take canonical equivalence into account
Pattern.CASE_INSENSITIVE
(?i)
By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched This flag allows your pattern to match without regard to case (upper or lower) Unicode-aware case-insensitive matching can be enabled by
specifying the UNICODE_CASE
flag in conjunction with this flag
Pattern.COMMENTS
(?x)
In this mode, whitespace is ignored, and embedded comments starting with # are ignored until the end of a line Unix lines mode can also be enabled via the embedded flag expression
Pattern.DOTALL
(?s)
In dotall mode, the expression ‘.’
matches any character, including a
line terminator By default the ‘.’
Trang 14expression does not match line terminators
Pattern.MULTILINE
(?m)
In multiline mode the expressions
‘^’ and ‘$’ match the beginning and ending of a line, respectively ‘^’ also
matches the beginning of the input
string, and ‘$’ also matches the end
of the input string By default these expressions only match at the beginning and the end of the entire input string
Pattern.UNICODE_CASE
(?u)
When this flag is specified then insensitive matching, when enabled
case-by the CASE_INSENSITIVE flag,
is done in a manner consistent with the Unicode Standard By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched
Pattern.UNIX_LINES
(?d)
In this mode, only the ‘\n’ line
terminator is recognized in the
behavior of ‘.’, ‘^’, and ‘$’
Particularly useful among these flags are
Pattern.CASE_INSENSITIVE, Pattern.MULTILINE, and
Pattern.COMMENTS (which is helpful for clarity and/or
documentation) Note that the behavior of most of the flags can also be obtained by inserting the parenthesized characters, shown in the table beneath the flags, into your regular expression, preceding the place where you want the mode to take effect Feedback
You can combine the effect of these and other flags through an "OR" (‘|’)
operation:
//: c12:ReFlags.java
import java.util.regex.*;
import com.bruceeckel.simpletest.*;
Trang 15public class ReFlags {
private static Test monitor = new Test();
public static void main(String[] args) {
Pattern p = Pattern.compile("^java",
Pattern.CASE_INSENSITIVE|Pattern.MULTILINE);
Matcher m = p.matcher(
"java has regex\nJava has regex\n" +
"JAVA has pretty good regular expressions\n" +
"Regular expressions are in Java");
This creates a pattern which will match lines which start with "java",
"Java", "JAVA", etc and attempt a match for each line within a multiline set (matches starting at the beginning of the character sequence and following each line terminator within the character sequence) Note that
the group( ) method only produces the matched portion Feedback
split()
Splitting divides an input string into an array of String objects, delimited
by the regular expression
String[] split(CharSequence charseq)
String[] split(CharSequence charseq, int limit)
This is a quick and handy way of breaking up input text over a common boundary:
//: c12:SplitDemo.java
import java.util.regex.*;
import com.bruceeckel.simpletest.*;
import java.util.*;
public class SplitDemo {
private static Test monitor = new Test();
Trang 16public static void main(String[] args) {
"[This, unusual use, of exclamation, points]",
"[This, unusual use, of exclamation!!points]",
"[Aha!, String, has, a, split(), built, in!]"
respectively This is a very important method, because it allows you to call
methods and perform other processing in order to produce replacement (replaceFirst( ) and replaceAll( ) are only able to put in fixed strings)
Trang 17With this method, you can programmatically pick apart the groups and create powerful replacements Feedback
appendTail(StringBuffer sbuf, String replacement) is invoked after one or more invocations of the appendReplacement( ) method in
order to copy the remainder of the input string Feedback
Here’s an example which shows the use of all the replace operations In addition, the block of commented text at the beginning is extracted and processed with regular expressions, for use as input in the rest of the example:
/*! Here's a block of text to use as input to
the regular expression matcher Note that we'll
first extract the block of text by looking for
the special delimiters, then process the
extracted block !*/
public class TheReplacements {
private static Test monitor = new Test();
public static void main(String[] args) throws Exception { String s = TextFile.read("TheReplacements.java");
// Match the specially-commented block of text above: Matcher mInput =
Pattern.compile("/\\*!(.*)!\\*/", Pattern.DOTALL) matcher(s);
if(mInput.find())
s = mInput.group(1); // Captured by parentheses
// Replace two or more spaces with a single space:
Trang 18// Process the find information as you
// perform the replacements:
while(m.find())
m.appendReplacement(sbuf, m.group().toUpperCase()); // Put in the remainder of the text:
m.appendTail(sbuf);
System.out.println(sbuf);
monitor.expect(new String[]{
"Here's a block of text to use as input to",
"the regular expression matcher Note that we'll", "first extract the block of text by looking for", "the special delimiters, then process the",
"extracted block ",
"H(VOWEL1)rE's A blOck Of tExt tO UsE As InpUt tO", "thE rEgUlAr ExprEssIOn mAtchEr NOtE thAt wE'll", "fIrst ExtrAct thE blOck Of tExt by lOOkIng fOr", "thE spEcIAl dElImItErs, thEn prOcEss thE",
"ExtrActEd blOck "
});
}
} ///:~
The file is opened and read using the TextFile.read( ) method
introduced earlier in this chapter mInput is created to match all the text (notice the grouping parentheses) between ‘/*!’ and ‘!*/’ Then, more than
two spaces are reduced to a single space, and any space at the beginning
of each line is removed (in order to do this on all lines and not just the beginning of the input, multiline mode must be enabled) These two replacements are performed with the equivalent (but more convenient, in
this case) replaceAll( ) that’s part of String Note that since each
replacement is only used once in the program, there’s no extra cost to
doing it this way rather than precompiling it as a Pattern Feedback
replaceFirst( ) only performs the first replacement that it finds In addition, the replacement strings in replaceFirst( ) and replaceAll( )
are just literals, so if you want to perform some processing on each
replacement they don’t help In that case, you need to use
appendReplacement( ), which allows you to write any amount of code
in the process of performing the replacement In the above example, a
group( ) is selected and processed – in this example, setting the vowel found by the regular expression to upper case – as the resulting sbuf is
being built Normally, you would step through and perform all the
Trang 19replacements and then call appendTail( ), but if you wanted to simulate replaceFirst( ) (or “replace n”) you would just do the replacement one time and then call appendTail( ) to put the rest into sbuf Feedback
appendReplacement( ) also allows you to refer to captured groups
directly in the replacement string by saying “$g” where ‘g’ is the group number However, this is for simpler processing and wouldn’t give you the desired results in the above program Feedback
public class Resetting {
private static Test monitor = new Test();
public static void main(String[] args) throws Exception { Matcher m = Pattern.compile("[frb][aiu][gx]")
matcher("fix the rug with bags");
reset( ) without any arguments sets the Matcher to the beginning of the
current sequence Feedback
Trang 20Regular expressions and Java I/O
Most of the examples so far have shown regular expressions applied to static strings The following example shows one way to apply regular
expressions to search for matches in a file Inspired by Unix’s grep,
JGrep.java takes two arguments: a filename and the regular expression
that you want to match.The ouput shows each line where a match occurs and the match position(s) within the line Feedback
public class JGrep {
public static void main(String[] args) throws Exception {
// Iterate through the lines of the input file:
ListIterator it = new TextFile(args[0]).listIterator(); while(it.hasNext()) {
ArrayList, from that array a ListIterator is produced The result is an
iterator that will allow you to move through the lines of the file (forward and backward) Feedback
Trang 21Each input line is used to produce a Matcher and the result is scanned with find( ) Note that the ListIterator.nextIndex( ) keeps track of the
line numbers Feedback
The test arguments open the JGrep.java file to read as input, and search for words starting with [Ssct] Feedback
Is StringTokenizer needed?
The new capabilities provided with regular expressions might prompt you
to wonder whether the original StringTokenizer class is still necessary
Before JDK 1.4, the way to split a string into parts was to “tokenize” it
with StringTokenizer But now it’s much easier and more succinct to do
the same thing with regular expressions:
//: c12:ReplacingStringTokenizer.java
import java.util.regex.*;
import com.bruceeckel.simpletest.*;
import java.util.*;
public class ReplacingStringTokenizer {
private static Test monitor = new Test();
public static void main(String[] args) {
String input = "But I'm not dead yet! I feel happy!"; StringTokenizer stoke = new StringTokenizer(input); while(stoke.hasMoreElements())
System.out.println(stoke.nextToken());
System.out.println(Arrays.asList(input.split(" "))); monitor.expect(new String[] {
Trang 22With regular expressions you can also split a string into parts using more complex patterns, something that’s much more difficult with
StringTokenizer It seems safe to say that regular expressions replace
any tokenizing classes in earlier versions of Java Feedback
You can learn much more about regular expressions in Mastering
Regular Expressions, 2 nd Edition by Jeffrey E F Friedl (O’Reilly, 2002)
the kinds of objects a stream will accept by redefining the toString( )
method that’s automatically called when you pass an object to a method
that’s expecting a String (Java’s limited “automatic type conversion”)
Feedback
There are questions left unanswered by the documentation and design of the I/O stream library For example, it would have been nice if you could say that you want an exception thrown if you try to overwrite a file when opening it for output—some programming systems allow you to specify that you want to open an output file, but only if it doesn’t already exist In
Java, it appears that you are supposed to use a File object to determine whether a file exists, because if you open it as a FileOutputStream or FileWriter it will always get overwritten Feedback
The I/O stream library brings up mixed feelings; it does much of the job and it’s portable But if you don’t already understand the decorator
pattern, the design is nonintuitive, so there’s extra overhead in learning and teaching it It’s also incomplete: for example, I shouldn’t have to write
utilities like TextFile, and there’s no support for the kind of output
formatting that virtually every other language’s I/O package supports Feedback
However, once you do understand the decorator pattern and begin using
the library in situations that require its flexibility, you can begin to benefit
Trang 23from this design, at which point its cost in extra lines of code may not bother you as much Feedback
If you do not find what you’re looking for in this chapter (which has only been an introduction, and is not meant to be comprehensive), you can
find in-depth coverage in Java I/O, by Elliotte Rusty Harold (O’Reilly,
1999) Feedback
Exercises
Solutions to selected exercises can be found in the electronic document The Thinking in Java
Annotated Solution Guide, available for a small fee from www.BruceEckel.com.
1 Open a text file so that you can read the file one line at a time
Read each line as a String and place that String object into a LinkedList Print all of the lines in the LinkedList in reverse
order Feedback
2 Modify Exercise 1 so that the name of the file you read is provided
as a command-line argument Feedback
3 Modify Exercise 2 to also open a text file so you can write text into
it Write the lines in the ArrayList, along with line numbers (do
not attempt to use the “LineNumber” classes), out to the file Feedback
4 Modify Exercise 2 to force all the lines in the ArrayList to upper
case and send the results to System.out Feedback
5 Modify Exercise 2 to take additional command-line arguments of
words to find in the file Print all lines in which any of the words match Feedback
6 Modify DirList.java so that the FilenameFilter actually opens
each file and accepts the file based on whether any of the trailing arguments on the command line exist in that file Feedback
7 Modify DirList.java to produce all the file names in the current
directory and subdirectories that satisfy the given regular
expression Hint: use recursion to traverse the subdirectories
Trang 248 Create a class called SortedDirList with a constructor that takes
file path information and builds a sorted directory list from the
files at that path Create two overloaded list( ) methods that will
either produce the whole list or a subset of the list based on an
argument Add a size( ) method that takes a file name and
produces the size of that file Feedback
9 Modify WordCount.java so that it produces an alphabetic sort
instead, using the tool from Chapter 11 Feedback
10 Modify WordCount.java so that it uses a class containing a
String and a count value to store each different word, and a Set
of these objects to maintain the list of words Feedback
11 Modify IOStreamDemo.java so that it uses
LineNumberReader to keep track of the line count Note that
it’s much easier to just keep track programmatically Feedback
12 Starting with section 4 of IOStreamDemo.java, write a program
that compares the performance of writing to a file when using buffered and unbuffered I/O Feedback
13 Modify section 5 of IOStreamDemo.java to eliminate the spaces
in the line produced by the first call to in5.readUTF( ) Feedback
14 Repair the program CADState.java as described in the text
Feedback
15 In Blips.java, copy the file and rename it to BlipCheck.java
and rename the class Blip2 to BlipCheck (making it public and removing the public scope from the class Blips in the process) Remove the //! marks in the file and execute the program
including the offending lines Next, comment out the default
constructor for BlipCheck Run it and explain why it works Note that after compiling, you must execute the program with “java Blips” because the main( ) method is still in class Blips Feedback
16 In Blip3.java, comment out the two lines after the phrases “You
must do this:” and run the program Explain the result and why it differs from when the two lines are in the program Feedback
Trang 2517 (Intermediate) In Chapter 8, locate the
GreenhouseController.java example, which consists of four files GreenhouseController contains a hard-coded set of
events Change the program so that it reads the events and their relative times from a text file (Challenging: Use a design patterns
factory method to build the events—see Thinking in Patterns with Java at www.BruceEckel.com.) Feedback
18 For the phrase “Java now has regular expressions” evaluate
whether the following expressions will find a match:
^Java
\Breg.*
n.w\s+h(a|i)s s?
21 Modify JGrep.java to use Java NIO memory-mapped files
22 Modify JGrep.java to accept a directory name or a file name as
argument (if a directory is provided, search should include all files
in the directory) Hint: you can generate a list of filenames with:
String[] filenames = new File(".").list();
Trang 2613: Concurrency
Objects provide a way to divide a program into
independent sections Often, you also need to turn a
program into separate, independently running subtasks
Each of these independent subtasks is called a thread, and you program
as if each thread runs by itself and has the CPU to itself Some underlying
mechanism is actually dividing up the CPU time for you, but in general,
you don’t have to think about it, which makes programming with multiple
threads a much easier task Feedback
A process is a self-contained running program with its own address space
A multitasking operating system is capable of running more than one
process (program) at a time, while making it look like each one is
chugging along on its own, by periodically switching the CPU from one
task to another A thread is a single sequential flow of control within a
process A single process can thus have multiple concurrently executing
threads Feedback
There are many possible uses for multithreading, but in general, you’ll
have some part of your program tied to a particular event or resource, and
you don’t want that to hold up the rest of your program So you create a
thread associated with that event or resource and let it run independently
of the main program Feedback
Concurrent programming is like stepping into an entirely new world and
learning a new programming language, or at least a new set of language
concepts With the appearance of thread support in most microcomputer
operating systems, extensions for threads have also been appearing in
programming languages or libraries In all cases, thread programming:
1 Seems mysterious and requires a shift in the way you think about
programming
2 Looks similar to thread support in other languages, so when you
understand threads, you understand a common tongue
Trang 27And although support for threads can makes Java a more complicated language, this isn’t entirely the fault of Java—threads are tricky FeedbackUnderstanding concurrent programming is on the same order of difficulty
as understanding polymorphism If you apply some effort, you can fathom the basic mechanism, but it generally takes deep study and understanding
in order to develop a true grasp of the subject The goal of this chapter is
to give you a solid foundation in the basics of concurrency, so that you can understand the concepts and write reasonable multithreaded programs
Be aware that you can easily become overconfident, so if you are writing anything complex you will need to study dedicated books on the topic Feedback
Motivation
One of the most compelling reasons for concurrency is to produce a responsive user interface Consider a program that performs some CPU-intensive operation and thus ends up ignoring user input and being unresponsive The basic problem is that the program needs to continue performing its operations, and at the same time it needs to return control
to the user interface so that the program can respond to the user If you have a “quit” button, you don’t want to be forced to poll it in every piece of code you write in your program, and yet you want the quit button to be
responsive, as if you were checking it regularly Feedback
A conventional method cannot continue performing its operations and at the same time return control to the rest of the program In fact, this sounds like an impossible thing to accomplish, as if the CPU must be in two places at once, but this is precisely the illusion that concurrency provides Feedback
Concurrency can also be used to optimize throughput For example, you you might be able to do important work while you’re stuck waiting for input to arrive on an I/O port Without threading, the only reasonable solution is polling the I/O port, which is awkward and can be difficult Feedback
If you have a multiprocessor machine, multiple threads may be
distributed across multiple processors, which can dramatically improve
Trang 28throughput This is often the case with powerful multiprocessor web servers, which can distribute large numbers of user requests across CPUs
in a program that allocates one thread per request Feedback
One thing to keep in mind is that a program with many threads must be able to run on a single-CPU machine Therefore, it must also be possible
to write the same program without using any threads However,
multithreading provides a very important organizational benefit, so that the design of your program can be greatly simplified Some types of problems, such as simulation—a video game, for example—are very difficult to solve without support for concurrency Feedback
The threading model is a programming convenience to simplify juggling several operations at the same time within a single program With
threads, the CPU will pop around and give each thread some of its time Each thread has the consciousness of constantly having the CPU to itself, but the CPU’s time is actually sliced between all the threads The
exception to this is if your program is running on multiple CPUs, but one
of the great things about threading is that you are abstracted away from this layer, so your code does not need to know whether it is actually running on a single CPU or many Thus, threads are a way to create transparently scalable programs—if a program is running too slowly, it can easily be made faster by adding CPUs to your computer Multitasking and multithreading tend to be the most reasonable ways to utilize
multiprocessor systems Feedback
Threading can reduce computing efficiency somewhat in single CPU machines, but the net improvement in program design, resource
balancing, and user convenience is often quite valuable In general, by being able to use threads you’re able to create a more loosely-coupled design, since otherwise parts of your code would be forced to explicitly pay attention to other tasks which would normally be handled by threads Feedback
Basic threads
The simplest way to create a thread is to inherit from java.lang.Thread,
which has all the wiring necessary to create and run threads The most
important method for Thread is run( ), which you must override to
Trang 29make the thread do your bidding Thus, run( ) is the code that will be
executed “simultaneously” with the other threads in a program FeedbackThe following example creates five threads, each with a unique
identification number generated with a static variable The Thread’s run( ) method is overridden to count down each time it passes through its loop and to return when the count is zero (at the point when run( )
returns, the thread is terminated by the threading mechanism) Feedback
//: c13:SimpleThread.java
// Very simple Threading example
import com.bruceeckel.simpletest.*;
public class SimpleThread extends Thread {
private static Test monitor = new Test();
private int countDown = 5;
private static int threadCount = 0;
public SimpleThread() {
super("" + ++threadCount); // Store the thread name start();
}
public String toString() {
return "#" + getName() + ": " + countDown;
Trang 30The thread objects are given specific names by calling the appropriate
Thread constructor This name is retrieved in toString( ) using
getName( )
A Thread object’s run( ) method virtually always has some kind of loop
that continues until the thread is no longer necessary, so you must
establish the condition on which to break out of this loop (or, in the case
above, simply return from run( )) Often, run( ) is cast in the form of
an infinite loop, which means that, barring some factor that causes run( )
to terminate, it will continue forever (later in the chapter you’ll see how to safely signal a thread to stop) Feedback
In main( ) you can see a number of threads being created and run The start( ) method in the Thread class performs special initialization for the thread and then calls run( ) So the steps are: the constructor is called
to build the object, it calls start( ) to configure the thread and the thread execution mechanism calls run( ) If you don’t call start( ) (which you
don’t have to do in the constructor, as you will see in subsequent
examples) the thread will never be started Feedback
The output for one run of this program will be different from that of another, because the thread scheduling mechanism is not deterministic
In fact, you may see dramatic differences in the output of this simple
Trang 31program between one version of the JDK and the next For example, a previous JDK didn’t time-slice very often, so thread 1 might loop to extinction first, then thread 2 would go through all of its loops, etc This was virtually the same as calling a routine that would do all the loops at once, except that starting up all those threads is more expensive In JDK 1.4 you get something like the above output, which indicates better time-slicing behavior by the scheduler—each thread seems to be getting regular service Generally these kinds of JDK behavioral changes have not been mentioned by Sun, so you cannot plan on any consistent threading
behavior The best approach is to be as conservative as possible while writing threading code Feedback
When main( ) creates the Thread objects it isn’t capturing the
references for any of them With an ordinary object, this would make it
fair game for garbage collection, but not with a Thread Each Thread
“registers” itself so there is actually a reference to it someplace and the
garbage collector can’t clean it up until the thread exits its run( ) and
dies Feedback
Yeilding
If you know that you’ve accomplished what you need to in your run( )
method, you can give a hint to the thread scheduling mechanism that you’ve done enough and that some other thread might as well have the
CPU This hint (and it is a hint—there’s no guarantee your
implementation will listen to it) takes the form of the yield( ) method
public class YieldingThread extends Thread {
private static Test monitor = new Test();
private int countDown = 5;
private static int threadCount = 0;
public YieldingThread() {
super("" + ++threadCount);
start();
Trang 32}
public String toString() {
return "#" + getName() + ": " + countDown;
Trang 33By using yield( ), the output is evened up quite a bit But note that if the
output string is longer, you will see output that is roughly the same as it
was in SimpleThread.java (try it—change toString( ) to put out
longer and longer strings to see what happens) Since the scheduling
mechanism is preemptive, it decides to interrupt a thread and switch to
another whenever it wants, so if I/O (which is executed via the main( ) thread) takes too long it gets interrupted before run( ) has a chance to yield( ) In general, yield( ) is useful only in rare situations and you can’t
rely on it to do any serious tuning of your application Feedback
Sleeping
Another way you can control the behavior of your threads is by calling
sleep( ) to cease execution for a given number of milliseconds If you replace the call to yield( ) in the above example with a call to sleep( ),
you get the following: Feedback
//: c13:SleepingThread.java
// Calling sleep() to wait for awhile
import com.bruceeckel.simpletest.*;
public class SleepingThread extends Thread {
private static Test monitor = new Test();
private int countDown = 5;
private static int threadCount = 0;
public SleepingThread() {
super("" + ++threadCount);
start();
}
public String toString() {
return "#" + getName() + ": " + countDown;
Trang 34public static void
main(String[] args) throws InterruptedException {
learn about those methods later) Usually, if you’re going to break out of a
suspended thread using interrupt( ) you will use wait( ) rather than sleep( ), so ending up inside of the catch clause is unlikely Here, we
Trang 35follow the maxim “don’t catch an exception unless you know what to do
with it” by re-throwing it as a RuntimeException Feedback
You’ll notice that the output is deterministic – each thread counts down
before the next one starts This is because join( ) (which you’ll learn about shortly) is used on each thread, so that main( ) waits for the thread
to complete before continuing If you did not use join( ), you’d see that the threads tend to run in any order, which means that sleep( ) is also
not a way for you to control the order of thread execution It just stops the execution of the thread for awhile The only guarantee that you have is that the thread will sleep at least 100 milliseconds, but it may take longer before the thread resumes execution because the thread scheduler still has to get back to it after the sleep interval expires Feedback
If you must control the order of execution of threads, your best bet is not
to use threads at all but instead to write your own cooperative routines which hand control to each other in a specified order Feedback
Priority
The priority of a thread tells the scheduler how important this thread is
Although the order that the CPU attends to an existing set of threads is indeterminate, if there are a number of threads blocked and waiting to be run, the scheduler will lean towards the one with the highest priority first However, this doesn’t mean that threads with lower priority don’t get run (that is, you can’t get deadlocked because of priorities) Lower priority threads just tend to run less often Feedback
Here’s SimpleThread.java modified so that the priority levels are demonstrated The priorities are adjusting using Thread’s
setPriority( ) method
//: c13:SimplePriorities.java
// Shows the use of thread priorities
import com.bruceeckel.simpletest.*;
public class SimplePriorities extends Thread {
private static Test monitor = new Test();
private int countDown = 5;
private volatile double d = 0; // No optimization
public SimplePriorities(int priority) {
Trang 36setPriority(priority);
start();
}
public String toString() {
return super.toString() + ": " + countDown;
Trang 37In this version, toString( ) is overridden to use Thread.toString( ),
which prints the thread name (which you can set yourself via the
constructor; here it’s automatically generated as Thread-1, Thread-2,
etc.), the priority level, and the “thread group” that the thread belongs to
Because the threads are self-identifying, there is no threadNumber in this example The overridden toString( ) also shows the countdown
value of the thread Feedback
You can see that the priority level of thread #1 is at the highest level, and all the rest of the threads are at the lowest level Feedback
Inside run( ), 100,000 repetitions of a rather expensive floating-point calculation have been added, involving double addition and division The variable d has been made volatile to ensure that no optimization is
performed Without this calculation, you don’t see the effect of setting the
priority levels (try it: comment out the for loop containing the double
calculations) With the calculation, you see that thread #1 is given a higher preference by the thread scheduler (at least, this was the behavior on my Windows 2000 machine) Even though printing to the console is also an expensive behavior, you won’t see the priority levels that way because console printing doesn’t get interrupted (otherwise the console display would get garbled during threading), whereas the math calculation above can be interrupted The calculation takes long enough that the thread scheduling mechanism jumps in and changes threads, and pays attention
to the priorities so that thread 1 gets preference Feedback
You can also read the priority of an existing thread with getPriority( )
and change it at any time (not just in the constructor, as above) with
setPriority( )
Trang 38Although the JDK has 10 priority levels, this doesn’t map well to many operating systems For example, Windows 2000 has 7 priority levels which are not fixed, so the mapping is indeterminate (although Sun’s Solaris has 231 levels) The only portable approach is to stick to
MAX_PRIORITY, NORM_PRIORITY and MIN_PRIORITY when
you’re adjusting priority levels Feedback
Daemon threads
A “daemon” thread is one that is supposed to provide a general service in the background as long as the program is running, but is not part of the essence of the program Thus, when all of the non-daemon threads
complete, the program is terminated Conversely, if there are any daemon threads still running, the program doesn’t terminate There is, for
non-instance, a non-daemon thread that runs main( ) Feedback
You must set the thread to be a daemon by calling setDaemon( ) before
it is started In run( ), the thread is put to sleep for a little bit Once the
threads are all started, the program terminates immediately, before any
Trang 39threads can print themselves, because there are no non-daemon threads
(other than main( )) holding the program open Thus, the program
terminates without printing any output
You can find out if a thread is a daemon by calling isDaemon( ) If a
thread is a daemon, then any threads it creates will automatically be daemons, as the following example demonstrates: Feedback
//: c13:Daemons.java
// Daemon threads spawn other daemon threads
import java.io.*;
import com.bruceeckel.simpletest.*;
class Daemon extends Thread {
private Thread[] t = new Thread[10];
public Daemon() {
setDaemon(true);
start();
}
public void run() {
for(int i = 0; i < t.length; i++)
t[i] = new DaemonSpawn(i);
for(int i = 0; i < t.length; i++)
public void run() {
while(true)
yield();
}
}
public class Daemons {
private static Test monitor = new Test();
public static void main(String[] args) throws Exception {
Trang 40Thread d = new Daemon();
System.out.println("d.isDaemon() = " + d.isDaemon()); // Allow the daemon threads to
// finish their startup processes:
The Daemon thread sets its daemon flag to “true” and then spawns a
bunch of other threads—which do not set themselves to daemon mode—to
show that they are daemons anyway Then it goes into an infinite loop
that calls yield( ) to give up control to the other processes Feedback
There’s nothing to keep the program from terminating once main( )
finishes its job, since there are nothing but daemon threads running So that you can see the results of starting all the daemon threads, the
main( ) thread is put to sleep for a second Without this you see only
some of the results from the creation of the daemon threads (Try
sleep( ) calls of various lengths to see this behavior.) Feedback