Finite State Machines and Automata 3.. 7.2 STATE MACHINES AND AUTOMATA Programs that search for patterns often have a special structure.. The overall behavior of the program can be viewe
Trang 1HANDOUT #7 AUTOMATA
K5 & K6, Computer Science Department, Văn Lang University
Second semester Feb, 2002
Instructor: Trần Đức Quang
Major themes:
1 Patterns and Pattern Matching
2 Finite State Machines and Automata
3 Deterministic and Nondeterministic Automata
Reading: Sections 10.2 and 10.3.
7.1 PATTERNS AND PATTERN MATCHING
A pattern is a set of objects with some recognizable property One type of pattern is a
set of character strings, such as the set of legal C identifiers, each of which is a string
of letters, digits, and underscores, beginning with a letter or underscore
Given a pattern and an input, the process of determining if the input matches the
pattern is called pattern matching, a problem also known as pattern recognition In
compiling, for example, one of the essential parts is to regconize construct patterns in programs before translating programs into a desired code Let’s see an illustration for the first phase of this process
Consider an if-statement in C,
if (a==b)
x = 1;
A C compiler will read input characters from the left, one at a time, collect them into
small groups of characters (lexemes or tokens) matching some lexical pattern This
phase is called lexical analysis Our statement, for example, may be grouped into the following tokens, each has its own pattern:
1 The keyword if
2 The left parenthesis (
3 The identifier a
4 The comparison operator ==
Trang 240 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7 AUTOMATA
5 The identifier b
6 The right parenthesis )
7 The identifier x
8 The assignment operator =
9 The integer 1
10 The statement-terminator ;
White space characters (blanks, tabs, and newlines) would also be eliminated
7.2 STATE MACHINES AND AUTOMATA
Programs that search for patterns often have a special structure We can identify cer-tain positions in the code at which we know something particular about the program’s progress toward its goal of finding an instance of a pattern We call these positions
states The overall behavior of the program can be viewed as moving from state to
state as it reads its input
To see the behavior of such a program, we can draw a graph with a node for each
state, and an arc for each moving from state to state (called a transition) A graph for
a program recognizing English words with five vowels in order is shown below:
There are two important states in this graph, one with an incoming arc labeled
start (state 0), and the other with a double circle (state 5) The former, the start state,
is the state in which we begin to recognize the pattern; the latter, the accepting state,
is the state we reach after having found our pattern and "accept" There may be several accepting states but one start state
Such a graph is called a finite automaton or just automaton.
We can design a pattern-matching program by first designing the automaton, then mechanically translating it into a program I will give an example in the next section
Automata can be viewed as a state machine consisting of a finite control, an input
tape, and a head to read a sequence of symbols written on the tape At any time during
its operation, the machine reads a symbol on the tape, changes its state, and moves the head one symbol to the right A picture of automata is shown in the figure on the next page
3 2
1 0
Λ
Λ − a Λ Λ − e Λ Λ − i Λ Λ − o Λ Λ − u
start
Trang 37.3 DETERMINISTIC AND NONDETERMINISTIC AUTOMATA
The automaton discussed in the previous section has an important property For any
state s and any input character x, there is at most one transition out of state s whose label includes x Such an automaton is said to be deterministic.
It is straighforward to convert deterministic finite automata (DFA) into programs
We create a piece of code for each state The code for state s examines its input and decides which of transitions out of s, if any, should be followed If a transition from state s to state t is selected, then the code for state s must arrange for the code of state
t to be executed next, perhaps by using a goto-statement.
Suppose we have a DFA for a bounce filter
You need not understand its meaning Just observe that the DFA has the start state
a and the two accepting states c and d, examines the input characters 1 and 0.
From this DFA, we can mechanically produce a simple program under the guide mentioned A resulting program is given on the next page
finite control input tape
c a
start
1 0
Trang 442 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7 AUTOMATA
void bounce()
{
char x;
/* state a */
x = getchar();
if (x == ’0’) goto a; /* transition to state a */
if (x == ’1’) goto b; /* transition to state b */ goto finis;
/* state b */
x = getchar();
if (x == ’0’) goto a; /* transition to state a */
if (x == ’1’) goto c; /* transition to state c */ goto finis;
/* state 1 */
x = getchar();
if (x == ’0’) goto d; /* transition to state d */
if (x == ’1’) goto c; /* transition to state c */ goto finis;
/* state d */
x = getchar();
if (x == ’0’) goto a; /* transition to state a */
if (x == ’1’) goto c; /* transition to state c */ goto finis;
finis: ;
}
Although it is easy to convert a DFA into a program, designing it is more difficult In fact, there is a generalization of DFAs, which is conceptually more natural This kind
of automata, called nondeterministic finite automata (NFA for short), may have two or
more transitions containing the same symbol out of one state
Note that a DFA is technically a NFA as well, one that happens not to have multi-ple transitions on one symbol
Trang 5NFAs are not directly implementable by programs, but they are useful conceptual tools for a number of applications Moreover, by using the "subset construction", it is possible to convert any NFA to a DFA that accepts the same set of character strings but this topic is beyond our discussion
For an illustration, I only show a NFA in the following figure
Note that we use the symbol Λ Λ to indicate any legal symbol.
7.4 GLOSSARY
Pattern: Mẫu See the definition in text.
Pattern Matching: Đối sánh mẫu, so mẫu.
Recognition: Nhận dạng.
Identifier: Định danh A name of an data object in a program.
Character: Ký tự Any symbol that we may input from the keyboard, including letters, digits, special symbols such as +, ^, and some nonprintable symbols.
Letter: Chữ cái.
Digit: Ký số, chữ số.
Underscore: Dấu gạch thấp _.
Input: Nguyên liệu, dữ liệu nhập.
Output: Thành phẩm, dữ liệu xuất.
Code: Mã lệnh, mã chương trình A full program or program segment in any form, such
as a high-level language or machine language
Compilation: Quá trình biên dịch Sometimes also translation.
Compiler: Trình biên dịch.
Interpreter: Trình thông dịch.
Translator: Chương trình dịch (nói chung).
Lexeme: Từ tố.
Token: Thẻ từ.
1 0
Λ
n a
m
start
Trang 644 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7 AUTOMATA
Assignment operator: Toán tử gán.
Statement-terminator: Dấu kết thúc câu lệnh.
Instance: Thể hiện.
Automaton, automata (pl.): Automat, Ôtômat.
Deterministic finite automata: Automat hữu hạn đơn định (tất định).
Nondeterministic finite automata: Automat hữu hạn đa định (không đơn định, không
tất định)
State: Trạng thái.
Transition: Chuyển vị.
Start state: Khởi trạng.
Accepting state, final state: Trạng thái kiểm nhận, kết trạng.
Finite control: Bộ điều khiển hữu hạn.
Input tape: Băng nguyên liệu.
Head: Đầu đọc.