program, 2 reconstrncting a program with another subset of variables from a counterexample of abstracted program, and 3 zefining the abstraction if the ommterexample ‘s a apnrions one, ‘
Trang 1
VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY
BUI PHI DIEP
AVOIDING STATE-SPACE EXPLOSION
IN MODEL-CHECKER
MASTER TUESIS OF INFORMATION TECIINOLOGY
Hanoi - 2014
Trang 2
VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY
BUI PHI DIEP
AVOIDING STATE-SPACE EXPLOSION
IN MODEL-CHECKER
Major: Computer science
Code: 60480101
MASTER THESIS OF INFORMATION TECHNOLOGY
SUPERVISOR: Assoc Prof Nguyen Vict Ha
Dr Mohamed Faouzi Atig
Hanoi - 2014
Trang 3
or diploma at University of Engineering and ‘Technology (UET/Co.tech} or any other
educational inatitution, except where due acknowledgement is made in the thesis Any contr‘bntion made to the research by others with whom T have worked an UIT /Colnech
or elsewhere, is explicitly acknowledged in the thesis I also declare that the intellectual content of this thesis is the product of my awn work, except to the extent that, assistance front others in the projecl’s design aud conception or in wbyle, preseulaliont and Hiugu‘stic expression is acknowledged
Date:
Trang 4ABSTRACT
Mode.-checking is a well-known technique for the program verification problem (i.e., checking that the program satisties a given property) llowever, Model-checking suffers the state-space explosion problem ‘Ihis is more visible in the the case of concurrent / parallel programs ‘Taerefore, developing new: efficient techniques ta address the state- space explesion problem (suck 2s slicing) is a crucial ard difficult challenge in Model checking,
Tn this thesis, we present a new slicing method for kandI'ng, the standard state explo-
sion problem Our slicing method consists of three steps: (1) creating an abstraction
with respect to a subset of program variah‘es, which ‘ends ta an ever-approximation
of tke input program, (2) reconstrncting a program with another subset of variables from a counterexample of abstracted program, and (3) zefining the abstraction if the ommterexample ‘s a apnrions one, ‘The provess stops when either there are no more countercxamples remaining or there exists a counterexample after investigating all vari ables In the former case, the program is correct; in the latter case, the program contains
an error We have implemented a prototype taal and run it enecessfully an atandard
beuclinarks, gether with severul chulleugizg examples The experiuiuntal results show the efficiency of our method
ii
Trang 5mistakes, sending to me at diepbpfÐvrt.elu.vr Íe appraciatedL
Trang 61B "Whestestradhures csust07ZPủ ti-80 tố VÀ E 66/804 8ê, SÁU v06 P3 HH 7
4'3' (Concurrent:ErogramB: s:v+ v¿ 2: có cà co cốc 0y Hee we Hee He 29
Trang 8An example of program abstraction, counterexample and reconstructed
program
Refined abstraction
CFG of concurrent program and their abstraction
A program with its initial abstraction 28
vi
Trang 9List of Tables
2.1 The syntax of prosram
22 Conditions on state transitions (11,9) rp (v2.2) for each vertex type
4.1 Experimental results of verifying concurrent programs in comparison with
Trang 10Chapter 1
Introduction
Tu Unis chapter, we desuribe Ube motivation of our work, aiid iL is iunporlaul, Inilia‘ly, we slarl with the accussaries of prograin verifivalion aud Model-Cheddug, Tucu we state the slate explosion problem, witie are unavoidable by applying Model-Chocking Next,
ye suumarive our sululious and sesulls, Finally, we describe related work aud Uesis’s slruclure
1.1 Mativation
Software is everywhere ‘The appearance af software ir in diverse areas namely edneation, healthcare system, transportation Besides, the development of multicore architecture makes software to be designed in many cores It helps software run faster, but unfartu na)
efrs to not only male software but alsa guarantee that it worhs carcctly Errors in
sofware becomes larger and wore complex Software devclopere uow aeed snore
soflware are difficult lo find aud fix Therefore, we ned ellieieul tecuniques, for example
program verification and testing, Ww help us andle complex program vcrurs
Program verification is a technique taat considers 2 program with given properties of
Trang 11Chapter 1 Introduction 2
Program testing is another technique to find program errors, Program testing considers
a program with a pair of input and output, called test ease, If the program executes with
case Otherwise, the program has an error related to the given input and output There are a number of limitations of program testing Initially, program testing requires a large
number of test eases to cover all possible program execution More importantly, program testing cannot handle nondeterministie programs, for example a program with many threads and interleaving between threads Such bugs like Heiseubuss [1| are difficult to figure out by program testing So in the scope of this thesis, we only focus on using program verification
Model checking is a popular verification technique The key idea of model checking is that it represents the program by a model, i.e Kripke structure [2], and verifies the niodel instead of the original program ‘The work in {3} shows a number of advantages of
model checking in comparison with other verifiéation techniques, ie automated theorem
proving, such as faster, providing counterexample - which is one of the biggest advantage
of model checking, and using temporal logies to describe properties of program
SPIN [4] is an efficient and a famous model checker SPIN provides an intuitive notation for design specification, and a concise notation for correctness claims, and the consisteney between the notations, In particular, the design specifications are written in Promela language [5], and correctness claims are written by Linear Temporal Logie |6} The structure of SPIN is shown in Figure 1.1 It takes input from the front end XSPIN,
‘The input is the program specification, including the program design and its correctness
claims, both are written in Promela language If the specification has no syntax errors,
it then generates a verifier, optimizes and executes the verifier A counterexample of
program is detected, it is then sent back to the simulation to to inspect the error in
detail
However, SPIN
and stores the valuation each state The state spaces that SPIN travels may contain
aces the state explosion problem Ideally, SPIN visits program states
billions of reachable sta
s Therefore, SPIN requires much time when verifying large programs, especially conentrent programs
One of efficient method to handle the state explosion program is slicing [7] There are
a number of slicing methods that have been developed, i.e, static slicing (7), dynamic
slicing [S$] and conditional slicing [9] In general
gram, slicing focus on a number of statements in the program with respect to the slicing
et of
nstead of considering the whole pro-
criterion ‘The slicing criterion often is a pair of a program location and a s
program variables, Slicing is proved to be useful in testing [10], debugging |S|.
Trang 12Chapter 1 Introduction 3
XSPIN Front -End
Promela Parser
Verifier Generator
Optimization
Executable On-The-Fly Verifier
Simulation
‘Counterexample
Ficue 1.1: Structure of SPIN
We now propose a new slicing method to handle some basic cases of the explosion
problem The difference of our method in comparison with other slicing methods that is
we do not concentrate on any specified subset of program variables, All variables in the program are considered For each step in our method, a subset of program variables is selected A new executable program, which consists of statements in the original with respect to selected variables, is generated and verified
The Figure 1.2 describes our method Our method aims to check the program safety
‘The process of our method is describes as follows
© Our method takes a program as input It then creates an initial abstraction of program with respect to a subset of program variables The abstraction is an over-approximation of the original program
« The abstraction then is verified by a model checker If the program is safe, the
process stops, the program is concluded to be safe Otherwise, a counterexample
is generated, The counterexample execution trace shows why there exists a bug
in the program The counterexample is used to reconstruct a simulation program
In general, the simulation program is another abstraction of the original program,
ce but also con tains statements with respect to a new subset of variables of the original program
‘The model checker then verifies the new abstract program
Which not only follows statements in counterexample execution tr
Trang 13Chapter 1 Tnireduebian 4
P
Create Abstraction
Frotim 1.2: Our inet hen
+ Our method stops when either there are no more variables remaining in the pro- gram or no more courterexamplea are detected In the former case, the program
is error In the latter case, the program is safe,
Trang 14program with respect to the slicing criterion,
‘The main issue of static slicing is that a static slice may contain many statements that do not affect the value of variables of interest, To overcome the issue, dynamic slicing [8] is
proposed to reduce the number of considered statements In particular, dynamic slicing
preservers the behavior of program for a specific inpnt So that instead of involving all potential statements affecting the slicing criterion, dynamic slicing reduces the search space Therefore, slicing criterion of dynamic slicing is a triple (1, V1) where 1, V are program location and a subset of variables in the program respectively, and 7 is the
program input,
Conditioned slicing [9] is another slicing method that bridges the gap between static slicing and dynamic slicing While static slicing does not care about input, dynamic slicing specifies input in detail, that triggers a large number of input needed to cover the whole program Conditioned slicing provides information about input without being so specific as to give the precise values, i.e., using boolean expression to relate the possible value of inputs Slicing criterion of dynamic slicing is a triple (J, V,F(V)) where 1, V are program location and a subset of variables in the program respeetively, and F(V) is the first order logic formula on variables in V7,
which represents predicates, So each abstract state in the abstracted program models a
the abstracted program are Boolean variables,
number of state in the original program ‘There exists a transition from an abstract state
if at least one corresponding concrete state in the original program has the transition,
that is over-approrimation The main challenge of predicate abstraction that is finding
a set of predicates for a program.
Trang 15Chapter 1 Tnireduebian 6
Pop
Create Initia: | Abstraction
ALP g Model Check
Frau 1.3: CECAR framework
Counterexample guide abstract refinement (CEGAR) is a technique that automates searching for predicates The technique consists of four basic steps as shown in Figure
18, Inthe Cure, we use some solutions ineludiug P,A(P).¢, Coe Lo denote he origival
program, abstracted program, program specification and ocuntercscaraple, respective.y
« Create Initial Abstraction: Construct abstraeted program of the original program
by over-approximation the program behavior of the criginal one The abstracted program is a finite-state program whose variables are Boolvat
Check the counterexample: ‘The counterexample of abstracted program may be
a spurious counterexample because of over-approximation, that means tae coun terexample ia not corresponds a valid execution o! the original program Li the counterexample tarns out to be an actual counterexample, the orizinal program does not satisiy the specification Otherwise, the abstracted program needs a refinement
# Refise the abstraction: Because of (he existence of uke spurious cocnterexample,
in is necessary to eliminate the counterexample from the abstracted program by adding additional predicates, which represerts behaviors in the abstracted program
Trang 16Chapter 1 Introduction
which does not correspond to original program The model checker then resumes
with the refined abstraction,
1.3 Thesis structure
The organization of this thesis is as follows, Chapter 2 describes our slicing method for seqential programs It also includes the program syntax, transition system of sequential programs Chapter 3 defines our slicing method for concurrent programs We only focus
on the difference of our method compared with the method for sequential programs Experimental results contains many tests with both sequential and concurrent programs
in Chapter 4 Most of them are well-known algorithms, for example Dekker [15], Lamport
[16}, Peterson [17], ete, Finally, conclusions and future work are in Chapter 5.
Trang 17Chapter 2
Slicing Sequential Programs
In this chapter, we describe our slicing method for sequential programs We aim to check
the program safety The form of safety properties are assertions that specify invariants
of the program In Promela, assertions are assert statements: assert(e), where e is
an expression If the expression e evaluates to 0 at run time, the program aborts Safety checking is so reduced to check whether assert statements are reachable
language with the standard
a Promela program as well as the transition system of Promela We thew describes our
method in details for sequential Promela programs
Definition 1 (Directed Graph) A directed graph is a tuple G = (V, B) where V is a
set, of vertices and & C V x V isa set of edges, together with functions start and end
that associate a start verter and an end verter with each edge A path of length & from
node n to node m is a sequence of edges (e1, ¢x) stich that end(e;) = start(ei41) for
2.1 Program Syntax
‘The program syntax is presented in Table 2.1, Each program P declares a set of variables Vars and a procedure init Variable type is either integer or boolean All variables are global, Variables are statically scoped as in C The variable identifier is a C-style identifier, Procedure init is constructed by a sequence of statements Statements may be labelled and are built inductively by composition with control-flow statement Expressions are built in the usual way from the constants, variables and the standard logical connectives ‘There are three statements that can affect the control flow of a
Trang 18Chapter 2 Sticiny Sequential Programs
int
bool type bool
Af option” fi ; Conditional statement
do option od ; Loop statement assert expr ; Assert statement
or Loop atatement rap expr hinap exper
2.2 Statements, Variables
‘The term stagement derotes and instance that can be derived from the norterminal stmt
in ‘Table 2.1 Let P be a program with » statements, Stnit be the set of statements in
P Let Type : Stmt + {Skip, Goto, Assignment, Condition, Assertion} be tae funetion
indicating the type of statements in Shmi.
Trang 19Chapter 2 Sticing Sequential Programs 10 Because the sequential program only has one procedure, we assume that all variables and statement labels are globally unique, Let Vars(P) be the set of variables in P, Var's(si)
be the set of variables appearing in statement 5) If s; is an assignment statement, let
Vars,(s;), Var'si(si) be the set of variables in the right side and left side of s; respectively,
Vars(si) = Vare(s)) UVarsi(si)-
2.3 Program Control-Flow Graph
This section defines the control-flow graph of a program of sequential programs, The program has a procedure, which is modeled by a directed graph,
The contro! flow graph of a sequential program P is a directed graph Gp = (Vp Ep)
where Vp is a set of vertices, and Ep © Vp x Vp is the set: of edges The set Vp contains one vertex for each statement in P and one additional vertex Berit for the program, which indicates where the program stops, Furthermore, Vp has one vertex
Err to indicate the failure of assert statements Each edge connects two vertices,
for example an edge from 1 to v2 denoted by (v1.02) Edges are constructed by the
function Nextp : Vp + Vp Nestp has a recursive definition based on the program syntax in the Table 2.1 In the table, each statement has an sseq node as its parent
If statement s; is not the last statement in its parent sseq, Nextp(«,) is the statement immediately following s¡ in the sequence, Otherwise, let œ be the closest ancestor of
sj in the syntax tree such that ø is a stmt node and a is the not last statement in its
sequence, If a exists
Nextp(s:) is Buit
Vert p(si) is the statement immediately following a Otherwise,
So we define Suecp basing on Next function:
@ Ifs; is goto L, Suecp(si) = {sj} where 5; is the statement labelled L
# If 5; is either an if or a do statement, then Sueep(si
Siy:1 $k <n are options of the if or do respectively
@ If s;is the last statement in an option of đo statement, then Sucep(si)
Sink
where s),.1 <k <n are options of the do
@ If 5; is an assertstatement, then Sucep(si
{Neat p(s), Brit}
Example 1 Consider the example in Figure 2.1 The left figure is the program source,
‘The program has one function named init ‘The control flow graph of init is shown in the right figure There is an if statement at line 5 with two options z2 < 3 and z2 > 3
Trang 20Chapter 2 Sticing Sequential Programs u
(A) An example of Promela program () Program control flow graph
Figure 2.1: An example of program aud its control flow graph
at line 6 and line 8 respectively, so two edges (4,16) and (H,8) are added Similarly, two edge (I7, Exit) and (19, Exit) are to illustrate the program flow when it finishes if statement The program also has an assert statement at line 7, so we have an edge
(17, Err) to indicate the program transition when the assertion fais
2.4 Program Transition System
For a set Vars’ C Vars, a valuation © to Vars’ is a function that associ
var For any function f : D+ R,d € D,r ©r, f\d/r] : D+ Ris defined as f{d/r|(a’) = rifd =a and f(a’) otherwise, For example, if Vars! = {x,y} and 2 = {(1,1).(u.2)} then 9[z/0] = {(z.0).(w.3)} For any function F : V+ Viv © V, flu/Q] V4 V is defined as F[o|Q|(w’) = 0'Wuar € Vars(09) : var/Q(var) For example, if statement via =yt Land = {(y,1)} then f@:2=041
A state 1) of program is a pair (sj.) where s; is a statement, and is the valuation
at the statement, Intuitively, a state is represented by a statement and the valuation of variables at that location, Let States(P) be the set of states in program P
We use m +p m to denote that there exists a transition from state m to state nm
Formally, m +p m hold if m = (i )) and ny = (j.9)) and m,n € Statesp The table 2.2 shows the state transition for each vertex type.
Trang 21Chapter 2 Shiciny Sequential Programs 12
TABLE 2.2; Conditiors on state transitions (v.11) + {iz, {0} for each vertex type
* skip, goto statements have one successor Consider vertex v represent a either
execution passes 2 to sSuccp{y), the valuation of variables remains the same
¢ The assiguments have one successor and (he slale chuuges in the expected nano
« The transitions for if, dostatements depends on their options Comaidar vertex 1
+ <1is the vertex representing the condition of an option of wv, ø is the number
of options Let iSuezp(e} he the set of vertices in Suecp(v) that axe evaluated
óc Suecp(e) : Aw) = Lh
The sucecssor is chosen non-deterministically ‘rom iSeeep{e The valuation of
lo be true in the current slale: tSucep(v) = {w
variables remains the sae
an the valration of expr If expr is evaluated to he trae in the exrrent state, it
tion af variables remains the
, there is no transition crea
+ ascert statements as two successors Consider vertex v represent a assert (oxpr) statement, Succe(v) — {w Ber} where w — Neat{v) If expr is evaluated to be
trua in the current state, in then selects a: to be its successor Otherwise, Fữr is
its successor, ‘I'he valuation of variables remains the same,
Trang 22Chapter 2 Shiciny Sequential Programs 13
ing far Sequential Programs
method checks whether the program P is safe or not, The program 's safe if all assert statements in the program are eva‘nated true Otherwise, it fs failed Our methordalazy contains the following steps as shown in the Figure 1.2:
1 Creating an abstract of program: We select some variables Vas, © Fars con cerning to safety properties in P to our initial abstraction We then transform program P to P, by abstracting variables in Vars, In particular, in program
A, statements which change value of variables in Vara, are transformed to be nondeterrinistic while other statements are él‘ minated
2 Reccxatruct tae counterexample: We use a model checker to check whether 1% is
Otherwise, we Aave a counterexample Cea of Py Cex potentially corresponds to
a conerere counterexample of P To determina the patentia ity, we provide more
information from P Lf no variables remains in the program the process stops Otherwise, we reconstruct the counterexample with some remaining variables in
Wars let Py be the constructed program from Cem We new con
model checker to verily P
Definition 2 Civen a vertex v and variable 2, an abstract vertex vf, with respect to x
as follows:
yi Varale) = 2} ¥ Vars(e) =
if Verse(v} # {a} A Vursi(u)
* il Varate) # {2} A‘t'ype(v} = Condition
fz} AType(v) = Aseigernent
`