bia chinh LVan VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY BUI PHI DIEP AVOIDING STATE SPACE EXPLOSION IN MODEL CHECKER MASTER THESIS OF INFORMATION TECHNOLOGY Hanoi 20[.]
Trang 1VIETNAM NATIONAL UNIVERSITY, HANOI
UNIVERSITY OF ENGINEERING AND TECHNOLOGY
BUI PHI DIEP
AVOIDING STATE-SPACE EXPLOSION
IN MODEL-CHECKER
MASTER THESIS OF INFORMATION TECHNOLOGY
Hanoi - 2014
Trang 2VIETNAM NATIONAL UNIVERSITY, HANOI
UNIVERSITY OF ENGINEERING AND TECHNOLOGY
BUI PHI DIEP
AVOIDING STATE-SPACE EXPLOSION
IN MODEL-CHECKER
Major: Computer science
Code: 60480101
MASTER THESIS OF INFORMATION TECHNOLOGY
SUPERVISOR: Assoc Prof Nguyen Viet Ha
Hanoi - 2014
Trang 3Declaration of Authorship
I hereby declare that this submission is my own work and to the best of my knowledge itcontains no materials previously published or written by another person, or substantialproportions of material which have been accepted for the award of any other degree
or diploma at University of Engineering and Technology (UET/Coltech) or any othereducational institution, except where due acknowledgement is made in the thesis Anycontribution made to the research by others, with whom I have worked at UET/Coltech
or elsewhere, is explicitly acknowledged in the thesis I also declare that the intellectualcontent of this thesis is the product of my own work, except to the extent that assistancefrom others in the project’s design and conception or in style, presentation and linguisticexpression is acknowledged
Signed:
Date:
i
Trang 4Model-checking is a well-known technique for the program verification problem (i.e.,checking that the program satisfies a given property) However, Model-checking su↵ersthe state-space explosion problem This is more visible in the the case of concurrent /parallel programs Therefore, developing new efficient techniques to address the state-space explosion problem (such as slicing) is a crucial and difficult challenge in Model-checking
In this thesis, we present a new slicing method for handling the standard state sion problem Our slicing method consists of three steps: (1) creating an abstractionwith respect to a subset of program variables, which leads to an over-approximation
explo-of the input program, (2) reconstructing a program with another subset explo-of variablesfrom a counterexample of abstracted program, and (3) refining the abstraction if thecounterexample is a spurious one The process stops when either there are no morecounterexamples remaining or there exists a counterexample after investigating all vari-ables In the former case, the program is correct; in the latter case, the program contains
an error We have implemented a prototype tool and run it successfully on standardbenchmarks, together with several challenging examples The experimental results showthe efficiency of our method
Trang 5First and foremost, I would like to express my deepest gratitude to my supervisor, soc.Prof Nguyen Viet Ha, for his patient guidance and continuous support throughoutthe years I would like to give my honest appreciation to my co-supervisor Dr MohamedFaouzi Atig and Prof Parosh Aziz Abdulla for their great support They always appearwhen I need help, and respond to queries so helpfully and promptly The encourage-ment from my family, my friends in UET-VNUH and Uppsala University, and my girlfriend, Diu Cap, is also very important for me When reading this thesis, if you find anymistakes, sending to me at diepbp@vnu.edu.vn is appreciated
As-iii
Trang 6Declaration of Authorship i
Acknowledgements iii
List of Figures vi
List of Tables vii
1.1 Motivation 1
1.2 Related work 4
1.2.1 Program Slicing 4
1.2.2 Predicate Abstraction 5
1.3 Thesis structure 7
2 Slicing Sequential Programs 8 2.1 Program Syntax 8
2.2 Statements, Variables 9
2.3 Program Control-Flow Graph 10
2.4 Program Transition System 11
2.5 Variable Slicing for Sequential Programs 13
2.5.0.1 Creating an initial abstraction 15
2.5.1 Reconstructing Counterexample 15
2.5.2 Refining the abstraction 19
3 Slicing Concurrent Programs 21 3.1 Program Syntax 22
3.2 Statements 22
3.3 Program Control-Flow Graph 22
3.4 Variable Slicing for Concurrent Programs 23
3.4.1 Reconstructing the counterexample 23
4 Experiment 27 4.1 Sequential Programs 27
4.2 Concurrent Programs 29
Trang 7Contents v
5 Conclusions and Future Work 33
Trang 8List of Figures
1.1 Structure of SPIN 3
1.2 Our method 4
1.3 CEGAR framework 6
2.1 An example of program and its control flow graph 11
2.2 An example of program abstraction, counterexample and reconstructed program 18
2.3 Refined abstraction 20
3.1 CFG of concurrent program and their abstraction 21
3.2 A concurrent counterexample 24
3.3 Simulating the concurrent counterexample 25
4.1 A program with its initial abstraction 28
Trang 9List of Tables
2.1 The syntax of program 9
2.2 Conditions on state transitionshv1, ⌦1i !P hv2, ⌦2i for each vertex type 12
4.1 Experimental results of verifying concurrent programs in comparison withSPIN 30
4.2 Column Information 30
4.3 Experimental results of verifying concurrent programs in comparison withtools in SV-COMP 32
vii
Trang 10Chapter 1
Introduction
In this chapter, we describe the motivation of our work, and it is important Initially, westart with the necessaries of program verification and Model-Checking Then we statethe state explosion problem, which are unavoidable by applying Model-Checking Next,
we summarize our solutions and results Finally, we describe related work and thesis’sstructure
1.1 Motivation
Software is everywhere The appearance of software is in diverse areas namely education,healthcare system, transportation Besides, the development of multicore architecturemakes software to be designed in many cores It helps software run faster, but unfortu-nately software becomes larger and more complex Software developers now need moree↵ort to not only make software but also guarantee that it works correctly Errors insoftware are difficult to find and fix Therefore, we need efficient techniques, for exampleprogram verification and testing, to help us handle complex program errors
Program verification is a technique that considers a program with given properties ofinterest It then concludes that whether the properties are satisfied in the program Theconsidered properties of program can be valuations of variables at a program location,
or no out of memory errors The result of program is either the properties are hold,which means all executions of program do not violate the properties, or the propertiesare not hold because of a specific execution It is notable that program verification isundecidable There are no a program verification tool which can handle every type ofprogram with every type of properties In practice, each program verification tool onlyfocuses on a small types of program with a restricted set of properties
Trang 11Chapter 1 Introduction 2
Program testing is another technique to find program errors Program testing considers
a program with a pair of input and output, called test case If the program executes withthe given input and returns the given output, the program is definitely correct in thatcase Otherwise, the program has an error related to the given input and output Thereare a number of limitations of program testing Initially, program testing requires a largenumber of test cases to cover all possible program execution More importantly, programtesting cannot handle nondeterministic programs, for example a program with manythreads and interleaving between threads Such bugs like Heisenbugs [1] are difficult tofigure out by program testing So in the scope of this thesis, we only focus on usingprogram verification
Model checking is a popular verification technique The key idea of model checking isthat it represents the program by a model, i.e Kripke structure [2], and verifies themodel instead of the original program The work in [3] shows a number of advantages ofmodel checking in comparison with other verification techniques, i.e automated theoremproving, such as faster, providing counterexample - which is one of the biggest advantage
of model checking, and using temporal logics to describe properties of program.SPIN [4] is an efficient and a famous model checker SPIN provides an intuitive notationfor design specification, and a concise notation for correctness claims, and the consistencybetween the notations In particular, the design specifications are written in Promelalanguage [5], and correctness claims are written by Linear Temporal Logic [6] Thestructure of SPIN is shown in Figure 1.1 It takes input from the front end XSPIN.The input is the program specification, including the program design and its correctnessclaims, both are written in Promela language If the specification has no syntax errors,
it then generates a verifier, optimizes and executes the verifier A counterexample ofprogram is detected, it is then sent back to the simulation to to inspect the error indetail
However, SPIN faces the state explosion problem Ideally, SPIN visits program statesand stores the valuation each state The state spaces that SPIN travels may containbillions of reachable states Therefore, SPIN requires much time when verifying largeprograms, especially concurrent programs
One of efficient method to handle the state explosion program is slicing [7] There are
a number of slicing methods that have been developed, i.e static slicing [7], dynamicslicing [8] and conditional slicing [9] In general, instead of considering the whole pro-gram, slicing focus on a number of statements in the program with respect to the slicingcriterion The slicing criterion often is a pair of a program location and a subset ofprogram variables Slicing is proved to be useful in testing [10], debugging [8]
Trang 12Chapter 1 Introduction 3
XSPINFront -End
Promela Parser LTL Parser
and Translator
Syntax Error
Reports
VerifierGenerator Simulation
Optimization
ExecutableOn-The-FlyVerifier
Counterexample
Figure 1.1: Structure of SPIN
We now propose a new slicing method to handle some basic cases of the explosionproblem The di↵erence of our method in comparison with other slicing methods that is
we do not concentrate on any specified subset of program variables All variables in theprogram are considered For each step in our method, a subset of program variables isselected A new executable program, which consists of statements in the original withrespect to selected variables, is generated and verified
The Figure1.2 describes our method Our method aims to check the program safety.The process of our method is describes as follows
• Our method takes a program as input It then creates an initial abstraction ofprogram with respect to a subset of program variables The abstraction is anover-approximation of the original program
• The abstraction then is verified by a model checker If the program is safe, theprocess stops, the program is concluded to be safe Otherwise, a counterexample
is generated The counterexample execution trace shows why there exists a bug
in the program The counterexample is used to reconstruct a simulation program
In general, the simulation program is another abstraction of the original program,which not only follows statements in counterexample execution trace but also con-tains statements with respect to a new subset of variables of the original program.The model checker then verifies the new abstract program
Trang 13Chapter 1 Introduction 4
Create InitialAbstractionP
Figure 1.2: Our method
• If the reconstructed program is safe, that means the counterexample is spurious.Our method then refines the initial abstraction and runs model checker again.Otherwise, a counterexample is generated and our method reconstructs it againwith another subset of variables
• Our method stops when either there are no more variables remaining in the gram or no more counterexamples are detected In the former case, the program
pro-is error In the latter case, the program pro-is safe
Trang 14The main issue of static slicing is that a static slice may contain many statements that donot a↵ect the value of variables of interest To overcome the issue, dynamic slicing [8] isproposed to reduce the number of considered statements In particular, dynamic slicingpreservers the behavior of program for a specific input So that instead of involving allpotential statements a↵ecting the slicing criterion, dynamic slicing reduces the searchspace Therefore, slicing criterion of dynamic slicing is a triple (l, V, I) where l, V areprogram location and a subset of variables in the program respectively, and I is theprogram input.
Conditioned slicing [9] is another slicing method that bridges the gap between staticslicing and dynamic slicing While static slicing does not care about input, dynamicslicing specifies input in detail, that triggers a large number of input needed to cover thewhole program Conditioned slicing provides information about input without being sospecific as to give the precise values, i.e., using boolean expression to relate the possiblevalue of inputs Slicing criterion of dynamic slicing is a triple (l, V, F (V )) where l, Vare program location and a subset of variables in the program respectively, and F (V ) isthe first order logic formula on variables in V
1.2.2 Predicate Abstraction
Predicate abstraction is a technique proposed by Graf and Saidi [12] and Colon andUribe [13] The technique is widely applied in software verification Instead of trackingthe specific value of data of variables in the program, it tracks predicates of the data.When verifying a program, an abstracted program is created to represent it, usingExistential Abstraction [14] Variables in the abstracted program are Boolean variables,which represents predicates So each abstract state in the abstracted program models anumber of state in the original program There exists a transition from an abstract state
if at least one corresponding concrete state in the original program has the transition,that is over-approximation The main challenge of predicate abstraction that is finding
a set of predicates for a program
Trang 15Chapter 1 Introduction 6
Create InitialAbstraction
P, '
Model Check
Refine CounterexampleGenerate Stop
Check SpuriousCounterexample
Figure 1.3: CEGAR framework
Counterexample guide abstract refinement (CEGAR) is a technique that automatessearching for predicates The technique consists of four basic steps as shown in Figure
1.3 In the figure, we use some notations including P, A(P ), ', Cex to denote the originalprogram, abstracted program, program specification and counterexample, respectively
• Create Initial Abstraction: Construct abstracted program of the original program
by over-approximation the program behavior of the original one The abstractedprogram is a finite-state program whose variables are Boolean
• Verify abstraction: The abstracted program is finite state so a model checker forprograms with Boolean variables is used for verifying whether the abstracted pro-gram satisfies a given specification ' If the model checker returns the abstractedprogram is safe then the process finishes Otherwise, the program does not satisfythe specification and then a counterexample is returned
• Check the counterexample: The counterexample of abstracted program may be
a spurious counterexample because of over-approximation, that means the terexample is not corresponds a valid execution of the original program If thecounterexample turns out to be an actual counterexample, the original programdoes not satisfy the specification Otherwise, the abstracted program needs arefinement
coun-• Refine the abstraction: Because of the existence of the spurious counterexample,
it is necessary to eliminate the counterexample from the abstracted program byadding additional predicates, which represents behaviors in the abstracted program