DSpace at VNU: Multi-class minimax probability machine

Symbolic Execution Symbolic execution [6] is a way of executing a program in which the program variables that contain concrete values are replaced by their symbolic counterparts that ex

Trang 1

A Parameterized Unit Test Framework Based on Symbolic Java PathFinder

Anh-Hoang Truong and Thanh-Nhan Vu

College of Technology Vietnam National University

144 Xuan Thuy, Hanoi, Vietnam Email: {hoangta, nhanvt.mcs07}@vnu.edu.vn

Abstract – Parameterized unit test recently gains a lot of

attention as it saves testing cost and is more efficient in term

of code coverage We present a framework for running

parameterized unit tests (PUT) based on Java PathFinder

(JPF) and JUnit Our approach bases on model checking and

symbolic execution of JPF for generating standard unit tests

As a result, we achieve high structural and path coverage

The generated unit tests are automatically executed by JUnit

so programmers receive immediately assertion failures if

any Currently, our approach mainly works with numeric

and boolean data type but it is possible to extend our

framework for other data types such as string

Keywords – Testing, Parameterized Unit Test

I INTRODUCTION There are many examples of the damages caused by

software errors, especially when software is ubiquitous

According to a report by the US National Institute of

Standards and Technology, software failures cost the US

economy $60 billion per annum, but the improvements in

software testing infrastructure are still limited while it

could save one-third of this cost [21]

Software testing [1], the most commonly used

technique for validating the quality of software, takes

about half of the total cost of software development One

of the reasons is because the practice of testing is still

manual in various phases Automated testing helps

developers reduce the cost of producing software and

increase the reliability of software Numerous methods

and researches have been proposed to support and

automate some parts of the software testing

To produce test data, random testing generates

randomly a stream of bits and sends to the program as

input parameters The main advantage of this method is its

simplicity It also does not require special machines and

or computing resources However, this approach is not

efficient as one path of a program may be executed many

times while some other paths are difficult to get executed

In other words, it is difficult to archive the adequacy

criterion [5] Symbolic execution can resolve the

drawbacks of random testing In this method the variables

that normally contain concrete values are replaced by their

symbolic counterparts, which express a range of possible

values using symbolic expressions Base on these

symbolic data, the model checker generates input data set

that covers all of possible executions of the program It

usually requires an external solver [3] to find the solutions for symbolic expressions

Model checking has been a popular topic since last two decades and recently it is widely used to analyse software programs However, model checking is hard due

to the complexity of the code and it often cannot completely analyse the program’s state space due to the large amount of memory it requires and the path explosion problem For these reasons, many popular model checkers rely on abstractions to reduce the size of the state space [20] However, these techniques are not well suited for handling code that manipulates complex data as they introduce too many predicates, making the abstraction process inefficient

Java PathFinder [2] uses model checking [4] without relying on abstraction (that cannot always achieve good code coverage), but augmenting it with symbolic execution It allows us to extend its capabilities by listening to events whenever a bytecode instruction is executed We base on this feature to add an extension that allows us to run parameterized unit tests, which creates standard unit tests of JUnit We then reply on JUnit test framework to run these generated unit tests The result is that the programmers do not have to write unit tests and the generated unit tests ensure high path coverage This is our main contribution in this paper

The rest of the paper is organized as follows Section

2 discusses about symbolic execution of JPF and other related backgrounds that are used in the later sections Section 3 shows some related works Section 4 gives detail of our approach We show some experimental results in Section 5 and conclusion in Section 6

II BACKGROUND

A Unit Test and Parameterized Unit Test Unit Test (UT) [11, 12]: is a concept in traditional

testing techniques A unit test is a method that has no parameter and returns void type UT is used to test a single unit of code Each UT contains 3 parts: input values, a sequence of instructions and assertions An UT

is failed when any of its assertions is violated or an exception is thrown The disadvantage of UT is that it can

check some specific execution paths of a program only Parameterized Unit Test (PUT) [7, 11, 12] is an improvement to unit test A PUT is a unit test with parameters PUT allows accepting different input values that are passed via parameters Usually these input values are generated automatically by a tool

2009 International Conference on Knowledge and Systems Engineering

Trang 2

The relationship between UT and PUT is shown in

Figure 1 Traditional UTs can be generalized to PUTs and

PUT can instantiate back UTs

Figure 1: The relationship between UT and PUT

B Symbolic Execution

Symbolic execution [6] is a way of executing a

program in which the program variables that contain

concrete values are replaced by their symbolic

counterparts that express a range of possible values using

symbolic expressions In symbolic execution, values of

variables and return values of programs are symbolic

expressions consisting of symbolic input During the

execution process of a program P, if the value of a

variable depends on the input parameters, the machine

will calculate a symbolic value to replace the concrete

value of the variable Given a variable x, the symbolic

value of x can be expressed by one of the following

formats:

(a) An input symbol

(b) A formula consisting of symbolic values and

operators

(c) A formula consisting of symbolic values and

concrete values and operators

Operators in symbolic execution can be addition (+),

subtraction (-), multiply (*) or divide (/), etc When a

program is executed in symbolic mode, concrete types are

replaced with corresponding symbolic types and concrete

operations are replaced with calls to methods that

implement corresponding operations on symbolic

expressions

Figure 2 shows an example of symbolic execution

The lower part is the symbolic execution tree of the

program above it In the tree, numbers outside the boxes

are the line numbers of the statements in the program

In symbolic execution, the states of a single thread

program consist of three parts: symbolic values of the

expressions, a path condition (PC), which is a set of

constraints on the values that we have to find to execute

on that path, a program counter which indicates the next

statement to be executed

PC is a boolean formula over input variables and

describes which conditions must be true in the state PC

accumulates constraints that the inputs must satisfy in

order for an execution to follow the particular path When

working with variables of an array type, PC needs to add a

condition to ensure that there is no out-of-bound array

access

A symbolic execution tree (SET) can be used to

characterize all execution paths of program A symbolic execution tree of a program is a (possibly infinite) tree where nodes are program states during symbolic execution and arcs are possible transitions between states All the leaf nodes of a SET where the PC is satisfiable represent the final states of programme while the paths from the root to these nodes represent the different execution paths Moreover, all feasible execution paths of the program are represented in SET All satisfiable valuations for a PC (in a leaf node) will give us

a real input and execution paths with all those inputs are equal, and the number of concrete executions may be infinite

public void Swap(int x, int y){

1: if (x > y) {

2: x = x + y;

3: y = x - y;

4: x = x - y;

5: if (x - y > 0)

6: assert(false);

}

Figure 2: A simple program and its symbolic execution tree There are two types of symbolic execution:

• Static symbolic execution: in every branching

point, PC is updated and a constraint solver [3] determines whether the appropriate path is feasible If a path is not feasible, the execution backtracks to previous node so only feasible paths are executed

• Dynamic symbolic execution: is a symbolic

execution technique base on dynamic program analysis where a program can be executed many times with different values of input parameters First, each input parameter is given a random value, the program is executed with these values and constraints are collected in the execution process and new constraints are generated automatically base on collected constraints Test input generators use an algorithm to generate a set of input data so that all executable paths are examined The algorithm is sketched as follows:

Trang 3

The program is executed with random values of

parameters and a given depth of SET (that useful when

program contains recursions or infinite loops) First, SET

is initialized Then symbolic values of variables are

calculated based on the given values of input parameters

and path constraints are generated accordingly When a

new constraint is generated, a new node is added to SET

When the execution finishes, the path of SET is set to

examined With concrete values, a concrete path of SET is

examined, and a new node is created for the other

branches and this node is marked as non–examined The

new node will store the appropriate path constraints

After a path is completed, a non-examined node is

chosen and a new constraint is built by collecting of all

the constraints in nodes that belong to the path from the

root of SET to the chosen node The new constraint is sent

to a constraint solver The constraint solver will

determine:

• If the constraint is not satisfied, another

non-examined is chosen and above step is repeated

• If the constraint is satisfied, the constraint solver

will generate concrete values for the input

parameters These values will be used for the

next concrete execution

• The algorithm terminates when all possible paths

of SET are examined

All of concrete values of input parameters and

analysis information of appropriate concrete execution

(method summary) are used for generating UT and

reporting purpose

Figure 3: Testing system structure

C Lazy Initialization

Lazy initialization is an algorithm for generalizing

traditional symbolic execution to support advanced

constructs of modern programming languages, such as

Java and C++ With lazy initialization, un-initialized

variables will be initialized when they are first met during

the execution of the program [19]

III RELATED WORKS Several approaches and tools have been proposed and

developed to generate test cases based on symbolic

execution [8, 10, 11, 13, 14, 16] Microsoft Pex (Program

EXploration), a model checker under active development,

can execute parameterized unit test (PUT) [7, 16, 18],

which inspires this work

Pex can generate a traditional unit test suite with

high code coverage It allows writing, executing

parameterized unit test in NET environment A

parameterized unit test is simply a method that takes

parameters, calls the code under test, and states assertions

[15] Given a parameterized unit test written in a NET

language, Pex automatically produces a small unit test

suite with high code and assertion coverage It performs a systematic white - box program analysis It learns the program behaviour by monitoring execution traces, and uses a constraint solver to generate new test cases with different behaviours However, the main drawbacks of Pex are it can analyse NET applications only and we can execute only one PUT in each run

Some researches introduce a method to support

symbolic execution in Java environment by instrumenting

Java bytecode [9, 19, 20] In tools like JCUTE and CUTE, original Java programs are converted to programs in other simpler architecture language (such as SCIL, Jimple) [17] and augmented with code to support symbolic execution

TABLE I EXAMPLES OF CONVERTING A PROGRAM FROM J AVA TO SCIL

p[i] = q[j]; t1 = p + i;

t2 = q + j;

*t2 = *t1;

assert (x > 0);

x = 10;

if (x > 10) goto L; ERROR;

L: x = 10;

if (x > 0)

x = 1;

else

x = -1;

y = x + y;

if (x <= 0) goto L1;

x = 1;

if (true) goto L2; L1: x = -1;

L2: y = x + y;

Table I shows examples of converting a program from Java to SCIL As we can see the array reference is changed to pointer operations in the first row and the assert command is converted to an if statement in the second row

The instrumented programs then are tested by the model in Figure 3 The original program is instrumented with additional code so that the Test Executor can talk with Test Input Selector to get test data Then the data is sent to Test Generator to produce unit tests

IV OUR APPROACH

A Java PathFinder Structure

JPF is a special Java Virtual Machine that executes a program not just once (like a normal VM), but theoretically in all possible ways It helps checking property violations like deadlocks or unhandled exceptions along all potential execution paths If it finds

an error, JPF reports the whole execution path that leads

to it Unlike a normal debugger, JPF keeps track of every step how it got to the defect as it is augmented with a capability of storing states, matching states and backtracking during its execution

The capability of storing states and matching states helps JPF check every state before executing it The state

is only executed if it is a new state (it has never been executed before); otherwise JPF goes back to the nearest state that has never explored (the backtracking capability) The key mechanisms of Symbolic JPF are:

Bytecode Instruction Factory: by default, a standard

JVM interprets bytecode instructions according to

Trang 4

concrete execution semantics, but in symbolic JPF,

user-made JVM uses an instruction factory to replace or extend

standard concrete execution semantics of bytecodes with

non-standard symbolic execution BCEL library

(ByteCode Engineering Library) is used in the interpreting

process Furthermore, JPF allows replacing this standard

execution semantics by using a configurable

InstructionFactory

Attributes associated with the program state:

JPF manages program states similar to the standard

JVM does Each state consists of a call stack, a heap that

stores values of the fields, and scheduling information

The call stack consists of stack frames corresponding to

methods that are being executed Each stack frame stores

symbolic information in local variables and stack

operands

Attributes that associated with the program values are

called slot attributes These attributes store the symbolic

values and expressions that are created during symbolic

execution

The combination of the above mechanisms allows

dynamic modification of execution semantics, i.e.,

changing mid-stream from a system-level, concrete

execution semantics to symbolic execution semantics,

thus providing the integrated test generation capability

described later in this paper

Symbolic JPF also uses other mechanisms such as:

Choice generators (CG): This is for handling

branching conditions during symbolic execution CG

creates a non-deterministic choice in JPF’s search and

adding the condition (or its negation) to the corresponding

path condition A constraint solver is used for checking

whether the path condition is satisfying or not Symbolic

JPF uses one of two constraint solvers: Choco and

IASolver

Listeners: There are two types of listener: search

listener and VM listener for printing the results of the

symbolic analysis and for enabling dynamic change of

execution semantics, respectively

Native peer: This is Model Java Interface (MJI) that

helps execution in JPF interact with lower level virtual

machines Symbolic JPF uses MJI for modelling native

libraries, e.g., to capture java.lang.Math library calls and

to send them to the constraint solver

B Combine Concrete Execution and Symbolic

Execution

Program unit-level testing is well supported in

Symbolic JPF Because Symbolic JPF can combine

concrete execution and symbolic execution, it helps to

execute a method by combining concrete values and

symbolic values of the input parameters

For symbolic execution, we need to configure JPF so

that it can use SymbolicInstructionFactory class – a tool

allows interpreting bytecode to a symbolic execution

semantic [2] We also need to indicate which method that

needs be executed, which input parameters and global

fields are concrete, which are symbolic

In the first stages, the program is concretely executed

since all the symbolic bytecode Instruction classes

delegate execution to the “concrete” super-class if there are no symbolic attributes associated with the data

A listener monitors the concrete execution of the program within JPF’s VM and it generates symbolic execution the first time the method with the specified name is invoked, it calculates and sends new symbolic values in the attributes of the specified symbolic inputs (parameters and global fields)

From that point, the methods invoked by the designated method continue to process the symbolic information stored in the attributes Once the method returns, JPF prints out the method summary

We can write a special listener to monitor concrete variable conditions in concrete execution When conditions are met, it starts symbolic execution with the corresponding method It is also possible to switch back to concrete execution, which means it can solve the constraints in the current path condition, computes the corresponding concrete values of the program variables and executes in concrete mode

The capability of analysing a unit with mixed concrete and symbolic inputs helps us to use concrete execution of a program to set up different concrete global contexts for the unit-level symbolic analysis This also helps reduce the complexity of path conditions in the symbolic analysis This useful when we want to test a complex program where path coverage is impossible by splitting the state space to several sub-spaces using some concrete parameters mixed with symbolic ones

C PUT Generation in Symbolic JPF

The idea is using the combination of dynamic symbolic execution and lazy initialization in symbolic JPF

to build a system, which can accept a PUT and generate appropriate UTs so that they form a test suite that has high path coverage

Symbolic JPF already supports symbolic execution with input parameters of numeric types (int, float, long, double) Based on this feature of JPF, we develop a testing framework that allows us to write PUT for testing Java classes that have methods with input parameters of numeric types Methods are executed on symbolic inputs that represent all possible concrete inputs Values of variables are represented as symbolic expressions A constraint solver processes these constraints of symbolic expressions to generate concrete test inputs that will make the execution follows the chosen path

To test the generated UTs we need JUnit – a testing framework for the Java programming language that allows

us to write and execute unit tests After writing PUTs we pass them to Symbolic JPF to generate concrete UTs that will be automatically executed in JUnit

The structure of an UT in JUnit is as follows:

public ClassName(String name) { super(name);

}

protected void setUp() throws Exception { super.setUp();

}

protected void tearDown() throws Exception { super.tearDown();

} //JUnit Test Methods

}

Trang 5

Our PUTs will follow the same structure so that we need

not to rewrite assertion library First, we must configure

Symbolic JPF so that JPF can execute PUT The

configuration is set as follows:

//notice JPF to use symbolic bytecode

+vm.insn_factory.class=gov.nasa.jpf.symbc.Symbol

icInstructionFactory

// uses Listener for result reporting

+jpf.listener=gov.nasa.jpf.symbc.SymbolicListene

r

// uses Math libraries

+vm.peer_packages=gov.nasa.jpf.symbc:gov.nasa.jp

f.jvm

//Choose a Constraint Solve

+symbolic.dp=iasolver

// Name of symbolic method and type of

parameters: concrete or symbolic

+symbolic.method=UnitUnderTest(sym#sym#con)

// Name of class that contains symbolic method

Main

Symbolic JPF is a virtual machine so the testing class

must have a main method to start symbolic execution In

the main method, the testing method that needs to be

executed symbolically is called with concrete values

In testing classes, there is no main method so for

these classes can be executed in JPF directly we add

PUTDriver component to work around this problem

PUTListener class allows the system to control execution

process, report results and generate UTs

Figure 6 Architecture of the PUT framework

Our proposed system has architecture as in Figure 6

The main component of the system is PUTRunner, which

uses a PUTDriver to generate data for the PUTRunner

PUTListener will listen to events generated by

PUTRunner to generate unit tests and they will be

executed as standard unit tests

V EXPERIMENT

We show here a simple method Euclid_GCD

(Euclid’s algorithm to find the greatest common divisor of

two integer numbers) in class NeedTest that we need to

test

while (x!=y){

if(x>y) x = x-y;

if(x<y) y = y-x;

}

return x;

}

We write a PUT method that test the above method as

follows

void testEuclid(int x, int y){

int z;

NeedTest ut = new NeedTest ();

Assume.isTrue(x>0 && x<100 && y>0 && y<100);

z = ut.Euclid_GCD(x, y);

Assert.assertEquals(x/(x/z),y/(y/z));

}

In this class, we need Assume statement to limit range

of the inputs value With this limitation, we reach the working range of Euclid GCD algorithm (all pair of

non-negative integer numbers: x > 0 && y > 0) and avoid the path explosion in symbolic execution (x < 100 && y <

100)

Executing this PUT we produce the generated test units as follows

extends TestCase{

private PUT cls_PUT = new PUT();

public GeneratedJUnits_by_PUT(String name){ super(name); }

protected void setUp()throws Exception { super.setUp(); }

protected void tearDown()throws Exception { super.tearDown(); }

public void testNeedTest_1_100(){

cls_PUT.testEuclid(1,100);}

cls_PUT testEuclid (1,1); }

cls_PUT testEuclid (3, 2); }

cls_PUT testEuclid (55, 34); } JUnit automatically executes these unit tests so we immediately get assertion errors if any This is really useful because the generated test units cover all paths and

we save a lot of manual work when having to write all of them down, especially for large functions and projects

VI CONCLUSION Base on Symbolic JPF and JUnit, we have proposed a parameterized testing framework for Java and implemented it as a tool Our experiments show a working tool that can save manual work of creating test cases and increase the coverage while maintain a compact test suite

in the sense that there is no two test cases that the program runs on the same path More important, our approach is extensible to hand branching condition of other data types such as string We plan to add this functionality in the next release of the tool

ACKNOWLEDGEMENT This research was partly supported by Vietnam National University, Hanoi under the project QGTD.09.02 We thank Binh-Duong Tran (K50) for the initial implementation of the tool

REFERENCES [1] B Beizer “Software Testing Techniques,” Van Nostrand Reinhold Co., New York,NY, USA, 2nd edition, 1990

[2] Corina Pasareanu, “Combining Unit-level Symbolic Execution and System-level Concrete Execution for Testing NASA Software,” ISSTA’08 paper

[4] Edmund M Clarke, Orna Grumberg, and Doron A Peled

“Model Checking,” The MIT Press, January 2000

Trang 6

Adequacy ACM Computing Surveys,” 29 (4) ISSN 0360-0300,

December 1997, pp 366–427

[6] J C King, “Symbolic execution and program testing,”

Communications of the ACM, 19(7):385–394, 1976

[7] J de Halleux and N Tillmann, “Parameterized unit testing with

Pex (tutorial),” In Proc of Tests and Proofs (TAP’08), volume

4966 of LNCS, pages 171–181, Prato,Italy, April 2008 Springer

[8] Jon Edvardsson, “Techniques for Automatic Generation of Tests

from Programs and Speciﬁcations,” Department of Computer and

Information Science, Linkoping SE-581, Sweden 2006

[9] Kari Kähkönen, “Evaluation of Java PathFinder Symbolic

Execution Extension,” June, 2007

[10] Michael, C.C.; McGraw, G.E.; Schatz, M.A.; Walton, C.C,

“Genetic algorithms for dynamic test data generation,”

Automated Software Engineering, 1997 Procee-dings., 12th

IEEE International Conference Volume , Issue , 1-5 Nov 1997

Page(s):307 – 308

[11] N Tillmann and W Schulte, “Unit tests reloaded: Parameterized

unit testing with symbolic execution,” IEEE Software, 23(4):38–

47, 2006

[12] N Tillmann and W Schulte, “Parameterized unit tests,” In

Proceedings of the 10thEuropean Software Engineering

Conference held jointly with 13th ACM SIG-SOFT International

Symposium on Foundations of Software Engineering, pages 253–

262 ACM, 2005

[13] Patrice Godefroid, “Compositional dynamic test generation,” In

Proceedings ofthe 34th ACM SIGPLAN-SIGACT Symposium on

Principles of Programming Languages (POPL), pages 47-54

ACM, 2007

[14] Patrice Godefroid, Nils Klarlund, and Koushik Sen Dart,

“Directed automated random testing,” In Proceedings of the ACM

and Implementation (PLDI), pages 213-223 ACM, 2005 [15] Peli de Halleux and Nikolai Tillmann, “Parameterized Test Patterns For Effective Testing with Pex,” Copyright Microsoft Corporation.October 21, 2008

[16] Petri Ihantola, “Automatic test data generation for programming exercises withsymbolic execution and Java PathFinder,” Master's thesis, Helsinki University of Technology, Departement of Theoretical Computer Science, 2006

[17] Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie J Hendren, Patrick Lam, and Vijay Sundaresan Soot, “A Java bytecode optimization framework,” In Proceedings of the 1999 conference

of the Centre for Advanced Studies on Collaborative Research (CASCON), page 13 IBM, 1999

[18] S Anand, P Godefroid, and N Tillmann, “Demand-driven compositional symbolic execution,” In Proc of TACAS’08, volume 4963 of LNCS, pages 367–381 Springer, 2008

[19] Sarfraz Khurshid, Corina S Pasareanu, and Willem Visser,

“Generalized symbolic execution for model checking and testing,” In 9th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), volume

2619 of Lecture Notes in Computer Science, pages 553-568 Springer, 2003

[20] Saswat Anand, C S Pasareanu, and W Visser, “JPF-SE: A symbolic execution extension to Java PathFinder,” In Proc of the 13th TACAS Conference, 2007

[21] Willem Visser, Corina S Pasareanu, and Sarfraz Khurshid, “Test input generation with Java PathFinder,” In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pages 97-107 ACM, 2004

Định dạng
Số trang	6
Dung lượng	210,74 KB