List of Notations Energy Power of a program Power Base Cost of instruction i Overhead cost between two instruction i and j n Number of instruction in a basic block Number of time
Trang 1A SOFTWARE APPROACH FOR LOWER POWER
CONSUMPTION
Bui Ngoc Hai Faculty of Information Technology University of Engineering and Technology Vietnam National University, Hanoi
Supervised by
Assoc Prof Dr Nguyen Ngoc Binh
A thesis submitted in fulfillment of the requirements for the degree of
Master of Science in Computer Science
April 2014
Trang 2ORIGINALITY STATEMENT
‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at University of Engineering and Technology (UET) or any other educational institution, except where due acknowledgement is made in the thesis Any contribution made to the research by others, with whom I have worked at UET or elsewhere, is explicitly acknowledged in the thesis I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.’
Hanoi, April 25th , 2014
Signed
Trang 3ABSTRACT
Optimizing the power consumption is an important topic in embedded system engineering, especially for embedded systems that use battery power source Power optimization can be achieved by software techniques and instruction scheduling is
an effective software approach for reducing power cost of processor(s) In this thesis, we propose our idea of using a genetic algorithm for low power instruction scheduling Our algorithm is applied to each basic block of assembly code to generate lower power program In the experiment section, we use two open source
simulation tools that are SimpleScalar Tool Set and SimplePower, the algorithm is applied to assembly programs of SimpleScalar Instruction Set, these programs are compiled and then have their power consumptions measured by SimplePower The
experimental results showed the effectiveness of our proposed method This scheduling method will be combined with the idea of reducing memory access for low power design in our further work
Trang 4First and foremost, I would like to express my deepest gratitude to my supervisor, Assoc.Prof.Dr Nguyen Ngoc Binh for giving me the opportunity to work with him and for his patient guidance and continuous support throughout the years
I would like to give my honest appreciation to my colleagues at the Laboratory of Embedded Systems for their great support I also would like to thank all my friends who gave me moral support during this work
Finally, this thesis would not have been possible without the moral support and love of my parents and my brother Thank you!
Trang 5Table of Contents
Chapter 1 Introduction 1
1.1 Software power optimization 1
1.2 Power optimization by instruction scheduling 2
1.3 Our work 2
1.4 Thesis organization 3
Chapter 2 Related Work 4
2.1 Software power estimation 4
2.2 Energy code driven generation for low power 5
2.3 Reducing memory access 5
2.4 Software power optimization using symbolic algebra 5
2.5 List scheduling for low power 5
2.6 Instruction scheduling to reduce switching activity 6
2.7 Low power instruction scheduling as traveling salesman problem 7
2.8 Force-directed scheduling for low power 8
2.9 Instruction scheduling to reduce the off-chip power 8
2.10 Energy-oriented and performance-oriented combination scheduling 8
2.11 Criticality-directed and Uncriticality-directed instruction scheduling for low power 9
2.12 Low power instruction scheduling using Particle Swarm Optimization algorithm 9
Chapter 3 Instruction Scheduling for Low Power 10
3.1 Problem description 10
3.2 Partitioning Basic Blocks of assembly code 12
3.2 Data Flow Graph construction 13
3.4 Generating Power Dissipation Table 14
Trang 64.2 Topological sorting 20
4.3 Representation of chromosome 21
4.4 Cross Over operator 21
4.5 Mutation operator 23
4.6 Fitness function 23
4.7 Genetic Algorithm for low power scheduling 24
Chapter 5 Expreriments 26
5.1 SimpleScalar tool set 27
5.2 SimplePower simulator 30
5.3 Experimental benchmarks set 32
5.4 Experimental results 33
5.5 Analysis and evaluation 35
Chapter 6 Conclusion and Future Work 37
References 39
Appendix A Some important source code 42
Appendix B Source code of benchmark programs 48
Appendix C Power Dissipation Table 55
Appendix D An example of scheduling a basic block 56
Trang 7List of Figures
Figure 2.1 List scheduling for low power 7
Figure 3.1 Flow of low power instruction scheduling 11
Figure 3.2 An example of a Basic Block and its Data Flow Graph 12
Figure 3.3 Examples of Basic Blocks 13
Figure 3.4 Algorithm to construct a DFG 14
Figure 3.5 PDT generation example 16
Figure 4.1 Topological sorting with random priorities assignment 21
Figure 4.2 Chromosome representation 21
Figure 4.3 Cross over operator 22
Figure 4.4 Cross over operator example 23
Figure 4.5 Mutation operator 23
Figure 4.6 Genetic algorithm for low power scheduling 24
Figure 5.1 Experimental framework 27
Figure 5.2 SimpleScalar simulator software architecture 28
Figure 5.3 SimplePower result example 31
Trang 8
List of Tables
Table 3.1 Instruction set architecture 15
Table 5.1 Experimental benchmark set 32
Table 5.2 Experimental results of GA scheduling 33
Table 5.3 Experimental results of list scheduling 34
Table 5.4 Results comparison of two algorithms 35
Trang 9ISA Instruction Set Architecture
PDT Power Dissipation Table
PISA Portable Instruction Set Architecture PSO Particle Swarm Optimization
RAW Read after Write
RTL Register Transfer Level
TSP Travelling Salesman Problem
WAR Write after Read
WAW Write after Write
Trang 10List of Notations
Energy Power of a program
Power Base Cost of instruction i
Overhead cost between two instruction i and j
n Number of instruction in a basic block
Number of times i get executed
Number of times the pair (i,j) get executed
Energy cost of other effects of the program
P(t) Population at the loop t
xt i Solution i at the loop t
Trang 11Chapter 1
Introduction
In embedded system engineering, optimization is an important problem Embedded systems always have limited resources such as the size of memory, the speed of the processor, power supply, etc Optimization will make the system work more efficiently with allowed resources Optimizing the power consumption is an important issue, especially for embedded systems using battery power source Since embedded devices are portable and use DC powered cells, optimized power consumption can help prolong the life time of such systems Today, embedded devices are becoming more popular in daily life as well as in science and technology Many people have become attracted by popular embedded devices such as smart phones, tablet computers, MP3 players Ubiquitous devices will need
a longer battery life Since the ability to reduce power consumption of an embedded device is more important, this optimization problem has become a major challenge for the designers and the manufacturers They must continually improve product quality to meet the needs of users, and low power consumption really is a necessary requirement
1.1 Software power optimization
Software controls most activity of hardware in the systems, therefore, it can have a significant effect on the power dissipation of a system Power can be optimized by software techniques There have been many software techniques on optimizing the power consumption of the processor Some techniques are instruction scheduling [1-10], reducing memory access, energy cost driven code generation, and instruction packing [2-3,11] Since the order of instructions controls internal switching in the processor; it can affect the power of processor during execution Therefore choosing a suitable order of instructions can reduce power consumption
of the system In terms of energy consumption, memory accesses are more expensive than register accesses, so optimal register allocation that reduces the
Trang 12memory operands, can also reduce power If we can obtain a table of power costs
of individual instructions, a reduction in total power cost can be obtained by using a code generator which selects proper low power instructions Another method is optimizing program source code for low power An example of this method is proposed in [12], where the authors optimize the C code by approximating complex expressions by simple polynomial expressions and converting floating point data to fixed point data, so that the energy dissipation of the program execution is reduced Some other techniques are voltage software controlling and frequency scaling for dynamic power optimization [13]
1.2 Power optimization by instruction scheduling
One of the main features that affect the power consumption of the systems is how the assembly instructions are scheduled or combined together, the power consumed during the execution of an instruction will depend on the previous instruction For a given C program as well as other high level languages code, there can be more than one sequence of instructions (assembly code) for a given processor Therefore, a suitable order of instructions in a program can result in the lower power
consumption Instruction Scheduling for Low Power is an effective software
approach for power optimization; this work is reordering the assembly instructions
of a program so that power consumption is reduced, of course keeping the semantics of the program There are many scheduling techniques that have been proposed, most of which aim to reduce overhead cost between pairs of instructions [2-5,7-9,11] Moreover, some techniques have other objectives for power reduction such as reducing switching activity [6,22] and using critical path [14]
1.3 Our work
In this thesis, we introduce our method for optimizing power consumption by instruction scheduling Scheduling is an NP-hard problem; the main difficulty being that the search space of the possible instruction orders is very large When finding a good schedule, our usual stumbling block is resolving the constraints of the data flow graph, i.e when we create a new order of instructions, we are not sure whether it satisfies the data flow graph or not Hence, our approach uses a genetic algorithm with a chromosome encoding that solves the data dependency problems
Trang 13better This method was introduced in [15], where the authors proposed this method
to solve the traveling salesman problem (TSP) We apply their method to the scheduling problem in order to reduce energy consumption This algorithm has the advantage of avoiding the local optimum For finding solutions in the large search space, we use a heuristic table, called Power Dissipation Table (PDT), which is
generated by power simulations A PDT for an instruction set with n instructions is
a (n × n) matrix, where each entry PDT(i,j) is the power cost consumed in the execution of instruction i followed by instruction j Each entry is used as overhead cost between i and j, and this table is used for evaluating the solution The original
assembly programs are divided into basic blocks, and then a Data Flow Graph (DFG) is constructed for each basic block This is a directed graph that presents the data dependencies of instructions in a basic block Our algorithm is applied for each basic block of an assembly program; it takes as input a data flow graph of a given basic block and the power dissipation table and its output is the low power instruction sequence For experiments, we use two open source simulation tools:
SimpleScalar Tool Set [16-18] and SimplePower [19] A sub set of SimpleScalar Instruction Set is considered and SimplePower is used to simulate the power
consumption The algorithm is applied to assembly programs of SimpleScalar ISA
Then these programs are compiled and their power consumptions are measured by
SimplePower for visual observation
1.4 Thesis organization
The remainder of this thesis is organized as follows Chapter 2 introduces some related works about software power optimization and instruction scheduling for low power Chapter 3 describes the steps of our low power instruction scheduling problem in detail Chapter 4 presents the proposed approach based on a genetic algorithm for this problem Chapter 5 reports the simulation tools, the benchmarks for experiment, experimental results and analysis Chapter 6 presents our conclusions and introduces the future work
Trang 14Chapter 2
Related Work
This chapter summarizes related research about software power optimization
2.1 Software power estimation
The first step for power consumption is power estimation The ability to estimate software power consumption can help to verify that a design meets its power constraints and verify the correctness of power optimization methods
V Tiwari [1,20-21] was the first researcher who proposed an energy
estimation model for a processor, and he also proposed the idea of scheduling assembly code for low power consumption In his model, each instruction in the
instruction set architecture consumes a fixed energy cost called the base energy
cost The base energy cost is computed as product of the voltage and the average
current in the processor while running a loop with a sequence of the same
instruction The other main component of the model is inter-instruction effects, which are also considered The inter-instruction effects includes effect of circuit
state, effect of resource constraints e.g pipeline stalls and write buffer stalls and effect of cache misses The circuit state overhead (overhead cost) between a pair of instructions is the power difference between the actual cost of the pair and the average of the base cost of the individual instructions
The total power of a program is calculated as the sum of base energy costs of all instructions and all inter-instruction effects The total power cost Ep of an assembly program can be given by equation (2.1)
where B i is the base cost for each instruction i, and N i is the number of times
it will get executed, and for every pair of consecutive instructions (i,j), O i,j is the
circuit state overhead between i and j, N i,j is the number of times this pair will be
executed E k is the energy of the other inter-instruction effects such as pipeline
Trang 15stalls, write buffer stalls and cache misses that would occur during program execution Our scheduling method uses the energy model proposed by Tiwari for finding solutions
2.2 Energy code driven generation for low power
The code generation method is proposed by V Tiwari et al [2-3] This approach
proposes the selection of instructions based on their power cost The main idea is a reduction in total power cost can be obtained by using a code generator which selects suitable low power instructions
2.3 Reducing memory access
V Tiwari et al also proposed a memory operand reduction approach in [2-3] This
approach is based on the fact that instructions with memory operands have very high-energy cost compared to instructions with register operands Therefore, much energy reduction can be obtained by reducing the number of memory operands, and
an efficient register management can bring this benefit by replacing the memory access instructions with register access instructions so that the semantic of the program is not changed
2.4 Software power optimization using symbolic algebra
In [13], the authors optimized the C code program for reducing power cost The main idea is approximating complex expressions by simple polynomial expressions and converting floating point data to fixed point data, so that the energy dissipation
of the program execution is reduced This work is a sequence of techniques First,
an energy profiler is used to find all energy critical code blocks Second, a tool is used to transform floating-point data to fixed-point data Third, complex nonlinear arithmetic expressions are approximated by polynomials Finally, the polynomial representations of the critical basic blocks are mapped to the instruction set using symbolic algebra
2.5 List scheduling for low power
Instruction scheduling for low power is first presented by V Tiwari [1-3,11] as the
instruction level power model and the idea of reordering assembly instructions
Trang 16Instruction scheduling for low power aims to reduce the circuit state overhead
(overhead cost), which is the energy dissipated due to switching from execution of
one instruction to another His research indicated that instructions can be reordered
to have a smaller amount of circuit state overhead; therefore, we can obtain low
power with a suitable order of instructions V Tiwari used the List Scheduling
Algorithm [11] for his experiments It is a basic scheduling algorithm with the
greedy strategy In fact, the list scheduling is just a simple Topological sorting algorithm The algorithm performs on each basic block of instructions From a data flow graph that presents the data dependency of a basic block, at each step, it
chooses the instruction with lowest overhead cost (circuit state overhead between
the previous instruction and it) from the priority list; where, priority list is the list contains all instructions which have no dependence to any other This simple algorithm is used to show the potential of power reduction by reordering assembly code List Scheduling is shown in Fig 2.1
2.6 Instruction scheduling to reduce switching activity
In [6], C-L Su et al proposed a cold scheduling algorithm to reduce the switching
activity of the processor The authors suggested a memory addressing method using Gray code; Gray code has a one-bit difference in representation for consecutive numbers The use of gray code addressing can reduce the number of bit switches of the address buses, lead to a large amount of reduction of power consumption because most of the program instructions access the consecutively addressed memory locations The cold scheduling technique is a software approach based on
a traditional list scheduling algorithm to reduce the switching activity of the control path
Trang 17Figure 2.1 List scheduling for low power
2.7 Low power instruction scheduling as a traveling salesman problem
K Choi et al [4] presented another method by formulating the instruction
scheduling problem as a traveling salesman problem (TSP) They used the minimum spanning tree and a simulated annealing technique for finding optimal solutions The scheduling technique uses a power dissipation table (PDT), which is
an (n × n) matrix, where each entry (i,j) is the average power consumed when instruction i followed by instruction j The scheduling algorithm uses a control
flow and data dependency graph for each basic block and the PDT The problem of
Input: List L, a DFG of a basic block;
For each vertex v in DFG
if v has no predecessor, add v to L;
Choose a random vertex i from L;
Schedule i, then remove i from L, remove i and all arcs of i from DFG;
While L is not empty do
Choose the highest priority j in L;
(overhead cost between the previous instruction and j is
smallest)
Schedule j;
Remove j from L, remove j and all arcs of j from DFG;
Update L by adding new vertices with no predecessor;
End_While Return schedule
Trang 18instruction reordering for low power is transformed to finding the tour of lowest cost (TSP) in the constraint graph
2.8 Force-directed instruction scheduling for low power
P Dongale [5] in his master thesis, proposed the algorithm force-directed
scheduling for low power (FP-ISLP) This is an application of the classic
force-directed scheduling algorithm to low power problem As well as Choi’s method above, this method also uses a PDT as a heuristic table for finding a good instruction order for low power
2.9 Instruction scheduling to reduce the off-chip power
Another low power scheduling technique was presented by H Tomiyama et al
[22] This method aims to reduce the off-chip power By reducing the bit-switching
on the off-chip buses The scheduling algorithm attempts to find an optimal instruction order of each basic block by decreasing the difference of binary representations of two consecutive instructions in the memory Then the power consumption is reduced because the number of transitions on the data bus are minimized
2.10 Energy-oriented and performance-oriented combination scheduling
A Parikh et al [7-8] proposed the method of performance-oriented scheduling, and
energy-oriented scheduling, and also a method by combining these two approaches The performance-oriented scheduling uses the list scheduling algorithm with time
as the objective parameter The energy-oriented approach also uses the list scheduling, but uses circuit-state overhead (inter-instruction effect) as the objective parameter In the energy-oriented scheduling, at each step, the scheduler selects the next node with the least circuit-state overhead In the combined method, the selections of the scheduler are mainly based on one parameter, and the other parameter is considered only when there is a tie, specifically, the decisions are based on circuit-state overhead and the time parameter is used only when there is a tie with the overhead values
Trang 192.11 Criticality-directed and uncriticality-directed instruction scheduling for low power
Two methods called Criticality-directed and Uncriticality-directed instruction
scheduling for low power were proposed by S Watanabe et al [14].The critical
path is the longest path in a data flow graph (DFG) Instructions on a critical path determine the execution time of the program These called critical instructions The first algorithm is based on the idea that every functional unit in the processor has different performance and power consumption Only critical instructions are scheduled in fast and power-hungry units and the rest are scheduled in the slow and power-efficient ones, so the total power consumption can be reduced In contrast, for the second algorithm, instead of finding the critical instructions, the authors proposed a method to exploit uncritical ones Only uncritical instructions are scheduled in power-efficient units, and energy consumption can be reduced
2.12 Low power instruction scheduling using Particle Swarm Optimization algorithm
C Nian et al [9] introduced a scheduling method which is based on Particle
Swarm Optimization (PSO) Algorithm to reduce the signal transitions The authors
take the low power instruction scheduling problem to the discrete PSO problem and modify the PSO solution to satisfy the data flow graph to find the better instruction order with fewer signal transitions; then the velocity updating formula is also improved in order to get better results
2.13 Summary
In this chapter, we present some existing works in software power optimization, focusing mainly on instruction scheduling approach The first step for optimizing power dissipation of a program is power estimation, software power consumption can be reduced by reducing memory access, code generation, optimizing source code, instruction scheduling
Trang 20Chapter 3
Instruction Scheduling for Low Power
This chapter describes the low power instruction scheduling problem and explains the steps needed to solve it
3.1 Problem description
Our scheduling problem involves the following steps:
Divide an assembly program code into basic blocks Construct Data Flow Graph for each basic block Apply Scheduling algorithm to each basic block Original assembly programs are divided into basic blocks, then a Data Flow Graph (DFG) is constructed for each basic block This is a directed graph that presents the data dependencies of instructions in a basic block The scheduling algorithm is applied to each basic block of an assembly program; it takes as input a data flow graph of a given basic block and the power dissipation table and it outputs the low power instruction sequence Scheduling is similar to Topological sorting problem; from the Data Flow Graph, we have to choose an order that satisfies the constraints of the graph Our scheduling problem is finding a topological order so that the total cost through all vertices is the smallest or smaller the original’s one, the costs between two vertices in a row here is the overhead cost between two vertices The flow diagram of our instruction scheduling problem and
is shown in Fig 3.1
Trang 21Figure 3.1 Flow of low power instruction scheduling
A basic block is a piece of code which has only one entry point and one exit
point after the last instruction in the BB One entry point means that there is no
instruction within it that is the destination of a jump instruction anywhere in the
program One exit point means that only the last instruction can cause the program
to begin executing instruction in a different basic block
A Data Flow Graph (DFG) is a graph which presents the data dependency of the instructions in a basic block It is a directed acyclic graph where each
instruction of the basic block is presented by a vertex and each arc represents the
dependency of an instruction pair Fig 3.2 shows an example of a basic block and
its DFG
Source Code
Assembly Code
Divide to Basic Blocks
Construct Data Folow Graph
Apply Scheduling Algorithm
Scheduled Assembly Code
Power Dissipation Table
Trang 22Figure 3.2 An example of a Basic Block and its Data Flow Graph Here, we cannot measure the overhead cost between each pair of instructions, but we can measure the energy consumption of each pair of instructions This power includes base energy cost of each instruction and overhead cost between them By measuring the power consumption in pairs as above, we
build a Power Dissipation Table (PDT) This is a matrix, where each element (i, j)
of the table represents the power dissipation when instruction i is followed by instruction j, this table will be used instead of the overhead cost table
3.2 Partitioning Basic Blocks of Assembly code
The source programs are written in C, and compiled by the SimpleScalar compiler (ssbig-na-sstrix-gcc) to obtain assembly code of SimpleScalar ISA Then, these
assembly programs are divided into basic blocks The algorithm [23] for generating
basic blocks from a listing of code is simple, it is described as follows:
First, find the set of leaders in the code, a leader is the first instruction of a basic block Leaders are one of the following three categories:
- The first instruction is a leader
- The instruction that is a label (target) of a branch/jump instruction is a leader
- The instruction that follows a branch/jump instruction is a leader
For each leader, its basic block consists of this leader and all the following instructions until the next leader Because the control can never pass through the
Trang 23end of a basic block, some block boundaries may have to be modified after finding the basic blocks For example, in fact, labels, jump instructions and assembly directives are not taken into account in basic blocks
Fig 3.3 shows an example of some basic blocks:
Figure 3.3 Examples of Basic Blocks
3.2 Data Flow Graph construction
A data flow graph is constructed for each basic block This graph presents the data dependencies among the instructions in the given basic block To construct the data flow graph, we need to understand the instruction set architecture and the data
dependencies between its instructions An instruction j is dependent on a previous instruction i if instruction j shares a register or a memory location with instruction i, and therefore j cannot be executed until instruction i has completed execution and
been written back
$L5:
addu $2,$3,1 move $3,$2
Trang 24
There are three types of data dependencies:
- Read After Write (RAW): instruction i and j have a RAW dependency
if i writes to a register or memory operand and after that, j reads from this location
- Write After Read (WAR): instruction i and j have a RAW dependency
if i reads from a register or memory operand and after that, j writes to this location
- Write After Write (WAW): instruction i and j have a RAW
dependency if they write to the same register or memory location
The algorithm to construct a data flow graph of a given basic block is shown
in Fig 3.4
Figure 3.4 Algorithm to construct a DFG
3.4 Generating Power Dissipation Table
We use a subset of SimpleScalar ISA for our experiments, it includes 16
instructions listed in Table 3.1 bellow:
For each instruction i
Begin
For each instruction j after i
Begin
If i and j have WAW then create arc ij
If i and j have RAW then create arc ij
End
For each address read by i
For each instruction j after i
Begin
if i and j have WAR then create arc ij, and
break;
End End
Trang 25Table 3.1 Instruction set architecture
No Instruction Description Example
2 subu Subtract unsigned integer subu $sp,$sp,16
a subroutine
jal BubbleSort
This subset of instruction set architecture covers all the benchmark programs that are used for experiments Among 16 instructions, there are 4 jump/branch
instructions that can be ignored So we have to generate a PDT includes 12×12
elements
To create the PDT table, we do the following: each element (i, j) of the table
is the power measured by giving the instruction j after instruction i, followed by the
Trang 26instruction nop, repeated 20,000 times to avoid loop overheads An example of PDT generation is described in Fig 3.5 SimpleScalar Tool Set [16-18] and
SimplePower [19] are used We use SimpleScalar ISA to create assembly programs
and to create the table PDT, the compiler ssbig-na-sstrix-gcc of SimpleScalar are used to compile assembly programs SimplePower is a power simulator for
SimpleScalar ISA, it is used to measure the elements of the PDT and to measure the
power consumption of assembly programs We create 144 assembly files automatically with each file corresponding to a pair of instruction These files are
named in alphabetical order that can be processed automatically in SimplePower
Running all these programs, we obtain the PDT
Figure 3.5 PDT generation example
3.5 Summary
In this chapter, we presented the details our problem in low power instruction scheduling We described the steps of the problem including BB partition, constructing DFG and applying scheduling algorithm for each BB We also presented the method to build a PDT table that is used by our scheduling algorithm
Trang 27the initial population, construction of the cross over operator, mutation operator, and building fitness function This chapter introduces the genetic algorithm (GA)
approach based on topological sorting for low power scheduling problem
4.1 Genetic Algorithm
In computer science, a genetic algorithm (GA) is a heuristic search algorithm that imitates the process of natural evolution [24] This algorithm is generally used to generate good solutions to search and optimization problems Genetic algorithms belong to the larger family of evolutionary algorithms, which generate solutions to optimization problems using techniques based on natural evolution, such as inheritance, mutation, crossover, and selection Genetic algorithms are applied in bioinformatics, computer science, economics, chemistry, engineering, mathematics, physics and other fields
Genetic Algorithms (GA) were developed in the 1970s by the work of Holland and his colleagues This concept is based on the main idea of the evolutionary theory Genetic Algorithms, as well as evolutionary algorithms in general, formed on the notion that the natural evolution is the most perfect and most reasonable process; and it is optimal This concept can be seen as an axiom, that is suitable with objective reality The evolutionary process represents the optimum in the feature that the later generation is always better (more developed,
Trang 28more complete) than the previous generation Throughout the process of natural evolution, new generations are generated to supplement, replace the older generations by two basic processes: reproduction and natural selection Each individual to survive and develop has to adapt to the environment, individuals that adapt can survive; poorly adapted individuals can be destroyed
Each individual has a set of chromosomes Each chromosome consists of many genes linked by a chain-like structure, representing this individual’s traits Individuals of the same species have the same chromosome’s structure but their gene’s structures are different, it makes the difference between individuals of the same species and decides the survival of the individual Because the natural environment always changes, chromosome structures also change to adapt to the environment; and later generations are always more adaptable than fomer generations These structures are formed due to the random exchange of information with the external environment or between chromosomes together.From this idea, scientists researched and built the genetic algorithm concept based
on natural selection and evolution rules. Genetic algorithms simulate basic processes of nature: cross over, mutation, reproduction and natural selection Here, each individual can have only one chromosome The chromosome is divided into genes that are arranged in a linear array Each individual (or chromosome) represents a possible solution of the problem Each operation on the set of chromosomes is equivalent to finding a solution in the solution space of the problem [25] The search process has to achieve two goals:
- Exploiting the best solutions
- Considering the entire search space
A genetic algorithm (or any evolutionary algorithm) to solve a particular problem must include the following five components:
- Encoding solutions - a genetic representation for the solution of the
problem
- Creating the initial population
- Constructing the fitness function: a function to evaluate solutions according
to the level of "adaptation" of them
Trang 29- Constructing genetic operators (selection, cross over and mutation)
- Choosing algorithm parameters (population size, number of generation, stop condition, cross over probability and mutation probability)
A GA takes the process of searching the optimal solution in many directions,
by maintaining a population of solutions and promoting the formation and exchange information between these solutions The population undergoes evolutionary process: In each generation, new individuals will be created by cross over operator and mutation operator with a certain probability Then the relatively
"good" individual will be retained while the relatively "bad" individual will be removed, creating a new generation better than the previous generation
At the loop t, the GA specifies a set of possible solutions (or individuals or chromosomes) that is called population P(t) = { x t 1 , x t 2 , , x t m } (number of
individuals is population size) Each solution x t i is evaluated to determine its suitability Some individuals are chosen for reproduction by cross over and mutation Then, a new set of solutions is formed by selecting the more suitable solutions This leads to a new population P(t+1) with the hope of containing the
individuals that more adaptive than previous population.
Thus, essentially, a GA is an iterative algorithm It aims to solve the problem
of searching based on artificial selection mechanism and the evolution of genes In this process, the survival of an individual depends on the features of its chromosome and the selection process GA uses the operators: selection, cross over and mutation on the chromosomes to create new chromosomes; these operators are essentially copying chromosomes, modifying chromosomes and exchanging sub-chromosomes
A GA can be seen to be different from the conventional optimization algorithms in the following features:
- GA works with the set of code of variables, not with variables
- GA searches on a population of individuals, not on a point, so it reduce the ability to finish searching at a local optimal point and does not reach theglobal optimization
Trang 30- GA only needs the information from the fitness function in order to find good solutions; it does not need other support information
- The basic operations of the algorithm are based on random integration, and probabilistic and nondeterministic selection
The mechanism of a genetic algorithm is simple but has more power than other conventional algorithms due to the evaluation and selection after each step
So, the ability to reach the optimal solution of GA is faster than other algorithms
4.2 Topological sorting
The topological sort is an ordering of vertices in a directed acyclic graph, such that
if there is a path from vertex v i to vertex v j , then v j appears after v i in the ordering
In the topological sorting procedure, in each step, select any vertex without incoming edges and then store the vertex and its position Then, the vertex and all the arcs from this vertex are removed from the graph As mentioned in chapter 3, scheduling is similar to topological sorting problem; from the Data Flow Graph, we have to choose an order that satisfies the constraints of the graph Our scheduling problem is finding a topological order so that the total cost through all vertices is the smallest or smaller the original’s one The costs between two vertices in a row here is the overhead cost between two vertices
There are more than one sequence of vertices that can be derived from a directed graph using this topological sorting procedure. To overcome this issue, and
to obtain a feasible complete path from a directed graph, an ordering technique using the topological sort and random assignment of priority is used [15] To derive
a unique sequence from a directed graph, a random priority assignment technique
to randomly assign a different priority to each vertex in the graph is used Therefore, a string of priorities can represent a feasible path The topological sorting procedure with priorities assignment is shown in Fig 4.1
Trang 31Figure 4.1 Topological sorting with random priorities assignment
4.3 Representation of chromosome
According to the topological sorting algorithm above, instead of using a sequence
of vertices of the graph, we can use a string of priorities to represent chromosome, each string of priorities will represent one individual in the population Fig 4.2 shows an example of chromosome representation
Figure 4.2 Chromosome representation
4.4 Cross Over operator
This operator generates a new chromosome from two old chromosomes For
constructing cross over operator, we base on the operator called Moon Cross Over
Trang 32introduced in [15] but we modify it a little In order to generate a new chromosome,
at the first step, the Moon Cross Over selects a sub string of an old chromosome,
and we do the same, but reverse this string Our cross over operator is described in Fig 4.3, and Fig.4.4 shows an example of it
Figure 4.3 Cross over operator
child = null;
k = 0;
Choose 2 chromosomes c a ,c b ; c a = a 1 a 2 a n ; c b = b 1 b 2 b n ;
Select 2 genes ai and aj randomly from c a (i<j);
if a i ≠ b k then child = <child,a i ,b k >;
else child = <child,a i >;
Else if j == n then
i = i -1;
k = k+1;
if a i ≠ b k then child = <b k ,a i ,child>;
else child = <a i ,child>;
Else
i = i -1;
k = k+1;
if a i ≠ b k then child = <a i ,child,b k >;
else child = <a i ,child>;
End_While
Trang 33Figure 4.4 Cross over operator example
4.5 Mutation operator
In the mutation operation, any two genes will be randomly selected and swapped
Fig.4.5 shows an example of Mutation operator
Figure 4.5 Mutation operator
4.6 Fitness function
The fitness function is the total cost from the first vertex to the last vertex of the
sequence of vertices which have been sorted In our problem, the path between each pair of vertices is the corresponding PDT value when switching between two instructions that correspond to these vertices Our goal is to select the sequence that has the smallest fitness function value