A Software Approach for Lower Power Consumption: . M.A Thesis

List of Notations Energy Power of a program Power Base Cost of instruction i Overhead cost between two instruction i and j n Number of instruction in a basic block Number of time

Trang 1

A SOFTWARE APPROACH FOR LOWER POWER

CONSUMPTION

Bui Ngoc Hai Faculty of Information Technology University of Engineering and Technology Vietnam National University, Hanoi

Supervised by

Assoc Prof Dr Nguyen Ngoc Binh

A thesis submitted in fulfillment of the requirements for the degree of

Master of Science in Computer Science

April 2014

Trang 2

ORIGINALITY STATEMENT

‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at University of Engineering and Technology (UET) or any other educational institution, except where due acknowledgement is made in the thesis Any contribution made to the research by others, with whom I have worked at UET or elsewhere, is explicitly acknowledged in the thesis I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.’

Hanoi, April 25th , 2014

Signed

Trang 3

ABSTRACT

Optimizing the power consumption is an important topic in embedded system engineering, especially for embedded systems that use battery power source Power optimization can be achieved by software techniques and instruction scheduling is

an effective software approach for reducing power cost of processor(s) In this thesis, we propose our idea of using a genetic algorithm for low power instruction scheduling Our algorithm is applied to each basic block of assembly code to generate lower power program In the experiment section, we use two open source

simulation tools that are SimpleScalar Tool Set and SimplePower, the algorithm is applied to assembly programs of SimpleScalar Instruction Set, these programs are compiled and then have their power consumptions measured by SimplePower The

experimental results showed the effectiveness of our proposed method This scheduling method will be combined with the idea of reducing memory access for low power design in our further work

Trang 4

First and foremost, I would like to express my deepest gratitude to my supervisor, Assoc.Prof.Dr Nguyen Ngoc Binh for giving me the opportunity to work with him and for his patient guidance and continuous support throughout the years

I would like to give my honest appreciation to my colleagues at the Laboratory of Embedded Systems for their great support I also would like to thank all my friends who gave me moral support during this work

Finally, this thesis would not have been possible without the moral support and love of my parents and my brother Thank you!

Trang 5

Table of Contents

Chapter 1 Introduction 1

1.1 Software power optimization 1

1.2 Power optimization by instruction scheduling 2

1.3 Our work 2

1.4 Thesis organization 3

Chapter 2 Related Work 4

2.1 Software power estimation 4

2.2 Energy code driven generation for low power 5

2.3 Reducing memory access 5

2.4 Software power optimization using symbolic algebra 5

2.5 List scheduling for low power 5

2.6 Instruction scheduling to reduce switching activity 6

2.7 Low power instruction scheduling as traveling salesman problem 7

2.8 Force-directed scheduling for low power 8

2.9 Instruction scheduling to reduce the off-chip power 8

2.10 Energy-oriented and performance-oriented combination scheduling 8

2.11 Criticality-directed and Uncriticality-directed instruction scheduling for low power 9

2.12 Low power instruction scheduling using Particle Swarm Optimization algorithm 9

Chapter 3 Instruction Scheduling for Low Power 10

3.1 Problem description 10

3.2 Partitioning Basic Blocks of assembly code 12

3.2 Data Flow Graph construction 13

3.4 Generating Power Dissipation Table 14

Trang 6

4.2 Topological sorting 20

4.3 Representation of chromosome 21

4.4 Cross Over operator 21

4.5 Mutation operator 23

4.6 Fitness function 23

4.7 Genetic Algorithm for low power scheduling 24

Chapter 5 Expreriments 26

5.1 SimpleScalar tool set 27

5.2 SimplePower simulator 30

5.3 Experimental benchmarks set 32

5.4 Experimental results 33

5.5 Analysis and evaluation 35

Chapter 6 Conclusion and Future Work 37

References 39

Appendix A Some important source code 42

Appendix B Source code of benchmark programs 48

Appendix C Power Dissipation Table 55

Appendix D An example of scheduling a basic block 56

Trang 7

List of Figures

Figure 2.1 List scheduling for low power 7

Figure 3.1 Flow of low power instruction scheduling 11

Figure 3.2 An example of a Basic Block and its Data Flow Graph 12

Figure 3.3 Examples of Basic Blocks 13

Figure 3.4 Algorithm to construct a DFG 14

Figure 3.5 PDT generation example 16

Figure 4.1 Topological sorting with random priorities assignment 21

Figure 4.2 Chromosome representation 21

Figure 4.3 Cross over operator 22

Figure 4.4 Cross over operator example 23

Figure 4.5 Mutation operator 23

Figure 4.6 Genetic algorithm for low power scheduling 24

Figure 5.1 Experimental framework 27

Figure 5.2 SimpleScalar simulator software architecture 28

Figure 5.3 SimplePower result example 31

Trang 8

List of Tables

Table 3.1 Instruction set architecture 15

Table 5.1 Experimental benchmark set 32

Table 5.2 Experimental results of GA scheduling 33

Table 5.3 Experimental results of list scheduling 34

Table 5.4 Results comparison of two algorithms 35

Trang 9

ISA Instruction Set Architecture

PDT Power Dissipation Table

PISA Portable Instruction Set Architecture PSO Particle Swarm Optimization

RAW Read after Write

RTL Register Transfer Level

TSP Travelling Salesman Problem

WAR Write after Read

WAW Write after Write

Trang 10

List of Notations

Energy Power of a program

Power Base Cost of instruction i

Overhead cost between two instruction i and j

n Number of instruction in a basic block

Number of times i get executed

Number of times the pair (i,j) get executed

Energy cost of other effects of the program

P(t) Population at the loop t

xt i Solution i at the loop t

Trang 11

Chapter 1

Introduction

In embedded system engineering, optimization is an important problem Embedded systems always have limited resources such as the size of memory, the speed of the processor, power supply, etc Optimization will make the system work more efficiently with allowed resources Optimizing the power consumption is an important issue, especially for embedded systems using battery power source Since embedded devices are portable and use DC powered cells, optimized power consumption can help prolong the life time of such systems Today, embedded devices are becoming more popular in daily life as well as in science and technology Many people have become attracted by popular embedded devices such as smart phones, tablet computers, MP3 players Ubiquitous devices will need

a longer battery life Since the ability to reduce power consumption of an embedded device is more important, this optimization problem has become a major challenge for the designers and the manufacturers They must continually improve product quality to meet the needs of users, and low power consumption really is a necessary requirement

1.1 Software power optimization

Software controls most activity of hardware in the systems, therefore, it can have a significant effect on the power dissipation of a system Power can be optimized by software techniques There have been many software techniques on optimizing the power consumption of the processor Some techniques are instruction scheduling [1-10], reducing memory access, energy cost driven code generation, and instruction packing [2-3,11] Since the order of instructions controls internal switching in the processor; it can affect the power of processor during execution Therefore choosing a suitable order of instructions can reduce power consumption

of the system In terms of energy consumption, memory accesses are more expensive than register accesses, so optimal register allocation that reduces the

Trang 12

memory operands, can also reduce power If we can obtain a table of power costs

of individual instructions, a reduction in total power cost can be obtained by using a code generator which selects proper low power instructions Another method is optimizing program source code for low power An example of this method is proposed in [12], where the authors optimize the C code by approximating complex expressions by simple polynomial expressions and converting floating point data to fixed point data, so that the energy dissipation of the program execution is reduced Some other techniques are voltage software controlling and frequency scaling for dynamic power optimization [13]

1.2 Power optimization by instruction scheduling

One of the main features that affect the power consumption of the systems is how the assembly instructions are scheduled or combined together, the power consumed during the execution of an instruction will depend on the previous instruction For a given C program as well as other high level languages code, there can be more than one sequence of instructions (assembly code) for a given processor Therefore, a suitable order of instructions in a program can result in the lower power

consumption Instruction Scheduling for Low Power is an effective software

approach for power optimization; this work is reordering the assembly instructions

of a program so that power consumption is reduced, of course keeping the semantics of the program There are many scheduling techniques that have been proposed, most of which aim to reduce overhead cost between pairs of instructions [2-5,7-9,11] Moreover, some techniques have other objectives for power reduction such as reducing switching activity [6,22] and using critical path [14]

1.3 Our work

In this thesis, we introduce our method for optimizing power consumption by instruction scheduling Scheduling is an NP-hard problem; the main difficulty being that the search space of the possible instruction orders is very large When finding a good schedule, our usual stumbling block is resolving the constraints of the data flow graph, i.e when we create a new order of instructions, we are not sure whether it satisfies the data flow graph or not Hence, our approach uses a genetic algorithm with a chromosome encoding that solves the data dependency problems

Trang 13

better This method was introduced in [15], where the authors proposed this method

to solve the traveling salesman problem (TSP) We apply their method to the scheduling problem in order to reduce energy consumption This algorithm has the advantage of avoiding the local optimum For finding solutions in the large search space, we use a heuristic table, called Power Dissipation Table (PDT), which is

generated by power simulations A PDT for an instruction set with n instructions is

a (n × n) matrix, where each entry PDT(i,j) is the power cost consumed in the execution of instruction i followed by instruction j Each entry is used as overhead cost between i and j, and this table is used for evaluating the solution The original

assembly programs are divided into basic blocks, and then a Data Flow Graph (DFG) is constructed for each basic block This is a directed graph that presents the data dependencies of instructions in a basic block Our algorithm is applied for each basic block of an assembly program; it takes as input a data flow graph of a given basic block and the power dissipation table and its output is the low power instruction sequence For experiments, we use two open source simulation tools:

SimpleScalar Tool Set [16-18] and SimplePower [19] A sub set of SimpleScalar Instruction Set is considered and SimplePower is used to simulate the power

consumption The algorithm is applied to assembly programs of SimpleScalar ISA

Then these programs are compiled and their power consumptions are measured by

SimplePower for visual observation

1.4 Thesis organization

The remainder of this thesis is organized as follows Chapter 2 introduces some related works about software power optimization and instruction scheduling for low power Chapter 3 describes the steps of our low power instruction scheduling problem in detail Chapter 4 presents the proposed approach based on a genetic algorithm for this problem Chapter 5 reports the simulation tools, the benchmarks for experiment, experimental results and analysis Chapter 6 presents our conclusions and introduces the future work

Trang 14

Chapter 2

Related Work

This chapter summarizes related research about software power optimization

2.1 Software power estimation

The first step for power consumption is power estimation The ability to estimate software power consumption can help to verify that a design meets its power constraints and verify the correctness of power optimization methods

V Tiwari [1,20-21] was the first researcher who proposed an energy

estimation model for a processor, and he also proposed the idea of scheduling assembly code for low power consumption In his model, each instruction in the

instruction set architecture consumes a fixed energy cost called the base energy

cost The base energy cost is computed as product of the voltage and the average

current in the processor while running a loop with a sequence of the same

instruction The other main component of the model is inter-instruction effects, which are also considered The inter-instruction effects includes effect of circuit

state, effect of resource constraints e.g pipeline stalls and write buffer stalls and effect of cache misses The circuit state overhead (overhead cost) between a pair of instructions is the power difference between the actual cost of the pair and the average of the base cost of the individual instructions

The total power of a program is calculated as the sum of base energy costs of all instructions and all inter-instruction effects The total power cost Ep of an assembly program can be given by equation (2.1)

where B i is the base cost for each instruction i, and N i is the number of times

it will get executed, and for every pair of consecutive instructions (i,j), O i,j is the

circuit state overhead between i and j, N i,j is the number of times this pair will be

executed E k is the energy of the other inter-instruction effects such as pipeline

Trang 15

stalls, write buffer stalls and cache misses that would occur during program execution Our scheduling method uses the energy model proposed by Tiwari for finding solutions

2.2 Energy code driven generation for low power

The code generation method is proposed by V Tiwari et al [2-3] This approach

proposes the selection of instructions based on their power cost The main idea is a reduction in total power cost can be obtained by using a code generator which selects suitable low power instructions

2.3 Reducing memory access

V Tiwari et al also proposed a memory operand reduction approach in [2-3] This

approach is based on the fact that instructions with memory operands have very high-energy cost compared to instructions with register operands Therefore, much energy reduction can be obtained by reducing the number of memory operands, and

an efficient register management can bring this benefit by replacing the memory access instructions with register access instructions so that the semantic of the program is not changed

2.4 Software power optimization using symbolic algebra

In [13], the authors optimized the C code program for reducing power cost The main idea is approximating complex expressions by simple polynomial expressions and converting floating point data to fixed point data, so that the energy dissipation

of the program execution is reduced This work is a sequence of techniques First,

an energy profiler is used to find all energy critical code blocks Second, a tool is used to transform floating-point data to fixed-point data Third, complex nonlinear arithmetic expressions are approximated by polynomials Finally, the polynomial representations of the critical basic blocks are mapped to the instruction set using symbolic algebra

2.5 List scheduling for low power

Instruction scheduling for low power is first presented by V Tiwari [1-3,11] as the

instruction level power model and the idea of reordering assembly instructions

Trang 16

Instruction scheduling for low power aims to reduce the circuit state overhead

(overhead cost), which is the energy dissipated due to switching from execution of

one instruction to another His research indicated that instructions can be reordered

to have a smaller amount of circuit state overhead; therefore, we can obtain low

power with a suitable order of instructions V Tiwari used the List Scheduling

Algorithm [11] for his experiments It is a basic scheduling algorithm with the

greedy strategy In fact, the list scheduling is just a simple Topological sorting algorithm The algorithm performs on each basic block of instructions From a data flow graph that presents the data dependency of a basic block, at each step, it

chooses the instruction with lowest overhead cost (circuit state overhead between

the previous instruction and it) from the priority list; where, priority list is the list contains all instructions which have no dependence to any other This simple algorithm is used to show the potential of power reduction by reordering assembly code List Scheduling is shown in Fig 2.1

2.6 Instruction scheduling to reduce switching activity

In [6], C-L Su et al proposed a cold scheduling algorithm to reduce the switching

activity of the processor The authors suggested a memory addressing method using Gray code; Gray code has a one-bit difference in representation for consecutive numbers The use of gray code addressing can reduce the number of bit switches of the address buses, lead to a large amount of reduction of power consumption because most of the program instructions access the consecutively addressed memory locations The cold scheduling technique is a software approach based on

a traditional list scheduling algorithm to reduce the switching activity of the control path

Trang 17

Figure 2.1 List scheduling for low power

2.7 Low power instruction scheduling as a traveling salesman problem

K Choi et al [4] presented another method by formulating the instruction

scheduling problem as a traveling salesman problem (TSP) They used the minimum spanning tree and a simulated annealing technique for finding optimal solutions The scheduling technique uses a power dissipation table (PDT), which is

an (n × n) matrix, where each entry (i,j) is the average power consumed when instruction i followed by instruction j The scheduling algorithm uses a control

flow and data dependency graph for each basic block and the PDT The problem of

Input: List L, a DFG of a basic block;

For each vertex v in DFG

if v has no predecessor, add v to L;

Choose a random vertex i from L;

Schedule i, then remove i from L, remove i and all arcs of i from DFG;

While L is not empty do

Choose the highest priority j in L;

(overhead cost between the previous instruction and j is

smallest)

Schedule j;

Remove j from L, remove j and all arcs of j from DFG;

Update L by adding new vertices with no predecessor;

End_While Return schedule

Trang 18

instruction reordering for low power is transformed to ﬁnding the tour of lowest cost (TSP) in the constraint graph

2.8 Force-directed instruction scheduling for low power

P Dongale [5] in his master thesis, proposed the algorithm force-directed

scheduling for low power (FP-ISLP) This is an application of the classic

force-directed scheduling algorithm to low power problem As well as Choi’s method above, this method also uses a PDT as a heuristic table for finding a good instruction order for low power

2.9 Instruction scheduling to reduce the off-chip power

Another low power scheduling technique was presented by H Tomiyama et al

[22] This method aims to reduce the off-chip power By reducing the bit-switching

on the off-chip buses The scheduling algorithm attempts to find an optimal instruction order of each basic block by decreasing the difference of binary representations of two consecutive instructions in the memory Then the power consumption is reduced because the number of transitions on the data bus are minimized

2.10 Energy-oriented and performance-oriented combination scheduling

A Parikh et al [7-8] proposed the method of performance-oriented scheduling, and

energy-oriented scheduling, and also a method by combining these two approaches The performance-oriented scheduling uses the list scheduling algorithm with time

as the objective parameter The energy-oriented approach also uses the list scheduling, but uses circuit-state overhead (inter-instruction effect) as the objective parameter In the energy-oriented scheduling, at each step, the scheduler selects the next node with the least circuit-state overhead In the combined method, the selections of the scheduler are mainly based on one parameter, and the other parameter is considered only when there is a tie, specifically, the decisions are based on circuit-state overhead and the time parameter is used only when there is a tie with the overhead values

Trang 19

2.11 Criticality-directed and uncriticality-directed instruction scheduling for low power

Two methods called Criticality-directed and Uncriticality-directed instruction

scheduling for low power were proposed by S Watanabe et al [14].The critical

path is the longest path in a data flow graph (DFG) Instructions on a critical path determine the execution time of the program These called critical instructions The first algorithm is based on the idea that every functional unit in the processor has different performance and power consumption Only critical instructions are scheduled in fast and power-hungry units and the rest are scheduled in the slow and power-efficient ones, so the total power consumption can be reduced In contrast, for the second algorithm, instead of finding the critical instructions, the authors proposed a method to exploit uncritical ones Only uncritical instructions are scheduled in power-efficient units, and energy consumption can be reduced

2.12 Low power instruction scheduling using Particle Swarm Optimization algorithm

C Nian et al [9] introduced a scheduling method which is based on Particle

Swarm Optimization (PSO) Algorithm to reduce the signal transitions The authors

take the low power instruction scheduling problem to the discrete PSO problem and modify the PSO solution to satisfy the data flow graph to find the better instruction order with fewer signal transitions; then the velocity updating formula is also improved in order to get better results

2.13 Summary

In this chapter, we present some existing works in software power optimization, focusing mainly on instruction scheduling approach The first step for optimizing power dissipation of a program is power estimation, software power consumption can be reduced by reducing memory access, code generation, optimizing source code, instruction scheduling

Trang 20

Chapter 3

Instruction Scheduling for Low Power

This chapter describes the low power instruction scheduling problem and explains the steps needed to solve it

3.1 Problem description

Our scheduling problem involves the following steps:

Divide an assembly program code into basic blocks Construct Data Flow Graph for each basic block Apply Scheduling algorithm to each basic block Original assembly programs are divided into basic blocks, then a Data Flow Graph (DFG) is constructed for each basic block This is a directed graph that presents the data dependencies of instructions in a basic block The scheduling algorithm is applied to each basic block of an assembly program; it takes as input a data flow graph of a given basic block and the power dissipation table and it outputs the low power instruction sequence Scheduling is similar to Topological sorting problem; from the Data Flow Graph, we have to choose an order that satisfies the constraints of the graph Our scheduling problem is finding a topological order so that the total cost through all vertices is the smallest or smaller the original’s one, the costs between two vertices in a row here is the overhead cost between two vertices The flow diagram of our instruction scheduling problem and

is shown in Fig 3.1

Trang 21

Figure 3.1 Flow of low power instruction scheduling

A basic block is a piece of code which has only one entry point and one exit

point after the last instruction in the BB One entry point means that there is no

instruction within it that is the destination of a jump instruction anywhere in the

program One exit point means that only the last instruction can cause the program

to begin executing instruction in a different basic block

A Data Flow Graph (DFG) is a graph which presents the data dependency of the instructions in a basic block It is a directed acyclic graph where each

instruction of the basic block is presented by a vertex and each arc represents the

dependency of an instruction pair Fig 3.2 shows an example of a basic block and

its DFG

Source Code

Assembly Code

Divide to Basic Blocks

Construct Data Folow Graph

Apply Scheduling Algorithm

Scheduled Assembly Code

Power Dissipation Table

Trang 22

Figure 3.2 An example of a Basic Block and its Data Flow Graph Here, we cannot measure the overhead cost between each pair of instructions, but we can measure the energy consumption of each pair of instructions This power includes base energy cost of each instruction and overhead cost between them By measuring the power consumption in pairs as above, we

build a Power Dissipation Table (PDT) This is a matrix, where each element (i, j)

of the table represents the power dissipation when instruction i is followed by instruction j, this table will be used instead of the overhead cost table

3.2 Partitioning Basic Blocks of Assembly code

The source programs are written in C, and compiled by the SimpleScalar compiler (ssbig-na-sstrix-gcc) to obtain assembly code of SimpleScalar ISA Then, these

assembly programs are divided into basic blocks The algorithm [23] for generating

basic blocks from a listing of code is simple, it is described as follows:

First, find the set of leaders in the code, a leader is the first instruction of a basic block Leaders are one of the following three categories:

- The first instruction is a leader

- The instruction that is a label (target) of a branch/jump instruction is a leader

- The instruction that follows a branch/jump instruction is a leader

For each leader, its basic block consists of this leader and all the following instructions until the next leader Because the control can never pass through the

Trang 23

end of a basic block, some block boundaries may have to be modified after finding the basic blocks For example, in fact, labels, jump instructions and assembly directives are not taken into account in basic blocks

Fig 3.3 shows an example of some basic blocks:

Figure 3.3 Examples of Basic Blocks

3.2 Data Flow Graph construction

A data flow graph is constructed for each basic block This graph presents the data dependencies among the instructions in the given basic block To construct the data flow graph, we need to understand the instruction set architecture and the data

dependencies between its instructions An instruction j is dependent on a previous instruction i if instruction j shares a register or a memory location with instruction i, and therefore j cannot be executed until instruction i has completed execution and

been written back

$L5:

addu $2,$3,1 move $3,$2

Trang 24

There are three types of data dependencies:

- Read After Write (RAW): instruction i and j have a RAW dependency

if i writes to a register or memory operand and after that, j reads from this location

- Write After Read (WAR): instruction i and j have a RAW dependency

if i reads from a register or memory operand and after that, j writes to this location

- Write After Write (WAW): instruction i and j have a RAW

dependency if they write to the same register or memory location

The algorithm to construct a data flow graph of a given basic block is shown

in Fig 3.4

Figure 3.4 Algorithm to construct a DFG

3.4 Generating Power Dissipation Table

We use a subset of SimpleScalar ISA for our experiments, it includes 16

instructions listed in Table 3.1 bellow:

For each instruction i

Begin

For each instruction j after i

Begin

If i and j have WAW then create arc ij

If i and j have RAW then create arc ij

End

For each address read by i

For each instruction j after i

Begin

if i and j have WAR then create arc ij, and

break;

End End

Trang 25

Table 3.1 Instruction set architecture

No Instruction Description Example

2 subu Subtract unsigned integer subu $sp,$sp,16

a subroutine

jal BubbleSort

This subset of instruction set architecture covers all the benchmark programs that are used for experiments Among 16 instructions, there are 4 jump/branch

instructions that can be ignored So we have to generate a PDT includes 12×12

elements

To create the PDT table, we do the following: each element (i, j) of the table

is the power measured by giving the instruction j after instruction i, followed by the

Trang 26

instruction nop, repeated 20,000 times to avoid loop overheads An example of PDT generation is described in Fig 3.5 SimpleScalar Tool Set [16-18] and

SimplePower [19] are used We use SimpleScalar ISA to create assembly programs

and to create the table PDT, the compiler ssbig-na-sstrix-gcc of SimpleScalar are used to compile assembly programs SimplePower is a power simulator for

SimpleScalar ISA, it is used to measure the elements of the PDT and to measure the

power consumption of assembly programs We create 144 assembly files automatically with each file corresponding to a pair of instruction These files are

named in alphabetical order that can be processed automatically in SimplePower

Running all these programs, we obtain the PDT

Figure 3.5 PDT generation example

3.5 Summary

In this chapter, we presented the details our problem in low power instruction scheduling We described the steps of the problem including BB partition, constructing DFG and applying scheduling algorithm for each BB We also presented the method to build a PDT table that is used by our scheduling algorithm

Trang 27

the initial population, construction of the cross over operator, mutation operator, and building fitness function This chapter introduces the genetic algorithm (GA)

approach based on topological sorting for low power scheduling problem

4.1 Genetic Algorithm

In computer science, a genetic algorithm (GA) is a heuristic search algorithm that imitates the process of natural evolution [24] This algorithm is generally used to generate good solutions to search and optimization problems Genetic algorithms belong to the larger family of evolutionary algorithms, which generate solutions to optimization problems using techniques based on natural evolution, such as inheritance, mutation, crossover, and selection Genetic algorithms are applied in bioinformatics, computer science, economics, chemistry, engineering, mathematics, physics and other fields

Genetic Algorithms (GA) were developed in the 1970s by the work of Holland and his colleagues This concept is based on the main idea of the evolutionary theory Genetic Algorithms, as well as evolutionary algorithms in general, formed on the notion that the natural evolution is the most perfect and most reasonable process; and it is optimal This concept can be seen as an axiom, that is suitable with objective reality The evolutionary process represents the optimum in the feature that the later generation is always better (more developed,

Trang 28

more complete) than the previous generation Throughout the process of natural evolution, new generations are generated to supplement, replace the older generations by two basic processes: reproduction and natural selection Each individual to survive and develop has to adapt to the environment, individuals that adapt can survive; poorly adapted individuals can be destroyed

Each individual has a set of chromosomes Each chromosome consists of many genes linked by a chain-like structure, representing this individual’s traits Individuals of the same species have the same chromosome’s structure but their gene’s structures are different, it makes the difference between individuals of the same species and decides the survival of the individual Because the natural environment always changes, chromosome structures also change to adapt to the environment; and later generations are always more adaptable than fomer generations These structures are formed due to the random exchange of information with the external environment or between chromosomes together.From this idea, scientists researched and built the genetic algorithm concept based

on natural selection and evolution rules. Genetic algorithms simulate basic processes of nature: cross over, mutation, reproduction and natural selection Here, each individual can have only one chromosome The chromosome is divided into genes that are arranged in a linear array Each individual (or chromosome) represents a possible solution of the problem Each operation on the set of chromosomes is equivalent to finding a solution in the solution space of the problem [25] The search process has to achieve two goals:

- Exploiting the best solutions

- Considering the entire search space

A genetic algorithm (or any evolutionary algorithm) to solve a particular problem must include the following five components:

- Encoding solutions - a genetic representation for the solution of the

problem

- Creating the initial population

- Constructing the fitness function: a function to evaluate solutions according

to the level of "adaptation" of them

Trang 29

- Constructing genetic operators (selection, cross over and mutation)

- Choosing algorithm parameters (population size, number of generation, stop condition, cross over probability and mutation probability)

A GA takes the process of searching the optimal solution in many directions,

by maintaining a population of solutions and promoting the formation and exchange information between these solutions The population undergoes evolutionary process: In each generation, new individuals will be created by cross over operator and mutation operator with a certain probability Then the relatively

"good" individual will be retained while the relatively "bad" individual will be removed, creating a new generation better than the previous generation

At the loop t, the GA specifies a set of possible solutions (or individuals or chromosomes) that is called population P(t) = { x t 1 , x t 2 , , x t m } (number of

individuals is population size) Each solution x t i is evaluated to determine its suitability Some individuals are chosen for reproduction by cross over and mutation Then, a new set of solutions is formed by selecting the more suitable solutions This leads to a new population P(t+1) with the hope of containing the

individuals that more adaptive than previous population.

Thus, essentially, a GA is an iterative algorithm It aims to solve the problem

of searching based on artificial selection mechanism and the evolution of genes In this process, the survival of an individual depends on the features of its chromosome and the selection process GA uses the operators: selection, cross over and mutation on the chromosomes to create new chromosomes; these operators are essentially copying chromosomes, modifying chromosomes and exchanging sub-chromosomes

A GA can be seen to be different from the conventional optimization algorithms in the following features:

- GA works with the set of code of variables, not with variables

- GA searches on a population of individuals, not on a point, so it reduce the ability to finish searching at a local optimal point and does not reach theglobal optimization

Trang 30

- GA only needs the information from the fitness function in order to find good solutions; it does not need other support information

- The basic operations of the algorithm are based on random integration, and probabilistic and nondeterministic selection

The mechanism of a genetic algorithm is simple but has more power than other conventional algorithms due to the evaluation and selection after each step

So, the ability to reach the optimal solution of GA is faster than other algorithms

4.2 Topological sorting

The topological sort is an ordering of vertices in a directed acyclic graph, such that

if there is a path from vertex v i to vertex v j , then v j appears after v i in the ordering

In the topological sorting procedure, in each step, select any vertex without incoming edges and then store the vertex and its position Then, the vertex and all the arcs from this vertex are removed from the graph As mentioned in chapter 3, scheduling is similar to topological sorting problem; from the Data Flow Graph, we have to choose an order that satisfies the constraints of the graph Our scheduling problem is finding a topological order so that the total cost through all vertices is the smallest or smaller the original’s one The costs between two vertices in a row here is the overhead cost between two vertices

There are more than one sequence of vertices that can be derived from a directed graph using this topological sorting procedure. To overcome this issue, and

to obtain a feasible complete path from a directed graph, an ordering technique using the topological sort and random assignment of priority is used [15] To derive

a unique sequence from a directed graph, a random priority assignment technique

to randomly assign a different priority to each vertex in the graph is used Therefore, a string of priorities can represent a feasible path The topological sorting procedure with priorities assignment is shown in Fig 4.1

Trang 31

Figure 4.1 Topological sorting with random priorities assignment

4.3 Representation of chromosome

According to the topological sorting algorithm above, instead of using a sequence

of vertices of the graph, we can use a string of priorities to represent chromosome, each string of priorities will represent one individual in the population Fig 4.2 shows an example of chromosome representation

Figure 4.2 Chromosome representation

4.4 Cross Over operator

This operator generates a new chromosome from two old chromosomes For

constructing cross over operator, we base on the operator called Moon Cross Over

Trang 32

introduced in [15] but we modify it a little In order to generate a new chromosome,

at the first step, the Moon Cross Over selects a sub string of an old chromosome,

and we do the same, but reverse this string Our cross over operator is described in Fig 4.3, and Fig.4.4 shows an example of it

Figure 4.3 Cross over operator

child = null;

k = 0;

Choose 2 chromosomes c a ,c b ; c a = a 1 a 2 a n ; c b = b 1 b 2 b n ;

Select 2 genes ai and aj randomly from c a (i<j);

if a i ≠ b k then child = <child,a i ,b k >;

else child = <child,a i >;

Else if j == n then

i = i -1;

k = k+1;

if a i ≠ b k then child = <b k ,a i ,child>;

else child = <a i ,child>;

Else

i = i -1;

k = k+1;

if a i ≠ b k then child = <a i ,child,b k >;

else child = <a i ,child>;

End_While

Trang 33

Figure 4.4 Cross over operator example

4.5 Mutation operator

In the mutation operation, any two genes will be randomly selected and swapped

Fig.4.5 shows an example of Mutation operator

Figure 4.5 Mutation operator

4.6 Fitness function

The fitness function is the total cost from the first vertex to the last vertex of the

sequence of vertices which have been sorted In our problem, the path between each pair of vertices is the corresponding PDT value when switching between two instructions that correspond to these vertices Our goal is to select the sequence that has the smallest fitness function value

Định dạng
Số trang	66
Dung lượng	760,88 KB