Luận văn a software approach for lower power consumption

List of Notations PY Sum Energy Power of a program Power Base Cost of instruction 7 Overhead cost between two instruction 7 andj Number of instruction in a basic block Nuniber of times

Trang 1

A SOFTWARE APPROACH FOR LOWER POWER

CONSUMPTION

Bui Ngoc Hai

Faculty of Information Technology University of Engineering and Technology

Vietnam National University, Hanoi

Supervised by

Assoc Prof Dr Nguyen Ngoc Binh

A thesis submitted in fulfillment of the requirements for the degree of

Master of Science in Computer Science

April 2014

Trang 2

ORIGINALITY STATEMENT

‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another

person, or subslantial proportions of material which have been accepted for the

award of any olher degree or diploma at University of Engineering and Technology (URT) or any other educational institution, except where due acknowledgement is

made in the thesis Any contribution madc to the research by others, with whom T

have worked at CET or elsewhere, is explicitly acknowledged in the thesis 1 also

declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.”

Hanoi, April 25th , 2014

Signed

Trang 3

ABSTRACT

Optimizing the power consumption is an important topic in embedded system engineering, especially for embedded systems that use battery power source Power

optimization can be achieved by sofware techniques and instruction scheduling is

an effective software approach for reducing power cost of processor(s) In this

lhosis, we propose our idea of using a genclic algorilhm for low power instruction scheduling Our algorithm is applied to cach basic block of assembly code to

generate lower power program In the experiment section, we use two open source

simulation tools that are SimpleScalar ‘Tool Set and SimplePower, the algorithm is

applied to assembly programs of SimpleScalar Instruction Set, these programs are conipiled and then have their power consumptions measured by SimplePower The experimental results showed the effectiveness of our proposed method This scheduling method will be combined with the idea of reducing memory access for

Jow power design in our further work.

Trang 4

ACKNOWLEDGEMENTS

First and foremost, 1 would like to express my deepest gratitude to my supervisor, Assoc.Prof.Dr Nguyen Ngoc Binh for giving me the opportunity to work with him and for his patient guidance and continuous support throughout the years

I would like to give my honest appreciation to my colleagues at the Laboratory of

Embedded Systems for their great support i also would like to thank all my friends who gave me moral support during this work

Finally, this thesis would not have been possible without the moral support and

love of my parents and my brother ‘thank youl

Trang 5

Table of Contents

1.2 Power optimization by instruction scheduling

1.4 Thesis organization

Chapter 2 Related Work

2.1 Soflware power cstimation

2.2 Energy code driven generation for low power

2.3 Ređuecing meinory accesS

2.4 Software power optimization using symbolic algebra 5 2.5 List scheduling Lor low power

2.6 Instruction scheduling to reduce switching activity

2.7 Low power instruction scheduling as traveling salesman problem

6

7

2.9 Instruction scheduling to reduce the olf-chip power 8 2.10 Fnergy-oriented and performance-orienled combination scheduling 8

2.11 Criticality-directed and Uncriticality-directed instruction scheduling for low

3.2 Data Flow Graph construction

Trang 6

Chapter 4.Genetic Algorithm fur low power Instruction scheduling

5.8 Analysis and evaluation

Chapter 6 Conclusion and Future Work

References

Appendix A Some important source code

Appendix B Source code of benchmark programs

Appendix C Power Dissipation Table

Appendix D An example of scheduling a basic block

Trang 7

List of Figures

Figure 2.1 List scheduling for low power

Figure 3.1 Flow of low power struction scheduling

Figure 3.2 An example of a Basic Block and its Data Flow Graph

Figure 3.3 Examples of Basic Blocks

Figure 3.4 Algorithm to construct a IDK

Tigure 3.5 PDT generation example ke

Figure 4.1 Topological sorting with random priorities ‘assignment

Figure 4.2 Chromosome represunlalion

Figure 4.3 CTOSS OVET OD€FALOF

Figure 4.4 Cross aver operator example

Figure 4.3 Mutation operator

Figure 4.6 Genetic algorithm for low power er scheduling

Figure 5.1 Experimental framework

Figure 5.2 SimpleScalar simulator software architecture

Figure 5.3 SimplePower resull cxample

Trang 8

List of Tables

‘Table 3.1 Instruction set architecture

Table 5.1 Experimental benchmark set

Table 5.2 Experimental results of GA scheduling

‘Yable 5.3 Kxperimental results of list scheduling

Table 5 ‡† Results comparison oftwo alaorithms

Trang 9

Genetic Algorithm Instruction Set Architecture Power Dissipation Table

Portable Instruction Set Architecture

Particle Swarm Optimization

Read after Write

Register ‘Transfer Level

‘Travelling Salesman Problem

Write after Read

White after Write

Trang 10

List of Notations

PY)

Sum Energy Power of a program Power Base Cost of instruction 7 Overhead cost between two instruction 7 andj Number of instruction in a basic block

Nuniber of times i get oxocuted Numiber of times the pair (1,/) gel executed Energy cost of other effcets of the program Population at the loop ¢

Solution / at the loop ¢ Vertex i

Trang 11

Chapter 1

Introduction

In embedded system engineering, optimization is an important problem Embedded

systems always have limited resources such as the size of memory, the speed of the processor, power supply, etc Optimization will make the system work more efficiently with allowed resources Optimizing the power consumption is an important issue, especially for embedded systems using battery power source

Since embedded devices are portable and use DC powered cells, opumized power

consumplion can help prolong the life time of such sysiems Today, embedded devices are hecoming more popular in daily fife as well as in science and

technology Many people have become attracted by popular embedded devices such as smart phones, tablet computers, MP3 players Ubiquitous devices will need

a longer battery life Since the ability to reduce power consumption of an embedded device is more important, this optimization problem has become a major

challonge for the designers and the manufacturers They must continually improve product qualily to meet the needs of users, and low power consumption really is a

necessary requirement

1.1 Software power optimization

Software controls most activity of hardware in the systems, therefore, it can have a

significant effect on the power dissipation of a system Power can be optimized by software techniques There have been many software techniques on optimizing the

power consumption of the processor Some techniques are instruction scheduling

[1-10], roducing memory accoss, energy cost driven code generation, and

instruction packing [2-3,11] Since the order of instructions controls intcrnal

switching in the processor, it can affect the power of processor during execution

‘Therefore choosing a suitable order of instructions can reduce power consumption

of the system In terms of energy consumption, memory accesses are more

expensive than register accesses, so optimal register allocation that reduces the

Trang 12

memory operands, can also reduce power If we can obtain a table of pawer cosis

of individual instructions, a reduction in total power cost can be obtained by using a code generator which selects proper low power instructions Another method is

oplimizing program source code for low power An example of this method is

proposed im [12], where the authors optimize the C code by approximating complex expressions by simple polynomial expressions and converting floating point data to fixed point data, so that the energy dissipation of the program exccution is reduced

Some other techniques are voltage software controlling and frequency sealing for dynamic power optimization [13]

1.2 Power optimization by instruction scheduling

One of the main features that affect the power consumption of the systems is how the asscmbly instructions are scheduled or combined together, the power consumed during the execution of an instruction will depend on the previous instruction Kor a

given C program as well as other high level languages code, there can be more than one sequence of instructions (assembly code) for a given processor Therefore, a suitable order of instructions in a program can result in the lower power consumption Instruction Scheduling for Low Power is an effective software

approach {or power optimization; this work is reordering the assombly instructions

of a program so that power consumption is reduced, of course keoping the semantics of the program There are many scheduling techniques that have been proposed, most of which aim ta reduce overhead cost between pairs of instructions

[2-5,7-9,11] Moreover, some techniques have other objectives for power reduction such as reducing switching activity [6,22] and using critical path [14]

the data flow graph, ie when we create a new order of instructions, we are not sure

whether it satisfies the data flow graph or not Ilence, our approach uses a genetic

algorithm with a chromosome encoding that solves the dala dependency problems

Trang 13

better This method was introduced in [15], where the authors proposed this method

to solve the traveling salesman problem (TSP) We apply their method to the scheduling problem in order to reduce energy consumption This algorithm has the

advantage of avoiding the local optimum For finding solutions in the large search

space, we usc a heurislic table, called Power Dissipation Table (PDT), which is

generated by power simulations A PIDT for an instruction set with 7 instructions is

a(n X n) matrix, whore cach cntry PDT(i,) is the power cost consumed in the

execution of imstruction i followed by instruction j Hach entry is used as overhead cost between i and j, and this table is used for evaluating the solution The original assembly programs are divided into basic blocks, and then a Data Flow Graph (DFG) is constructed for each basic block This is a directed graph that presents the

dala dependencies of instructions in a basic block Our algorithm is applied for

cach basic block of an assembly program, it takes as input a data flow graph of a given hasic block and the power dissipation table and its output is the low power

instruction scquence For experiments, we use two open source simulation tools:

SimpleScalar Tool Set [16-18] and SimplePower [19] A sub set of SimpleScalar

Instruction Set is considered and SimplePower is used to simulate the power consumption The algorithm is applied to assembly programs of SimpleScalar ISA Then these programs are compiled and their power consumptions are measured by SimplePower for visual observation

1.4 Thesis organization

‘The remainder of this thesis is organized as follows Chapter 2 introduces some related works about software power optimization and instruction scheduling for low power Chapter 3 describes the steps of our low power instruction scheduling problem in detail Chapter 4 presents the proposed approach based on a genetic

algorithm for this problem Chapter 5 reports the simulation toals, the benchmarks

for cxperiment, experimental results and analysis Chapter 6 presents our

conclusions and introduces Lhe future work,

Trang 14

Chapter 2

Related Work

This chapter summarizes related research about software power optimization

2.1 Software power estimation

‘The first step for power consumption is power estimation ‘The ability to estimate

software power consumption can help to verify that a design meets its power constraints and verify the correctness of power optimization methods

V ‘Tiwari [1,20-21] was the first researcher who proposed an energy

estimation model for a processor, and he also proposed the idea of scheduling assembly code for low power consumption In his model, each instruction in the instruction set architecture consumes a fixed energy cost called the base energy cost The base energy cast is computed as product of the voltage and the average

current in the processor while ruming a loop with a sequence of the same

inslruction The other main component of the model is inter-insiruction effects,

which arc also considered The inter-instruction effects mehades citect of circuit

state, effect of resource constraints e.g, pipeline stalls and write buffer stalls and

effect of cache misses The circuit state overhead (overhead cost) between a pair of instructions is the power difference between the actual cost of the pair and the average of the base cost of the individual instructions

‘The total power of a program is calculated as the sum of base energy costs of

all instructions and all inter-instruction effects ‘he total power cost EK, of an

assembly program can be given by equation (2 1)

Ey = Ve xN)+ Yeu * Ni) +> Ey (2.1)

up

where 8; is the base cost lor cach imstruction 7, and N7 is Lac number of umes

it will pet executed, and for every pair of consecutive instructions (ij), O,, is the

circuit state overhead between j and j, N,, is the number of times this pair will be

executed Hy is the energy of the other inter-instruction effects such as pipeline

Trang 15

stalls, write buffer stalls and cache misses that would occur during program execution Our scheduling method uses the energy model proposed by Tiwari for finding solutions

2.2 Energy code driven generation for low power

‘The code generation method is proposed by V ‘Tiwari ef af [2-3] ‘This approach

proposes the selection of instructions based on their power cost The main idea is a reduction in total power cost can be obtained by using a code generator which selects suitable low power instructions

2.3 Reducing memory access

V Tiwari et af, also proposed a memory operand reduction approach in [2-3] ‘This

approach is based on the fact that instructions with memory operands have very high-energy cost compared to instructions with register operands Therefore, much energy reduction can be obtained by reducing the number of memory operands, and

an efficient register management can bring this benefit by replacing the memory

38 instructions with regisicr access instructions so that the semantic of the

program is not changed

2.4 Software power optimization using symbolic algebra

In [13], the authors optimized the C code program for reducing power cost The main idea is approximating complex expressions by simple polynomial expressions

and converting floating point data lo Gxed point data, so that the energy dissipation

of the program execution is reduced This work is a scquonee of Lochniques Firs,

an energy profiler is used Lo find all energy critical code blocks Second, a tool is used to transform floating-point data to fixed-point data Third, complex nonlinear

arithmetic expressions are approximated by polynomials Finally, the polynomial representations of the critical basic blocks are mapped to the instruction set using symbolic algebra

2.5 List scheduling for low power

Instruction scheduling for low power is first prosented by V Tiwari [1-3,11| as the instruction level power model and the idea of reordering assembly instructions

Trang 16

Instruction scheduling for low power aims to reduce the circuit state overhead overhead cost), which is the energy dissipated due to switching from execution of

one instruction to another Ilis research indicated that instructions can be reordered

to have a smaller amount of circuit slate overhead; therefore, we can obtain low

power with a suilable order of instructions V Tiwari used the List Scheduling

Algorithm [11] for his experiments It is a basic scheduling algorithm with the greedy strategy In fact, the list scheduling is just a simple Topological sorting

algorithm ‘The algorithm performs on each basic block of instructions Krom a data flow graph that presents the data dependency of a basic block, at each step, it chooses the instruction with lowest overhead cost (circuit state overhead between

the previous instruction and il) from the priority list; whore, priarily list is the list conlains all instructions which have no dependence lo any other This simple

algorithm is used to show the potential of power reduction by roordering assembly

code List Scheduling is shown in Fig, 2.1

2.6 Instruction scheduling to reduce switching activity

In [6], C-L Su e¢ al proposed a cold scheduling algorithm Lo reduce lhe switching

activity of the processor The authors suggested a memory addressing method using

Gray code, Gray code has a onc-bil dilleronce in reprosentation for consecutive numbers The use of gray code addressing can reduce the number of bit switches of the address buses, lead lo a large amount of reduction of power consumption because most of the program instructions access the consecutively addressed

memory locations ‘fhe cold scheduling technique is a software approach based on

a traditional list scheduling algorithm to reduce the switching activity of the control path

Trang 17

Input: List L, a DFG of a basic block:

For each vertex v in DFG:

itv has no predecessor, add v tn 15

Choose a random vertex i from L;

Schedule i, then remove é from L, remove é and all ares of i from DFG;

While £ is not empty do

Choose the highest priority j in Z;

(overhead cost between the previous instruction and j is smallest)

Schedule j;

Remove Lrom L, remove j and all arcs of j fram DFG;

Update L by adding new vertices with no predecessor,

End_While

Return schedule

Figure 2.1 List scheduling for low power

2.7 Low power instruction scheduling as a traycling salesman problem

K Choi ef ai [4] presented another method by fornmlating the instruction

scheduling problem as a traveling salesman problem (TSP) They used the

minimum spanning tree and a simulated annealing technique for finding optimal

solutions The scheduling technique uses a power dissipation table (PDT), which is

an fn x n) matrix, where each entry (ij) is the average power consumed when

instruction i followed by instruction 7 The scheduling algorithm uses a control flow and data dependency graph for cach basic block and the PDT The problem of

Trang 18

instruction reordering for low power is transformed to finding the tour of lowest cost (TSP) in the constraint graph

2.8 Force-directed instruction scheduling for low power

P Dongale [5] in his master thesis, proposed the algorithm /force-directed

scheduling for low power (EP-ISI.P) This is an application of the classic force-

directed scheduling algorithm to low power problem As well as Choi’s method above, this method also uses a PDT as a heuristic table for finding a good instruction order for low power

2.9 Instruction scheduling to reduce the off-chip power

Another low power scheduling technique was presented by H Tomiyama et al

[22] This method aims to reduce the off-chip power By reducing the bit-switching,

on the off-chip buses The scheduling algorithm attempts to find an optimal instruction order of each basic block by decreasing the difference of binary representations of two consecutive instructions in the memory Then the power

consumplion is reduced because the number of lansilions on tbe dala bus are minimized

2.10 Knergy-oricnted and performancc-oricnted combination scheduling

A Parikh et ai [7-8] proposed the method of performance-oriented scheduling, and energy-oriented scheduling, and also a method by combining these two approaches

The performance-oriented scheduling uses the list scheduling algorithm with time

as ihe objeclive parameter Tho cnorgy-oricnicd approach also uscs the list scheduling, but uses circuil-slale overhead (inlcr-instruction c{Tecl) as the objective parameter In the energy-oriented scheduling, at each step, the scheduler selects the

next node with the least circuit-state overhead In the combined method, the

selections of the scheduler are mainly based on one parameter, and the other parameter is considered only when there is a tie, specifically, the decisions are based on circuit-state overhead and the time parameter is used only when there is a

tie with the overhead values.

Trang 19

2.11 Criticality-dirccted and uncriticality-directed instruction scheduling for

low power

Two methods called Criticality-directed and Uncriticality-directed instruction

schedwing for low power were proposed by S Watanabe ef af [14].‘The critical path is the longest path in a data flow graph (DIG) Instructions on a critical path determine the execution time of the program These called critical instructions The first algorithm is based on the idea that every functional unit in the processor has

different performance and power consumption Only critical instructions arc

scheduled in fast and power-hungry units and the rest arc scheduled in the slow and power-efficient ones, so the total power consumption can be reduced In contrast,

for the second algorithm, instead of finding the critical instructions, the authors

proposed a method to exploit uncritical ones Only uncritical instructions are scheduled in power-efficient units, and energy consumption can be reduced

2.12 Low power instruction scheduling using Particle Swarm Optimization algorithm

C Mian e¢ al [9] introduced a scheduling method which is based on Particle

Swarm Optimization (PSO) Algorithm to reduce the signal transilions The authors

take the low power instruction scheduling problem to the discrete PSO problem and

modily ihe PSO solution to satisly thc data {low graph to find the better instruction order with fewer signal transitions; then the velocity updating formula is also

improved in order to get better results

2.13 Summary

In this chaplor, we proscnl some oxisling works im software power optimization, {ocusing mainly on instruction scheduling approach, The Lirsl slep for optimizing

power dissipation of a program is power cstimation, software power consumption

can be reduced by reducing memory access, code generation, optimizing source

code, instruction scheduling

Trang 20

Chapter 3

Instruction Scheduling for Low Powcr

This chapter describes the low power instruction scheduling problem and explains the steps needed to solve it

3.1 Problem description

Our scheduling problem involves the following steps

* Construot Data Flow Graph for cach basic block

Original assembly programs are divided into basic blocks, then a Data Flow Graph (DFG) is constructed for cach basic block This is a directed graph that presents the data dependencies of instructions in a basic block ‘The scheduling

algorithm is applied to each basic block of an assembly program, it takes as input a data flow graph of a given basic block and the power dissipation table and it outputs the low power instruction sequence Scheduling is similar to Topological sorting problem, from the Data Flow Graph, we have to choose an order that

satisfies the constraints of the graph Our scheduling problem is finding a topological order so that the total cost through all vertices is the smallest or smaller

the original’s onc, the costs between two vertices in a row here is the overhead cost

between two vertices ‘The flow diagram of our instruction schedulmg problem and

is shown in Fig 3.1

Trang 21

Souree Code

Ỷ Assembly Code

Figure 3.1 How of low power instrnotion scheduling

A basic block is a piece of code which has only one entry point and one cxit

point after the last instruction in the BB One entry point means that there is no

instruction within it that is the destination of a jump instruction anywhere in the

program One exit paint means that only the [ast instruction can cause the program

to begin executing instruction in a different basic block

A Data Flow Graph (DFG) is a graph which presents the data dependency of

the instructions in a basic block It is a directed acyclic graph where each instruction of the basic block is presented by a vertex and each are represents the dependency of an instruction pair Fig 3.2 shows an example of a basic block and

its DFG

Trang 22

sw 8fÐ/20(%Sp)

move §fp,$sp

3.sw $3,22(8ip) sw 86,24(${p)

.l §4,0x00000002

sw $4, 0(8Fp)

Figure 3.2 An cxarmple of a Basic Block and its Data Flow Graph

Here, we cannot measure the overhead cost between each pair of

instructions, but we can measure the energy consumption of each pau of instructions This power includes base energy cost of each instruction and overhead cost between them By measuring the power consumption in pairs as above, we build a Power Dissipation Table (PDT) This is a matrix, where each element (i, j)

of the table represents the power dissipation when instruction 7 is followed by instruction , this table will be uscd instead of the overhead cost table

3.2 Partitioning Basic Blocks of Assembly code

‘The source programs are written in C, and compiled by the SimpleScalar compiler (sshig-na-sstrix-gcc) to obtain asscmbly code of SimpleScalar ISA Then, these assembly programs are divided into basic blocks ‘he algorithm [23] for generating

basic blocks from a listing of code is simple, it is described as follows:

First, find the set of leaders in the code, a leader is the first imstruction of a

basic block Leaders are one of the following three categories:

- The first mstruction is a leader

- The instruction that is a label (target) of a branch/jump instruction is a

leader

- The instruction that follows a branch/jump instruction is a leader

For each leader, its basic block consists of this leader and all the following

instructions until the next leader Because the control can never pass through the

Trang 23

end of a basic block, some block boundaries may have to be modified after finding the basic blocks For example, in fact, labels, jump instructions and assembly

directives are not taken into account in basic blocks

Fig 3.3 shows an example of some basic blocks:

addu $2,83,1 move $3,52

Tigure 3.3 Examples of Basic Mocks

3.2 Data Flow Graph construction

A data flow graph is constructed for each basic block This graph presents the data dependencies among the instructions in the given basic block To construct the data flow graph, we need to understand the instruction set architecture and the data

dependencies between its instruclions An instruction 7 is dependent on 4 previous

instruction i if mstruction j shares a register or a memory location with instruction 4,

and therefore j cannot he executed until instruction i has completed execution and

been written back.

Trang 24

There are three types of data dependencies

- Read After Write (RAW): instruction i and f have a RAW dependency

if 1 writes to a register or memory operand and after that, j reads from this location

- Write After Read (WAR): instruction / andj have a RAW dependency

if i reads from a register or memory operand and after that, j writes to this location

- Write Afler Write (WAW): instruction i and j have a RAW dependency if they write to the same register or memory location

The algorithm to construct a data flow graph of a given basic block is shown

For each instruction j after i Begin

if i andj have WAR then create are ÿ, and break,

End End

Figure 3.4 Algorithm to construct a DEG

3.4 Generating Power Dissipation Table

We use a subset of SimpleScalar ISA for our experiments, it includes 16

inslruclions hsted in Table 3.1 bellow:

Trang 25

Table 3.1 Instruction set architecture

1 addu Add unsigned integer addu $2,$3,$4

2 subu Subtract unsigned integer subu $sp.$sp,16

6 sit Tests if one register is less | sh $5,$3,$6

than another

7 sra Shift right arithmetic sra $4,$5,3

8 mile Move from LO register milo $5

12 mult Muluply two registers mult $7,$8

ta (ial Jump and link - used to call | jal BubbleSort

a subroutine

16 bne Branch on nol cqual bne B4,$5,SL7

‘This subset of instruction set architecture covers all the benchmark programs that are used for experiments Among 16 instructions, there are 4 jump/branch instructions that can be ignored So we have to generate a PDT includes 1212

elements

To create the PDT table, we do the following: each element fi, j) of the table

is the power measured by giving the instruction j after instruction i, followed by the

Trang 26

instruction nop, repeated 20,000 times to avoid loop overheads An example of

PDT generation is described in Fig 3.5 SimpleScalar Tool Set [16-18] and

SimplePower [19] are used We use SimpleScalar ISA to create assembly programs

and to create the table PDT, the compiler ssbig-na-sstrix-gce of SimpleScalar are used to compile assembly programs SimplePower is a power simulator for

SimpleScalar ISA, it is used to measure the elements of the PDT and to measure the

power consumption of assembly programs We create 144 assembly files

automatically with each file corresponding to a pair of instruction These files are

named in alphabetical order that can be processed automatically in SimplePower Running all these programs, we obtain the PDT

subu §2,S4,S6

nop addu $3,$5,$7 subu $2,$4,86 nop

Trang 27

and building fimess function This chapter introduces the genetic algorithm (GA)

approach based on topological sorting for low power scheduling problem

4.1 Genetic Algorithm

In computer scicnec, a genetic algorithm (GA) is a heuristic scarch algorithm that

imitates the process of natural evolution [24] ‘this algorithm is generally used to

generate good solutions to search and optimization problems Genetic algorithms

belong to the larger family of evolutionary algorithms, which generate solutions to optimization problems using techniques based on natural evolution, such as

inherilance, mutation, crossover, and seleclion Genelic algorithms are applied in

bioinformatics, computer science, economics, chemistry, engineering, mathematics,

physics and other ficlds

Genetic Algorithms (GA} were developed in the 1970s by the work of Ilolland and his colleagues This concept is based on the main idea of the

evolutionary theory Genelie Algorithms, as well as evolutionary algorithms in

gencral, formed on the notion that the natural evolution is the most perfcet and

most rcasonable process, and it is optimal This concept can be scen as an axiom,

that is suitable with objective reality ‘lhe evolutionary process represents the

optimum in the feature that the later generation is always better (more developed,

Trang 28

more complete) than the previous generation Throughout the process of natural evolution, new generations are generated to supplement, replace the older generations by two basic processes: reproduction and natural selection Lach

individual lo survive and develop has to adapt to the environment, individuals that

adapt can survive, poorly adapted individuals can be destroyed

Each individual has a set of chromosomes Each chromosome consists of many genes linked by a chain-like structure, representing this individual’s traits

Individuals of the same species have the same chromosome’s structure but their gene’s structures are different, it makes the difference between individuals of the same species and decides the survival of the individual Kecause the natural

environment always changes, chromosome structures also change to adapt to the environment, and later generations are always more adaptable than fomer generations These structures are formed due to the random exchange of information with the external environment cr between chromosomes together

From this idea

each individual can have only one chromosome The chromosome is divided into

genes that are arranged in a linear array Each individual (or chromosome)

represents @ possible solulion of the problem, Each operation on the set of chromosomes is equivalent to finding a solution in the solution space of the problem [25] The search process has to achieve two goals

- Exploiting the best solutions

- Considering the entire search space

A genetic algorithm (or any evolutionary algorithm) to solve a particular

problem must include the following five components

- Encoding solutions - a genetic representation for the solution of the

problem

- Creating the initial population

- Constructing the fitness function: a function to ovaluate solutions according

lo the level of "adaptation" of them.

Trang 29

- Constructing genetic operators (selection, cross over and mutation)

- Choosing algorithm parameters (population size, number of generation,

stop condition, cross over probability and mutation probability)

A GA takes the process of scarching the optimal solution im many directions,

by maintaining a population of solutions and promoting the formation and

exchange information between these solutions The population undergoes evolutionary process: In each generation, new individuals will be created by cross

over operator and mulation operator with a corlain probability Thon the relatively

"good" individual will be retained while the relatively "bad" individual will be

removed, creating a new generation better than the previous generation

Al the loop 4 the GA specifies a scl of possible solutions (or individuals or

chromosomes) that is called population P(t) — { x, x44 4” 2 (number of

individuals is population size) Each solution x, is evaluated to determine its suitability Some individuals are chosen for reproduction by cross over and mutation Then, a new set of solutions is formed by selecting the more suitable solutions This leads to a new population Pffi 1) with the hope of containing the

individuals that more adaptive than previous population

Thus, essentially, a GA is an iterative algorithm It aims to solve the problem

of searching, based on artificial selection mechanism and the evolution of genes In

this process, the survival of an individual depends on the features of its

chromosome and the selection process GA uses the operators: selection, cross over

and mutation on the chromosomes to create new chromosomes; these operators are

essentially copying chromosomes, modifving chromosomes and exchanging sub-

chromosomes

A GA can be seen to be different from the conventional optimization algorithms in the following features:

- GA works with the set of code of variables, not with variables

- GA searches on a population of individuals, not on a point, so it reduce the ability to finish searching at a local optimal point and does not reach the global optimization

Trang 30

- GA only needs the information from the fitness function in order to find good solutions; it does not need other support information

- ‘the basic operations of the algorithm are based on random integration, and

probabilistic and nondeterministic sclection

The mechanism of a genetic algorithm is simple but has more power than

other conventional algorithms due lo the evaluation and selection afler each slep

So, the ability to reach the optimal solution of GA is faster than olher algorithms 4.2 Topological sorting

The topological sort is an ordering of vertices in a directed acyclic graph, such that

if there is a path from vertex v; to vertex v,, then w, appears after v, in the ordering

In the topological sorting procedure, in each step, select any vertex without

incomimg edges and then store the vertex and its position hen, the vertex and all the arcs from this vertex are removed from the graph As mentioned in chapter 3,

scheduling is similar to topological sorting problem, from the Data Flow Graph, we

have to choose an order that satisfies the constraints of the graph Our scheduling

problem is finding a topological order go thal the total cost through all vertices is

the smallest or smaller the original’s onc The costs between two vertices in a row

here 1s the overhead cost between two vertices

There are more than one scquence of vertices thal can be derived from a

directed graph using this topological sorting procedure To overcome this issue, and

to obtain a feasiblc complete path from a directed graph, an ordering technique using the topological sort and random assignment of priority is used [15] ‘To derive

a unique sequence from a directed graph, a random priority assignment technique

to randomly assign a different priority to each vertex in the graph is used Therefore, a string of priorities can represent a feasible path The topological sorting procedure with priorities assignment is shown in Fig 4.1

Trang 31

Input: a Data Flow Graph

List L;

While {any vertex remains) da

If every vertex has a predecessor then

According to the topological sorting algorithm above, inslcad of using a sequence

of vertices of the graph, we can use a string of priorilics lo represcnt chromosome, cach string of priorities will represent onc individual in the population Fig, 4.2 shows an example of chromosome representation

4.4 Cross Over operator

This operator generates a new chromosome from two old chromosomes For constructing cross over operator, we base on the operator called Afoan Cross Over

Trang 32

2

introduced in [15] but we modify it a little In order to generate a new chromosome,

at the first step, the Adaon Cross Over selects a sub string of an old chromosome,

and we do the same, but reverse this string Our cross over operator is described in

Tig 4.3, and Fig 4.4 shows an example of it

child = a0, 1.043 sub c, = the remaining substring results from deleting genes in child from o, sub_cy — cụ — child

While (length of ehild <n) do

Ifi==1 then

i= n, k=kt iff at by then child = <child,a,d.>;

else child = <child.a;>:

Else ifj

i=i-l;

k=kti;

if aif by then child = <dia,child>;

clse child <a,child>;

if a¢ by then child = <a,child,b>;

else child = <a, child>;

Figure 4,3 Cross over operator

Trang 33

The fitness function is the total cost from the first vertex to the last vertex of the

sequence of vertices which have been sorted In our problem, the path between each pair of vertices is the corresponding PDT value when switching between two instructions that correspond to these vertices Our goal is to select the sequence that has the smallest fitness function value

Tiêu đề	A Software Approach for Lower Power Consumption
Tác giả	Bui Ngoc Hai
Người hướng dẫn	Assoc. Prof. Dr. Nguyen Ngoc Binh
Trường học	University of Engineering and Technology, Vietnam National University, Hanoi
Chuyên ngành	Computer Science
Thể loại	thesis
Năm xuất bản	2014
Thành phố	Hanoi

Định dạng
Số trang	66
Dung lượng	782,95 KB