List of Notations PY Sum Energy Power of a program Power Base Cost of instruction 7 Overhead cost between two instruction 7 andj Number of instruction in a basic block Nuniber of times
Trang 1A SOFTWARE APPROACH FOR LOWER POWER
CONSUMPTION
Bui Ngoc Hai
Faculty of Information Technology University of Engineering and Technology
Vietnam National University, Hanoi
Supervised by
Assoc Prof Dr Nguyen Ngoc Binh
A thesis submitted in fulfillment of the requirements for the degree of
Master of Science in Computer Science
April 2014
Trang 2ORIGINALITY STATEMENT
‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another
person, or subslantial proportions of material which have been accepted for the
award of any olher degree or diploma at University of Engineering and Technology (URT) or any other educational institution, except where due acknowledgement is
made in the thesis Any contribution madc to the research by others, with whom T
have worked at CET or elsewhere, is explicitly acknowledged in the thesis 1 also
declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.”
Hanoi, April 25th , 2014
Signed
Trang 3ABSTRACT
Optimizing the power consumption is an important topic in embedded system engineering, especially for embedded systems that use battery power source Power
optimization can be achieved by sofware techniques and instruction scheduling is
an effective software approach for reducing power cost of processor(s) In this
lhosis, we propose our idea of using a genclic algorilhm for low power instruction scheduling Our algorithm is applied to cach basic block of assembly code to
generate lower power program In the experiment section, we use two open source
simulation tools that are SimpleScalar ‘Tool Set and SimplePower, the algorithm is
applied to assembly programs of SimpleScalar Instruction Set, these programs are conipiled and then have their power consumptions measured by SimplePower The experimental results showed the effectiveness of our proposed method This scheduling method will be combined with the idea of reducing memory access for
Jow power design in our further work.
Trang 4ACKNOWLEDGEMENTS
First and foremost, 1 would like to express my deepest gratitude to my supervisor, Assoc.Prof.Dr Nguyen Ngoc Binh for giving me the opportunity to work with him and for his patient guidance and continuous support throughout the years
I would like to give my honest appreciation to my colleagues at the Laboratory of
Embedded Systems for their great support i also would like to thank all my friends who gave me moral support during this work
Finally, this thesis would not have been possible without the moral support and
love of my parents and my brother ‘thank youl
Trang 5Table of Contents
1.2 Power optimization by instruction scheduling
1.4 Thesis organization
Chapter 2 Related Work
2.1 Soflware power cstimation
2.2 Energy code driven generation for low power
2.3 Ređuecing meinory accesS
2.4 Software power optimization using symbolic algebra 5 2.5 List scheduling Lor low power
2.6 Instruction scheduling to reduce switching activity
2.7 Low power instruction scheduling as traveling salesman problem
6
7
2.9 Instruction scheduling to reduce the olf-chip power 8 2.10 Fnergy-oriented and performance-orienled combination scheduling 8
2.11 Criticality-directed and Uncriticality-directed instruction scheduling for low
3.2 Data Flow Graph construction
Trang 6Chapter 4.Genetic Algorithm fur low power Instruction scheduling
5.8 Analysis and evaluation
Chapter 6 Conclusion and Future Work
References
Appendix A Some important source code
Appendix B Source code of benchmark programs
Appendix C Power Dissipation Table
Appendix D An example of scheduling a basic block
Trang 7List of Figures
Figure 2.1 List scheduling for low power
Figure 3.1 Flow of low power struction scheduling
Figure 3.2 An example of a Basic Block and its Data Flow Graph
Figure 3.3 Examples of Basic Blocks
Figure 3.4 Algorithm to construct a IDK
Tigure 3.5 PDT generation example ke
Figure 4.1 Topological sorting with random priorities ‘assignment
Figure 4.2 Chromosome represunlalion
Figure 4.3 CTOSS OVET OD€FALOF
Figure 4.4 Cross aver operator example
Figure 4.3 Mutation operator
Figure 4.6 Genetic algorithm for low power er scheduling
Figure 5.1 Experimental framework
Figure 5.2 SimpleScalar simulator software architecture
Figure 5.3 SimplePower resull cxample
Trang 8
List of Tables
‘Table 3.1 Instruction set architecture
Table 5.1 Experimental benchmark set
Table 5.2 Experimental results of GA scheduling
‘Yable 5.3 Kxperimental results of list scheduling
Table 5 ‡† Results comparison oftwo alaorithms
Trang 9Genetic Algorithm Instruction Set Architecture Power Dissipation Table
Portable Instruction Set Architecture
Particle Swarm Optimization
Read after Write
Register ‘Transfer Level
‘Travelling Salesman Problem
Write after Read
White after Write
Trang 10List of Notations
PY)
Sum Energy Power of a program Power Base Cost of instruction 7 Overhead cost between two instruction 7 andj Number of instruction in a basic block
Nuniber of times i get oxocuted Numiber of times the pair (1,/) gel executed Energy cost of other effcets of the program Population at the loop ¢
Solution / at the loop ¢ Vertex i
Trang 11Chapter 1
Introduction
In embedded system engineering, optimization is an important problem Embedded
systems always have limited resources such as the size of memory, the speed of the processor, power supply, etc Optimization will make the system work more efficiently with allowed resources Optimizing the power consumption is an important issue, especially for embedded systems using battery power source
Since embedded devices are portable and use DC powered cells, opumized power
consumplion can help prolong the life time of such sysiems Today, embedded devices are hecoming more popular in daily fife as well as in science and
technology Many people have become attracted by popular embedded devices such as smart phones, tablet computers, MP3 players Ubiquitous devices will need
a longer battery life Since the ability to reduce power consumption of an embedded device is more important, this optimization problem has become a major
challonge for the designers and the manufacturers They must continually improve product qualily to meet the needs of users, and low power consumption really is a
necessary requirement
1.1 Software power optimization
Software controls most activity of hardware in the systems, therefore, it can have a
significant effect on the power dissipation of a system Power can be optimized by software techniques There have been many software techniques on optimizing the
power consumption of the processor Some techniques are instruction scheduling
[1-10], roducing memory accoss, energy cost driven code generation, and
instruction packing [2-3,11] Since the order of instructions controls intcrnal
switching in the processor, it can affect the power of processor during execution
‘Therefore choosing a suitable order of instructions can reduce power consumption
of the system In terms of energy consumption, memory accesses are more
expensive than register accesses, so optimal register allocation that reduces the
Trang 12memory operands, can also reduce power If we can obtain a table of pawer cosis
of individual instructions, a reduction in total power cost can be obtained by using a code generator which selects proper low power instructions Another method is
oplimizing program source code for low power An example of this method is
proposed im [12], where the authors optimize the C code by approximating complex expressions by simple polynomial expressions and converting floating point data to fixed point data, so that the energy dissipation of the program exccution is reduced
Some other techniques are voltage software controlling and frequency sealing for dynamic power optimization [13]
1.2 Power optimization by instruction scheduling
One of the main features that affect the power consumption of the systems is how the asscmbly instructions are scheduled or combined together, the power consumed during the execution of an instruction will depend on the previous instruction Kor a
given C program as well as other high level languages code, there can be more than one sequence of instructions (assembly code) for a given processor Therefore, a suitable order of instructions in a program can result in the lower power consumption Instruction Scheduling for Low Power is an effective software
approach {or power optimization; this work is reordering the assombly instructions
of a program so that power consumption is reduced, of course keoping the semantics of the program There are many scheduling techniques that have been proposed, most of which aim ta reduce overhead cost between pairs of instructions
[2-5,7-9,11] Moreover, some techniques have other objectives for power reduction such as reducing switching activity [6,22] and using critical path [14]
the data flow graph, ie when we create a new order of instructions, we are not sure
whether it satisfies the data flow graph or not Ilence, our approach uses a genetic
algorithm with a chromosome encoding that solves the dala dependency problems
Trang 13better This method was introduced in [15], where the authors proposed this method
to solve the traveling salesman problem (TSP) We apply their method to the scheduling problem in order to reduce energy consumption This algorithm has the
advantage of avoiding the local optimum For finding solutions in the large search
space, we usc a heurislic table, called Power Dissipation Table (PDT), which is
generated by power simulations A PIDT for an instruction set with 7 instructions is
a(n X n) matrix, whore cach cntry PDT(i,) is the power cost consumed in the
execution of imstruction i followed by instruction j Hach entry is used as overhead cost between i and j, and this table is used for evaluating the solution The original assembly programs are divided into basic blocks, and then a Data Flow Graph (DFG) is constructed for each basic block This is a directed graph that presents the
dala dependencies of instructions in a basic block Our algorithm is applied for
cach basic block of an assembly program, it takes as input a data flow graph of a given hasic block and the power dissipation table and its output is the low power
instruction scquence For experiments, we use two open source simulation tools:
SimpleScalar Tool Set [16-18] and SimplePower [19] A sub set of SimpleScalar
Instruction Set is considered and SimplePower is used to simulate the power consumption The algorithm is applied to assembly programs of SimpleScalar ISA Then these programs are compiled and their power consumptions are measured by SimplePower for visual observation
1.4 Thesis organization
‘The remainder of this thesis is organized as follows Chapter 2 introduces some related works about software power optimization and instruction scheduling for low power Chapter 3 describes the steps of our low power instruction scheduling problem in detail Chapter 4 presents the proposed approach based on a genetic
algorithm for this problem Chapter 5 reports the simulation toals, the benchmarks
for cxperiment, experimental results and analysis Chapter 6 presents our
conclusions and introduces Lhe future work,
Trang 14Chapter 2
Related Work
This chapter summarizes related research about software power optimization
2.1 Software power estimation
‘The first step for power consumption is power estimation ‘The ability to estimate
software power consumption can help to verify that a design meets its power constraints and verify the correctness of power optimization methods
V ‘Tiwari [1,20-21] was the first researcher who proposed an energy
estimation model for a processor, and he also proposed the idea of scheduling assembly code for low power consumption In his model, each instruction in the instruction set architecture consumes a fixed energy cost called the base energy cost The base energy cast is computed as product of the voltage and the average
current in the processor while ruming a loop with a sequence of the same
inslruction The other main component of the model is inter-insiruction effects,
which arc also considered The inter-instruction effects mehades citect of circuit
state, effect of resource constraints e.g, pipeline stalls and write buffer stalls and
effect of cache misses The circuit state overhead (overhead cost) between a pair of instructions is the power difference between the actual cost of the pair and the average of the base cost of the individual instructions
‘The total power of a program is calculated as the sum of base energy costs of
all instructions and all inter-instruction effects ‘he total power cost EK, of an
assembly program can be given by equation (2 1)
Ey = Ve xN)+ Yeu * Ni) +> Ey (2.1)
up
where 8; is the base cost lor cach imstruction 7, and N7 is Lac number of umes
it will pet executed, and for every pair of consecutive instructions (ij), O,, is the
circuit state overhead between j and j, N,, is the number of times this pair will be
executed Hy is the energy of the other inter-instruction effects such as pipeline
Trang 15stalls, write buffer stalls and cache misses that would occur during program execution Our scheduling method uses the energy model proposed by Tiwari for finding solutions
2.2 Energy code driven generation for low power
‘The code generation method is proposed by V ‘Tiwari ef af [2-3] ‘This approach
proposes the selection of instructions based on their power cost The main idea is a reduction in total power cost can be obtained by using a code generator which selects suitable low power instructions
2.3 Reducing memory access
V Tiwari et af, also proposed a memory operand reduction approach in [2-3] ‘This
approach is based on the fact that instructions with memory operands have very high-energy cost compared to instructions with register operands Therefore, much energy reduction can be obtained by reducing the number of memory operands, and
an efficient register management can bring this benefit by replacing the memory
38 instructions with regisicr access instructions so that the semantic of the
program is not changed
2.4 Software power optimization using symbolic algebra
In [13], the authors optimized the C code program for reducing power cost The main idea is approximating complex expressions by simple polynomial expressions
and converting floating point data lo Gxed point data, so that the energy dissipation
of the program execution is reduced This work is a scquonee of Lochniques Firs,
an energy profiler is used Lo find all energy critical code blocks Second, a tool is used to transform floating-point data to fixed-point data Third, complex nonlinear
arithmetic expressions are approximated by polynomials Finally, the polynomial representations of the critical basic blocks are mapped to the instruction set using symbolic algebra
2.5 List scheduling for low power
Instruction scheduling for low power is first prosented by V Tiwari [1-3,11| as the instruction level power model and the idea of reordering assembly instructions
Trang 16Instruction scheduling for low power aims to reduce the circuit state overhead overhead cost), which is the energy dissipated due to switching from execution of
one instruction to another Ilis research indicated that instructions can be reordered
to have a smaller amount of circuit slate overhead; therefore, we can obtain low
power with a suilable order of instructions V Tiwari used the List Scheduling
Algorithm [11] for his experiments It is a basic scheduling algorithm with the greedy strategy In fact, the list scheduling is just a simple Topological sorting
algorithm ‘The algorithm performs on each basic block of instructions Krom a data flow graph that presents the data dependency of a basic block, at each step, it chooses the instruction with lowest overhead cost (circuit state overhead between
the previous instruction and il) from the priority list; whore, priarily list is the list conlains all instructions which have no dependence lo any other This simple
algorithm is used to show the potential of power reduction by roordering assembly
code List Scheduling is shown in Fig, 2.1
2.6 Instruction scheduling to reduce switching activity
In [6], C-L Su e¢ al proposed a cold scheduling algorithm Lo reduce lhe switching
activity of the processor The authors suggested a memory addressing method using
Gray code, Gray code has a onc-bil dilleronce in reprosentation for consecutive numbers The use of gray code addressing can reduce the number of bit switches of the address buses, lead lo a large amount of reduction of power consumption because most of the program instructions access the consecutively addressed
memory locations ‘fhe cold scheduling technique is a software approach based on
a traditional list scheduling algorithm to reduce the switching activity of the control path
Trang 17Input: List L, a DFG of a basic block:
For each vertex v in DFG:
itv has no predecessor, add v tn 15
Choose a random vertex i from L;
Schedule i, then remove é from L, remove é and all ares of i from DFG;
While £ is not empty do
Choose the highest priority j in Z;
(overhead cost between the previous instruction and j is smallest)
Schedule j;
Remove Lrom L, remove j and all arcs of j fram DFG;
Update L by adding new vertices with no predecessor,
End_While
Return schedule
Figure 2.1 List scheduling for low power
2.7 Low power instruction scheduling as a traycling salesman problem
K Choi ef ai [4] presented another method by fornmlating the instruction
scheduling problem as a traveling salesman problem (TSP) They used the
minimum spanning tree and a simulated annealing technique for finding optimal
solutions The scheduling technique uses a power dissipation table (PDT), which is
an fn x n) matrix, where each entry (ij) is the average power consumed when
instruction i followed by instruction 7 The scheduling algorithm uses a control flow and data dependency graph for cach basic block and the PDT The problem of
Trang 18instruction reordering for low power is transformed to finding the tour of lowest cost (TSP) in the constraint graph
2.8 Force-directed instruction scheduling for low power
P Dongale [5] in his master thesis, proposed the algorithm /force-directed
scheduling for low power (EP-ISI.P) This is an application of the classic force-
directed scheduling algorithm to low power problem As well as Choi’s method above, this method also uses a PDT as a heuristic table for finding a good instruction order for low power
2.9 Instruction scheduling to reduce the off-chip power
Another low power scheduling technique was presented by H Tomiyama et al
[22] This method aims to reduce the off-chip power By reducing the bit-switching,
on the off-chip buses The scheduling algorithm attempts to find an optimal instruction order of each basic block by decreasing the difference of binary representations of two consecutive instructions in the memory Then the power
consumplion is reduced because the number of lansilions on tbe dala bus are minimized
2.10 Knergy-oricnted and performancc-oricnted combination scheduling
A Parikh et ai [7-8] proposed the method of performance-oriented scheduling, and energy-oriented scheduling, and also a method by combining these two approaches
The performance-oriented scheduling uses the list scheduling algorithm with time
as ihe objeclive parameter Tho cnorgy-oricnicd approach also uscs the list scheduling, but uses circuil-slale overhead (inlcr-instruction c{Tecl) as the objective parameter In the energy-oriented scheduling, at each step, the scheduler selects the
next node with the least circuit-state overhead In the combined method, the
selections of the scheduler are mainly based on one parameter, and the other parameter is considered only when there is a tie, specifically, the decisions are based on circuit-state overhead and the time parameter is used only when there is a
tie with the overhead values.
Trang 192.11 Criticality-dirccted and uncriticality-directed instruction scheduling for
low power
Two methods called Criticality-directed and Uncriticality-directed instruction
schedwing for low power were proposed by S Watanabe ef af [14].‘The critical path is the longest path in a data flow graph (DIG) Instructions on a critical path determine the execution time of the program These called critical instructions The first algorithm is based on the idea that every functional unit in the processor has
different performance and power consumption Only critical instructions arc
scheduled in fast and power-hungry units and the rest arc scheduled in the slow and power-efficient ones, so the total power consumption can be reduced In contrast,
for the second algorithm, instead of finding the critical instructions, the authors
proposed a method to exploit uncritical ones Only uncritical instructions are scheduled in power-efficient units, and energy consumption can be reduced
2.12 Low power instruction scheduling using Particle Swarm Optimization algorithm
C Mian e¢ al [9] introduced a scheduling method which is based on Particle
Swarm Optimization (PSO) Algorithm to reduce the signal transilions The authors
take the low power instruction scheduling problem to the discrete PSO problem and
modily ihe PSO solution to satisly thc data {low graph to find the better instruction order with fewer signal transitions; then the velocity updating formula is also
improved in order to get better results
2.13 Summary
In this chaplor, we proscnl some oxisling works im software power optimization, {ocusing mainly on instruction scheduling approach, The Lirsl slep for optimizing
power dissipation of a program is power cstimation, software power consumption
can be reduced by reducing memory access, code generation, optimizing source
code, instruction scheduling
Trang 20Chapter 3
Instruction Scheduling for Low Powcr
This chapter describes the low power instruction scheduling problem and explains the steps needed to solve it
3.1 Problem description
Our scheduling problem involves the following steps
© Divide an assembly program code into basic blocks
* Construot Data Flow Graph for cach basic block
© Apply Scheduling algorithm to each basic block
Original assembly programs are divided into basic blocks, then a Data Flow Graph (DFG) is constructed for cach basic block This is a directed graph that presents the data dependencies of instructions in a basic block ‘The scheduling
algorithm is applied to each basic block of an assembly program, it takes as input a data flow graph of a given basic block and the power dissipation table and it outputs the low power instruction sequence Scheduling is similar to Topological sorting problem, from the Data Flow Graph, we have to choose an order that
satisfies the constraints of the graph Our scheduling problem is finding a topological order so that the total cost through all vertices is the smallest or smaller
the original’s onc, the costs between two vertices in a row here is the overhead cost
between two vertices ‘The flow diagram of our instruction schedulmg problem and
is shown in Fig 3.1
Trang 21Souree Code
Ỷ Assembly Code
Figure 3.1 How of low power instrnotion scheduling
A basic block is a piece of code which has only one entry point and one cxit
point after the last instruction in the BB One entry point means that there is no
instruction within it that is the destination of a jump instruction anywhere in the
program One exit paint means that only the [ast instruction can cause the program
to begin executing instruction in a different basic block
A Data Flow Graph (DFG) is a graph which presents the data dependency of
the instructions in a basic block It is a directed acyclic graph where each instruction of the basic block is presented by a vertex and each are represents the dependency of an instruction pair Fig 3.2 shows an example of a basic block and
its DFG
Trang 22sw 8fÐ/20(%Sp)
move §fp,$sp
3.sw $3,22(8ip) sw 86,24(${p)
.l §4,0x00000002
sw $4, 0(8Fp)
Figure 3.2 An cxarmple of a Basic Block and its Data Flow Graph
Here, we cannot measure the overhead cost between each pair of
instructions, but we can measure the energy consumption of each pau of instructions This power includes base energy cost of each instruction and overhead cost between them By measuring the power consumption in pairs as above, we build a Power Dissipation Table (PDT) This is a matrix, where each element (i, j)
of the table represents the power dissipation when instruction 7 is followed by instruction , this table will be uscd instead of the overhead cost table
3.2 Partitioning Basic Blocks of Assembly code
‘The source programs are written in C, and compiled by the SimpleScalar compiler (sshig-na-sstrix-gcc) to obtain asscmbly code of SimpleScalar ISA Then, these assembly programs are divided into basic blocks ‘he algorithm [23] for generating
basic blocks from a listing of code is simple, it is described as follows:
First, find the set of leaders in the code, a leader is the first imstruction of a
basic block Leaders are one of the following three categories:
- The first mstruction is a leader
- The instruction that is a label (target) of a branch/jump instruction is a
leader
- The instruction that follows a branch/jump instruction is a leader
For each leader, its basic block consists of this leader and all the following
instructions until the next leader Because the control can never pass through the
Trang 23end of a basic block, some block boundaries may have to be modified after finding the basic blocks For example, in fact, labels, jump instructions and assembly
directives are not taken into account in basic blocks
Fig 3.3 shows an example of some basic blocks:
addu $2,83,1 move $3,52
Tigure 3.3 Examples of Basic Mocks
3.2 Data Flow Graph construction
A data flow graph is constructed for each basic block This graph presents the data dependencies among the instructions in the given basic block To construct the data flow graph, we need to understand the instruction set architecture and the data
dependencies between its instruclions An instruction 7 is dependent on 4 previous
instruction i if mstruction j shares a register or a memory location with instruction 4,
and therefore j cannot he executed until instruction i has completed execution and
been written back.
Trang 24There are three types of data dependencies
- Read After Write (RAW): instruction i and f have a RAW dependency
if 1 writes to a register or memory operand and after that, j reads from this location
- Write After Read (WAR): instruction / andj have a RAW dependency
if i reads from a register or memory operand and after that, j writes to this location
- Write Afler Write (WAW): instruction i and j have a RAW dependency if they write to the same register or memory location
The algorithm to construct a data flow graph of a given basic block is shown
For each instruction j after i Begin
if i andj have WAR then create are ÿ, and break,
End End
Figure 3.4 Algorithm to construct a DEG
3.4 Generating Power Dissipation Table
We use a subset of SimpleScalar ISA for our experiments, it includes 16
inslruclions hsted in Table 3.1 bellow:
Trang 25Table 3.1 Instruction set architecture
1 addu Add unsigned integer addu $2,$3,$4
2 subu Subtract unsigned integer subu $sp.$sp,16
6 sit Tests if one register is less | sh $5,$3,$6
than another
7 sra Shift right arithmetic sra $4,$5,3
8 mile Move from LO register milo $5
12 mult Muluply two registers mult $7,$8
ta (ial Jump and link - used to call | jal BubbleSort
a subroutine
16 bne Branch on nol cqual bne B4,$5,SL7
‘This subset of instruction set architecture covers all the benchmark programs that are used for experiments Among 16 instructions, there are 4 jump/branch instructions that can be ignored So we have to generate a PDT includes 1212
elements
To create the PDT table, we do the following: each element fi, j) of the table
is the power measured by giving the instruction j after instruction i, followed by the
Trang 26instruction nop, repeated 20,000 times to avoid loop overheads An example of
PDT generation is described in Fig 3.5 SimpleScalar Tool Set [16-18] and
SimplePower [19] are used We use SimpleScalar ISA to create assembly programs
and to create the table PDT, the compiler ssbig-na-sstrix-gce of SimpleScalar are used to compile assembly programs SimplePower is a power simulator for
SimpleScalar ISA, it is used to measure the elements of the PDT and to measure the
power consumption of assembly programs We create 144 assembly files
automatically with each file corresponding to a pair of instruction These files are
named in alphabetical order that can be processed automatically in SimplePower Running all these programs, we obtain the PDT
subu §2,S4,S6
nop addu $3,$5,$7 subu $2,$4,86 nop
Trang 27and building fimess function This chapter introduces the genetic algorithm (GA)
approach based on topological sorting for low power scheduling problem
4.1 Genetic Algorithm
In computer scicnec, a genetic algorithm (GA) is a heuristic scarch algorithm that
imitates the process of natural evolution [24] ‘this algorithm is generally used to
generate good solutions to search and optimization problems Genetic algorithms
belong to the larger family of evolutionary algorithms, which generate solutions to optimization problems using techniques based on natural evolution, such as
inherilance, mutation, crossover, and seleclion Genelic algorithms are applied in
bioinformatics, computer science, economics, chemistry, engineering, mathematics,
physics and other ficlds
Genetic Algorithms (GA} were developed in the 1970s by the work of Ilolland and his colleagues This concept is based on the main idea of the
evolutionary theory Genelie Algorithms, as well as evolutionary algorithms in
gencral, formed on the notion that the natural evolution is the most perfcet and
most rcasonable process, and it is optimal This concept can be scen as an axiom,
that is suitable with objective reality ‘lhe evolutionary process represents the
optimum in the feature that the later generation is always better (more developed,
Trang 28more complete) than the previous generation Throughout the process of natural evolution, new generations are generated to supplement, replace the older generations by two basic processes: reproduction and natural selection Lach
individual lo survive and develop has to adapt to the environment, individuals that
adapt can survive, poorly adapted individuals can be destroyed
Each individual has a set of chromosomes Each chromosome consists of many genes linked by a chain-like structure, representing this individual’s traits
Individuals of the same species have the same chromosome’s structure but their gene’s structures are different, it makes the difference between individuals of the same species and decides the survival of the individual Kecause the natural
environment always changes, chromosome structures also change to adapt to the environment, and later generations are always more adaptable than fomer generations These structures are formed due to the random exchange of information with the external environment cr between chromosomes together
From this idea
each individual can have only one chromosome The chromosome is divided into
genes that are arranged in a linear array Each individual (or chromosome)
represents @ possible solulion of the problem, Each operation on the set of chromosomes is equivalent to finding a solution in the solution space of the problem [25] The search process has to achieve two goals
- Exploiting the best solutions
- Considering the entire search space
A genetic algorithm (or any evolutionary algorithm) to solve a particular
problem must include the following five components
- Encoding solutions - a genetic representation for the solution of the
problem
- Creating the initial population
- Constructing the fitness function: a function to ovaluate solutions according
lo the level of "adaptation" of them.
Trang 29- Constructing genetic operators (selection, cross over and mutation)
- Choosing algorithm parameters (population size, number of generation,
stop condition, cross over probability and mutation probability)
A GA takes the process of scarching the optimal solution im many directions,
by maintaining a population of solutions and promoting the formation and
exchange information between these solutions The population undergoes evolutionary process: In each generation, new individuals will be created by cross
over operator and mulation operator with a corlain probability Thon the relatively
"good" individual will be retained while the relatively "bad" individual will be
removed, creating a new generation better than the previous generation
Al the loop 4 the GA specifies a scl of possible solutions (or individuals or
chromosomes) that is called population P(t) — { x, x44 4” 2 (number of
individuals is population size) Each solution x, is evaluated to determine its suitability Some individuals are chosen for reproduction by cross over and mutation Then, a new set of solutions is formed by selecting the more suitable solutions This leads to a new population Pffi 1) with the hope of containing the
individuals that more adaptive than previous population
Thus, essentially, a GA is an iterative algorithm It aims to solve the problem
of searching, based on artificial selection mechanism and the evolution of genes In
this process, the survival of an individual depends on the features of its
chromosome and the selection process GA uses the operators: selection, cross over
and mutation on the chromosomes to create new chromosomes; these operators are
essentially copying chromosomes, modifving chromosomes and exchanging sub-
chromosomes
A GA can be seen to be different from the conventional optimization algorithms in the following features:
- GA works with the set of code of variables, not with variables
- GA searches on a population of individuals, not on a point, so it reduce the ability to finish searching at a local optimal point and does not reach the global optimization
Trang 30- GA only needs the information from the fitness function in order to find good solutions; it does not need other support information
- ‘the basic operations of the algorithm are based on random integration, and
probabilistic and nondeterministic sclection
The mechanism of a genetic algorithm is simple but has more power than
other conventional algorithms due lo the evaluation and selection afler each slep
So, the ability to reach the optimal solution of GA is faster than olher algorithms 4.2 Topological sorting
The topological sort is an ordering of vertices in a directed acyclic graph, such that
if there is a path from vertex v; to vertex v,, then w, appears after v, in the ordering
In the topological sorting procedure, in each step, select any vertex without
incomimg edges and then store the vertex and its position hen, the vertex and all the arcs from this vertex are removed from the graph As mentioned in chapter 3,
scheduling is similar to topological sorting problem, from the Data Flow Graph, we
have to choose an order that satisfies the constraints of the graph Our scheduling
problem is finding a topological order go thal the total cost through all vertices is
the smallest or smaller the original’s onc The costs between two vertices in a row
here 1s the overhead cost between two vertices
There are more than one scquence of vertices thal can be derived from a
directed graph using this topological sorting procedure To overcome this issue, and
to obtain a feasiblc complete path from a directed graph, an ordering technique using the topological sort and random assignment of priority is used [15] ‘To derive
a unique sequence from a directed graph, a random priority assignment technique
to randomly assign a different priority to each vertex in the graph is used Therefore, a string of priorities can represent a feasible path The topological sorting procedure with priorities assignment is shown in Fig 4.1
Trang 31Input: a Data Flow Graph
List L;
While {any vertex remains) da
If every vertex has a predecessor then
According to the topological sorting algorithm above, inslcad of using a sequence
of vertices of the graph, we can use a string of priorilics lo represcnt chromosome, cach string of priorities will represent onc individual in the population Fig, 4.2 shows an example of chromosome representation
4.4 Cross Over operator
This operator generates a new chromosome from two old chromosomes For constructing cross over operator, we base on the operator called Afoan Cross Over
Trang 322
introduced in [15] but we modify it a little In order to generate a new chromosome,
at the first step, the Adaon Cross Over selects a sub string of an old chromosome,
and we do the same, but reverse this string Our cross over operator is described in
Tig 4.3, and Fig 4.4 shows an example of it
child = a0, 1.043 sub c, = the remaining substring results from deleting genes in child from o, sub_cy — cụ — child
While (length of ehild <n) do
Ifi==1 then
i= n, k=kt iff at by then child = <child,a,d.>;
else child = <child.a;>:
Else ifj
i=i-l;
k=kti;
if aif by then child = <dia,child>;
clse child <a,child>;
if a¢ by then child = <a,child,b>;
else child = <a, child>;
Figure 4,3 Cross over operator
Trang 33The fitness function is the total cost from the first vertex to the last vertex of the
sequence of vertices which have been sorted In our problem, the path between each pair of vertices is the corresponding PDT value when switching between two instructions that correspond to these vertices Our goal is to select the sequence that has the smallest fitness function value