MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY GENETIC ALGORITHMS FOR SOLVING BOUNDED DIAMETER MINIMUM SPANNING TREE PROBLEM By TTuynh Thi Thanh Bình
Trang 1
MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
GENETIC ALGORITHMS FOR SOLVING
BOUNDED DIAMETER MINIMUM
SPANNING TREE PROBLEM
By
TTuynh Thi Thanh Bình Supervisor: Associate Professor Nguyen Duc Nghia
A Dissertation submitted in partial fulfillment of the requirements
for the Degree of Doctor of Philosophy in Engincering
HaNoi, 2011
Trang 22.3.22 Center-Based Tree Construction Algorithm
Trang 32.3.2.4 Improved Greedy Heurisitics (ROH — f and CBTC —J)
2.3.2.5 Hierarchical clustering heuristic algorithm HCH
2.33 Metaheuristie algorithms 23.4 Conclusion
3.4 Discnasion
4 Genetic algorithm with multi-parent recombination operator
5 Multi-population Genetic Algorithm
5.1 Structure of the genetic algorithm
Trang 45.2.1 Problem inscances 5.2.2 Experiment setup
Result,
6 Steady-state genctic algorithm
G1 Stendy state generic ulgorihur structure
6.1.1 Individual representation and initial population 6.12 Crossover
6.1.3 Mutation G14 Selection 6.2 Replacement policy
63 Experiments
63.1 Problem instances 63.2 Experiment sctup 63.3 Parameter
Trang 5List of Figures
Li Scheme of genetic algorilun
2.1 The BAST with 19 vertices and hoimded diameter D=4, 7 is the center of
the tree
2.2 The BDST with 19 vertices and bounded diameter Đ—ð, v1, vy ure the
centers of the tree
23 The bet BDST found by OTTC algorithm on the Buclidean problem
A spanning tree on twelve nodes and an its edge-set representation
A spanning tree on eleven nodes and an its permutation-code representation 2.10 Center Move Mutarion
2.11 Edge Delete Mutation
2.12 Subtree-Optimize Mutation
bạ 3
bạ 3
28
Trang 6A star-like structure of a typical solution to the BLM ST" problem
Greedy Fdge Delete Local search
‘The best BUS found by CBC heuristic on the Buclidean problem in-
slance with v= 100, D= 10
The best BDST found by (BRC — I heuristic on the Euclidean problem
instance with 2 = L00, D = LO,
Comparison between the sum of the best solutions found by EA — zy?
algorithm on all the problem instances
Compucison between Lhe sum of uke best solutious found by BA sựỗ
algorithm on all the problem instances
Compucison between the sum of uke best solutious found by BA sựT
algorithm on all the problem instances
Comparison between the sum of the best solutions found by BA — ay
algorithm on all the problem instances
Comparison between the sun of the best solutions found by #A — xdk
algorithm on all the problem instances (» = b,r, isk = 2,5.7,9)
Comparison between the sum of the best solutions found by BA — agk
algoritiun ou oll the problem instances (2 — 87,14 — 2,5 7,9)
Comparison between the cum of the best solutions found by LA — emk
algoritlun ou all the problem instances (2 — & Gk -25 79)
Trang 748
49
412
418
Comparison between the sum of the average solutions found by E#A — zy2
algorithm on all the problem instances (x = 6,7 1)
Comparison between the sum af the average solutions found by FA — zyä
algorithm on all the problem instances
Comparison between the sum of the average solutions found by FA — ry7
algorithm on all the problem instances
Comparison between the sum of the avcrage solutions found by EA yd
algorithm on all the problem instances (x — 6,7, {) -
Compucison between the sur of the average solutions found by EA adk
algorithm on all the problem instances (x — 6,7, 4: — 2,8,7,9)
Comparison between the sum of the average solutions found by EA — agk
algorithm on all the problem instances (» = b,+, l;k = 3, 5, 7,8)
.14 Comparison between the sum of the average solutions found by EA xmk
415
4.16
6.1
algorithm on all the problem instances (» = 6,7 isk = 2,5.7,9)
Comparision betwen the best solution found by Œ.4:, Gz, GAs, Cla,
GAs, Gg on all the problem instance
Comparision between the standard deviation of the solution found by GA,
GAp, CAs, GA GAs, GaAs on all the problein instance
Mniti-popnlation model
‘Lhe comparision between the best results found by GAn, GA, GAr
CAy und AGA on the instance with n — 250, D — 15, instunee Ls
‘Lhe comparision between the mean results found by Gan, Gly, GAr
GAy aud HGA on the instance with n — 250, D — 15, instance Ls
‘Lhe number of individuals from GAu1, GAv,, GAts, GArs migrate to GA pinat
Trang 8Results of QUUC, CBUTC, RGU, CBRC, CBRC — 1, NGI! — ft on the
Euclidean instances of the BDMST problan with ø — 100 and D —
5,7,9,11,13, 15
Fuclidean instances of the BDMST problem with ø — 250 and 2 —
15, 20, 23, 25, 27, 30, 35
Non-Euclidean instances of the BDA{ST problem with n — 100 and D —
Results of QUUC, CLIC, RGU, CBRC, CBRC — 1, LiGi! — 4 on the
Nou-Buclidenn instances of the BDMST problem with a — 250 und D - 5,10, 13, 15, 17, 20, 25
viii
49
50
52
Trang 9Non-Euclicean instances of the BD.MS1 prablem with n = SUG and =
10, 15, 18, 20, 22, 23, 30
Results of OLU'C, CBIC, RGH, CBRC, CBRC — 1, RGH — 4 on the
Non-Faclidean instances of the BDMST problem with n — L000 and D —
15, 20, 23, 25, 27, 30, 35
‘Lhe rate of the heuristic algorithms use for initialization of the population
in each experiment, genetic algorithm
Comparision between the result found by £14 — zy2; x = d,g.m; y=U.r,6
on the 20 Euclidean problem instances
Comparision between the result found by EA —ay5; x — 4, g.m; y —4.r,b
on the 20 Euclidean problem instances
Comparision between the result lound by #4 — zy7; z — đ,g.rm; ự — (,m,b
on the 20 Euclidean problem instances
Comparision hetween the result found by FA — ny; a —d,q.m; y— Urb
on the 20 Kuclidean problem instances
Comperision between the result with dillercnt crossover probabily ou the
Luclidean problem instance with number of vertices are 250, D—15
Comparision between the result with different crossover probabily on the
Fuclidean problem instance with miber of vertices are 250, D—15
Comparision between the result found by HJ — BSEA, PEA —1, HOA,
MHGA on the 20 Euclidean problem instances
Comparision between the result found by RJ) — ESEA, PEA —I, HGA,
MAGA on the 20 Non-Euclidean problem instances
Result of Gan, GA, GAy, GA and LGA on 20 Luclidean 11442"
Trang 10Results of PHEA— HGH, PEA- RGHI, PEA—CBRC, PBA-CBRCI,
PRA — F on the 20 Fuclidean instances of the BDMST problem of size
100, 250, 500 ancl 1,000
Average number of Ierwtious required by PEA RCH, PEA RGHI,
PEA-CBRC, PEA—CBRCI, PEA-— I to reach the best solution on
Une 20 Euclidcun instances of BDMST problem of size 100, 250, 500 and
1,000
Results of PEA RCH,PEA RGHI,PEA OBRC,PBA CBRƠI,
PRA — I on the 2 Nan-Fuclidean instances of the BDMST problem of
size 100, 250, 500 and 1,000
Average number of iterations reqnired by PEA — RGH, PRA — RGHI,
PEA ~ CBRC, PEA ~ OBRCI, PEA — Ï to reach the best solution on
the 20 Non-Eneliđean instanees f H72ØT prohlem of size 100, 250, 500
Trang 11Abstract
The Bounded Dianeter Minimaurn 8panuing Trec (BĐĐAf ST) problern is ä connlaina- torial optimization problem that arises in many applications such as design of wire-based communication uelworks under quality of service requirements; desigu linear lightwave networks, where it can minimize interference in the network by limiting the traffic in the network lines Another practical application requiring a BD.MST arises in data compres sion, where some algorithms compress 4 file utilizing a tree data-structure, and decompress
a path in the tree to avcess a record in wd-hoe wircleys uctworky distributed imutual cx~
clusion algorithms
Let G = (V, £’) be a connected undirected graph with positive edge weights w(e) (e is an edge of graph} The BDMST problem can be formulated as follows: among spanning
trees of G whose diameters cla not exceed a given upper bound J > 2, find a spanning
studies af the 844 $1’ problem, and without lost of generality, we will assume that G is
u complete graph
This problem is known to be NP — hard for 4 << D < V|—1 Moreover, the BDMST problem hay been shown Lo be alse approsimate-hurd, in that there ig no polynomial Lime algorithm which could guarantee to find a solution which has a cost within log{|V|) of the optimum, unless P — NP Therefore, heuristic and meta-heuristic teelmiques are ear rently the only practical method for improving the solution quality in solving the L)A4 97
problem, especially when [V'| is large
Trang 12In this thesis, we survey the literature on the BDAST and then present new algorithms for solving this problem
First, we propose a greedy heuristic algorithm called Cantar-Rased Recursive Clustering (CBRE) We extend the concept of center to each level of the partially constructed spanning tree The algorithm can be seen as recursively clustering the vertices of the graph: every internal node of the spanning tree is the center of the sub-graph in the sub- iece ruoted ab this nude and we recursive vo find Unc best couler, The uew heuristic is compared with other well-known heuristics for solving the BMS!" problem, namely, the Onv-Time-Tree-Construction (OPTC), the Rundumized Greedy Heuristic (RGH) of Ruidh and Julstrom, the Center- Based 'Lree Construction (€
) of Julstrom, the Randomized Greedy Heuristic with post-improvement (RGH — I) and Center- Based Tree Construction with post-improvement (BTC — 1) of Singh and Gupta
And then, we mtroduce multi-parent recombination operator in Genetic Algorithms (GAs} for solving the BIẾT problem The proposed multi-parent recombination operator al- lows using more than two parents to crcate offspring We consider three different methods for chousing parents, Three new methods for adding edges from the parents to the off spring also considered or each of the three methods of choosing parents and three ways for adding edges, we also experiment genetic algorillms for volving BDMST problem with different number of parents We discuss anc analyze the efficiency of using differ- ent heuristic algorithius Le initialize the population in genetic algorithun lor solving the BDMST problem
We present» uew geuctie algorithin (GA) which use mulli-populution where each populu- tion is initialized with a, different well Imow henristic The individnals in each popnlation will subsequently compete lor positious iu a selection population, using a simulated an nealing mechanism hased on proportionate selection In the selection population, they will combine ond cvelve toward Lhe optimus We compare our cevulis with other GA Beside generational genetic algorithm, recently many researchers are interested in steady-
Trang 13state genetic algorithm We present steady-state genetic algorithms which use different heuristic algorithms for decoding We modify the decoder and the replacement policy
nsed in PEA — F so as to imprave its performance We use four decoders hy different
Experimental resulta are also reported to compare the efficiency of different henristic and
genetic algorithms for solving BDMST problem
Trang 14Acknowledgements
First of all, I would like to thank my supervisor, Professor Nguyen Duc Nghia, for his guidance and assistance, which significantly contributed to the further development of amy research and writing skills, I further want lo thank Professor BL Bob McKay, Dr Nguyen Xnan Toai who help me so much during the time Twas Th student
Talso would like to thank the committee members: Professor Hoang Van Kiem, Professor
Nanyen Thue Tai, Professor Nguyen Thanh Thay, Associate Professor Thai Cong Cuong, Associate Professor Huynh Quyct Thang, Agsociate Professor Tran Dinh Khang, Asao- ciate Professor Ngo Quoe Tao, Dr To Cam Tla, Dr Bui Thn Lam, Dr Ngo Tong Son,
Dr ‘Lruong ‘hi Dieu Linh, Dr Le Minh Hoang, Dr Le 'Lrong Vinh
I would like to give special thanks to my parents, my husband and my daughters, who gave
me unconditional support and encouragement during the long time 1 needed ta conduct
research and write this thesis
Also, thank to Ministry of Ecucation and ‘[raining, Hanoi University of Science and ‘Lech nology, Natioual Foundation for Seience and Technology Development for Uneir funding for my research I would like to thank to my colleagues at School of Information and
Conmunivation Technology, any Lricnds, for their communts and exvouragemcat
Trang 15Chapter 1
Tntroduction
Nerwork design prohlems are aerive †opies in reearch The selection of an ap- timal contiguration or design of a network occurs in many different application contexts including Weensportution (uitline, railroad, Loulffic, aud wnasy Lransit}, communication (Lele phone and computer networks), electric power systems, and oil and gas pipelines ‘here are Jot of real world problems can be mapped to a formulation deuling with nodes and dges within a graph lor example, telephone companies are particularly interested in
minimum spanning tree, because the minimum spanning tree of a sct of sites defines the
wiring acheme that connects the sites using as little wire as possible Tr is the mother of all network design problems This minimum spanning tree is a fundamental problem and can he easy polynomial-time solved hy using Prim or Kruskal algorithm
Another cxample concern with a traffic network whose nodes represent both origin and destination areas for the velicular Lrallic of a city and also inberseclivis in Lue road net work ‘I'he ares correspond to streets in the city, and the are flows are the amount of traftic traversing the streets A typical network design problem would be to select a subset of the pessible road improvements subject to a budget constraint Vhe design objective would
he to minimize the total travel cost for all travelers in the city network
Tt is interesting to see the wide range of network models that: are related to the fixed
Trang 16charge design problem If all are construction costs are set to zero, then the fixed charge design model becomes a series of shortest path problems If all are routing costs are set
ta zero, the fixed charge design model becomes a Steiner tree problem on a graph Since
the fixed charge design problem contains the Steiner problem as a special case, we can
and totally dominate the routing costs (i.¢., the optimal network design must be a tree),
then the fixed charge design problern becomes tke optimum communication spanning tree
problem defined by Llu [25]
Scott 4] hus introduccd another newwork synthesis problem, called the “opal wetwork”
problem, that is closely related to the fixed charge design problem ‘I'he arc routing costs
in this problem are all linear functions of the total How Arc capacities, which are all
initially zero, can be raised ta infinity The objective is ta minimize total routing east,
subject to the usual capacity and How routing constraints and the added constraint thet
the total construction costs cannot exceed a given budget Optimal network problem is
In communication network design when requirements can be for exampte a linilation of the maximum communication delay or the guarantee for a minimum signal-to-noise ratio, thus, the mumber of relaying nudes on any path between (wo communication partner neetls to be restricted ‘his problem is the BDMS1'
The BDMST problem have so much applications in design of wire-based communication networks under quality of service requirements; in linear lightwave networks, where it can
dainimize interference in the uctwork by limiting the vraflic in the ovtwork lines; in data
compression, where same algorithms compress a file utilizing a tree data-structure, and decompress a path in the Urce lo acvessy w record; in ad-hoe wireless uclworks distribuled
mutual exclusion algorithms More detail about the applications of the BDMST are
Trang 171.1 Motivation
BPMST problem has applicarions in several areas, such as in commnicatian net- work design, distributed mutual cxelusiou, linear lightwave actworks wad bit-coupression for information retrieval In the thesis of Abdalla [I] and DT Martin Grnber [17] detailed informations shout the motivation of BUALST are presented Additional fields of ap- plication are described in [34., where the BPIMST appears as a snbproblem within the vehicle routing problem Paper [3] deals with ad hoc wireless networks while the paper [6] presents dynamic routing algorithms for multicasting in a linear lightwave network We consider several applications as bellow
can be a limitation of the weximun
1n conununiemion nebwork design, the roquiremi
communication delay or the guarantee for a minimum signal-to noise ratio ‘I'hus, the nutuber of reluying nodes on any path belween two conunumieation puctners needs lo be
limited by a given constant
In distributed mutual exclusion, before cntcring a critical scction a computer in a dis-
tributed environment, has to signal its intention and ask for permission A relevant part
of the costs for these operations is the length of the longest path the messages between the computers have to travel Thna, when a, tree structure is nsed as imderlying commn-
nication infrestructure as proposed in [43 the diamcter of it has a direct influcnce on the
efficiency of the mutual exclnsion algorithm
Jn distributed system, messages passe from node to other node In |4Ÿ, Raymon uses a logical spanuinyg tree struclure on a uetwork of processurs Messages are passed among, processors requesting entrance to a critical section and processors grating the privilege to enter The maximum number of message generated per critical-section execution is 2d, where ¢ is the diameter of the spanning tree ‘lherefore a small ciameter is eesential for the efficiency of the algorilun Minimizing cdyge weights reduces the cost of the uctwork
Another application can be found in information retrieval systems where large data
Trang 18struc-tures called bitmaps are used in compressing large files, see [9] It is required to compress the files, so that they will occupy less memory space, while allowing reasonably fast access
only vectors within a cluster are coded relative to a representative but also the cluster vepresenratives themselves relative to each ather, where the relation of the clusters is ex- pressed by a graph spanning them all Decoding process leads to the problem of creating a 1mÌnimuur spanning tree wltere the Hamuning distance between two clusters is used as east function ‘Ihe length of the paths within this tree has a considerable impact on the time required to decompress bit-veelors part of the corresponding clusters Ag a consequence,
there has to be a trade off between the compression rate (costs of the spanning tree) and
the (de-jcompression time (diameter of the tree}
BDMST iso challenge, We would like to propose the new algotithms for solving this problem to find god solution in Teasonahle time
hy henristics Hennistios and especially metahenristica can be seen aa sltemative when large instances have to be solved in reasonable time, whereas these approaches are not
able to guarantee to reach the optimum
Trang 19There are a lot of heuristic algorithms based on different approachs, such as: Greedy Heuristics, Local Search, Evolutionary Algorithms, ‘hese approaches can only applied for specific problems Recently researchers use metahenrisic algorithms to design a com putational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality Metahenristies make few or no as- sumptions about the problem being optimized and can search on a very large spaces of candidate solutions, However, metaheuristics do uot guarantee un optiuul solution is ever
found Examples of metabeuristic algorithms are: Iterated Local Search [31], Labu Search [14], or Variable Neighborhood Scarch (V.N'S) [23], Simulated Anucaling 30], Aut Colony Optimization (AC) [11], Lvolutionary Algorithms (Z'A) [5], and Memetic Algorithms [32]
We will briefly overview Greedy heuristic algorithms, Tocal search, Genetic algorithms which we usc for developping new algorithm for solving BD MST
Greedy henristic algorithm is an algorithm that follows the problem solving meta- heuristic of making thc locally optimal choice at cach stage with the hope of finding the global optinuit
In general, greedy algorithms have five pillars:
A candidate set, from which a solution is created
2 Aselection function, which chooses the best candidate to be added to the solution
3 A feasibility function, that is used to determine if a candiclate can be used to con- tribute lo @ solution
4 An objective function, which assigns a value Lo a solution, or a partial solution
5 A solution function, which will indicate when we have discovered a complete solution
Greedy algorithms produce goad solutions on some mathematical problems, bur not on others Most: problems for which they work well have two properties:
g
Trang 20« Greedy choice property: the choice made by a greedy algorithm may depend on choices made so far but not on future choices er all the solutions to the subproblem
Tt iteratively makes one greedy chaice after another, reducing each given problem
into a smaller one
« Optimat substructure: a problem has optimal substructure if the best next move always leads to the optimal solution
Greedy algorithms mostly (but not always) fail ro find the globally optimal solmion, because they usually do not operate exhaustively on all the data ‘I'hey can make com inilments to certain choices tov early which prevent them from finding the best overall solution later Kor example, all known greedy coloring algorithms for the graph coloring problem and all oUuer NP — complete problems do not consistently find optimum solu- tions Nevertheless, they are useful because they are quick to think up and often give good approximations to the optimum
Local search is a metahenristic for solving computationally hard optimization problems
Loeal search can be used on problems thal can be Jormuluted ws findiug » so luLion muxi~
mizing a.eriterion among a number of candidate solutions Tocal search algorithms move from solution to solution in the space of candidate solutions (Ue search space) until a solution deemed optimal is found or a time bound is elapsed
A local search algorithm starts from a candidate solution and then iteratively moves to a neighbor sahution This is only possible if a neighborhood relation is defined on the search space As an example, the neighborhood of a vertex cover is another vertex cover only differing by one node For boolean satisfiability, the neighbors of a truth assignment: are usually the truth assignments only differing from it by the evaluation of a variable The same probluw sy Lave mulliple differcut neighborhoods defined on il: lueal optimization with neighborhoods that involve changing up to & components of the solution is often
referred bo as k opt
10
Trang 21
Survivar sles
Figure 1.1: Scheme of genetic algorithm
‘Termination of Jocal search can be based on a time bound Another common choice is
to terminate when the best solution found by the algorithur has nel been improved in a given number of steps Local search algorithms are typically inenmplete algorithms, as the search may stop even if the best solution found by the algorithm is not optimal This can happen even if termination is due to the impossibility of improving the solution, as
the optimal solution can Tie far from the ucixhborhood of the solutions crossed by the
algorithms
The genetic algorithm (CA) is a search heuristic that mimics Uhe process of natural
evolution This heuristic is routinely used to generate useful solutions to optimization and search problems, Cencli¢ algoritiuns belong bo Uhe lurger class of evolwvionezy algorithins (FA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover
‘The general acherne of a GA can he given in the figure 1.1
GAs are useful and efficient when:
« The scarch space is large, complex or poorly understood
ø Domain knowledge is scarce or expert knowledge is difficult ro encode to narrow the
search space,
11
Trang 22« No mathemarical analysis is available
@ Traditional search methards fail
Representation: Objects forming possible solution within original problem context are called phenotypes, their encoding, the individusls within the GA, are called genotypes
‘The representation step specifies the mapping from the phenotypes onto a set of genotypes Candidate solution, phenotype and individual are uscd to denotes points of the space of
be used for points in the genotye epace
Mulation Operaior: 11 is applied o one genotype aud delivers a modified mutant, the child
or offspring of it In general, mutation is supposed to cause a random unbiased change Mutation has a theoretical role: it cau guarantee Wal the space is connected,
Crossover Operator: A binary variation operator is called recombination or crossover This operator merges information from two parent genotypes into one or two offspring genotypes Similarly to mmtation, crossover is a stnchastic operator: the choice of what parts of cuch purcut are combined, and the way these purts are combinod, depend on random drawings The principle behind crossover is simple: by mating twa individuals with different but desirable features, we cun produce ax ollspring which combines both of
those features
Parent Selection Mechanism: ‘Lhe role of parent selection (mating selection) is to distin guish among individuals based on their quality to allow the better individuals to hecome parents of the next generation Parent selection is probabilistic ‘hus, high quality indi viduals ger a higher chance to become parents than those with low quality Nevertheless, low quality individuals are often given a small, but positive chance, otherwise the whole search could become tun greedy aud gel stuck in a local optáinui,
Survivor Selection Mechanism: 'Vhe role of survivor selection is to distinguish among in- dividuuly based on their qualivy Iu CA, the population size is (almost always) constant,
12
Trang 23thus a choice has to be made on which individuals will be allowed in the next genera- tion ‘I'his decision is based on their fitness values, favoring those with higher quality As opposed to parent: selection which is stochastic, survivor selection is often deterministic, for instance, ranking the unified multiset of parents and offspring and selecting the top segment (fitness biased), or selection only from the offspring (age-biased)
Termination Condition: Notice that GA is stochastic and mostly there are no guarantees
Jo reach an optimum Commonly used conditions for terminavions arc the following:
1 The maximally allowed CPU times elapses
2 The total number of fitness evaluations reaches @ given linnit
For a.given period of time, the fitness improvement, remains under a, threshold value
4 The population diversity drops imder a given threshold
Population: The role of the population is to hold possible solutions A population is a amultivet of genulypes In almost all CA applications, the population size iy coustanb, not changing during the evolntional search
Both, exect and heuristic methods, have their strengths and weaknesses In practice, the combination of them to hybrid algorithms often allows ta improve solution quality (faster algorithms and for better solutions) by exploiting synergies [17] Classifications and surveys of different hybridizations of exact optimization techniques with metahenristics can
be found in [41, 33, 42]
problem On this thesis, we will use local search and genetic algorithm for developing
new algoritluns lor solving BDMST problem
13
Trang 24Contributions will be presented in four chapters and can he summerized as follow:
1 We propose the Center-Based Recursive Clustering (CBRC) heuristic algorithm CBRC is bused on RCH (nud CBTC}, We extend the concept of center to cach level
of the partially constructed spanning tree The algorithm can be seen as recursively clustering the vertices of the graph: every internal node of the spanning tree is
the center of the sub-graph in the subtree rooted at this node and we recursive to find the best center We also survey the constraint between the weight of tree and
hounded diameter We experiment and compare the result between our algorithm and others - RGH, RGH — I, CBTC, OTTC, CBTO — J - on the Euclidean and Non-Rnclidean instances up to 1000 vertices On the Euclidean instances, the results show the effectiveness of our algorithms on the best, mean and deviation valucs On the Non-Euclidcan instusees, the best results found by CBRC Tare Lhe same with the one found by OLU'C
2 We also introduce three multi-parent recombination operators in genetic algorithm for solving ROMST problem We consider three different methads for choosing
parents: the first one is based on Levenshtein distance between the parents, the sec-
oud one uses Le best individual in Uke population and the lust ene uses randomly chosen individual in the population We also experiment each method of choosing
14
Trang 25parents with three ways for adding edges from the parents into the offspring: choose the edge randomly, choose the edge which have minimum weight, chacse the edge which have minimum weight in maximum sharing edge from the parents We exper- iment on the Kuctidean instances up to 1UQU vertices We concentrate on analyzing the recombination aperator in genetic algorithms So we compare the resnits of our algorithms using respectively, three mentioned multi-parent recombination opera- tory with another geuctic algorithm using two-parent revumbinution eperator on the
came problem
We propose a new hybrid genetic algorithm for solving BUMS’ problem ‘Lhe new genctic algorithm uscy mult-populution, where cach population is initiulized wilh different well known heuristic The individuals in each population will subsequently
compels for positions in a selection population, using a simuluted onneuling aechu- nism based on proportionate selection; in the selection population, they will combine und evolve toward the optimum, Therefore, our research upprouches cuuploy differ- ent initial biases by using different heuristics far initialization, and to hybridize the individuals from these populations to promote the exploratory capacity of the GA
We compare our results with other genetic algorithms, namely, the genetic algorithm
in [4] of Raid) and Julstrom (called RJ — ESEA), the genetic algorithm of Alok and Gnpta in 46] (called PEA—T} and the genetic algorithm in each popnlation on
the Liuctidean and Non-luclidean instances up to 100) vertices ‘lhe results show
the cllectivcness of our ulgoriUlan
We propose stearly-state genetic algorithms which use different heuristic algorithms for decoding We modify the decoder and the replacement policy used in PEA — I
so us Wo improve ity performanee We use four decoders by different well-known heuristic algorithms: AGH, RGM — 1, CHRC, CLC — 1 We experiment on the Euclidean and Nou-Euelidean instances up to 1000 vertices aud Lhe results show the
15
Trang 26outperform of our algorithms than the others
This diysctation iy orgunized uy follow
In chapter 1, we introduce the motivation of the thesis, methodologies Scope of researches and voutributious ure also prescuted
After the introduction, chapter 2 present formulation of the BALMST problem and sum- marize the related works in the field of the BDMST problem To our best knowledge, all
of the algorithms for solving BD.MST only snitable for one kind of the problem inscance: Euclidean or Non-Buelidean instances 8o, in the remain chapters, we will preset our algorithms for solving RD MST We hops that our propose algorithms ean he applied for both Luclidean and Non-Liuclidean instances to find better solution
A uew greedy Leurisilic algurithur (Cemer-Based Recursive Clustering) is presented in chapter 3
Evolutionary alyoriluns have proven elective ou several hard spanuing tree probleus So,
in the chapter 4,
6, We present our genetic algorithms for solving BUMS
An B.As recombination operator should provide strong heritability This means that the tree produned by recombining parent trees should consist mostly af parental edges Tr is also beueliciul Lo favor edges thal are coummon to the parents In the chupler 4, we present Taniti-parent, recombination operator in genetic algorithm for solving BD MST
Aluiosl all geuctiv algoritluns for solviug the BDMST problem strongly depeud on theiz particular henristies, in that the heuristics were nsually used to initialize GA populations and played an important role in the design of genetic operators, However, it has been suggested in the Ticerature that the behavionrs of different henristies vary over different: classes of problem instances [46]
Th chapter 5, we introduce a new hybrid genetic algorithm for solving BD MST problems
16
Trang 27that uses a multi-population, where each population is initialized with a different well known heuristic Chapter 5 presents new hybrid multi-population genetic algorithm in which each population is initialized with a different well know heuristic Chapter 6 will introduce steady-state genetic algorithm for solving 4L)ALS‘’ problem which uses differ ent henrisities for decoding the tree
Tinally, the conclusion summarizes the works.
Trang 28Chapter 2
Bounded Diameter Minimum
Spanning Tree and Related Works
This chapter presents the formulation of BD ST and suuumarizes the relaled works in
the field af the 2DMST problem
Before introduce the approaches for solving BDMST, we state the problem
We need to introduce some concepts relating ta tree diameter and center before the BDMST prablem can be formally stated
Let T — (V, Er) be a tree with node set, V and edge set Fr
number of edges on the path between v and any other node within the tree T
Definition 3: {Diamiclor} The diameder of a tr
Trang 29ihe tree ig even) or the two connected vertices (if the diameter is odd) of minimum eccen- tricity Suppose that a diameter of the tree is defined by the pHÍN sĩ, tạ, [x] g]q1 vB
If kis even then 7 fs onlled a center of the tree, If is odd then ry) and vy, are
Definition 5: (Bounded Diameter Minimum Spauning Tree Problua - BDMST) Let
G — (V,B) he a connected undirected graph with pasitive edge weights w(e) ‘The BUMS problem can by formulated as follows: among all spanning trees of @ whose diameters do not exceed a given upper bound L) > 2 find the spanning tree with the minimal cost (sum
of the weights on edges of the tree) As in almost all studies of the BDMST problem, and urithout lost af generality, we will assume that G is a complete graph
Thus, we can formulate the problem as:
so the center of tree is only one vertex In figure 2.2, the hammded diameter is add nnmber,
80 Uy, by are the centers of irc und (vy 99) is center edge
Definition 6: (Decision HDMST problem) Let G — (
7) he œ connected undirected
spanning tree with diameter less than or equal D and the weight of tree is q?
18
Trang 30node, the center of the tree In the case D = 3, the center is a single edge where all
remaining nodes of the graph are connected to one of its endpoints by the cheaper edge Therefore the optimal BDMST can be found in polynomial time by enumerating all stars
in O(n?) (D = 2), respectively by iterating over all edges and connecting the remaining nodes in time O(m.n) (D = 3), which is bounded above by O(n’) for complete graphs In case, 4 < |V|— 1, BDATST become NP —hard problem Detail about special cases with
D <4 can be seen in [16], Reduction of BDMST is introduced in (13, 17]
Some of the well-known constrained minimum spanning tree problems require min- imizing the weighted diameter of the spanning tree of a randomly-weighted graph These problems are closely related to the problems that require optimizing the weighted radius
of the spanning tree The main difference between these problems and the BDMST prob- lem lies in the way they disregard the number of edges in the longest path in the tree Approaches to solve these problems can be sometimes modified to solve the BDMST
20
Trang 31problem, and vise versa In this section, we introduce some optimization and decision problems concern with BUMS’
Let G — (V, F} be a connected undirected graph with positive edge weights 1(«) Suppose
‘= (V, Er) be a spanning tree of G
Problem 1: Banded Weighted Diameter Minimum Spanning Tree problem (BW DST) Among ell spanning trees of G uhose weight of diameters do not exceed a given upper bound
D, firul the spanning tree wilh the minimal vost
Problem 2: Minimum Weighted Diameter Bounded Spanning Iree problem (MW DLS’)
find the spanning tree with the minimal weighted diameter
Problem 3: Bounded Weighted Radius Minimum Spanning Tree problem (BW RMST}
R, find the spanning tree with the minimal cost
Problem 4: Minimnm Weighted Radins Bonded Spanning Tree problem (MWRBST)
Jind die spanning tree with the minimal weighted radius
Problem 5: Bounded Weighted Diameter Bounded Spanning ‘tee problem (LW DSL) Among all spunsing trees of G whose weight of diarutters do not exceed u given upper bound D, find the spanning tree with the weight of tree do not exceed a given upper bound
$
Problem 8: Iiomded Weighted Radius Bounded Spanning Tree problem (BW 2/187)
R, find the spanning tree with the weight of tree do nat exceed a given upper hound 8 Two upplications duscly rdaled to BDMST problem are mentioned bellow
Problem 7: [op Constraint Minimum Spanning Tree Problem (I[C/MST) Given a graph C= (V,E) with positive
Trang 32consists of no more than H edges
Generalize of HC'MST' can be defined as follow:
Problem 8: Distance ar Delay Constrained Minimnm Spanning Tree Problem Ginen a graph G =(V.£) with positive edge weight w(e) and delay value de > U A root r und a bounded delay I Find spanning tree T — (V Fp) of @ that minimal cost and the delay of all edge in the path from r to other node less than L
Three other bellow problows arc constraint optimization problans coucern Lo spanning
tree
Problem 9: k Cardinabity Tree Problem Given an undirected yraph G — (V,E) with edge weights and a positive integer number te, the k — Clardinality ‘Tree problem consists
of finding a subtree T of G with evactly k edges and the minimum possible weight
Problem 10: Degree-Constrained Minimum Spanning Tree Problem Let G = (V, F) he a connected undirected graph with positive edge weight w(c) DCMST can be formulated as follows: among spanning trees of G whose degree is not enceed a given upper bound d > 2, find the spanning tree with minimum cost
Problem 11: Capacitated Miniuuiu Spanning Tree Problem Given on undirected weighted graph G, a node r of G@ and in integral value Q, C'MS'I' consists of finding a minimum spanning tree T of G rooted al r such thal the wunber of nudes of each sublree of T docs not exceed Q
All of above problems ae NP — hard and can be seen in [24]
In the next section, we will review the approaches for solving BDMST
The BDMST problem has been shown to he also approximate-hard, in that there
is no polynomial time algorithin which could guarantee to find a solution that has a cost within log(|¥]} of the optimum, imless P — NP Techniques for solving the BDMST
32
Trang 33problem may be classified into two main categories: exact methods and inexact (heuristic) methods Exact algorithms are guaranteed to find an optimal solution ‘Lhe run-time increases dramarically with the instance size, and often only apply far small instances Heuristic algorithms will be used for larger instances and it guarantee to find gand solutions
in a limited time
2.3.L Exact approaches
Exact approaches for solving the BDMST problem are based on mixed linear iuleger programming (33, 15] Achuthun ob ul [35] prescuted tbrew brauel-and-bound algorithms for it and solved instances with up to 100 vertices Gouveia and Magnanti [15] deseribed nework flow model that solved instauces with up to L00 vertices and 1,000 edges, and Santos et al [2] extended the methods of Achuthan et al [194] hey preseuled a formulation based ou lifled Millor-Tucker-Zouilin ineyualities rospousible for
"They model MS'’ problem into two cases: even diameter and odd diameter and solve
it seperately They experiment on the graph with maaimuu [¥] — 40 aud |E] — 200 However, being deterministic and exhoustive in nature, exact approaches could only be used lo sulve siuall problem instauces (e.g complele graphs with less than 100 nodes)
Trang 342.3.2.1 One Time Tree Construction Algorithm
Abdalla et al [2] presented a greedy heuristic algorithm, the One Time Tree Con- struction (OTTC) for solving the BDMST problem OTTC is based on Prims algorithm
in [37] It starts with a set of vertices, initially containing a randomly chosen vertex
The set is then repeatedly extended by adding a new vertex that is nearest (in cost) to the set, as long as the inclusion of the new node does not violate the constraint on the
diameter of the tree The algorithm time for appending each new edge, in the worst case,
is O(n) This step is repeated n — 1 times, so the algorithm time is O(n#) The quality
of the tree indentified by the algorithm depends heavily on the start vertex To identify
a low-weight BDST, the algorithm should be run starting from each vertex in the target
graph The time of the entire process is then O(n) This algorithm is time consuming,
and its performance is strongly dependend on the starting vertex
Figure 2.3 shows a smallest BDST found by OTTC, of diameter D = 5 on n = 100
Trang 352.3.2.2 Center-Based Tree Construction Algorithm
Tn [28 the Oenter-Bused Tree Construction Heuristic (CBTC) applies the sume Drim-based strategy but uses the start vertex as the center of the spanning tree {if 1) is even) or ay one of lwo verlices in the couler (if D is odd) This algorithin docs nol need
ta bound each vertex eccentricity It suffices to bound each vertex’s depth by the number
of edges on the path from the trec’s ccnter to the vertex No vertex can be morc than
[BJ edges from the center, and the depth thns the eligibility of a vertex is fixed when it
joins the tecc Updating this algorithm data structures requires only linear time in the
worst, case (constant time when a new vertex depth is || }, so the time complexity of
the algorithm is O(n?) and O(n*) if starting at cach vertex
Julstrom also modiied CBTC algorithin by choosing the starting vertex and all subse- quent vertices at random from those not yet in the spanning tree ‘I'he connection of
each new vertex v lo the tree remains greedy [lt always uses the lowest-weighl edge Lhet,
connects t to a vertex in the tree whose depth is less than |9| ‘Ihe modified algorithm called Randomized center-based Tree Construction (RTC) The time complexity of RTC, like that of CRTC is O(n®) Running the randomized heuristic n times and reporting the best soluvion is thus O(2?}
2.3.2.3 Randomized Creedy Heuristic Algorithm
Raid] and Julstrom proposed in [40] a modified version of OY FC, called Ran: domized Greedy Henristi
(RGH) RGH starts from a centre by randomly selecting a vertex and keeping it as the fixed center ducing the search It then repeatedly extends the punning tee from the ventur by adding» randouily chosen verlex from the remaining vertices, and connecting it to a vertex that is already in the tree via an edge with the sunullest weight,
‘Yhe algorithm also differ from Ó7'fŒ in that it begin by fixing the center of the tree The starling vertex vg is chosen randomly, If D is even, vg is Uhe center If D is odd,
35
Trang 36another vertex tị is chosen at ranđom and sạ,t are the centers: the edge joining them
is the first in the tree Instead of maintaining the eccentricity of vertex and path lengths Detween vertices, the randomized heuristic stores the depth of each connected vertex: the number of edges on the path from it to the center ‘'his value is set when a vertex joins
the tree and does not subseqnently change No vertex may have adepth greater than 2 ; otherwise the diameter constraint is viclated or œo{0¡) is displaced from the center
Sketch of RCH ulgoritha cun be presented in the algoritlan 1
Identifying the vertex u € C! that is nearest to v requires time (|C|) — Ofn) ‘his
+ +a random vertex from U;
%% vorlex from C with gimullesi se{(u, #));
36
Trang 372.3.2.4 Improved Greedy Heurisitics (RGH — I and CBTC — 1)
Singh and Gupta [46] extended greedy constructive heuristic with a local search
step that reevaluate previous vertex connections after appending each new vertex
They check for each vertex v if it can be connected to a better parent vertex other than
the one to which it is currently connected without violating the diameter constraint The vertex, which offers the maximum reduction in the cost of BDST is selected and whole
subtree rooted at vertex v is deleted from its current location and reconnected to the tree
via the vertex selected
This improvement is applicable to CBTC also and the obtained algorithm will be denoted
instance with the number of vertices is 100 and D = 10 respectively ‘The tree on the figure 2.5 found by apply the local search on the best tree found by CBTC algorithm (figure 2.4) and can be seen on the circle mark
Figure 2.6 and 2.7 show the best BDST found by RGH on the Euclidean problem instance with n = 100, D = 10 respectively The tree on the figure 2.7 found by apply the local search on the best tree found by RGH algorithm (figure 2.6) and can be seen
27
Trang 38
algorithm on the Euclidean problem instance | RGH — I algorithm on the Euclidean prob-
on the circle mark Singh and Gupta [46] experiment on the Euclidean instances with the
number of vertices are 50, 100, 250, 500 and 1000 diameter bound is set to 5, 10, 15, 20,
25 respectively
In [21], Gruber and Raid! propose a constructive heuristic that exploits a hierar- chical clustering to guide the process of building a backbone The clustering heuristic constructs diameter constrained trees within three steps: determining a hierarchical clus- tering, reducing the height of this clustering according to the diameter bound, and finally deriving a BDMST from this height-restricted clustering
They experiment on the Euclidean instances from Beasley’s OR-Library [7] |V| = 1000
and 15 first instances are used On large Euclidean instances the BDMSTs obtained
by the HCH outperforms other construction heuristics significantly, especially when the
diameter bound is tight and it takes only few seconds but it can not apply to the Non- Euclidean instances
28
Trang 392.3.2.6 Comments
Tn Singh and Gupta [46], (hey experiment aud compare the resull bewweon OTTC,
TGH, 1GH — 1, CHỮ BEC — J on the Muclidean and Non-Liuclidean instances in
which the muzuber of vertices ure 50, 100, 250, 500, 1000 and the diameter bound is vet bo
5, 10, 15, 20, 25 respectively ‘he experimental results show that:
On the Non-Buclidean instances, RGH 7 and CBTƠ I give better results than ROH and CRTC respectively on the est and average results Both RGH and RGH—T perform much worse than OTTO, CBTC and CBTC I Even RGH I cannot compete with OTTC, CBTC and CRTC — T On almost inscances, OTTC gives the best resnlts on the min, mean value
In 28), Julslrom experinsent on 240 graphs 120 Euclidean aud an equal number with
e weights chosen at random 'l'he Kuctidean graphs consisted of points randomly placed
iu the unil square, 30 graphs each of n — 100, 250, 500, and 1,000 pois In each sev of graphs, 15 instances can be founcl in OR- library [7], where they are listed as instances of the Euclidean Steiner problem, and 15 more were randomly generated In each set, the points are the vertices of complete graphs whose edge weights are the Kuclidean distances Dewween the points
Tent more, sets of 30 complete graphs siso cansisted of n — 100, 250, 500, and 1,000 vertices, The edge weiglts of these graphs were chosen ut random on Le interval [0.01, 0.99
On the Euclidean instances, diuuucter bound is ect to 5, 10, 15, 25 for [¥| = 100, 10, 15,
20 40 far |V[ — 2ã0, I5, 30, 45, 60 for [V — 500, 20, 40, 60, 100 for [V| — 1000 On random edge weight instances, diameter bound is set to 5, 7, 10, 15 for |V'| = 10, 5, 10,
15 20 for [V| = 250, 10, 15 20, 30 for |V[ = 500, 10, 30, 30, 50 far |V| = 1000
The experimental results on [28, 46, 40] show that:
On the Fuclidean instances, the best and average results found by ROH — F are better
Trang 40that are slightly shorter than those OTTC finds, but RTC trees are much shorter than those of OL'T'C and CBY'C When O4'1'C, C41 C are applied to problem instances whose vertices are points in Finclidean space and whose edge weights are the distances between the points, the weight of BUAIS found by the heuristic are much larger than minimum, especially in the case 2 is smaller than n OTTC and CBTC build backbanes of shart edges; the remaining points connect to these backbones via longer edges, so OTT! and CBTC build longer trees (hun ucecssucy This observation Lolds for ulinost all BDMST problem instances With larger diameter bounds, the differences in the three algorithms results diminish, to the particular advantage of CBTC
On random weight instances, C'LIV'C identities have on average lower weights then those OTTC RIC is always worse than that of both OTTC and RTC The lack of Euclidean
structure in the random-weight instances make OTTC and CRTC better than ATC
2.3.3 Metaheuristic algorithms
Beside the greedy construction houristies, several rescarch groups have developed evolutionary algorithms (FAs) for solving the BDATST and hope that they conld find good result within reasonable time
In FA, representation methods are important role and decide all the operaror in the algorithm
Representation ethuds: Thee are a lol of methods [or representing individuals, especially spanning tree: Characteristic vectors, Predecessor coding, Prufer number, Link and node, Edge-sel-encoding, Permutation code In this thesis, we will use Edge-set-encoding and
Permutation code
« Edge-set-encoding: The problem of spanning tree representation has been studied extensively in the literature References in [36, 26] and specially [45° contain sub stantial disenssions and analysis af different representations from theoretical and practical perspectives Tor the BPMST prablem, three representations have been
30