Luận Án tiến sĩ genetic algorithms for solving bounded diameter minimum spanning tree problem

MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY GENETIC ALGORITHMS FOR SOLVING BOUNDED DIAMETER MINIMUM SPANNING TREE PROBLEM By TTuynh Thi Thanh Bình

Trang 1

MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

GENETIC ALGORITHMS FOR SOLVING

BOUNDED DIAMETER MINIMUM

SPANNING TREE PROBLEM

By

TTuynh Thi Thanh Bình Supervisor: Associate Professor Nguyen Duc Nghia

A Dissertation submitted in partial fulfillment of the requirements

for the Degree of Doctor of Philosophy in Engincering

HaNoi, 2011

Trang 2

2.3.22 Center-Based Tree Construction Algorithm

Trang 3

2.3.2.4 Improved Greedy Heurisitics (ROH — f and CBTC —J)

2.3.2.5 Hierarchical clustering heuristic algorithm HCH

2.33 Metaheuristie algorithms 23.4 Conclusion

3.4 Discnasion

4 Genetic algorithm with multi-parent recombination operator

5 Multi-population Genetic Algorithm

5.1 Structure of the genetic algorithm

Trang 4

5.2.1 Problem inscances 5.2.2 Experiment setup

Result,

6 Steady-state genctic algorithm

G1 Stendy state generic ulgorihur structure

6.1.1 Individual representation and initial population 6.12 Crossover

6.1.3 Mutation G14 Selection 6.2 Replacement policy

63 Experiments

63.1 Problem instances 63.2 Experiment sctup 63.3 Parameter

Trang 5

List of Figures

Li Scheme of genetic algorilun

2.1 The BAST with 19 vertices and hoimded diameter D=4, 7 is the center of

the tree

2.2 The BDST with 19 vertices and bounded diameter Đ—ð, v1, vy ure the

centers of the tree

23 The bet BDST found by OTTC algorithm on the Buclidean problem

A spanning tree on twelve nodes and an its edge-set representation

A spanning tree on eleven nodes and an its permutation-code representation 2.10 Center Move Mutarion

2.11 Edge Delete Mutation

2.12 Subtree-Optimize Mutation

bạ 3

28

Trang 6

A star-like structure of a typical solution to the BLM ST" problem

Greedy Fdge Delete Local search

‘The best BUS found by CBC heuristic on the Buclidean problem in-

slance with v= 100, D= 10

The best BDST found by (BRC — I heuristic on the Euclidean problem

instance with 2 = L00, D = LO,

Comparison between the sum of the best solutions found by EA — zy?

algorithm on all the problem instances

Compucison between Lhe sum of uke best solutious found by BA sựỗ

Compucison between the sum of uke best solutious found by BA sựT

Comparison between the sum of the best solutions found by BA — ay

Comparison between the sun of the best solutions found by #A — xdk

algorithm on all the problem instances (» = b,r, isk = 2,5.7,9)

Comparison between the sum of the best solutions found by BA — agk

algoritiun ou oll the problem instances (2 — 87,14 — 2,5 7,9)

Comparison between the cum of the best solutions found by LA — emk

algoritlun ou all the problem instances (2 — & Gk -25 79)

Trang 7

48

49

412

418

Comparison between the sum of the average solutions found by E#A — zy2

algorithm on all the problem instances (x = 6,7 1)

Comparison between the sum af the average solutions found by FA — zyä

Comparison between the sum of the average solutions found by FA — ry7

Comparison between the sum of the avcrage solutions found by EA yd

algorithm on all the problem instances (x — 6,7, {) -

Compucison between the sur of the average solutions found by EA adk

algorithm on all the problem instances (x — 6,7, 4: — 2,8,7,9)

Comparison between the sum of the average solutions found by EA — agk

algorithm on all the problem instances (» = b,+, l;k = 3, 5, 7,8)

.14 Comparison between the sum of the average solutions found by EA xmk

415

4.16

6.1

algorithm on all the problem instances (» = 6,7 isk = 2,5.7,9)

Comparision betwen the best solution found by Œ.4:, Gz, GAs, Cla,

GAs, Gg on all the problem instance

Comparision between the standard deviation of the solution found by GA,

GAp, CAs, GA GAs, GaAs on all the problein instance

Mniti-popnlation model

‘Lhe comparision between the best results found by GAn, GA, GAr

CAy und AGA on the instance with n — 250, D — 15, instunee Ls

‘Lhe comparision between the mean results found by Gan, Gly, GAr

GAy aud HGA on the instance with n — 250, D — 15, instance Ls

‘Lhe number of individuals from GAu1, GAv,, GAts, GArs migrate to GA pinat

Trang 8

Results of QUUC, CBUTC, RGU, CBRC, CBRC — 1, NGI! — ft on the

Euclidean instances of the BDMST problan with ø — 100 and D —

5,7,9,11,13, 15

Fuclidean instances of the BDMST problem with ø — 250 and 2 —

15, 20, 23, 25, 27, 30, 35

Non-Euclidean instances of the BDA{ST problem with n — 100 and D —

Results of QUUC, CLIC, RGU, CBRC, CBRC — 1, LiGi! — 4 on the

Nou-Buclidenn instances of the BDMST problem with a — 250 und D - 5,10, 13, 15, 17, 20, 25

viii

49

50

52

Trang 9

Non-Euclicean instances of the BD.MS1 prablem with n = SUG and =

10, 15, 18, 20, 22, 23, 30

Results of OLU'C, CBIC, RGH, CBRC, CBRC — 1, RGH — 4 on the

Non-Faclidean instances of the BDMST problem with n — L000 and D —

15, 20, 23, 25, 27, 30, 35

‘Lhe rate of the heuristic algorithms use for initialization of the population

in each experiment, genetic algorithm

Comparision between the result found by £14 — zy2; x = d,g.m; y=U.r,6

on the 20 Euclidean problem instances

Comparision between the result found by EA —ay5; x — 4, g.m; y —4.r,b

Comparision between the result lound by #4 — zy7; z — đ,g.rm; ự — (,m,b

Comparision hetween the result found by FA — ny; a —d,q.m; y— Urb

on the 20 Kuclidean problem instances

Comperision between the result with dillercnt crossover probabily ou the

Luclidean problem instance with number of vertices are 250, D—15

Comparision between the result with different crossover probabily on the

Fuclidean problem instance with miber of vertices are 250, D—15

Comparision between the result found by HJ — BSEA, PEA —1, HOA,

MHGA on the 20 Euclidean problem instances

Comparision between the result found by RJ) — ESEA, PEA —I, HGA,

MAGA on the 20 Non-Euclidean problem instances

Result of Gan, GA, GAy, GA and LGA on 20 Luclidean 11442"

Trang 10

Results of PHEA— HGH, PEA- RGHI, PEA—CBRC, PBA-CBRCI,

PRA — F on the 20 Fuclidean instances of the BDMST problem of size

100, 250, 500 ancl 1,000

Average number of Ierwtious required by PEA RCH, PEA RGHI,

PEA-CBRC, PEA—CBRCI, PEA-— I to reach the best solution on

Une 20 Euclidcun instances of BDMST problem of size 100, 250, 500 and

1,000

Results of PEA RCH,PEA RGHI,PEA OBRC,PBA CBRƠI,

PRA — I on the 2 Nan-Fuclidean instances of the BDMST problem of

size 100, 250, 500 and 1,000

Average number of iterations reqnired by PEA — RGH, PRA — RGHI,

PEA ~ CBRC, PEA ~ OBRCI, PEA — Ï to reach the best solution on

the 20 Non-Eneliđean instanees f H72ØT prohlem of size 100, 250, 500

Trang 11

Abstract

The Bounded Dianeter Minimaurn 8panuing Trec (BĐĐAf ST) problern is ä connlaina- torial optimization problem that arises in many applications such as design of wire-based communication uelworks under quality of service requirements; desigu linear lightwave networks, where it can minimize interference in the network by limiting the traffic in the network lines Another practical application requiring a BD.MST arises in data compres sion, where some algorithms compress 4 file utilizing a tree data-structure, and decompress

a path in the tree to avcess a record in wd-hoe wircleys uctworky distributed imutual cx~

clusion algorithms

Let G = (V, £’) be a connected undirected graph with positive edge weights w(e) (e is an edge of graph} The BDMST problem can be formulated as follows: among spanning

trees of G whose diameters cla not exceed a given upper bound J > 2, find a spanning

studies af the 844 $1’ problem, and without lost of generality, we will assume that G is

u complete graph

This problem is known to be NP — hard for 4 << D < V|—1 Moreover, the BDMST problem hay been shown Lo be alse approsimate-hurd, in that there ig no polynomial Lime algorithm which could guarantee to find a solution which has a cost within log{|V|) of the optimum, unless P — NP Therefore, heuristic and meta-heuristic teelmiques are ear rently the only practical method for improving the solution quality in solving the L)A4 97

problem, especially when [V'| is large

Trang 12

In this thesis, we survey the literature on the BDAST and then present new algorithms for solving this problem

First, we propose a greedy heuristic algorithm called Cantar-Rased Recursive Clustering (CBRE) We extend the concept of center to each level of the partially constructed spanning tree The algorithm can be seen as recursively clustering the vertices of the graph: every internal node of the spanning tree is the center of the sub-graph in the sub- iece ruoted ab this nude and we recursive vo find Unc best couler, The uew heuristic is compared with other well-known heuristics for solving the BMS!" problem, namely, the Onv-Time-Tree-Construction (OPTC), the Rundumized Greedy Heuristic (RGH) of Ruidh and Julstrom, the Center- Based 'Lree Construction (€

) of Julstrom, the Randomized Greedy Heuristic with post-improvement (RGH — I) and Center- Based Tree Construction with post-improvement (BTC — 1) of Singh and Gupta

And then, we mtroduce multi-parent recombination operator in Genetic Algorithms (GAs} for solving the BIẾT problem The proposed multi-parent recombination operator allows using more than two parents to crcate offspring We consider three different methods for chousing parents, Three new methods for adding edges from the parents to the off spring also considered or each of the three methods of choosing parents and three ways for adding edges, we also experiment genetic algorillms for volving BDMST problem with different number of parents We discuss anc analyze the efficiency of using different heuristic algorithius Le initialize the population in genetic algorithun lor solving the BDMST problem

We present» uew geuctie algorithin (GA) which use mulli-populution where each populution is initialized with a, different well Imow henristic The individnals in each popnlation will subsequently compete lor positious iu a selection population, using a simulated an nealing mechanism hased on proportionate selection In the selection population, they will combine ond cvelve toward Lhe optimus We compare our cevulis with other GA Beside generational genetic algorithm, recently many researchers are interested in steady-

Trang 13

state genetic algorithm We present steady-state genetic algorithms which use different heuristic algorithms for decoding We modify the decoder and the replacement policy

nsed in PEA — F so as to imprave its performance We use four decoders hy different

Experimental resulta are also reported to compare the efficiency of different henristic and

genetic algorithms for solving BDMST problem

Trang 14

Acknowledgements

First of all, I would like to thank my supervisor, Professor Nguyen Duc Nghia, for his guidance and assistance, which significantly contributed to the further development of amy research and writing skills, I further want lo thank Professor BL Bob McKay, Dr Nguyen Xnan Toai who help me so much during the time Twas Th student

Talso would like to thank the committee members: Professor Hoang Van Kiem, Professor

Nanyen Thue Tai, Professor Nguyen Thanh Thay, Associate Professor Thai Cong Cuong, Associate Professor Huynh Quyct Thang, Agsociate Professor Tran Dinh Khang, Asao- ciate Professor Ngo Quoe Tao, Dr To Cam Tla, Dr Bui Thn Lam, Dr Ngo Tong Son,

Dr ‘Lruong ‘hi Dieu Linh, Dr Le Minh Hoang, Dr Le 'Lrong Vinh

I would like to give special thanks to my parents, my husband and my daughters, who gave

me unconditional support and encouragement during the long time 1 needed ta conduct

research and write this thesis

Also, thank to Ministry of Ecucation and ‘[raining, Hanoi University of Science and ‘Lech nology, Natioual Foundation for Seience and Technology Development for Uneir funding for my research I would like to thank to my colleagues at School of Information and

Conmunivation Technology, any Lricnds, for their communts and exvouragemcat

Trang 15

Chapter 1

Tntroduction

Nerwork design prohlems are aerive †opies in reearch The selection of an ap- timal contiguration or design of a network occurs in many different application contexts including Weensportution (uitline, railroad, Loulffic, aud wnasy Lransit}, communication (Lele phone and computer networks), electric power systems, and oil and gas pipelines ‘here are Jot of real world problems can be mapped to a formulation deuling with nodes and dges within a graph lor example, telephone companies are particularly interested in

minimum spanning tree, because the minimum spanning tree of a sct of sites defines the

wiring acheme that connects the sites using as little wire as possible Tr is the mother of all network design problems This minimum spanning tree is a fundamental problem and can he easy polynomial-time solved hy using Prim or Kruskal algorithm

Another cxample concern with a traffic network whose nodes represent both origin and destination areas for the velicular Lrallic of a city and also inberseclivis in Lue road net work ‘I'he ares correspond to streets in the city, and the are flows are the amount of traftic traversing the streets A typical network design problem would be to select a subset of the pessible road improvements subject to a budget constraint Vhe design objective would

he to minimize the total travel cost for all travelers in the city network

Tt is interesting to see the wide range of network models that: are related to the fixed

Trang 16

charge design problem If all are construction costs are set to zero, then the fixed charge design model becomes a series of shortest path problems If all are routing costs are set

ta zero, the fixed charge design model becomes a Steiner tree problem on a graph Since

the fixed charge design problem contains the Steiner problem as a special case, we can

and totally dominate the routing costs (i.¢., the optimal network design must be a tree),

then the fixed charge design problern becomes tke optimum communication spanning tree

problem defined by Llu [25]

Scott 4] hus introduccd another newwork synthesis problem, called the “opal wetwork”

problem, that is closely related to the fixed charge design problem ‘I'he arc routing costs

in this problem are all linear functions of the total How Arc capacities, which are all

initially zero, can be raised ta infinity The objective is ta minimize total routing east,

subject to the usual capacity and How routing constraints and the added constraint thet

the total construction costs cannot exceed a given budget Optimal network problem is

In communication network design when requirements can be for exampte a linilation of the maximum communication delay or the guarantee for a minimum signal-to-noise ratio, thus, the mumber of relaying nudes on any path between (wo communication partner neetls to be restricted ‘his problem is the BDMS1'

The BDMST problem have so much applications in design of wire-based communication networks under quality of service requirements; in linear lightwave networks, where it can

dainimize interference in the uctwork by limiting the vraflic in the ovtwork lines; in data

compression, where same algorithms compress a file utilizing a tree data-structure, and decompress a path in the Urce lo acvessy w record; in ad-hoe wireless uclworks distribuled

mutual exclusion algorithms More detail about the applications of the BDMST are

Trang 17

1.1 Motivation

BPMST problem has applicarions in several areas, such as in commnicatian network design, distributed mutual cxelusiou, linear lightwave actworks wad bit-coupression for information retrieval In the thesis of Abdalla [I] and DT Martin Grnber [17] detailed informations shout the motivation of BUALST are presented Additional fields of application are described in [34., where the BPIMST appears as a snbproblem within the vehicle routing problem Paper [3] deals with ad hoc wireless networks while the paper [6] presents dynamic routing algorithms for multicasting in a linear lightwave network We consider several applications as bellow

can be a limitation of the weximun

1n conununiemion nebwork design, the roquiremi

communication delay or the guarantee for a minimum signal-to noise ratio ‘I'hus, the nutuber of reluying nodes on any path belween two conunumieation puctners needs lo be

limited by a given constant

In distributed mutual exclusion, before cntcring a critical scction a computer in a dis-

tributed environment, has to signal its intention and ask for permission A relevant part

of the costs for these operations is the length of the longest path the messages between the computers have to travel Thna, when a, tree structure is nsed as imderlying commn-

nication infrestructure as proposed in [43 the diamcter of it has a direct influcnce on the

efficiency of the mutual exclnsion algorithm

Jn distributed system, messages passe from node to other node In |4Ÿ, Raymon uses a logical spanuinyg tree struclure on a uetwork of processurs Messages are passed among, processors requesting entrance to a critical section and processors grating the privilege to enter The maximum number of message generated per critical-section execution is 2d, where ¢ is the diameter of the spanning tree ‘lherefore a small ciameter is eesential for the efficiency of the algorilun Minimizing cdyge weights reduces the cost of the uctwork

Another application can be found in information retrieval systems where large data

Trang 18

struc-tures called bitmaps are used in compressing large files, see [9] It is required to compress the files, so that they will occupy less memory space, while allowing reasonably fast access

only vectors within a cluster are coded relative to a representative but also the cluster vepresenratives themselves relative to each ather, where the relation of the clusters is ex- pressed by a graph spanning them all Decoding process leads to the problem of creating a 1mÌnimuur spanning tree wltere the Hamuning distance between two clusters is used as east function ‘Ihe length of the paths within this tree has a considerable impact on the time required to decompress bit-veelors part of the corresponding clusters Ag a consequence,

there has to be a trade off between the compression rate (costs of the spanning tree) and

the (de-jcompression time (diameter of the tree}

BDMST iso challenge, We would like to propose the new algotithms for solving this problem to find god solution in Teasonahle time

hy henristics Hennistios and especially metahenristica can be seen aa sltemative when large instances have to be solved in reasonable time, whereas these approaches are not

able to guarantee to reach the optimum

Trang 19

There are a lot of heuristic algorithms based on different approachs, such as: Greedy Heuristics, Local Search, Evolutionary Algorithms, ‘hese approaches can only applied for specific problems Recently researchers use metahenrisic algorithms to design a com putational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality Metahenristies make few or no as- sumptions about the problem being optimized and can search on a very large spaces of candidate solutions, However, metaheuristics do uot guarantee un optiuul solution is ever

found Examples of metabeuristic algorithms are: Iterated Local Search [31], Labu Search [14], or Variable Neighborhood Scarch (V.N'S) [23], Simulated Anucaling 30], Aut Colony Optimization (AC) [11], Lvolutionary Algorithms (Z'A) [5], and Memetic Algorithms [32]

We will briefly overview Greedy heuristic algorithms, Tocal search, Genetic algorithms which we usc for developping new algorithm for solving BD MST

Greedy henristic algorithm is an algorithm that follows the problem solving metaheuristic of making thc locally optimal choice at cach stage with the hope of finding the global optinuit

In general, greedy algorithms have five pillars:

A candidate set, from which a solution is created

2 Aselection function, which chooses the best candidate to be added to the solution

3 A feasibility function, that is used to determine if a candiclate can be used to con- tribute lo @ solution

4 An objective function, which assigns a value Lo a solution, or a partial solution

5 A solution function, which will indicate when we have discovered a complete solution

Greedy algorithms produce goad solutions on some mathematical problems, bur not on others Most: problems for which they work well have two properties:

g

Trang 20

« Greedy choice property: the choice made by a greedy algorithm may depend on choices made so far but not on future choices er all the solutions to the subproblem

Tt iteratively makes one greedy chaice after another, reducing each given problem

into a smaller one

« Optimat substructure: a problem has optimal substructure if the best next move always leads to the optimal solution

Greedy algorithms mostly (but not always) fail ro find the globally optimal solmion, because they usually do not operate exhaustively on all the data ‘I'hey can make com inilments to certain choices tov early which prevent them from finding the best overall solution later Kor example, all known greedy coloring algorithms for the graph coloring problem and all oUuer NP — complete problems do not consistently find optimum solutions Nevertheless, they are useful because they are quick to think up and often give good approximations to the optimum

Local search is a metahenristic for solving computationally hard optimization problems

Loeal search can be used on problems thal can be Jormuluted ws findiug » so luLion muxi~

mizing a.eriterion among a number of candidate solutions Tocal search algorithms move from solution to solution in the space of candidate solutions (Ue search space) until a solution deemed optimal is found or a time bound is elapsed

A local search algorithm starts from a candidate solution and then iteratively moves to a neighbor sahution This is only possible if a neighborhood relation is defined on the search space As an example, the neighborhood of a vertex cover is another vertex cover only differing by one node For boolean satisfiability, the neighbors of a truth assignment: are usually the truth assignments only differing from it by the evaluation of a variable The same probluw sy Lave mulliple differcut neighborhoods defined on il: lueal optimization with neighborhoods that involve changing up to & components of the solution is often

referred bo as k opt

10

Trang 21

Survivar sles

Figure 1.1: Scheme of genetic algorithm

‘Termination of Jocal search can be based on a time bound Another common choice is

to terminate when the best solution found by the algorithur has nel been improved in a given number of steps Local search algorithms are typically inenmplete algorithms, as the search may stop even if the best solution found by the algorithm is not optimal This can happen even if termination is due to the impossibility of improving the solution, as

the optimal solution can Tie far from the ucixhborhood of the solutions crossed by the

algorithms

The genetic algorithm (CA) is a search heuristic that mimics Uhe process of natural

evolution This heuristic is routinely used to generate useful solutions to optimization and search problems, Cencli¢ algoritiuns belong bo Uhe lurger class of evolwvionezy algorithins (FA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover

‘The general acherne of a GA can he given in the figure 1.1

GAs are useful and efficient when:

« The scarch space is large, complex or poorly understood

ø Domain knowledge is scarce or expert knowledge is difficult ro encode to narrow the

search space,

11

Trang 22

« No mathemarical analysis is available

@ Traditional search methards fail

Representation: Objects forming possible solution within original problem context are called phenotypes, their encoding, the individusls within the GA, are called genotypes

‘The representation step specifies the mapping from the phenotypes onto a set of genotypes Candidate solution, phenotype and individual are uscd to denotes points of the space of

be used for points in the genotye epace

Mulation Operaior: 11 is applied o one genotype aud delivers a modified mutant, the child

or offspring of it In general, mutation is supposed to cause a random unbiased change Mutation has a theoretical role: it cau guarantee Wal the space is connected,

Crossover Operator: A binary variation operator is called recombination or crossover This operator merges information from two parent genotypes into one or two offspring genotypes Similarly to mmtation, crossover is a stnchastic operator: the choice of what parts of cuch purcut are combined, and the way these purts are combinod, depend on random drawings The principle behind crossover is simple: by mating twa individuals with different but desirable features, we cun produce ax ollspring which combines both of

those features

Parent Selection Mechanism: ‘Lhe role of parent selection (mating selection) is to distin guish among individuals based on their quality to allow the better individuals to hecome parents of the next generation Parent selection is probabilistic ‘hus, high quality indi viduals ger a higher chance to become parents than those with low quality Nevertheless, low quality individuals are often given a small, but positive chance, otherwise the whole search could become tun greedy aud gel stuck in a local optáinui,

Survivor Selection Mechanism: 'Vhe role of survivor selection is to distinguish among in- dividuuly based on their qualivy Iu CA, the population size is (almost always) constant,

12

Trang 23

thus a choice has to be made on which individuals will be allowed in the next generation ‘I'his decision is based on their fitness values, favoring those with higher quality As opposed to parent: selection which is stochastic, survivor selection is often deterministic, for instance, ranking the unified multiset of parents and offspring and selecting the top segment (fitness biased), or selection only from the offspring (age-biased)

Termination Condition: Notice that GA is stochastic and mostly there are no guarantees

Jo reach an optimum Commonly used conditions for terminavions arc the following:

1 The maximally allowed CPU times elapses

2 The total number of fitness evaluations reaches @ given linnit

For a.given period of time, the fitness improvement, remains under a, threshold value

4 The population diversity drops imder a given threshold

Population: The role of the population is to hold possible solutions A population is a amultivet of genulypes In almost all CA applications, the population size iy coustanb, not changing during the evolntional search

Both, exect and heuristic methods, have their strengths and weaknesses In practice, the combination of them to hybrid algorithms often allows ta improve solution quality (faster algorithms and for better solutions) by exploiting synergies [17] Classifications and surveys of different hybridizations of exact optimization techniques with metahenristics can

be found in [41, 33, 42]

problem On this thesis, we will use local search and genetic algorithm for developing

new algoritluns lor solving BDMST problem

13

Trang 24

Contributions will be presented in four chapters and can he summerized as follow:

1 We propose the Center-Based Recursive Clustering (CBRC) heuristic algorithm CBRC is bused on RCH (nud CBTC}, We extend the concept of center to cach level

of the partially constructed spanning tree The algorithm can be seen as recursively clustering the vertices of the graph: every internal node of the spanning tree is

the center of the sub-graph in the subtree rooted at this node and we recursive to find the best center We also survey the constraint between the weight of tree and

hounded diameter We experiment and compare the result between our algorithm and others - RGH, RGH — I, CBTC, OTTC, CBTO — J - on the Euclidean and Non-Rnclidean instances up to 1000 vertices On the Euclidean instances, the results show the effectiveness of our algorithms on the best, mean and deviation valucs On the Non-Euclidcan instusees, the best results found by CBRC Tare Lhe same with the one found by OLU'C

2 We also introduce three multi-parent recombination operators in genetic algorithm for solving ROMST problem We consider three different methads for choosing

parents: the first one is based on Levenshtein distance between the parents, the sec-

oud one uses Le best individual in Uke population and the lust ene uses randomly chosen individual in the population We also experiment each method of choosing

14

Trang 25

parents with three ways for adding edges from the parents into the offspring: choose the edge randomly, choose the edge which have minimum weight, chacse the edge which have minimum weight in maximum sharing edge from the parents We experiment on the Kuctidean instances up to 1UQU vertices We concentrate on analyzing the recombination aperator in genetic algorithms So we compare the resnits of our algorithms using respectively, three mentioned multi-parent recombination opera- tory with another geuctic algorithm using two-parent revumbinution eperator on the

came problem

We propose a new hybrid genetic algorithm for solving BUMS’ problem ‘Lhe new genctic algorithm uscy mult-populution, where cach population is initiulized wilh different well known heuristic The individuals in each population will subsequently

compels for positions in a selection population, using a simuluted onneuling aechu- nism based on proportionate selection; in the selection population, they will combine und evolve toward the optimum, Therefore, our research upprouches cuuploy different initial biases by using different heuristics far initialization, and to hybridize the individuals from these populations to promote the exploratory capacity of the GA

We compare our results with other genetic algorithms, namely, the genetic algorithm

in [4] of Raid) and Julstrom (called RJ — ESEA), the genetic algorithm of Alok and Gnpta in 46] (called PEA—T} and the genetic algorithm in each popnlation on

the Liuctidean and Non-luclidean instances up to 100) vertices ‘lhe results show

the cllectivcness of our ulgoriUlan

We propose stearly-state genetic algorithms which use different heuristic algorithms for decoding We modify the decoder and the replacement policy used in PEA — I

so us Wo improve ity performanee We use four decoders by different well-known heuristic algorithms: AGH, RGM — 1, CHRC, CLC — 1 We experiment on the Euclidean and Nou-Euelidean instances up to 1000 vertices aud Lhe results show the

15

Trang 26

outperform of our algorithms than the others

This diysctation iy orgunized uy follow

In chapter 1, we introduce the motivation of the thesis, methodologies Scope of researches and voutributious ure also prescuted

After the introduction, chapter 2 present formulation of the BALMST problem and sum- marize the related works in the field of the BDMST problem To our best knowledge, all

of the algorithms for solving BD.MST only snitable for one kind of the problem inscance: Euclidean or Non-Buelidean instances 8o, in the remain chapters, we will preset our algorithms for solving RD MST We hops that our propose algorithms ean he applied for both Luclidean and Non-Liuclidean instances to find better solution

A uew greedy Leurisilic algurithur (Cemer-Based Recursive Clustering) is presented in chapter 3

Evolutionary alyoriluns have proven elective ou several hard spanuing tree probleus So,

in the chapter 4,

6, We present our genetic algorithms for solving BUMS

An B.As recombination operator should provide strong heritability This means that the tree produned by recombining parent trees should consist mostly af parental edges Tr is also beueliciul Lo favor edges thal are coummon to the parents In the chupler 4, we present Taniti-parent, recombination operator in genetic algorithm for solving BD MST

Aluiosl all geuctiv algoritluns for solviug the BDMST problem strongly depeud on theiz particular henristies, in that the heuristics were nsually used to initialize GA populations and played an important role in the design of genetic operators, However, it has been suggested in the Ticerature that the behavionrs of different henristies vary over different: classes of problem instances [46]

Th chapter 5, we introduce a new hybrid genetic algorithm for solving BD MST problems

16

Trang 27

that uses a multi-population, where each population is initialized with a different well known heuristic Chapter 5 presents new hybrid multi-population genetic algorithm in which each population is initialized with a different well know heuristic Chapter 6 will introduce steady-state genetic algorithm for solving 4L)ALS‘’ problem which uses differ ent henrisities for decoding the tree

Tinally, the conclusion summarizes the works.

Trang 28

Chapter 2

Bounded Diameter Minimum

Spanning Tree and Related Works

This chapter presents the formulation of BD ST and suuumarizes the relaled works in

the field af the 2DMST problem

Before introduce the approaches for solving BDMST, we state the problem

We need to introduce some concepts relating ta tree diameter and center before the BDMST prablem can be formally stated

Let T — (V, Er) be a tree with node set, V and edge set Fr

number of edges on the path between v and any other node within the tree T

Definition 3: {Diamiclor} The diameder of a tr

Trang 29

ihe tree ig even) or the two connected vertices (if the diameter is odd) of minimum eccentricity Suppose that a diameter of the tree is defined by the pHÍN sĩ, tạ, [x] g]q1 vB

If kis even then 7 fs onlled a center of the tree, If is odd then ry) and vy, are

Definition 5: (Bounded Diameter Minimum Spauning Tree Problua - BDMST) Let

G — (V,B) he a connected undirected graph with pasitive edge weights w(e) ‘The BUMS problem can by formulated as follows: among all spanning trees of @ whose diameters do not exceed a given upper bound L) > 2 find the spanning tree with the minimal cost (sum

of the weights on edges of the tree) As in almost all studies of the BDMST problem, and urithout lost af generality, we will assume that G is a complete graph

Thus, we can formulate the problem as:

so the center of tree is only one vertex In figure 2.2, the hammded diameter is add nnmber,

80 Uy, by are the centers of irc und (vy 99) is center edge

Definition 6: (Decision HDMST problem) Let G — (

7) he œ connected undirected

spanning tree with diameter less than or equal D and the weight of tree is q?

18

Trang 30

node, the center of the tree In the case D = 3, the center is a single edge where all

remaining nodes of the graph are connected to one of its endpoints by the cheaper edge Therefore the optimal BDMST can be found in polynomial time by enumerating all stars

in O(n?) (D = 2), respectively by iterating over all edges and connecting the remaining nodes in time O(m.n) (D = 3), which is bounded above by O(n’) for complete graphs In case, 4 < |V|— 1, BDATST become NP —hard problem Detail about special cases with

D <4 can be seen in [16], Reduction of BDMST is introduced in (13, 17]

Some of the well-known constrained minimum spanning tree problems require minimizing the weighted diameter of the spanning tree of a randomly-weighted graph These problems are closely related to the problems that require optimizing the weighted radius

of the spanning tree The main difference between these problems and the BDMST problem lies in the way they disregard the number of edges in the longest path in the tree Approaches to solve these problems can be sometimes modified to solve the BDMST

20

Trang 31

problem, and vise versa In this section, we introduce some optimization and decision problems concern with BUMS’

Let G — (V, F} be a connected undirected graph with positive edge weights 1(«) Suppose

‘= (V, Er) be a spanning tree of G

Problem 1: Banded Weighted Diameter Minimum Spanning Tree problem (BW DST) Among ell spanning trees of G uhose weight of diameters do not exceed a given upper bound

D, firul the spanning tree wilh the minimal vost

Problem 2: Minimum Weighted Diameter Bounded Spanning Iree problem (MW DLS’)

find the spanning tree with the minimal weighted diameter

Problem 3: Bounded Weighted Radius Minimum Spanning Tree problem (BW RMST}

R, find the spanning tree with the minimal cost

Problem 4: Minimnm Weighted Radins Bonded Spanning Tree problem (MWRBST)

Jind die spanning tree with the minimal weighted radius

Problem 5: Bounded Weighted Diameter Bounded Spanning ‘tee problem (LW DSL) Among all spunsing trees of G whose weight of diarutters do not exceed u given upper bound D, find the spanning tree with the weight of tree do not exceed a given upper bound

$

Problem 8: Iiomded Weighted Radius Bounded Spanning Tree problem (BW 2/187)

R, find the spanning tree with the weight of tree do nat exceed a given upper hound 8 Two upplications duscly rdaled to BDMST problem are mentioned bellow

Problem 7: [op Constraint Minimum Spanning Tree Problem (I[C/MST) Given a graph C= (V,E) with positive

Trang 32

consists of no more than H edges

Generalize of HC'MST' can be defined as follow:

Problem 8: Distance ar Delay Constrained Minimnm Spanning Tree Problem Ginen a graph G =(V.£) with positive edge weight w(e) and delay value de > U A root r und a bounded delay I Find spanning tree T — (V Fp) of @ that minimal cost and the delay of all edge in the path from r to other node less than L

Three other bellow problows arc constraint optimization problans coucern Lo spanning

tree

Problem 9: k Cardinabity Tree Problem Given an undirected yraph G — (V,E) with edge weights and a positive integer number te, the k — Clardinality ‘Tree problem consists

of finding a subtree T of G with evactly k edges and the minimum possible weight

Problem 10: Degree-Constrained Minimum Spanning Tree Problem Let G = (V, F) he a connected undirected graph with positive edge weight w(c) DCMST can be formulated as follows: among spanning trees of G whose degree is not enceed a given upper bound d > 2, find the spanning tree with minimum cost

Problem 11: Capacitated Miniuuiu Spanning Tree Problem Given on undirected weighted graph G, a node r of G@ and in integral value Q, C'MS'I' consists of finding a minimum spanning tree T of G rooted al r such thal the wunber of nudes of each sublree of T docs not exceed Q

All of above problems ae NP — hard and can be seen in [24]

In the next section, we will review the approaches for solving BDMST

The BDMST problem has been shown to he also approximate-hard, in that there

is no polynomial time algorithin which could guarantee to find a solution that has a cost within log(|¥]} of the optimum, imless P — NP Techniques for solving the BDMST

32

Trang 33

problem may be classified into two main categories: exact methods and inexact (heuristic) methods Exact algorithms are guaranteed to find an optimal solution ‘Lhe run-time increases dramarically with the instance size, and often only apply far small instances Heuristic algorithms will be used for larger instances and it guarantee to find gand solutions

in a limited time

2.3.L Exact approaches

Exact approaches for solving the BDMST problem are based on mixed linear iuleger programming (33, 15] Achuthun ob ul [35] prescuted tbrew brauel-and-bound algorithms for it and solved instances with up to 100 vertices Gouveia and Magnanti [15] deseribed nework flow model that solved instauces with up to L00 vertices and 1,000 edges, and Santos et al [2] extended the methods of Achuthan et al [194] hey preseuled a formulation based ou lifled Millor-Tucker-Zouilin ineyualities rospousible for

"They model MS'’ problem into two cases: even diameter and odd diameter and solve

it seperately They experiment on the graph with maaimuu [¥] — 40 aud |E] — 200 However, being deterministic and exhoustive in nature, exact approaches could only be used lo sulve siuall problem instauces (e.g complele graphs with less than 100 nodes)

Trang 34

2.3.2.1 One Time Tree Construction Algorithm

Abdalla et al [2] presented a greedy heuristic algorithm, the One Time Tree Con- struction (OTTC) for solving the BDMST problem OTTC is based on Prims algorithm

in [37] It starts with a set of vertices, initially containing a randomly chosen vertex

The set is then repeatedly extended by adding a new vertex that is nearest (in cost) to the set, as long as the inclusion of the new node does not violate the constraint on the

diameter of the tree The algorithm time for appending each new edge, in the worst case,

is O(n) This step is repeated n — 1 times, so the algorithm time is O(n#) The quality

of the tree indentified by the algorithm depends heavily on the start vertex To identify

a low-weight BDST, the algorithm should be run starting from each vertex in the target

graph The time of the entire process is then O(n) This algorithm is time consuming,

and its performance is strongly dependend on the starting vertex

Figure 2.3 shows a smallest BDST found by OTTC, of diameter D = 5 on n = 100

Trang 35

2.3.2.2 Center-Based Tree Construction Algorithm

Tn [28 the Oenter-Bused Tree Construction Heuristic (CBTC) applies the sume Drim-based strategy but uses the start vertex as the center of the spanning tree {if 1) is even) or ay one of lwo verlices in the couler (if D is odd) This algorithin docs nol need

ta bound each vertex eccentricity It suffices to bound each vertex’s depth by the number

of edges on the path from the trec’s ccnter to the vertex No vertex can be morc than

[BJ edges from the center, and the depth thns the eligibility of a vertex is fixed when it

joins the tecc Updating this algorithm data structures requires only linear time in the

worst, case (constant time when a new vertex depth is || }, so the time complexity of

the algorithm is O(n?) and O(n*) if starting at cach vertex

Julstrom also modiied CBTC algorithin by choosing the starting vertex and all subse- quent vertices at random from those not yet in the spanning tree ‘I'he connection of

each new vertex v lo the tree remains greedy [lt always uses the lowest-weighl edge Lhet,

connects t to a vertex in the tree whose depth is less than |9| ‘Ihe modified algorithm called Randomized center-based Tree Construction (RTC) The time complexity of RTC, like that of CRTC is O(n®) Running the randomized heuristic n times and reporting the best soluvion is thus O(2?}

2.3.2.3 Randomized Creedy Heuristic Algorithm

Raid] and Julstrom proposed in [40] a modified version of OY FC, called Ran: domized Greedy Henristi

(RGH) RGH starts from a centre by randomly selecting a vertex and keeping it as the fixed center ducing the search It then repeatedly extends the punning tee from the ventur by adding» randouily chosen verlex from the remaining vertices, and connecting it to a vertex that is already in the tree via an edge with the sunullest weight,

‘Yhe algorithm also differ from Ó7'fŒ in that it begin by fixing the center of the tree The starling vertex vg is chosen randomly, If D is even, vg is Uhe center If D is odd,

35

Trang 36

another vertex tị is chosen at ranđom and sạ,t are the centers: the edge joining them

is the first in the tree Instead of maintaining the eccentricity of vertex and path lengths Detween vertices, the randomized heuristic stores the depth of each connected vertex: the number of edges on the path from it to the center ‘'his value is set when a vertex joins

the tree and does not subseqnently change No vertex may have adepth greater than 2 ; otherwise the diameter constraint is viclated or œo{0¡) is displaced from the center

Sketch of RCH ulgoritha cun be presented in the algoritlan 1

Identifying the vertex u € C! that is nearest to v requires time (|C|) — Ofn) ‘his

+ +a random vertex from U;

%% vorlex from C with gimullesi se{(u, #));

36

Trang 37

2.3.2.4 Improved Greedy Heurisitics (RGH — I and CBTC — 1)

Singh and Gupta [46] extended greedy constructive heuristic with a local search

step that reevaluate previous vertex connections after appending each new vertex

They check for each vertex v if it can be connected to a better parent vertex other than

the one to which it is currently connected without violating the diameter constraint The vertex, which offers the maximum reduction in the cost of BDST is selected and whole

subtree rooted at vertex v is deleted from its current location and reconnected to the tree

via the vertex selected

This improvement is applicable to CBTC also and the obtained algorithm will be denoted

instance with the number of vertices is 100 and D = 10 respectively ‘The tree on the figure 2.5 found by apply the local search on the best tree found by CBTC algorithm (figure 2.4) and can be seen on the circle mark

Figure 2.6 and 2.7 show the best BDST found by RGH on the Euclidean problem instance with n = 100, D = 10 respectively The tree on the figure 2.7 found by apply the local search on the best tree found by RGH algorithm (figure 2.6) and can be seen

27

Trang 38

algorithm on the Euclidean problem instance | RGH — I algorithm on the Euclidean prob-

on the circle mark Singh and Gupta [46] experiment on the Euclidean instances with the

number of vertices are 50, 100, 250, 500 and 1000 diameter bound is set to 5, 10, 15, 20,

25 respectively

In [21], Gruber and Raid! propose a constructive heuristic that exploits a hierarchical clustering to guide the process of building a backbone The clustering heuristic constructs diameter constrained trees within three steps: determining a hierarchical clustering, reducing the height of this clustering according to the diameter bound, and finally deriving a BDMST from this height-restricted clustering

They experiment on the Euclidean instances from Beasley’s OR-Library [7] |V| = 1000

and 15 first instances are used On large Euclidean instances the BDMSTs obtained

by the HCH outperforms other construction heuristics significantly, especially when the

diameter bound is tight and it takes only few seconds but it can not apply to the Non- Euclidean instances

28

Trang 39

2.3.2.6 Comments

Tn Singh and Gupta [46], (hey experiment aud compare the resull bewweon OTTC,

TGH, 1GH — 1, CHỮ BEC — J on the Muclidean and Non-Liuclidean instances in

which the muzuber of vertices ure 50, 100, 250, 500, 1000 and the diameter bound is vet bo

5, 10, 15, 20, 25 respectively ‘he experimental results show that:

On the Non-Buclidean instances, RGH 7 and CBTƠ I give better results than ROH and CRTC respectively on the est and average results Both RGH and RGH—T perform much worse than OTTO, CBTC and CBTC I Even RGH I cannot compete with OTTC, CBTC and CRTC — T On almost inscances, OTTC gives the best resnlts on the min, mean value

In 28), Julslrom experinsent on 240 graphs 120 Euclidean aud an equal number with

e weights chosen at random 'l'he Kuctidean graphs consisted of points randomly placed

iu the unil square, 30 graphs each of n — 100, 250, 500, and 1,000 pois In each sev of graphs, 15 instances can be founcl in OR- library [7], where they are listed as instances of the Euclidean Steiner problem, and 15 more were randomly generated In each set, the points are the vertices of complete graphs whose edge weights are the Kuclidean distances Dewween the points

Tent more, sets of 30 complete graphs siso cansisted of n — 100, 250, 500, and 1,000 vertices, The edge weiglts of these graphs were chosen ut random on Le interval [0.01, 0.99

On the Euclidean instances, diuuucter bound is ect to 5, 10, 15, 25 for [¥| = 100, 10, 15,

20 40 far |V[ — 2ã0, I5, 30, 45, 60 for [V — 500, 20, 40, 60, 100 for [V| — 1000 On random edge weight instances, diameter bound is set to 5, 7, 10, 15 for |V'| = 10, 5, 10,

15 20 for [V| = 250, 10, 15 20, 30 for |V[ = 500, 10, 30, 30, 50 far |V| = 1000

The experimental results on [28, 46, 40] show that:

On the Fuclidean instances, the best and average results found by ROH — F are better

Trang 40

that are slightly shorter than those OTTC finds, but RTC trees are much shorter than those of OL'T'C and CBY'C When O4'1'C, C41 C are applied to problem instances whose vertices are points in Finclidean space and whose edge weights are the distances between the points, the weight of BUAIS found by the heuristic are much larger than minimum, especially in the case 2 is smaller than n OTTC and CBTC build backbanes of shart edges; the remaining points connect to these backbones via longer edges, so OTT! and CBTC build longer trees (hun ucecssucy This observation Lolds for ulinost all BDMST problem instances With larger diameter bounds, the differences in the three algorithms results diminish, to the particular advantage of CBTC

On random weight instances, C'LIV'C identities have on average lower weights then those OTTC RIC is always worse than that of both OTTC and RTC The lack of Euclidean

structure in the random-weight instances make OTTC and CRTC better than ATC

2.3.3 Metaheuristic algorithms

Beside the greedy construction houristies, several rescarch groups have developed evolutionary algorithms (FAs) for solving the BDATST and hope that they conld find good result within reasonable time

In FA, representation methods are important role and decide all the operaror in the algorithm

Representation ethuds: Thee are a lol of methods [or representing individuals, especially spanning tree: Characteristic vectors, Predecessor coding, Prufer number, Link and node, Edge-sel-encoding, Permutation code In this thesis, we will use Edge-set-encoding and

Permutation code

« Edge-set-encoding: The problem of spanning tree representation has been studied extensively in the literature References in [36, 26] and specially [45° contain sub stantial disenssions and analysis af different representations from theoretical and practical perspectives Tor the BPMST prablem, three representations have been

30

Tiêu đề	Genetic Algorithms For Solving Bounded Diameter Minimum Spanning Tree Problem
Tác giả	Ttuynh Thi Thanh Bỡnh
Người hướng dẫn	Associate Professor Nguyen Duc Nghia
Trường học	Hanoi University of Science and Technology
Chuyên ngành	Engineering
Thể loại	Luận án
Năm xuất bản	2011
Thành phố	Hanoi

Định dạng
Số trang	126
Dung lượng	1,67 MB