1997a, who use a three-stage GA for computer network topology design, routing and capacity assignment; and Pierre and Legault 1998, who use a bit-string GA for computer mesh network desi
Trang 1Node-Pair Encoding Genetic
Programming for Optical Mesh Network Topology Design
Mark C Sinclair
Telecommunications is a vital and growing area, important not only in its own right, but also for the service it provides to other areas of human endeavour Moreover, there currently seems to be a demand for an ever-expanding set of telecommunication services of ever-increasing bandwidth One particular technology that has the potential to provide the huge bandwidths necessary if such broadband services are to be widely adopted, is multi-wavelength all-optical transport networks (Mukherjee, 1997) However, the development of such networks presents scientists and engineers with a challenging range of difficult design and optimisation problems
One such problem is mesh network topology design In the general case, this starts with
a set of node locations and a traffic matrix, and determines which of the node pairs are to be directly connected by a link The design is guided by an objective function, often cost-based, which allows the ‘fitness’ of candidate networks to be evaluated In the more specific problem of the topology design of multi-wavelength all-optical transport networks, the nodes would be optical cross-connects, the links optical fibres, and the traffic static Suitable routing and dimensioning algorithms must be selected, with sufficient allowance for restoration paths, to ensure that the network would at least survive the failure of any single component (node or link)
Telecommunications Optimization: Heuristic and Adaptive Techniques, edited by D.W Corne, M.J Oates and G.D Smith
Copyright © 2000 John Wiley & Sons Ltd ISBNs: 0-471-98855-3 (Hardback); 0-470-84163X (Electronic)
Trang 2In previous work, Sinclair (1995; 1997) has applied a simple bit-string Genetic Algorithm (GA), a hybrid GA and, with Aiyarak and Saket (1997) three different Genetic Programming (GP) approaches to this problem In this chapter, a new GP approach, inspired by edge encoding (Luke and Spector, 1996), is described (a shorter version of this chapter was first published at GECCO’99 (Sinclair, 1999)) It was hoped that this would provide better results than the best previous GP approach, and perhaps even prove competitive with GAs The eventual aim of the author’s current research into GP encoding schemes for mesh network topologies is to provide a more scalable approach than the inherently non-scalable encoding provided by bit-string GAs
The chapter is structured as follows Section 6.2 presents some of the previous work on Evolutionary Computation (EC) for network topology design; section 6.3 provides a more detailed description of the problem tackled, including the network cost model; and section 6.4 summarises two previous solution attempts In section 6.5 the node-pair encoding GP approach is described; then in section 6.6 some experimental results are outlined; and finally, section 6.7 records both conclusions and suggestions for further work
Over the years, at least 30 papers have been published on EC approaches to network
topology design Two of the earliest are by Michalewicz (1991) and Kumar et al (1992).
Michalewicz uses a two-dimensional binary adjacency matrix representation, and problem-specific versions of mutation and crossover, to evolve minimum spanning tree topologies
for computer networks Kumar et al tackle three constrained computer network topology
problems, aiming for maximum reliability, minimum network diameter or minimum average hop count Their GA is bit-string, but uses problem-specific crossover, as well as a repair operator to correct for redundancy in their network representation
For optical network topology design, in addition to the papers by Sinclair, mentioned
above, there is the work of Paul et al (1996) and Brittain et al (1997) Both these groups of
authors have addressed constrained minimum-cost Passive Optical Network (PON) topology design for local access However, while problem-specific representations are used
by both, as well as problem-specific genetic operators by Paul et al., only Brittain et al.
provide full details of their algorithm This uses a two-part chromosome comprising a bit-string and a permutation, with each part manipulated with appropriate standard operators
Other recent work of interest includes Dengiz et al (1997; Chapter 1, this volume) on a hybrid GA for maximum all-terminal network reliability, Ko et al (1997a), who use a
three-stage GA for computer network topology design, routing and capacity assignment; and Pierre and Legault (1998), who use a bit-string GA for computer mesh network design
Given the locations of the n nodes (optical cross connects) and the static traffic
requirements, t ij,between them, the problem is to determine which of the n(n–1)/2 possible
bi-directional links (optical fibres) should be used to construct the network The number of possible topologies is thus 2n(n− 1 / 2 To illustrate, Figure 6.1 shows a 15-node network design problem, for which an example mesh topology design is given in Figure 6.2
Trang 3Figure 6.1 Example network design problem.
The cost model used to guide the design was developed by Sinclair (1995) for minimum-cost topology design of the European Optical Network (EON) as part of the
COST 239 initiative (O’Mahony et al., 1993) It assumes static
two-shortest-node-disjoint-path routing (Chen, 1990) between node pairs, and that a reliability constraint is used This
is to ensure that there are two, usually fully-resourced, node-disjoint routes between node pairs, thus guaranteeing the network will survive the failure of any single component
To determine the cost of a given topology, separate models for both links and nodes are required The intention is to approximate the relative contribution to purchase, installation and maintenance costs of the different network elements, while ensuring the model is not too dependent on the details of the element designs, nor too complex for use in the ‘inner loop’ of a design procedure
Trang 4Figure 6.2 Example mesh topology.
First, the two-shortest-node-disjoint routes are determined for all node pairs In this, the
‘length’ of each path is taken to be the sum of the contributing link weights, with the weight
of the bi-directional link between nodes i and j given by:
j ij
i
ij N L N
where N i and N j are the node effective distances of nodes i and j, respectively (see below) and L ij is the length of link (i,j) in km Then, the link carried traffic is determined for each
link by summing the contributions from all the primary and restoration routes that make use
of it The restoration traffic is weighted by a parameter K R, but restoration routes are
usually taken to be fully-resourced (i.e K = 1.0), and thus are assumed to carry the same
Trang 5traffic as the corresponding primary routes, i.e the traffic they would be required to carry if
their primary route failed The capacity of link (i,j) is taken to be:
) ( ceilK T ij
ij K T
where T ij is the carried traffic in Gbit/s on the link, and ceilx() rounds its argument up to the
nearest x – here, the assumed granularity of the transmission links, K G (say 2.5 Gbit/s) The
factor of K T (say 1.4) is to allow for stochastic effects in the traffic The cost of link (i,j) is
then given by:
ij ij
ij V L
where α is a constant (here taken to be 1.0) Increasing capacity necessarily implies increased cost due, for example, to wider transmission bandwidth, narrower wavelength separation and/or increasing number and speed of transmitters and receivers With α = 1, a linear dependence of cost on capacity is assumed, but with α < 1, the cost can be adjusted
to rise more slowly with increases in the link capacity The linear link length dependency approximates the increasing costs of, for example, duct installation, fibre blowing and/or the larger number of optical amplifiers with increasing distance
Node effective distance was used as a way of representing the cost of nodes in an optical network in equivalent distance terms It can be regarded as the effective distance added to a path as a result of traversing a node By including it in link weights (equation 6.1) for the calculation of shortest paths, path weights reflect the cost of the nodes traversed (and half the costs of the end nodes) As a result, a longer geographical path may have a lower weight if it traverses fewer nodes, thereby reflecting the relatively high costs of
optical switching The node effective distance of node i was taken to be:
n i
i K n K
where n i is the degree of node i, i.e the number of bi-directional links attached to it The constants K0 and K n were taken to be, say 200 km and 100 km, respectively, as these were judged to be reasonable for the network diameters of 1400–3000 km used here Node effective distance thus increases as the switch grows more complex Node capacity is the sum of the capacities of all the attached links, i.e
∑
=
j ij
i V
where V i is the capacity of node i in Gbit/s The node cost was taken to be:
i i
i N V
The cost is thus derived as if the node were a star of links, each of half the node effective length, and each having the same capacity as the node itself Further, if all the links
Trang 6attached to a node are of the same capacity, the node costs would grow approximately with the square of the node degree, corresponding, for example, to the growth in the number of crosspoints in a simple optical space switch Overall, the relative costs of nodes and links
can be adjusted by setting the values of K0 and K n appropriately
The network cost is then taken to be the sum of the costs of all the individual links and nodes comprising the network However, to ensure the reliability constraint is met, the
actual metric used also includes a penalty function This adds P R (say 250,000) to the
network cost for every node pair that has no alternative path, and P N (say 500,000) for every node pair with no routes at all between them This avoids the false minimum of the topology with no links at all, whose cost, under the link and node cost models employed,
would be zero The values selected for P R and P N must be sufficiently large to consistently ensure no penalties are being incurred by network topologies in the latter part of an evolutionary algorithm run However, values substantially larger than this should be avoided, as they may otherwise completely dominate the costs resulting from the topology itself, and thereby limit evolutionary progress
Two previous attempts at optical mesh topology design using the same cost model are outlined below An additional attempt with a hybrid GA (Sinclair, 1997) has been excluded from consideration due to the highly problem-specific nature of its encoding and operators
In 1995, Sinclair used a bit-string GA to tackle two variants of the EON: a small illustrative problem consisting of just the central nine nodes, as well as the full network of ~20 nodes
The encoding simply consisted of a bit for each of the n(n–1)/2 possible links Clearly, this
representation scales poorly as problem size increases, as the bit string grows Ο(n2), rather
than only with the number of links actually required, m, i.e Ο(m) ≈ Ο(n) provided node
degree remains approximately constant with network size The genetic operators were single-point crossover (probability 0.6) and mutation (probability 0.001); and fitness-proportionate selection (window-scaling, window size 5) was used The GA was generational, although an elitist strategy was employed
For the full EON, with a population of 100 and ten runs of less than 48,000 trials each, a network design was obtained with a cost (at 6.851×106) which was some 5.1% lower than
an earlier hand-crafted design (O’Mahony et al., 1993) In addition, the GA network design
was of superior reliability, at least in terms of the reliability constraint
More recently, Aiyarak et al (1997) described three different approaches to applying GP to
the problem The most successful of these, Connected Nodes (CN), encodes the design as a program describing how the network should be connected Only one terminal and one function are required The terminal is the ephemeral random integer constant ℜ, over the
Trang 7range of n node identification numbers (ids) The function is con, which takes two
arguments, representing the ids of two nodes that are to be connected in the network
topology represented by the program (note: in Aiyarak et al., 1997), this function was
called connect2) As each con function must also provide a return value, it simply returns its first argument However, if its two id arguments are equal, it does nothing apart
from returning their value The program tree is evaluated depth-first: children before
parents For example, Figure 6.3 shows a small target network and Figure 6.4 a corresponding ‘hand-crafted’ connected-nodes GP tree Executing the tree would connect those node pairs indicated above each of the con functions: (4,3), (1,4), (1,3), and
so on, resulting in the desired network Clearly, with this representation, minimum program size grows only with the number of links (Ο(m) ≈ Ο(n)), as only a single con and at most two terminals are required for each node pair connected In Aiyarak et al (1997), the only
genetic operator used was crossover, and tournament selection was employed
Experimental results were obtained for both the nine central nodes, as well as the full EON, establishing the superiority of the CN approach over the two other GP encoding methods In addition, the network design cost obtained with CN (6.939×106) was only some 1% above Sinclair’s earlier GA result (1995) However, the computational burden of CN was far greater: the best design was obtained using two runs of 500,000 trials each on a population of 1000 individuals
Figure 6.3 Target network.
The two node-pair encoding (NP) GP approaches presented in this chapter are based on an earlier edge encoding for graphs described by Luke and Spector (1996) Their approach evolved a location-independent topology, with both the number of nodes and their interconnections specified by the GP program Here, however, the number of nodes is fixed
in advance, and the GP program is only required to specify the links
As in the CN approach, the program again describes how the network should be
connected However, for NP, the program tree is evaluated top-down: children after
parents The functions operate on a node pair (represented by two node ids, a and b), then pass the possibly-modified node-pair to each of their children Similarly, terminals operate
on the node pair passed to them Overall execution of the tree starts with (a,b) = (0,0)
2 3
4
Trang 8Figure 6.4 CN program for the target network.
This use of a current node pair is similar to the concept of the current edge in Luke and Spector (1996), but without any structure equivalent to their node stack Two different variants of node-pair encoding are described here (NP1 and NP2); the difference is due to the arities of their functions/terminals These are given, for both variants, in Table 6.1 The function/terminal add adds a link between the current node pair (a,b), provided
a ≠ b; whereas terminal cut removes the link between the current node pair, if there is one To allow the current node pair to change, thus moving the focus of program execution
to a different point in the network, five of the functions (da, ia, ia2, ia4, ia8) modify the value of variable a The choice of 1, 2, 4 and 8 for the values by which a may be increased was motivated by minimum-description length considerations By providing the reverse function (rev), similar modifications can be made, in effect, to variable b The double (dbl) and triple (tpl) functions allow operations on node pairs that are numerically close together to be accomplished using a smaller depth of tree For example,
in Figure 6.5, the tpl in the lower left of the diagram, which refers to node pair (2,1), enables links (2,1), (3,1) and (4,1) to be added by its subtrees Without this tpl, an equivalent subtree based on just adds, ia?s and nops would be two levels deeper Finally,
con
con
1
con
con
con
3
con
con
con
3
(4,3) (1,4)
(1,3)
(1,2)
(1,0)
(2,3) (0,4)
(0,3)
Trang 9terminal nop does nothing to its node pair, thus allowing a program branch to terminate without either adding or removing a link
Table 6.1 NP function/terminal sets.
Arity Abbr NP1 NP2 Description
rev 1 1 Reverse: (a,b) = (b,a)
da 1 2 (decrement a) mod n
ia 1 2 (increment a) mod n
ia2 1 2 (increase a by 2) mod n
ia4 1 2 (increase a by 4) mod n
ia8 1 2 (increase a by 8) mod n
dbl 2 2 Double: pass current node pair to both children
tpl 3 3 Triple: pass current node pair to all three children add 1 0 Add link (a,b)
cut 0 0 Cut link (a,b)
nop 0 0 Do nothing
In NP1, add is a function and has a single child, whereas in NP2 it is a terminal Further, in NP1, the functions that modify variable a all have just one child, whereas in NP2, they have two The effect of these differences is to encourage taller, narrower program trees in NP1, with further operations following the addition of a link, and shallower, broader trees in NP2, with link addition ending a program branch This is illustrated by the ‘hand-crafted’ program trees, for the target network of Figure 6.3, given in Figures 6.5 and 6.6 using NP1 and NP2 respectively In both diagrams, the current node pair is indicated above each of the add function/terminals It should be noted that the tree
in Figure 6.5 has a depth of 10 and uses 28 program nodes, whereas that in Figure 6.6 is only 7 levels deep and uses just 24 function/terminals
For the relative assessment of the different approaches to optical mesh network topology design, it was decided to use the full 20-node EON (Sinclair, 1995) and five additional network design problems For the latter, the initial node locations and traffic requirements
were generated using the approach described by Griffith et al (1996), although further
modified to ensure reasonable node separations Each network has 15 nodes, covers a 1000km ×1000km area and carries an overall traffic of 1500 Gbit/s Due to their smaller
size, reduced penalties of P R = 50,000 and P N = 100,000 were used for the 15-node networks, rather than the larger values suggested in section 6.3, which were those applied for the EON
Trang 10Figure 6.5 NP1 program for target network.
The bit-string GA developed by Sinclair (1995), and reviewed in section 6.4.1, is referred
to here as GA1 (with a maximum of 15,000 trials) when used on the five 15-node problems, and as GA4 (with a maximum of 50,000 trials) on the EON (Table 6.2) In addition, in an attempt to improve on Sinclair’s results, both a higher mutation rate and a larger population size were also tried for both the 15-node problems and the EON From a few trials runs, reasonably successful algorithms with a higher mutation rate (GA2) and a larger population size (GA3) were discovered for the 15-node problems Also, for the EON, algorithms with
an increased mutation probability of 0.01, and increased population sizes of 200, 500 and
1000 (the latter three still with mutation probability 0.001) were tried, without good results,
add
dbl
dbl
rev dbl add
add nop
ia4 add nop
ia2 add nop
rev ia2 tpl
nop
add nop
(4,0)
(4,3)
(2,3) (4,1) (3,1)
(2,1)
(0,3) (1,0)