Natural computing for simulation based optimization and beyond

Most overviews concerningthe connecting area of simulation-based or simulation optimization do not focus onnatural computing.. 1.1 Natural Computing and Simulation Natural computing NC c

Trang 2

SpringerBriefs in Operations Research

Trang 3

SpringerBriefs present concise summaries of cutting-edge research and practicalapplications across a wide spectrum ofﬁelds Featuring compact volumes of 50 to

125 pages, the series covers a range of content from professional to academic.Typical topics might include:

• A timely report of state-of-the art analytical techniques

• A bridge between new research results, as published in journal articles, and acontextual literature review

• A snapshot of a hot or emerging topic

• An in-depth case study or clinical example

• A presentation of core concepts that students must understand in order to makeindependent contributions

SpringerBriefs in Operations Research showcase emerging theory, empiricalresearch, and practical application in the various areas of operations research,management science, and relatedﬁelds, from a global author community Briefs arecharacterized by fast, global electronic dissemination, standard publishingcontracts, standardized manuscript preparation and formatting guidelines, andexpedited production schedules

More information about this series athttp://www.springer.com/series/11467

Trang 4

Silja Meyer-Nieberg • Nadiia Leopold •

Trang 5

Silja Meyer-Nieberg

ITIS GmbH

Neubiberg, Bayern, Germany

Nadiia LeopoldBundeswehr University MunichNeubiberg, Bayern, GermanyTobias Uhlig

Bundeswehr University Munich

Neubiberg, Bayern, Germany

ISSN 2195-0482 ISSN 2195-0504 (electronic)

SpringerBriefs in Operations Research

ISBN 978-3-030-26214-3 ISBN 978-3-030-26215-0 (eBook)

https://doi.org/10.1007/978-3-030-26215-0

This work is subject to copyright All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, speci ﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on micro ﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar

or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made The publisher remains neutral with regard

to jurisdictional claims in published maps and institutional af ﬁliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Trang 6

This brief bridges the gap between the areas of simulation studies on the one handand optimization with natural computing on the other Most overviews concerningthe connecting area of simulation-based or simulation optimization do not focus onnatural computing While they often mention the area shortly as one of the sources

of potential techniques, they concentrate on methods stemming from classicaloptimization Since natural computing methods have been applied with great suc-cess in several application areas, a review concerning potential beneﬁts and pitfallsfor simulation studies is merited The brief presents such an overview and combines

it with an introduction to natural computing and selected major approaches as well

as a concise treatment of general simulation-based optimization As such it is theﬁrst review which covers both: the methodological background and recent appli-cation cases Therefore, it will be of interest to practitioners from eitherﬁeld as well

as to people starting their research

The brief is intended to serve two purposes: First, it can be used to gain moreinformation concerning natural computing, its major dialects, and their usage forsimulation studies Here, we also cover the areas of multi-objective optimization andneuroevolution While the latter is only seldom mentioned in connection withsimulation studies, it is a powerful potential technique as it is pointed out below.Second, the reader is provided with an overview of several areas of simulation-basedoptimization which range from logistic problems to engineering tasks

Additionally, the brief focuses on the usage of surrogate and meta-models It takestwo research directions into close consideration which are rarely considered insimulation-based optimization: (evolutionary) data farming and digital games Datafarming is a relatively new and lively subarea of exploratory simulation studies As itoften aims tofind weaknesses in the simulated systems, it benefits from direct searchand as such from natural computing The brief presents recent application examples.Digital games which are also termed soft simulations are interesting from severalvantage points First of all, they represent a vibrant and rapidly progressing researchfield in the area of natural computing So far, however, the communities are disjunctresulting in a slow migration of concepts and ideas from one area to the other

v

Trang 7

Notwithstanding, both fields may profit from each other Therefore, the briefcontains a concise review concerning natural computing and digital games.Second, one of the major research directions in digital games focuses on thedevelopment of convincing non-player characters or in other words of derivinggood controllers Often employed methods comprise, for example, genetic pro-gramming and neuroevolution Here, we arrive at another point where the briefdiverts from traditional overviews: behavioral and controller learning Despite theabundance of approaches for games, it has only seldom been considered in therelated area of simulation It is our belief that it offers great potential benefitsespecially if simulation-based optimization is used to identify weaknesses or toconduct stress tests.

Overall, the brief will appeal to two major research communities in operationsresearch—optimization and simulation It is of interest to both experienced prac-titioners and newcomers to theﬁeld

Tobias Uhlig

Trang 8

1 Introduction to Simulation-Based Optimization 1

1.1 Natural Computing and Simulation 1

1.2 Simulation-Based Optimization 3

1.2.1 From Task to Optimization 6

1.2.2 A Brief Classiﬁcation of Simulation-Based Optimization 7

References 8

2 Natural Computing and Optimization 9

2.1 Evolutionary Algorithms 9

2.1.1 Genetic Algorithms 12

2.1.2 Evolution Strategies 14

2.1.3 Differential Evolution 16

2.1.4 Genetic Programming 17

2.2 Swarm-Based Methods 18

2.2.1 Ant Colony Optimization 18

2.2.2 Particle Swarm Optimization 19

2.3 Neuroevolution 21

2.4 Natural Computing and Multi-Objective Optimization 22

References 27

3 Simulation-Based Optimization 31

3.1 On Using Natural Computing 31

3.2 Simulation-Based Optimization: From Industrial Optimization to Urban Transportation 33

3.3 Simplifying Matters: Surrogate Assisted Evolution 39

3.4 Evolutionary Data Farming 43

3.5 Soft Simulations: Digital Games and Natural Computing 46

References 50

4 Conclusions 59

vii

Trang 9

Chapter 1

Introduction to Simulation-Based

Optimization

Abstract Natural computing techniques first appeared in the 1960s and gained more

and more importance with the increase of computing resources Today they areamong the established techniques for black-box optimization which characterizestasks where an analytical model cannot be obtained and the optimization techniquecan only utilize the function evaluations themselves A classical application area

is simulation-based optimization Here, natural computing techniques have beenapplied with great success But before we can focus on the application areas, we firsthave to take a closer look at what we mean when we refer to optimization, simulation,and natural computing The present chapter is devoted to a concise introduction tothe field

1.1 Natural Computing and Simulation

Natural computing (NC) comprises approaches that adopt principles found in nature

mimicking evolutionary and other natural processes, e.g., implementing simple brainmodels or simulating swarm behavior [1] Methods belonging to natural computingare therefore quite diverse ranging across evolutionary algorithms, swarm-basedtechniques, and neural networks Further examples include artificial immune sys-tems [2], DNA computing [3], quantum systems (e.g see the respective sections in[1]), or even slime moulds [4] Simulation-based analyses and simulation-based op-timization (SBO) are among the earliest application areas Today, success stories ofnatural computing include examples from the engineering or industrial domain [5],computational red teaming, and evolutionary data farming [6] This book presents

an overview of current natural computing techniques as well as their applications in

the broad area of simulation We will refer to this area as simulation-based tion but it should be noted that the term simulation optimization is also common In

optimiza-general, two main applications can be distinguished: The first uses natural ing to optimize control parameters of a simulated system, see Fig.1.1 Usually, thisdoes not change the intrinsic structures or behavioral routines of the system itself.Commonly used NC methods for this application scenario are genetic algorithms,evolution strategies, or particle swarm optimization The second approach transforms

comput-© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020

S Meyer-Nieberg et al., Natural Computing for Simulation-Based

Optimization and Beyond, SpringerBriefs in Operations Research,

https://doi.org/10.1007/978-3-030-26215-0_1

1

Trang 10

2 1 Introduction to Simulation-Based Optimization

Fig 1.1 Optimizing

NC MethodFitness

Fig 1.2 Behavioral learning

Simulation

NC Method Fitness

The present survey is structured as follows: Sect.1.2provides a brief overview

of simulation-based optimization in general The following sections cover the mostcommon natural computation approaches for optimization and their fundamentalworking principles Here, we present the large field of evolutionary algorithms,swarm-based methods, and evolutionary neural networks Special attention is paid tothe growing field of multi-objective optimization in Sect.2.4 Afterwards, exemplaryapplications of simulation-based optimization with natural computing are described

in Sect.3 The section in turn consists of five parts: First, we discuss the generalapplicability of NC approaches Afterwards, we display the spectrum of applicationcases in Sect.3.2 The third part zooms in on the use of meta-models or surrogateassisted approaches These approaches have been introduced to reduce the impact

of the expensive evaluations As direct search methods, the NC methods require thecomputation of a performance measure, the so-called fitness, to assess the quality

of a potential solution In the case of simulation-based optimization, evaluating anindividual is based on conducting simulation runs Since nearly all approaches oper-ate with several solutions at a time, using natural computing can be time-consumingespecially when used together with stochastic multi-agent systems or finite element

Trang 11

1.1 Natural Computing and Simulation 3

Fig 1.3 Simulation-based

optimization—simulation

evaluates solution candidates

provided by an optimization

approach and returns

performance measures that

are used to steer the

computa-1.2 Simulation-Based Optimization

The goal in optimization is to find a solution to a problem of the following form

min

withX the space of feasible solutions also called the search space and f : X →

R the function of interest Typically, the position of the solution and its functionvalue are of interest Following common practice, the discussion is restricted to theminimization case However, the transfer to maximization can be done easily [7]

In simulation-based optimization the function f is not directly available and is

replaced by simulation (see Fig.1.3)

Simulation executes a model to calculate values of interest, essentially, it is based experimentation The employed model is an abstract representation of a systemand approximates the properties and behavior of the modeled function Consequently,simulation provides only an estimate ˆf of the exact function of interest f This

model-estimate can be written as an expectation:

f (x) ≈ ˆf(x) = E[F(x, θ)], (1.2)where θ is a random variable, F is a sample performance measure Simulation is

often applied when stochastic and dynamic effects are intrinsic properties of theconsidered system and prohibit the direct formulation of the function of interest

Trang 12

4 1 Introduction to Simulation-Based OptimizationThe influence of stochasticity also needs to be taken into account for the opti-mization task Then, (1.1) should be changed to

Common performance measures can be divided into two classes: threshold

mea-sures and moment-based meamea-sures [8] The former considers the probability of F realizations below a certain threshold q The goal is to find a maximal value

-P({F(x, θ) ≤ q}) → max (1.4)

if minimization of the original objective is required This amounts to minimizing thefrequency for disadvantageous outliers Statistical moment-based measures, definedby

e[Fk (x, θ)] → min (1.5)

with k ∈ N, k ≥ 1 demand minimality with respect to the k-th moment (see e.g [8,

9]) In other words, moment-based measures are concerned with the minimization ofthe non-central moments of the distribution Usually the first and the second momentsare considered For a more detailed description, see [8] It should be noted that forpractical applications, the statistical estimates of the moments have to be used.Before continuing, it is worthwhile to take a closer look at the concept of noise

Noise in general means that the function evaluations are not exact and that

dis-turbances occur, due to measurement errors or stochastic elements in simulations

Instead of the exact f -value at x only the noisy

F (x) = g( f, x, ε) (1.6)can be observed The variable ε stands for an n-dimensional random variable A

special case of (1.6) is the so-called additive noise

F (x) = f (x) + ε (1.7)which is often considered in theoretical literature The additive noise term is com-monly assumed to follow a normal distribution with zero mean and standard deviation

σ ε This case is known as the standard noise model Figure1.4illustrates the effects

of standard noise in the case of the sphere f (x) =N

i=1x i2 The figure shows the

Trang 13

1.2 Simulation-Based Optimization 5

(b) additive noise(a) no noise

Fig 1.4 The influence of noise in the case of the sphere in 2D

isoclines of the function As the noisy case, Fig.1.4b, shows random influence onthe function evaluation may lead to strong distortions of the fitness landscape

Actuator noise, another noise model, is used for situations where the noise

over-shadows the positions in the search vector x itself,

F(x) = f (x + ε). (1.8)This model is not as well explored as the standard noise model, although it is oftenencountered in practical optimization, for instance, in path planning In both cases,

additive and actuator noise, any evaluation of F at the position x results in a

differ-ent value In order to gain better estimates of the expected values, techniques likeresampling are commonly applied

Actuator noise, Eq (1.8), is strongly related to robustness In several publications,

it is treated as a subproblem of robust optimization In the case of robust optimization,

the parameters (either design/control or environmental) vary naturally The “noise”

is thus not a result of measurement errors, but it is an intrinsic property of the processitself In these cases, the goal of the optimization is not to find a single, isolated,

optimum of the function f , but rather to identify parameter regions (or a solution)

in which the solution quality is retained even if (small) variations of the parameteroccur This can be observed in aerodynamic design, where a suitable wing shapenecessarily should be optimal with respect to varying wind currents

Trang 14

6 1 Introduction to Simulation-Based Optimization

The function f that shall be optimized usually stems from a modeling process during

which a real-world problem is simplified The modeler captures the objectives of thedecision makers, the underlying process with its restrictions, and of course the forces

of influence In terms of optimization, the objective function, the set of restrictionsand interdependencies, and the decision variables must be obtained It may be infea-sible to find closed, analytical expressions for the objective, the restrictions and theinterdependencies for all cases Here, simulation-based optimization or numericalapproximations become important

Modeling in itself is a type of art: While the model should be as simple as sible, it must capture the relevant processes for the optimization adequately Thisguarantees that result solutions based on this model are transferable to the modeledsystem Otherwise, an optimal solution for the model could lead to unexpected oreven disastrous effects when it is implemented in reality It should be noted that themodel already represents a first (function) approximation of the real system Thus,nearly any optimization takes place on a surrogate of the reality A further challenge

pos-is that a real-life system may be presented adequately by several models of differenttypes

The type of the employed model predetermines the methods that can be applied

and it also affects the computational performance If f , for example, is an affine-linear function f (x) = ctx andX = {x ∈ R n|Ax ≤ b}, the model belongs to the class of

linear programming with continuous variables These problems can be solved by thesimplex algorithm or by interior point methods Optimal solutions can be obtainedefficiently even for large-scale problems—provided that they exist However, not allreal-world problems can be represented by affine-linear functions and not all de-cision variables are continuous and deterministic In the case of discrete variables,so-calledN P-hard optimization problems are often encountered—even if the ob-

jective function and the restrictions remain affine-linear It remains unknown todaywhether efficient exact algorithms for these kinds of problems exist So far, onlyexponential time complexity could be achieved, see [10, p 25f]

Sometimes it is not sufficient to consider only one goal for the optimizationprocess It may be necessary to take several objectives into account: e.g productquality and product costs Furthermore, these goals may be conflicting Maximizingthe product quality usually does not go along with a minimization of the costs.Multi-objective optimization which aims at the identification of compromise so-lutions is generally more difficult than single objective optimization Problems thatcan be solved in polynomial time, that is, problems for which efficient algorithmsexist, may becomeN P-hard when more than one criterion has to be taken into

account

Usually, simulation-based optimization is concerned with finding optimal eters for the simulation model The focus of this paper lies on the usage of NCmethods in the area of simulation, as described in the next subsection In contrast tomost of the other reviews, we also address controller learning since it offers greatpotential benefits

Trang 15

Of-To optimize the measure, the simulation must be coupled with suitable tion methods Usually, optimization algorithms require many simulation runs sincethe quality of a parameter setting can only be accessed by executing the simulationmodel.

optimiza-Consider, for instance, the case of industrial design, e.g., aerodynamic shape sign The actual shape is determined by a group of variables or, more correctly, bytheir parameter settings The effect of a specific shape configuration can only beassessed by conducting simulations—which usually entails time-consuming com-putational fluid dynamics Additionally, some influences as for example wind speedand wind direction must be assumed to be random Thus, the problem becomes astochastic task requiring a multitude of simulations The goal of the optimization inthese types of applications is often the identification of a robust optimum since thesolution has to retain its quality under a variety of environmental conditions.Commonly, simulation-based optimization can be subdivided into three classes—depending on the nature of the search space An extensive overview on the classesand methods (not NC methods) can be found, for example, in [11] Here, we onlycover the main approaches so that the natural computing methods can be assessed

de-in the general context The first class contade-ins problems with a discrete and fde-initesearch-space that contains only a few (<100) solutions These tasks are classified as ranking-and-selection problems Two main methodologies can be applied to them,

see [12]: frequentist and Bayesian approaches

The following category comprises large or infinite discrete search spaces, an

area where heuristics and metaheuristics are often applied Neither heuristics nor

metaheuristics guarantee an optimal solution Instead, they usually deliver a good

but not necessarily optimal solution for a problem Heuristics are designed for specificproblem classes and cannot be transferred easily to other classes Such a class could

be, for example, the traveling salesman problem—the task of finding the shortest

or cost-optimal tour visiting a number of given locations In contrast metaheuristicscan be used for several problem classes Their general structure remains unchangedand only minimal aspect must be adapted to the given problem class However, itshould be noted that metaheuristics are usually more inefficient than specially adaptedheuristics, since they do not rely on special explicit problem knowledge Therefore,they usually are employed whenever fine-tuned problem-specific methods are eitherunavailable or their development would be too costly The class of metaheuristics isvast and aside from natural computing comprises simulated annealing, tabu search,

Trang 16

8 1 Introduction to Simulation-Based Optimizationiterated local search, and many more Since the focus here lies on natural computing,the reader is referred to [13] for an overview of further metaheuristics.

Continuous search spaces represent the final category Here, gradient-based ods (which approximate the gradient by using finite differences), direct search meth-ods, stochastic approximation algorithms, or other numerical optimization methodscan be applied, see [14] for an introduction Direct search methods or zero-ordermethods only make use of function evaluations They are applied when further infor-mation, gradient or Hessian, is unobtainable The class of direct search methods forcontinuous optimization is vast and comprises, for instance, methods like Hooke-and-Jeeves, Nelder-Mead (simplex downhill method), simulated annealing, and naturalcomputing

meth-References

1 Rozenberg, G., Bäck, T., Kok, J.N (eds.): Handbook of Natural Computing Springer (2012)

2 Read, M., Andrews, P.S., Timmis, J.: An introduction to artificial immune systems In: berg et al (eds.) [1], pp 1575–1597

Rozen-3 Kari, L., Seki, S., Sosík, P.: DNA computing—foundations and implications In: Rozenberg

7 Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing Natural Computing Series Springer, Berlin (2003)

8 Beyer, H.G., Sendhoff, B.: Robust optimization—a comprehensive survey Comput Methods

Appl Mech Eng 196(33–34), 3190–3218 (2007).https://doi.org/10.1016/j.cma.2007.03.003

9 Beyer, H.G., Sendhoff, B.: Functions with noise-induced multimodality: a test for evolutionary

robust optimization—properties and performance analysis IEEE Trans Evol Comput 10(5),

507–526 (2006)

10 Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundations and Applications Morgan man (2005)

Kauff-11 Fu, M.C (ed.): Handbook of Simulation Optimization Springer (2015)

12 Hong, L.J., Nelson, B.L.: A brief introduction to optimization via simulation In: Winter lation Conference, WSC ’09, pp 75–85 Winter Simulation Conference (2009) http://dl.acm org/citation.cfm?id=1995456.1995472

Simu-13 Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics, 2nd edn Springer (2004)

14 Nocedal, J., Wright, W.: Numerical Optimization Springer, New York (1999)

Trang 17

Chapter 2

Natural Computing and Optimization

Abstract This chapter introduces the area of natural computing The field

encom-passes a multitude of classes ranging from evolutionary algorithms to firefly niques The present chapter focuses on a selected set of established techniques:evolutionary algorithms, swarm-based methods, and neuroevolution The first twoclasses are mainly used for parameter optimization (with the exception of geneticprogramming, a specific type of evolutionary algorithms) whereas the third class

tech-is applied for learning the structures of controllers As such, the methods selectedillustrate the main concepts of natural computing and serve to show the broadness

of the application areas The last part of the chapter is devoted to multi-objectiveoptimization–an important task in practice which is often solved with natural com-puting

2.1 Evolutionary Algorithms

Evolutionary algorithms (EAs) implement principles from natural evolution From

the optimization perspective, they can be seen as population-based stochastic or domized optimization algorithms Evolutionary algorithms comprise several sub-types (see e.g [1 3]) This section describes briefly the basic concepts of the mostestablished EAs before presenting some additional instances

ran-Figure2.1 shows the general structure of an evolutionary algorithm The two

fundamental principles of an EA are reproduction and selection based directly or

indirectly on the so-called fitness Reproduction is the process of deriving new didate solutions based on existing ones Selection steers the evolutionary process bypicking favorable candidate solutions based on their fitness To assess the fitness of a

can-candidate solution a fitness function is employed The function may be the objective

function itself or a derived function that can be more easily evaluated and used inthe algorithm It implies the objective of the optimization by measuring the quality

of candidate solutions Therefore, depending on the application, it may either be ananalytical function or a performance measure which depends on the outcome of sim-ulations Since this function is used to evaluate and compare the population members

it must be chosen carefully since the evolutionary pressure, a main driving force ofthe progress, depends on it

S Meyer-Nieberg et al., Natural Computing for Simulation-Based

Optimization and Beyond, SpringerBriefs in Operations Research,

https://doi.org/10.1007/978-3-030-26215-0_2

9

Trang 18

10 2 Natural Computing and Optimization

Fig 2.1 The general working principle of evolutionary algorithms [1 ]

The population of potential solutions forms the core of the algorithm The

so-called parent population contains candidate solutions for the given task It also provides the basis for creating new tentative solutions—the offspring Following

common practice in NC literature, candidate solutions are simply called solutions inthe remainder of the paper This does not mean that they are the optimal solutions

First, a subset of the parent population is chosen This parent or mating selection

can be implemented in several ways depending on the evolutionary algorithm and on

the application Stochastic selection is very common Each individual i is assigned

a selection probability p i ≥ 0 To realize random draws based on the probabilities,techniques like roulette-wheel selection or stochastic universal sampling are applied.Often, stochastic universal sampling is preferred due to its better statistical properties,see e.g [1, p 61f] The selection probability p i of an individual i usually determined based on the fitness function values f i

the selection probability of the the i th individual may be defined as:

p i =2− s μ +μ(μ − 1). 2i (s − 1)

The parameter s ∈ [1, 2] controls the selection pressure [1] The techniques duced above require the evaluation of the whole population, a task that may beinefficient if the population is large and the evaluation is coupled with simulations

Trang 19

intro-2.1 Evolutionary Algorithms 11

Another very popular form is tournament selection In tournament selection,

pop-ulation members are randomly chosen for a tournament, that is, for a comparison ofthe fitness The winner (or winners) of the tournament are then used in the creation ofthe offspring Since the tournament size is usually small compared to the populationsize, this selection type is usually more efficient

The operators that create the offspring are called variation operators There are

two processes that are performed: recombination and mutation usually used in that

order Recombination is an n-ary operator that combines traits of two or more parents.

In genetic algorithms, the term crossover is more common which refers to combining

the properties of two parent using cut and paste to create two offspring The moregeneral recombination may involve more than one parent and may create just oneoffspring The importance of recombination differs in the EA variants It is themain search operator in genetic algorithms, whereas it is not used in evolutionaryprogramming Recombination may be coupled with a stochastic decision whether

to perform recombination or not This is often the case in genetic algorithms and ingenetic programming

The result of recombination is then mutated Mutation is a unary operator that

randomly changes some traits of an individual The significance of mutation varies fordifferent types of EAs While it is only a background operator in genetic algorithms,

it is the sole variation operator in evolutionary programming As in the case ofrecombination, there may be a stochastic decision first whether a specific offspringshould be mutated or not

After the offspring population is created, the new population has to be determined.This survivor selection is organized in various forms There are EAs (e.g someevolution strategies) which discard the old parent population and deterministicallytake the bestμ of the offspring In contrast, genetic algorithms often swap only part

of the population with new solutions Again, the selection may be deterministic orstochastic and ranges from rank-based selection through fitness-based selection totournament selection

An evolutionary algorithm terminates when a predefined stopping condition issatisfied This may be the computing time or the number of fitness evaluations Thistype of condition is usually coupled with criteria that consider the search progress

of the EA If for instance the search stagnates for several generations the EA mayterminate although the time resources were not exhausted

It should be noted that there are usually two phases in natural search: an

explo-ration phase and an exploitation phase During exploexplo-ration, the algorithm, i.e., the

population explores the search space spreading the population members in the space.Exploration usually occurs in the beginning of a run since information on goodregions in the search space is sparse When the search continues, more and moreinformation becomes available At a certain point which depends on the task, the

algorithm should therefore switch to the exploitation phase and converge into good

regions of the search space Both processes must be carefully balanced A too longexploration phase may waste scarce computing resources, whereas a premature endmay result in suboptimal solutions A lot of work in natural computing addressesmechanisms to control these phases either implicitly or explicitly It should be men-

Trang 20

Fig 2.2 Concepts used in genetic algorithms and their real world counterparts

tioned that, for example, restart algorithms may switch from exploitation again toexploration Successful variants, e.g BIBOP-CMA-ES, restart the run after conver-gence with an increased population and therefore a greater potential for exploration[5 7]

Genetic algorithms (GAs) are among the earliest evolutionary algorithms and areprobably the best known They were invented in 1960s [8,9] by Holland to analyzethe behavior of adaptive systems and to serve as simple models of evolution Theiroriginal form operated on bit strings, whereas today GAs comprise various formsand application areas They are perhaps the most diverse group of all EAs—usedfor discrete and combinatorial optimization as well as for continuous optimizationproblems In the case of simulation-based optimization, they are among the most com-

monly used methods Genetic algorithms differentiate between a so-called genotype and phenotype The latter encodes the candidate solution for evaluation purposes

whereas the operations of the GA (recombination, mutation) are performed on thegenotype Figure2.2 illustrates this concept: Elements of the genotype space aretranslated to phenotypes that can be evaluated and mapped to fitness values Ele-ments of the phenotype space correspond to real world solutions from the actualproblem domain The fitness space relates to the real world optimization goals cap-turing the objectives of the given problem The genotype space has no real worldequivalent, instead it is an abstraction that enables more efficient searching by usingdata structures that can be modified and combined easily

Trang 21

choose one similar to the phenotype (2, 3, 1, 4), that is, a vector where the i th entry

denotes the i th city of the tour The same tour, however, could be captured using

random keys (3.1, 1.8, 2.3, 3.9)—a vector where the i th entry encodes the relative

position of city i in the tour, i.e., you decode the genotype by sorting its entries and

sorting the corresponding cities accordingly Further representations exist as well.The genotype-phenotype mapping has to be chosen carefully since variation oper-ates on the genotype and small changes in the genotype should result in small changes

of the phenotype and the fitness Therefore, the representation or the phenotype mapping (see [1] for a more detailed discussion) must be tailored to theproblem that shall be solved

genotype-The choice of the genotype determines the available variation operators Onbit strings, recombination can be organized as a 1-point crossover where the twogenomes cross and break at a randomly chosen point (see Fig.2.3 The first offspringthen takes the first part of genome of the first parent and the second from the other par-ent, whereas the other uses the remaining parts In contrast, recombining real-valuedgenomes may be realized by computing a weighted mean of the parents in each entry.Similarly, mutation on bitstrings may be implemented as a bit-flip coupled with acertain mutation probability whereas a normally distributed random variable may beadded to an offspring for a real-valued representation These are just two examples

of the large group of variation operators that have been introduced These operatorsmust be choosen adequately with respect to the genotype and to the application task.Even if the genotype is fixed, several choices typically remain Consequently, a largenumber of these operators exist Consider for example a permutation genotype, i.e.,the first representation of the TSP above In this case, [1, p 216f] lists nine recombi-nation types alone Each form of recombination aims to preserve some characteristicswhich are assumed to be beneficial for solving the problem For example, in order

crossover, two indices i and j , i < j are chosen Between these indices, the entries

of the first parent are copied for the first of the two offspring The rest is filled byfeasible components of the second parent All entries are acceptable if they are notalready part of the offspring These entries are removed The others are copied into

Trang 22

the offspring starting after j and circling back to the beginning1 In this manner, therelative ordering of entries of the second parent is preserved—as far as it is possible.For the second offspring, the parents switch their roles As seen, order crossover

tries to retain the relative order of the elements (a should come after b) but not the absolute order of the element (a should be first, b second) An operator which is

aimed at preserving the absolute order is cycle crossover

Common to all GAs is that mutation is used only as a background operator givingthe algorithm the chance to cover the whole search space The main search operator

is recombination Mutation is therefore used only with a small probability Selection

in GAs often varies widely This concerns the parent selection as well as the survivorselection

A special form of GAs are real-coded genetic algorithms (RCGAs), see e.g [2,10,

11] which use a real-valued representation and specialized recombination operators.Mutation is not present but it should be noted that the properties of the special recom-bination operators are very similar to those of the mutation operators in evolutionstrategies [12] Two main classes of crossover variants can be distinguished: parentcentric including, e.g., versions like blend crossover[13], simulated binary crossover[10], or parent centric crossover (PCX) [14] and mean centric crossover represented

by, e.g., unimodal normal distribution crossover [15] and simplex crossover [2].While the former create the offspring close to the selected parents, the latter take themean or the centroid as the basis to spawn the new candidate solutions Perhaps the

simplest crossover type in RCGAs is blend crossover (BLX- α) [13] which shall serve

as an example It is realized by choosing two parents x1and x2 The two offspring,

x1, x2, are then created component-wise as

x 1 j = x 1 j + γ j (x 2 j − x 1 j )

x 2 j = x 2 j − γ j (x 2 j − x 1 j ) (2.1)withγ j ∼ U(−α, 1 + α) The parameter α controls the extend of the change by the

uniform random variablesγ j The component-wise is difference between the parents,and therefore, the parent diversity, controls the spread of the offspring Explorationbeyond the borders defined by the two parents is enforced byα We will see that the

concept of using the distribution of the (good) parent solutions to create the offspring

is also applied in other variants of natural computation

Evolution strategies (ESs) were invented by Rechenberg, Schwefel, and Bienert [16,

17] Interestingly, their first application area was in discrete optimization Today,however, they are predominantly used in continuous optimization and are seen asefficient metaheuristics for this area as several studies have revealed [18,19] Evo-

1 The genotype is treated as a ring

Trang 23

lution strategies operate directly on the search space using N -dimensional vectors.

Mutation is the main search operator Indeed, all of them employ mutation, while,there are ES types that do not use recombination [20] If recombination is used, it

is usually performed by choosingρ of the μ parents uniformly at random to

cre-ate an offspring The recombination is then realized by computing the average of

the selected parents This type is called intermediate recombination Another form exists, termed dominant or discrete recombination where for each component of

the recombinant one parent is chosen at random and its respective entry is copiedinto the offspring Dominant recombination, however, is seldom used for continuousoptimization Often,ρ = μ in which case, all parents contribute and the offspring

only differ after mutation has taken place The recombination can be realized as aweighted average or convex sum

resulting inλ offspring which are then evaluated with the fitness function The random

variableσN (0, C) has zero mean and a covariance matrix σ2C A very important

task in evolution strategies is an appropriate adaptation of the covariance matrix.This is in contrast to genetic algorithms which usually operate with a constant muta-tion probability The extend of the changes must be adapted to the fitness landscape,that is the step-sizeσ and the mutation directions C must fit to the form of the area

the population is currently in If this is not the case, the ES may perform ciently or may not converge to good solutions at all Since this is a very critical task,

ineffi-research on adaptive and self-adaptive methods has a long history in ESs (see e.g.

[21,22]) Today, covariance matrix adaptation (CMA) which estimates the ance matrix given the search history and the present population is usually seen as thestate-of-the art The best known version, the CMA-ES, stems from Hansen, Oster-meier, and Gawelcyk (see [23] for more details) More recently, Beyer and Sendhoff[24] introduced the CMSA-ES (covariance matrix self-adaptation evolution strategy)which performs comparatively Concerning survivor selection, evolution strategiesfollow deterministic schemes There are two main types: comma- and plus-selection.Comma-selection discards the old parent population and takes theμ best of the λ

covari-offspring Therefore,λ > μ is necessary Plus-selection takes the μ best individuals

from the old parent and the offspring populations Here,λ < μ is possible and a fit

individual may persist a long time

Natural (gradient) evolution strategies (NESs) [25] are at the boundary betweenestimation of distribution algorithms and evolution strategies They operate with

Trang 24

an explicit probability model, a normal distribution, the parametersθ of which are

chosen so that the expected fitness

J(θ) =

f (x, θ)τ(x, θ)dx, (2.4)

withτ(x, θ) the probability density function of x given θ, is optimized Potentially,

the maximization problem could be solved via stochastic gradient descent with acomparatively simple update rule for the statistical parameters However, problems

as slow convergence or prematurely reduced step-sizes were encountered Therefore,several changes were introduced: First, fitness shaping to strengthens the influence

of higher quality solution Several transformations are possible The main ment is that they have to respect the monotonicity of the original fitness function.Typically, ranking-based transformations are recommended It should be noted thatfitness shaping is introduced based on empirical evidence and not on theory.Second, natural gradients replace the “normal variant” Natural gradients were firstconsidered by Amari [26] for learning in artificial neural networks Their usage inNESs postulates that the step taken in the parameter space should not only optimizethe expected fitness but should cause the resulting distribution to remain close tothe previous distribution Natural evolution strategies can be interpreted as model-based stochastic optimization methods which implement additionally some heuristiccomponents

Differential evolution (DE) is another EA type which is predominantly used for

con-tinuous optimization, see, e.g., [27,28] for an introduction As ESs, it operates on

N -dimensional vectors Among others, DE differs from most EAs in the variation

order First, mutation is performed, followed by recombination Differential evolutionhas been introduced in 1995 by Storn and Price [29] It considers distance vectorsbetween population members as a foundation for the changes The distance vec-tors indicate the population diversity They are large, when the population is spreadthroughout the space and they are smaller, when the population converges Usingdistance vectors, differential evolution can potentially transverse between differentlocal optimizers as long as there are population members in their respective vicin-ity As stated by Storn, differential evolution has the ability for contour matching,thus, the population can adapt to various forms of the fitness landscape, see [30] To

create an offspring, a parent vector (target vector) xl is chosen at random from thepopulation The following mutation process combines traits from several offspring

Trang 25

First, one basis vector xi is chosen to which the weighted distance vector between

two other members xj, xkis added The members are chosen at random from thepopulation, excluding the parent vector The factorβ ∈ (0, 1) is a control factor the

value of which is set before starting the optimization run Later, further mutationtypes have been introduced The changes may concern all components of the muta-tion: For example, the weighting factor which may be chosen at random Furtherchanges concern the basis vector which, e.g., may be chosen as the best individual

of the population and of course the distance vectors themselves The mutation result

is then recombined with the parent member by choosing the entries at random fromboth candidates The recombination guarantees that the parent is not reproduced,i.e., at least one component is exchanged The classical DE then performs a pairwisecomparison between offspring and parent The better survives and enters the nextpopulation Other DE-types create a pool of offspring solution and take the best can-didates of the parent and offspring populations Differential evolution is an efficient

EA, although it is seen as not robust if the evaluations are overlaid by noise, i.e.,random perturbations [31,32]

Genetic programming (GP) has been developed in the 1990s, see [33,34] It can beseen as a subtype of a GA which uses a special representation [1], usually a programtree Its first application area was the evolution of computer programs Today, GP

is used in machine learning, data mining, or robot control Instead of optimizingparameters it can be seen as optimizing the form of a model or a function that shallsolve a certain task Usually, GP operates on syntax trees where the inner nodesdecode functions or operators and the leaves implement constants or input variables.Due to the tree form, GP uses specially adapted recombination and mutation methods.Recombination or crossover combines parts of two trees and has strong variationaleffects usually more often associated with mutation This has two effects: Mutation

is used only with a very small probability Furthermore, modern GP types preservethe well performing members of the parent population in order to safeguard againstpossible detrimental crossover effects Genetic programming operates with largepopulations with over 500 members Tournament selection is thus very common Awell-known problem encountered in genetic programming, is bloat, an exponentialincrease of the tree size during the course of the run Usually, the excess parts areneutral with respect to the function that is learned and can be compared to the non-coding parts of genomes There are many theories—some contradicting—concerningthis phenomenon, its effects, and its causes The large program size causes a slowerexecution of the code, therefore methods exist which either try to limit the tree size

an individual can achieve or remove non-coding parts from the tree However, thenon-coding parts may have benefits as discussed e.g in [33,35] Mutation can be adisruptive event which destroys important parts of the structure But if it occurs inparts of the tree which have been neutral so far, its effects may be dampened

Trang 26

2.2 Swarm-Based Methods

Swarm-based methods consider the swarming behavior of animals as the foundationfor the algorithms Two methods are probably best known: ant colony optimizationwhich is inspired by ant behavior and particle swarms which are modeled to mimicthe behavior of bird swarms or fish schools Both were introduced in the 1990s andprogressed fast from first academic investigations to industrial use Further methodsmimic the behavior of bees or wasps and are used e.g for routing tasks in networks

In the following, the two best known methods are discussed

Ant colony optimization (ACO) has been introduced in the 1990s by Dorigo and

others for the traveling salesman problem (TSP) While it is not among the bestapproaches for tackling the TSP, it is one of the best for vehicle routing problemsand has been applied successfully to other routing problems as well as to schedulingand assignment problems [36] The working principle is best explained using theTSP An ACO is a constructive metaheuristic, i.e., each member of the population(ant) constructs a candidate solution from scratch by moving on the constructiongraph which represents the problem In case of the TSP, the construction graph is theTSP graph itself In other applications, it may be more difficult to find a good graphrepresentation for the problem Ant colony optimization has two phases: During thefirst phase, the ants construct candidate solutions by moving on the constructiongraph, connecting the components of the solution At each node of the graph, the anthas to make a stochastic decision which edge it should take The decision is influenced

by the artificial pheromone trailsτ of the ants and by problem specific information

η which provides guesses which components may be beneficial for the solution The

probability that an ant k, which is currently at the node i moves towards an adjacent node j (and thus adds the component or the edge v i j to the solution) follows

in most cases The symbolN (i) denotes all adjacent permissible nodes of j for the

ant k in the construction graph In the case of the TSP, the distance between cities is

usually taken into account as a problem-specific information, the so-called heuristicinformation can then be encoded asη i j = 1/d i j with d i j the distance between i and j

After the ants have constructed their individual solutions, the algorithm switches to

the second phase, the pheromone update First, a process called evaporation is started

during which the pheromone trails on the edges are decreased This shall enable theACO to forget bad solutions over time Afterwards, the ants deposit pheromone onthe edge they have visited The pheromone amount is proportional to the quality

Trang 27

2.2 Swarm-Based Methods 19

of the solution the ant has built The more ants have used an edge and the betterthe overall solution is to which the edge belongs, the more pheromone is deposited.Such edges will become more attractive in the following iterations leading to thedeposition of more pheromone which in turn increases the attractiveness again Suchprocesses are termed autocatalytic and are one of the main working principles ofACO Several ACO methods have been developed They differ in various points:Some only allow the best ant(s) to deposit pheromone, other allocate extra amounts

to the best ant, some use a process called local pheromone evaporation in which antsremove pheromone after they used an edge in order to enforce exploration For anintroduction into ACO see [36]

Particle swarm optimization (PSO) is typically used for continuous search spaces

It is built after the swarming behavior of birds or fishes In its simplest form, the

particles move through the search space by updating their velocity vector vi (t) which

indicates direction and extend of the movement (see Fig.2.4) The update considersinformation of the swarm and of the search history Usually, there are three maincomponents

v(t + 1) = ωv(t) + c1r 1· (ˆx(t) − x(t))

The first considers the old velocity v(t) which is included as a momentum term to

safeguard against abrupt changes and to enable the swarm to leave the boundaries of

the initial region The second c1r 1· (ˆx(t) − x(t)) is called the social component The

social component gives the particle the tendency to move towards the current bestmemberˆx(t) of the swarm This contribution is combined with stochastic influences

c1r1enforcing exploration The symbol· denotes a component-wise multiplication

At this point, the swarm resembles a multi-point stochastic hill climber Ignoringthe old velocities, all members of the swarm would move towards the current bestsolution However, a particle also considers information from its own search history

in c2r2· (y(t) − x(t)) The cognitive component gives the particle the tendency to

return to the best point y(t) it has found so far Again, stochastic influences are

present

The PSO described above is the original form of the so-called global best PSO

since the best individual is determined using all swarm members There are also types

of PSO which consider local neighborhoods of particles Here, several different

topologies are in use The simplest and oldest local topology is the ring, where each particle is connected to k neighbors The neighborhoods overlap, allowing a

slower propagation of a good position through the swarm Other topologies includelattice-like structures or combine well-connected clusters with sparse connectionsbetween the different clusters The connectivity determines the convergence speed:Fully connected structures converge faster, whereas sparser structures exhibit longer

Trang 28

Fig 2.4 Particles moving through a search space using a momentum component (blue) a social

component (red) and a cognitive component (green)

exploration phases Therefore, the former are used in unimodal optimization whereasthe latter are typically applied when the problem is assumed to be multimodal.Nearly all PSO approaches determine the neighborhood based on the indices butnot on the distance in search or function space The reasons for these are two-fold:First of all, determining the pair-wise Euclidean distance between all swarm membersincreases the computational burden considerably Second, local neighborhoods in thesearch space would also keep information concerning good solutions contained there.Particles farther away would receive an incorporate the information belatedly Theswarm would therefore have the tendency to compare mainly the solution quality

in local search space neighborhoods and operate similar to parallel local searchprocedures

Particle swarm optimization is also quite efficient operating with swarm sizes of10-30 individuals Over the years, a lot of variants have been developed The reader

is referred to [37] for an overview The velocity update equation, (2.8), provides anexample for an inertia weightω, ω > 0 Inertia weights are one means to deal with

a problem that was encountered early in PSO research The velocity vectors showed

a strong tendency for an increase resulting in large positional changes of the particle.Other common methods include using a constriction factorχ, χ > 0

Trang 29

2.3 Neuroevolution 21

2.3 Neuroevolution

Artificial neural networks (ANNs) are simple models of brains—from the biological

viewpoint From the mathematical sciences, they can be interpreted as non-linearfunction approximators Their building blocks are neurons which receive, process,and propagate signals Usually, a neuron receives inputs from connected neuronsvia weighted links, aggregates and transforms the information and passes it on toother neurons A neural network is able learn by adjusting the weights or by alteringthe networks topology for example by introducing new neurons or by changing theconnections between neurons Traditionally, research in ANNs focused on weightlearning methods with relatively few approaches as for example optimal brain sur-geon, optimal brain damage, cascade correlation addressing the question of improv-ing the network structure Neural networks are also often coupled with evolution-ary algorithms for instance in the areas of reinforcement learning or computationalintelligence in games Here, two main directions can be distinguished: Either the

EA substitutes traditional weight learning techniques or it additionally changes thenetwork topology In both cases, the question of representation arises Very commonare direct representations, that is, the EA operates on a more or less straightforwardrepresentation of the structure or the weights of the network Other representations,e.g., developmental representation are also used, see e.g [38] In the case of the firstmain application area, the learning and generalization capabilities of the networkdepends on the structure defined by the user If the network is too small, it cannotsolve the task If the network is too large, it is prone to overfitting—overadapting tothe training set If the user is able to specify an appropriate structure beforehand, thelearning task is easier, however Weight learning is often performed with ESs andPSO, see [39] for an example

Other approaches, called topology and weight evolving artificial neural networks

(TWEANNs), change the structure of the network An example for this class is the

group of neuroevolution of augmenting topologies (NEAT) approaches (see e.g [40]for an introduction) The NEAT-approaches work with two types of genes: nodegenes for neurons and connection genes coding links between two neurons Anindividual of the population is a complete neural network expressed by the genes.These individuals are changed via adapted crossover and mutation operators In thebeginning, very simple network structures are used which increase during the run

if there is an evolutionary advantage (complexification) Mutation may concern the

structure or the weights In the latter case, it makes random changes to the weight

of a connection In the former case, it introduces new nodes or new connectionsinto the network As a result, individuals may have genotypes of different lengths(and of course structures) which makes crossover difficult NEAT therefore tries

to identify similar genes in individuals which are then changed during crossover,whereas dissimilar genes are copied from the better parent Only one offspring iscreated When major structural changes occur, it usually takes time to fine-tunethe weights so that the new network performs as best as it can If such a network iscompared to one with a inherently inferior structure but with already adapted weights,

Trang 30

it looses the comparison Innovations are thus in danger of not being accepted Tosafeguard against this effect, NEAT uses niching—grouping similar individuals into

a subspecies Competition on the individual level only occurs in a niche

NEAT and its successors, e.g., rtNEAT for real-time learning [41] and hyperNEAT[38] for large networks, are successful and perform superior if the task requiresstructure-learning and if it is difficult to identify appropriate structures beforehand

If this is not the case, NEAT may waste resources by explicitly evolving the topology.The approach has been applied to several areas ranging from optimal control [42,

43] to computer games [44,45] and collision warning systems [46,47] The originalNEAT variants addressed single objective problems Multi-objective variants havebeen introduced among other by [48, 49] with [48] using the SPEA2 as multi-objective EA and [49] applying the NSGA-II Both multi-objective methods aredescribed in more detail in Sect.2.4 In order to cope with situations that requiresubroutines or modules suited to specific tasks, [50] introduced Modular Multi-objective NEAT (MM-NEAT)

2.4 Natural Computing and Multi-Objective Optimization

In practical optimization, several objectives may appear Often, these criteria areconflicting, that is, maximizing one goal results in decreasing the value of other

goals This is the area of multi-objective optimization which remains a fundamental

challenge see e.g [51–53] Analogous to a single-objective problem a multi-objective

problem with n objectives is formulated as

min

x∈X ( f1(x), f2(x), , f n (x)) (2.9)According to [51] a multi-objective model can be seen as an intermediate result

of a modeling process where the decision maker is faced with conflicting objectives.Further specifications of the decision makers’ preferences are required which wouldeventually transform the model into a single-objective problem In the case thatthe preferences of the decision makers are known beforehand, it is possible to use a

solution-oriented transformation as the weighted sum method [51] The problem canthen be solved with single-objective optimization Since the decision makers statetheir preferences before the optimization run and enable the determination of a utilityfunction, the methods are also known as a priori methods The weighted sum method

is suitable for convex optimization problem but may fail for more general problems

In this case, the -constraint method can be applied Here, only one objective is

optimized whereas the others are transformed into constraints, see e.g [52] for anoverview on this and other methods

However, identifying the preferences beforehand is not always possible In thiscase, the decision makers need to compare several alternatives before they are able to

state which one they prefer For this, it is necessary to identify so-called compromise

Trang 31

or Pareto optimal solutions which are not surpassed by any other solution In other

words, no other solution or decision exists that would lead to a better outcome in

at least one objective without worsening the other criteria This set of compromisesolutions is stored in an archive After the optimization process is complete, it ispresented to a human decision maker or analyzed further In order to achieve thegoal of finding or at least approximating the Pareto optimal solutions (called Paretoset), the task is reformulated as a set-oriented problem In most applications, the

Pareto front in the function space can only be approximated, however That is, the

corresponding solution set contains the vectors that are non-dominated by any otherknown candidate Since the goal of the optimization is to find a good Pareto frontapproximation, there are two criteria that the process has to take into account Onthe one hand, the algorithm needs to identify non-dominated solutions, on the other,the entire front should be approximated Thus, the solutions that are retained in therepository should be as diverse as possible

The concept of Pareto dominance (and other dominance measures) usually leads

to incomparable sets, and thus incomparable Pareto set approximations Therefore,further quality indicators are necessary to define a total preorder on the solutionspace For an overview see e.g [51, 53] Here, the discussion is restricted to the

hypervolume indicator I H (A) or S -metric The indicator is defined with respect to

a reference set R in the function space First, the subset of the function space that

is defined by the front of F (A) = {f(a)|a ∈ A} and the reference set is determined

as H (A, R) = {g ∈ F|∃a ∈ A, ∃r ∈ R : f(a) ≤ g ≤ r} [51] The indicator denotes

then the Lebesgue-measure of this subset I H (A) = λ(H(A, R) Larger values of the

indicator are preferred

For the area of natural computing, several algorithms have been introduced Theycan be grouped into several classes: algorithms that are based on dominance anddominance ranking, algorithms that apply decomposition techniques, and algorithmsthat optimize indicator functions Here, some examples of the first and the thirdgroup are described Algorithms belonging to the first group require diversificationmechanisms in order to ensure a spreading of the solutions over the Pareto front.Concerning dominance-based methods, two approaches and their derivatives areidentified as standard multi-objective evolutionary algorithms in literature, the non-dominated sorting genetic algorithm and the strength pareto evolutionary algorithm,here described shortly in their revised version It should be noted that both algorithmsare used for continuous as well as discrete problems Therefore, neither defines thedetails of the recombination and mutation processes leaving those to the applicationtask

The non-dominated sorting genetic algorithm, the NSGA-II, addresses mainly

the question of survivor selection [54] It uses a ranking of the present population byintroducing levels of non-dominance The first rank or first front contains all solutionsthat are non-dominated The second rank, the second front, consists of all memberswhich are only dominated by individuals from rank one This continues until therank of all solutions has been determined However, since the size of the archive islimited, it remains necessary to distinguish the quality of individuals belonging tothe same rank

Trang 32

24 2 Natural Computing and OptimizationThe algorithm tries to enlarge the population diversity For this, the concept of

crowding is introduced The less crowded the area around an individual is, the more

valuable it is for the search If a part of the front, however, is already well covered,selection should prefer less densely populated areas The preference relation used for

the selection is termed crowded-comparison It first compares individuals using their

non-domination ranks Individuals with a lower rank number are always preferred Inthe case of rank parity, an individual is selected if its crowding distance is larger The

measure operates on the m-dimensional function space The crowding distance is an

aggregated measure over all function components For each criterion, the population

is sorted independently An individual then has two neighbors for each objective

m: one with a better value and one with a worse value than itself The difference

between the function values of these members is then obtained and set in relation tothe maximal spread of the criterion The crowding distance is then obtained as thesum over all criteria In the case that an individual is a boundary point of one objective,

it is assigned a value of infinity guaranteeing its selection if its non-dominated rank

is sufficiently high The algorithm uses the crowded distance for survivor selectionfollowing an elitist scheme The offspring and the parent population are combined andthe rank of all individuals is determined The next parent population is filled based

on the fronts As long as a front can enter the population in its entirety, the dominated ranking remains the sole selection measure If a front only fits partially,the crowding distance comes into play and the remaining places are filled by thefront members with the largest crowding distances

non-Siegmund et al [55] explicitly addressed the problem of using multi-objectiveoptimization in the context of stochastic simulations In order to cope with the result-ing noisy optimization problem, the authors augmented a variant of the NSGA-IIwith resampling strategies Instead of using the crowding distance, the R-NSGA-IIdetermines the distance to predefined reference points which are set by the decisionmakers Several resampling schemes ranging from static over time-based schemes

to schemes that take concepts from multi-objective optimization into account wereevaluated Here, a time-based scheme is briefly described following [55] Resamplingapproaches re-evaluate a candidate solution several times in order to derive a morereliable estimate However, for each evaluation costs are incurred Therefore, thenumber of samples and the sample set itself for the re-evaluation must be determinedwith care The time-based resampling in [55] starts with only a sampling set in thebeginning of a run which is gradually increased Furthermore, the authors proposedand compared a new dynamic resampling approach called distance-based dynamicresampling For the two-objective noisy test function considered in the paper, thenew scheme and a time-based resampling scheme performed best

The strength Pareto evolutionary algorithm 2 (SPEA2) [56] addresses again thetwo main goals of multi-objective optimization: Minimizing the distance to the Paretofront and spreading the population The algorithm operates with an archive whichcontains the currently non-dominated solutions The archive has a maximal capacity,therefore at times solutions must be deleted The truncation method applied preservessolutions at the boundaries of the objectives

Trang 33

The algorithm uses the concept of strength S (i) which equals the number of

solutions that are dominated by the individual i In contrast, the raw fitness of an individual is given as the combined strengths of all individuals that dominate i In

the case of individuals with the same raw fitness, the SPEA2 determines the distance

to the kth nearest neighbor and uses the inverse to estimate the density which is then

added to the raw fitness The algorithm first creates a mating pool by performingbinary tournament selection based on the fitness with replacement until the pool isfilled Afterwards, crossover and mutation processes are performed and the fitness

of all solutions is (re)assessed After the new population has been created, the dominated front is determined by considering the population and the current archive

non-If the size of the new non-dominated front does not exceed the size of the archive,

it is copied into the archive and the remaining free places are filled with the bestremaining solutions, i.e., the population and old archive members with sufficientlysmall fitness values If the size of the new non-dominated front is too large, however,the truncation mechanism has to be applied

TheS -metric evolutionary multi-objective algorithm (SMS-EMOA) [57] usesthe dominated hypervolume, a concept that combines the hypervolume with solutiondominance It is used here as an example for this algorithm type Other hypervolume-based algorithms exist, see [51] It uses a plus-strategy (μ + 1) or a steady-state strat-

egy creating one offspring in each generation The offspring may enter the population

if this increases theS -metric Since the size of the population is kept constant, one

individual must be deleted This is determined by dividing the population into ranks(similar to the NSGA-II) and deleting one individual from the worst front The indi-vidual is determined by considering the changes to theS -metric of the front caused

by its elimination The front member is removed which causes the least loss Usingthe hypervolume requires the definition of reference points The SMS-EMOA applies

an adaptive reference point based on the nadir point n [58, p 35] which concerningminimization is defined as

possible choice: For example, the so-called R2 indicator may also be used which

is easier to compute Using the R2 indicator has a potential drawback since it is

only weakly monotonic [59] However, as argued in [59], this may not represent

a serious problem provided that the objective functions assume continuous values.Multi-objective approaches have also been considered in swarm-based optimization.For example, several multi-objective ant colony optimization (MOACO) approacheshave been introduced, see e.g [60,61] for an overview As pointed out in [61] theperformance may vary significantly according to the design choices made In thecase of ant colony optimization, an adaptation to the multi-objective case can be

For the area of natural computing, several algorithms have been introduced Theycan be grouped into several classes: algorithms that are based on dominance anddominance ranking,... class="page_container" data-page="32">

24 Natural Computing and Optimization< /small>The algorithm tries to enlarge the population diversity For this, the concept of

crowding is introduced

Định dạng
Số trang	67
Dung lượng	1,28 MB