1. Trang chủ
  2. » Công Nghệ Thông Tin

Bài giảng Khai phá dữ liệu (Data mining) Genetic algorithm

70 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Genetic Algorithm
Tác giả Trịnh Tấn Đạt
Người hướng dẫn TAN DAT TRINH, Ph.D.
Trường học Saigon University
Chuyên ngành Information Technology
Thể loại lecture
Năm xuất bản 2024
Thành phố Ho Chi Minh City
Định dạng
Số trang 70
Dung lượng 2,95 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

What Are Genetic Algorithms GAs? GAs are search and optimization techniques based on Darwin’s Principle of Natural Selection and Genetic Inheritance.. Nature to Computer Mapping GAs us

Trang 1

Trịnh Tấn Đạt

Khoa CNTT – Đại Học Sài Gòn

Email: trinhtandat@sgu.edu.vn

Website: https://sites.google.com/site/ttdat88/

Trang 2

 Introduction: Genetic Algorithm (GA)

 GA Operators and Parameters

 Example

Trang 3

History Of Genetic Algorithms

 “ Evolutionary Computing ” was introduced in the 1960s by I.

Rechenberg.

‘ Adaptation in Natural and Artificial Systems ’ in 1975.

 In 1992 John Koza used genetic algorithm to evolve programs to

perform certain tasks He called his method “ Genetic Programming ”.

Trang 4

What Are Genetic Algorithms (GAs)?

 GAs are search and optimization techniques based on Darwin’s Principle

of Natural Selection and Genetic Inheritance.

 A class of probabilistic optimization algorithms.

 Widely-used in business, science and engineering.

Trang 5

Basic Idea Of Principle Of Natural Selection

“Select The Best, Discard The Rest”

An Example of Natural Selection:

 Rabbits are fast and smart

 Some of them are faster and smarter than other rabbits Thus, they are less likely to be eaten by foxes.

 They have a better chance of survival and start breeding.

 The resulting baby rabbits (on average) will be faster and smarter

 Now, evolved species are more faster and smarter

Genetic Algorithms Implement Optimization Strategies By Simulating Evolution Of Species Through Natural Selection

Trang 6

Classes of Search Techniques

Genetic Programming

Genetic Algorithms Sort

Trang 7

Nature to Computer Mapping

 GAs use a vocabulary borrowed from nature genetics

Population Set of solutions Individuals in environment Solutions to a problem Individual’s degree of adaptation to its

surrounding environment

Solutions quality (fitness function)

Chromosome Encoding for a SolutionGene Part of the encoding of a solution

Selection, crossover and mutation in nature’s evolutionary process Stochastic operators

Trang 8

Working Mechanism Of GAs

Trang 9

Simple Genetic Algorithm

Simple_Genetic_Algorithm()

{ Initialize the Population;

Calculate Fitness Function;

While(Fitness Value != Optimal Value)

{ Selection ;//Natural Selection, Survival Of Fittest

Crossover ;//Reproduction, Propagate favorable characteristics

Mutation; //Mutation

Calculate Fitness Function;

} }

Trang 10

Designing GAs

⚫ How to represent chromosomes ?

⚫ How to create an initial population?

⚫ How to define fitness function?

⚫ How to define genetic operators?

⚫ How to generate next generation?

⚫ How to define stopping criteria?

Trang 11

GA Operators and Parameters

Trang 12

Search Space and Population

 The search space S is the finite set of possible solutions

 Each solution x  S is called an individual

 Population of size N is a subset of search space S.

 Start with a population of randomly generated individuals, or use

 A previously saved population

 A set of solutions provided by a human expert

 A set of solutions provided by another heuristic algorithm

Trang 13

Representation (Encoding)

The process of representing the solution in the form of a string (chromosome)

that conveys the necessary information.

 Just as in a chromosome, each gene controls a particular characteristic of the individual

 Similarly, each bit in the string represents a characteristic of the solution.

Trang 14

Binary Encoding

Trang 15

Binary Encoding

Binary Encoding – Most common method of encoding Chromosomes are strings of 1s and 0s

 In classic genetic algorithms, binary strings of fixed length m are used.

 In order to be able to encode each solution of the search space S in a one-to-one

way, the inequality

 For example, we have S={1,2, …,15} Choose m = 4 to represent 15

15 2

8 = 3   4 =

Trang 16

Value Encoding

 Every chromosome is a string of some values

 Values can be anything connected to problem, form numbers, real numbers or chars to some complicated objects

Trang 17

Permutation Encoding

travelling salesman problem or task ordering problem

 In permutation encoding, every chromosome is a string of numbers,

which represents number in a sequence.

Trang 18

Tree Encoding

 Tree encoding is used mainly for evolving programs or expressions,for genetic programming.

 In tree encoding every chromosome is a tree of some objects, such as

functions or commands in programming language

Trang 19

Fitness function

 A fitness value is assigned to each solution depending on how close it actually is

to solving the problem.

A fitness function is a nonnegative function f

A fitness function quantifies the optimality of a solution (chromosome) so that particular solution may be ranked against all the other solutions

f : S R

Trang 21

 The primary objective of the selection operator is to emphasize the good solutions and eliminate the bad solutions in a population, while keeping the population size constant

 “Selects The Best, Discards The Rest”.

 Identify the good solutions in a population.

 Make multiple copies of the good solutions.

 Eliminate bad solutions from the population so that multiple copies of good solutions can be placed in the population

The process that determines which solutions are to be preserved and allowed to reproduce and

which ones deserve to die out

Trang 22

Random Selection

 Chromosomes are randomly selected from the population to be parents

to crossover

Trang 23

Roulette Wheel Selection

 Roulette Wheel Selection (fitness-proportional selection; stochastic sampling with replacement)

is an instance of a reproduction operator:

 Strings that are fitter are assigned a larger slot and hence have a better chance of appearing in the new population.

 For example, after spinning 4 times, we have new population {2,4,2,1}

Trang 25

 Elitism can very rapidly increase performance of GA, because it preventslosing the best found solution

Trang 26

Tournament Selection

population at random and select the best out of these to become a parent

Trang 28

Age Based Selection

 We don’t have a notion of a fitness

 Each individual is allowed in the population for a finite generation where

it is allowed to reproduce, after that, it is kicked out of the population no matter how good its fitness is

Trang 29

Fitness Based Selection

 The children tend to replace the least fit individuals in the population

Trang 30

 After selection, a specified percentage pc of chromosomes

in the mating pool P'(t) is chosen at random.

 The selected chromosomes are mated at random, and each

pair of parents undergoes a crossover operation.

It is the process in which two chromosomes (strings) combine their genetic material (bits) to produce a new offspring which possesses both their characteristics.

The cross-over probability pc is another parameter of the genetic

algorithm

Typical values are between 60% and 90%.

Trang 31

One-point Crossover

 A random point is chosen on the individual chromosomes (strings) and the genetic material is exchanged at this point.

Trang 33

Uniform crossover

 Bits are randomly copied from the first or from the second parent

Trang 34

Arithmetic crossover

 Some arithmetic operation is performed to make a new offspring

Trang 35

Tree crossover

Trang 36

Partially Matched Crossover (PMX)

Trang 37

 Crossover between 2 good solutions MAY NOT ALWAYS yield a better or as good a solution.

 However, parents are good  probability of the child being good is high.

 If offspring is not good (poor solution), it will be removed in the next iteration during “Selection”.

Trang 38

 After crossover, a specified percentage pm of genes in the

pool P’’(t) is chosen at random.

 A selected parent chromosome undergoes a mutation

It is the process by which a string is deliberately changed so as to maintain diversity in the

population set.

The mutation probability pm is another parameter

of the genetic algorithm.

Typical values are below 1%.

Trang 39

 The classical mutation operator is the Bit-flip Mutation

Trang 40

Advantages Of GAs

Global Search Methods :

 GAs search for the function optimum starting from a population of

 This characteristic suggests that GAs are global search methods.

Trang 41

 Hill climbing (gradient descent - ascent) method

 A new point is selected from the neighborhood of the current point based on its fitness value.

local

global

Trang 42

I am not at the top.

My high is better!

I am at the top Height is

I will continue

few microseconds after

Trang 43

Advantages Of GAs

 Exploiting the best solutions

 Takes the current search information from the experience of the last search to guide the search toward the direction that might be close to the best solutions

 From Selection operator and Crossover operator.

 Exploring the search space

 Widens the search to reach all possible solutions around the search space

 From Mutation operator and Crossover operator.

Important task: GAs can balance exploitation and exploration.

 Too high exploitation leads to premature convergence

 Too high exploration leads to non-convergence and to no fitter solution.

 Hill climbing only exploits the best solution It neglects exploration of search space.

Trang 44

Advantages Of GAs

Blind Search Methods

 GAs only use the information about the fitness function to solve the

optimal problem

GAs use probabilistic transition rules

 This makes them more robust and applicable to a large range of problems.

GAs can be easily used in parallel machines

 Reduce computation cost significantly

Trang 46

GA Examples

Maximum of Function

 Let’s consider a function f

Problem: find x 0 such that

 First derivative

1 )

10 sin(

) (x = x x +

] 2 , 1 [

), (

) (x0  f xx  −

f

0 )

10 cos(

10 )

10 sin(

, 20

1 2

0

2 , 1

, 20

1 2

0

i

i x

x

i

i x

i i

For x 19 =1.85, f(x 19 )=2.85

Trang 47

GA Examples

How GAs can solve this problem ?!?

 Representation (convert real numbers to chromosomes)

 Using binary vectors as a chromosome

Length of chromosome m=22

 The mapping from a binary string <b21b20…b0> into a real number x from [-1, 2]

is given by

2 -1

3*10 6 real numbers

Trang 48

 For example, a chromosome

Trang 49

GA Examples

 Initial population

Create randomly population, each chromosomes v is a binary string of 22 bits

Fitness function: eval(v)=f(x),

1 )

10 sin(

Trang 50

On the other hand

These offspring evaluate to

The second offspring has a better evaluation than both of its parents f(v 2 )= 0.0788

f(v 3 )= 2.2506

Trang 51

GA Examples

 Assume that pop_size=50, pc=25% and pm=1%

 After 150 generations, we have

Trang 52

GA Examples

Traveling Salesman Problem (TPS)

The travelling salesman must visit every city in his territory exactly once and then return back to the starting point.

Given the cost of travel between all cities, how should he plan his itinerary for minimum total cost ?

Cost = {money, distance, time,….}

Trang 53

GA Examples

Binary Representation

Trang 54

GA Examples

Why we cannot use binary string

Fail!

Trang 55

• Look for the most natural expression of the problem.

• Create genetic operators that avoid building illegal chromosomes

Nonbinary Representation

Trang 56

GA Examples

Swap Mutation

The following mutation operator is adapted to the path representation:

Trang 57

GA Examples

Why we cannot use single-point crossover

Fail!

Trang 58

GA Examples

PMX-Crossover

Also Partially Matched Crossover (PMX) avoids building illegal chromosomes:

Trang 59

How Do GAs Work?

Maximize a function of k variable, f(x 1 ,…,x k ), where x i[a i ,b i ], and f(x 1 ,…,x k )>0 for all x i

Representation (binary string)

We divide [a i ,b i ] into (b i -a i )*10 4 equal size ranges

For each x i → binary string of length m i satisfies

To represent real value of a binary string

Thus, each chromosome is represent by a binary string of length

(b i -a i )*10 42 mi - 1

v=(string 1 string 2 …string k )

Trang 60

GA Examples

 Selection (Roulette wheel selection)

Trang 61

GA Examples

 Crossover(pc is probability of crossover)

 Mutation (pm is probability of mutation)

 For example, maximize the function

[-3, 12.1] → 15.1*10 4 equal size ranges

x 1 → binary string of length 18

[4.1, 5.8] → 1.7 *10 4 equal size ranges

x 2 → binary string of length 15

Trang 62

GA Examples

The total length of a chromosome is m=18+15=33 bits

Trang 63

GA Examples

Assume population of size (pop_size) equals 20 All 33 bits in all

chromosome are initialized randomly

Trang 64

Choose q 11 and q 4 ,etc.

New population

Trang 65

 Crossover with pc=2.5% If r < 0.25, we select a given chromosome of crossover

The chromosome v 2 ’, v 11 ’, v 13 ’ and v 18 ’ were selected for crossover

 Mutation with pm=1% We have 33*20=660 bits, we expect (on average) 6.6 mutation per generation Generate 660 random numbers r , if r < 0.01, we mutate the bit

Trang 66

After applying crossover and mutation, we have a new population

Trang 67

Genetic Algorithms (GAs) implement optimization strategies based on simulation of

the natural law of evolution of a species by natural selection

 The basic GA Operators are:

Encoding Selection Crossover Mutation

 GAs have been applied to a variety of function optimization problems.

 GA s have been shown to be highly effective in searching a large, poorly defined search

noise.

Trang 68

1) We're going to optimize a very simple problem: trying to create a list of N numbers that equal X when summed together

 EX1: N = 5 ; X = 8 ; one solution is [2, 0, 0 ,4, 2]

 EX2: N = 5 and X = 200, then these would all be appropriate solutions

lst = [40,40,40,40,40]

lst = [50,50,50,25,25]

lst = [200,0,0,0,0]

Ref : https://lethain.com/genetic-algorithms-cool-name-damn-simple/

Trang 69

2) Uses a genetic algorithm to maximize a function of many variables

Ex 1 : Consider the function: z = f(x,y) -x^2+2x-y^2+4y

Find (x*,y*) to z is maximum

Ref:

https://github.com/philipkiely/floydhub_genetic_algorithm_tutorial/blob/mast er/geneticmax.py

Ex 2: The equation is shown below:

Trang 70

3) Evolution of a salesman

Ref : genetic-algorithm-tutorial-for-python-6fe5d2b3ca35

Ngày đăng: 16/12/2023, 20:11

w