1. Trang chủ
  2. » Giáo án - Bài giảng

ai _ neural network for beginners (part 2 of 3) - codeproject

12 548 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề AI: Neural Network For Beginners (Part 2 Of 3)
Tác giả Sacha Barber
Thể loại Article
Năm xuất bản 2007
Định dạng
Số trang 12
Dung lượng 579,27 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Part 2 : This one, is about multi layer neural networks, and the back propagation training method to solve a non linear classification problem such as the logic of an XOR logic gate.. Pa

Trang 1

Articles » General Programming » Algorithms & Recipes » Neural Networks

AI : Neural Network for beginners (Part 2 of 3)

By Sacha Barber, 29 Jan 2007

Download demo project (includes source code) - 812 Kb

Introduction

This article is part 2 of a series of 3 articles that I am going to post The proposed article content will be as follows:

1 Part 1 : Is an introduction into Perceptron networks (single layer neural networks)

2 Part 2 : This one, is about multi layer neural networks, and the back propagation training method to solve a non linear classification problem such as the logic of an XOR logic gate This is something that a Perceptron can't do This is explained further within this article

3 Part 3 : Will be about how to use a genetic algorithm (GA) to train a multi layer neural network to solve some logic problem

Summary

This article will show how to use a multi-layer neural network to solve the XOR logic problem

A Brief Recap (From part 1 of 3)

Before we commence with the nitty gritty of this new article which deals with multi layer Neural Networks, let just

revisit a few key concepts If you haven't read Part 1, perhaps you should start there

Perceptron Configuration ( Single layer network)

The inputs (x1,x2,x3 xm) and connection weights (w1,w2,w3 wm) shown below are typically real values, both positive (+) and negative (-)

The perceptron itself, consists of weights, the summation processor, an activation function, and an adjustable

threshold processor (called bias here after)

For convenience, the normal practice is to treat the bias as just another input The following diagram illustrates the

revised configuration

4.86 (97 votes)

Trang 2

The bias can be thought of as the propensity (a tendency towards a particular way of behaving) of the perceptron to fire irrespective of it's inputs The perceptron configuration network shown above fires if the weighted sum > 0, or if you have into maths type explanations

So that's the basic operation of a perceptron But we now want to build more layers of these, so let's carry on to the new stuff

So Now The New Stuff (More layers)

From this point on, anything that is being discussed relates directly to this article's code

In the summary at the top, the problem we are trying to solve was how to use a multi-layer neural network to solve the XOR logic problem So how is this done Well it's really an incremental build on what Part 1 already discussed So let's march on

What does the XOR logic problem look like? Well, it looks like the following truth table:

Remember with a single layer (perceptron) we can't actually achieve the XOR functionality, as it is not linearly

separable But with a multi-layer network, this is achievable

What Does The New Network Look Like

The new network that will solve the XOR problem will look similar to a single layer network We are still dealing with inputs / weights / outputs What is new is the addition of the hidden layer

Trang 3

As already explained above, there is one input layer, one hidden layer and one output layer.

It is by using the inputs and weights that we are able to work out the activation for a given node This is easily

achieved for the hidden layer as it has direct links to the actual input layer

The output layer, however, knows nothing about the input layer as it is not directly connected to it So to work out the activation for an output node we need to make use of the output from the hidden layer nodes, which are used as inputs to the output layer nodes

This entire process described above can be thought of as a pass forward from one layer to the next

This still works like it did with a single layer network; the activation for any given node is still worked out as follows:

Where (wi is the weight(i), and Ii is the input(i) value)

You see it the same old stuff, no demons, smoke or magic here It's stuff we've already covered

So that's how the network looks/works So now I guess you want to know how to go about training it

Trang 4

Types Of Learning

There are essentially 2 types of learning that may be applied, to a Neural Network, which is "Reinforcement" and

"Supervised"

Reinforcement

In Reinforcement learning, during training, a set of inputs is presented to the Neural Network, the Output is 0.75, when the target was expecting 1.0

The error (1.0 - 0.75) is used for training ('wrong by 0.25')

What if there are 2 outputs, then the total error is summed to give a single number (typically sum of squared errors)

Eg "your total error on all outputs is 1.76"

Note that this just tells you how wrong you were, not in which direction you were wrong

Using this method we may never get a result, or it could be a case of 'Hunt the needle'

NOTE : Part 3 of this series will be using a GA to train a Neural Network, which is Reinforcement learning The GA simply does what a GA does, and all the normal GA phases to select weights for the Neural Network There is no back propagation of values The Neural Network is just good or just bad As one can imagine, this process takes a lot more steps to get to the same result

Supervised

In Supervised Learning the Neural Network is given more information

Not just 'how wrong' it was, but 'in what direction it was wrong' like 'Hunt the needle' but where you are told 'North a bit', 'West a bit'

So you get, and use, far more information in Supervised Learning, and this is the normal form of Neural Network learning algorithm Back Propagation (what this article uses, is Supervised Learning)

Learning Algorithm

In brief, to train a multi-layer Neural Network, the following steps are carried out:

Start off with random weights (and biases) in the Neural Network

Try one or more members of the training set, see how badly the output(s) are compared to what they should be (compared to the target output(s))

Jiggle weights a bit, aimed at getting improvement on outputs

Now try with a new lot of the training set, or repeat again,

jiggling weights each time

Keep repeating until you get quite accurate outputs

This is what this article submission uses to solve the XOR problem This is also called "Back Propagation" (normally called BP or BackProp)

Backprop allows you to use this error at output, to adjust the weights arriving at the output layer, but then also allows you to calculate the effective error 1 layer back, and use this to adjust the weights arriving there, and so on, back-propagating errors through any number of layers

The trick is the use of a sigmoid as the non-linear transfer function (which was covered in Part 1 The sigmoid is used

Trang 5

as it offers the ability to apply differentiation techniques.

Because this is nicely differentiable – it so happens that

Which in context of the article can be written as

delta_outputs[i] = outputs[i] * (1.0 - outputs[i]) * (targets[i] - outputs[i])

It is by using this calculation that the weight changes can be applied back through the network

Things To Watch Out For

Valleys: Using the rolled ball metaphor, there may well be valleys like this, with steep sides and a gently sloping floor Gradient descent tends to waste time swooshing up and down each side of the valley (think ball!)

So what can we do about this Well we add a momentum term, that tends to cancel out the back and forth

movements and emphasizes any consistent direction, then this will go down such valleys with gentle bottom-slopes much more successfully (faster)

Starting The Training

This is probably best demonstrated with a code snippet from the article's actual code:

///<summary>

/// The main training The expected target values are passed in to this

/// method as parameters, and the <see cref="NeuralNetwork"> NeuralNetwork </see>

Trang 6

/// is then updated with small weight changes, for this training iteration

/// This method also applied momentum, to ensure that the NeuralNetwork is

/// nurtured into proceeding in the correct direction We are trying to avoid valleys.

/// If you don't know what valleys means, read the articles associated text

///</summary>

///<param name="target"> A double[] array containing the target value(s) </param>

private void train_network( double [] target)

{

//get momentum values (delta values from last pass)

double [] delta_hidden = new double [nn.NumberOfHidden + 1 ];

double [] delta_outputs = new double [nn.NumberOfOutputs];

// Get the delta value for the output layer

for ( int i = 0 ; i < nn.NumberOfOutputs; i++)

{

delta_outputs[i] =

nn.Outputs[i] * ( 1 0 - nn.Outputs[i]) * (target[i] - nn.Outputs[i]);

}

// Get the delta value for the hidden layer

for ( int i = 0 ; i < nn.NumberOfHidden + 1 ; i++)

{

double error = 0 0

for ( int j = 0 ; j < nn.NumberOfOutputs; j++)

{

error += nn.HiddenToOutputWeights[i, j] * delta_outputs[j];

}

delta_hidden[i] = nn.Hidden[i] * ( 1 0 - nn.Hidden[i]) * error;

}

// Now update the weights between hidden & output layer

for ( int i = 0 ; i < nn.NumberOfOutputs; i++)

{

for ( int j = 0 ; j < nn.NumberOfHidden + 1 ; j++)

{

//use momentum (delta values from last pass),

//to ensure moved in correct direction

nn.HiddenToOutputWeights[j, i] += nn.LearningRate * delta_outputs[i] * nn.Hidden[j]; }

}

// Now update the weights between input & hidden layer

for ( int i = 0 ; i < nn.NumberOfHidden; i++)

{

for ( int j = 0 ; j < nn.NumberOfInputs + 1 ; j++)

{

//use momentum (delta values from last pass),

//to ensure moved in correct direction

nn.InputToHiddenWeights[j, i] += nn.LearningRate * delta_hidden[i] * nn.Inputs[j]; }

}

}

So Finally The Code

Well, the code for this article looks like the following class diagram (It's Visual Studio 2005 C#, NET v2.0)

Trang 7

The main classes that people should take the time to look at would be :

NN_Trainer_XOR : Trains a Neural Network to solve the XOR problem

TrainerEventArgs : Training event args, for use with a GUI

NeuralNetwork : A configurable Neural Network

NeuralNetworkEventArgs : Training event args, for use with a GUI

SigmoidActivationFunction : A static method to provide the sigmoid activation function The rest are a GUI I constructed simply to show how it all fits together

NOTE : the demo project contains all code, so I won't list it here

Code Demos

The DEMO application attached has 3 main areas which are described below:

LIVE RESULTS Tab

Trang 8

It can be seen that this has very nearly solved the XOR problem (You will probably never get it 100% accurate) TRAINING RESULTS Tab

Viewing the training phase target/outputs together

Viewing the training phase errors

Trang 9

TRAINED RESULTS Tab

Viewing the trained target/outputs together

Viewing the trained errors

Trang 10

It is also possible to view the Neural Networks final configuration using the "View Neural Network Config" button If people are interested in what weights the Neural Network ended up with, this is the place to look

What Do You Think ?

That's it I would just like to ask, if you liked the article, please vote for it

Points of Interest

Trang 11

I think AI is fairly interesting, that's why I am taking the time to publish these articles So I hope someone else finds it interesting, and that it might help further someone's knowledge, as it has my own

Anyone that wants to look further into AI type stuff, that finds the content of this article a bit basic should check out Andrew Krillovs articles, at Andrew Krillov CP articles as his are more advanced, and very good In fact anything Andrew seems to do, is very good

History

v1.0 24/11/06

Bibliography

Artificial Intelligence 2nd edition, Elaine Rich / Kevin Knight McGraw Hill Inc

Artificial Intelligence, A Modern Approach, Stuart Russell / Peter Norvig Prentice Hall

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves If in doubt please contact the author via the discussion board below

A list of licenses authors might use can be found here

About the Author

Sacha Barber

Software Developer (Senior)

United Kingdom Member

I currently hold the following qualifications (amongst others, I also studied Music Technology and Electronics, for my sins)

- MSc (Passed with distinctions), in Information Technology for E-Commerce

- BSc Hons (1st class) in Computer Science & Artificial Intelligence

Both of these at Sussex University UK

Award(s)

I am lucky enough to have won a few awards for Zany Crazy code articles over the years

Microsoft C# MVP 2012

Codeproject MVP 2012

Microsoft C# MVP 2011

Codeproject MVP 2011

Microsoft C# MVP 2010

Codeproject MVP 2010

Microsoft C# MVP 2009

Codeproject MVP 2009

Trang 12

Permalink | Advertise | Privacy | Mobile

Web01 | 2.6.121031.1 | Last Updated 30 Jan 2007 Everything else Copyright © Article Copyright 2006 by Sacha BarberCodeProject , 1999-2012

Terms of Use

Microsoft C# MVP 2008

Codeproject MVP 2008

And numerous codeproject awards which you can see over at my blog

Comments and Discussions

36 messages have been posted for this article Visit http://www.codeproject.com/Articles/16508/AI-Neural-Network-for-beginners-Part-2-of-3 to post and view comments on this article, or click here to get a print view with messages

Ngày đăng: 28/04/2014, 10:10

TỪ KHÓA LIÊN QUAN

w