A number of input mappings have been used in the neural network lit- erature, the most popular being the inner product and Euclidean input mappings.. Notice that the gradient for the inn
Trang 1psychology, biology, and engineering to name a few
edge is known then it can be beneficial to use it) In other words, they are
Of course, however, sufficient data is typically needed for effective solutions
mechanism that specifies how to reason over the rule-base, (3) fuzzification
Stable Adaptive Control and Estimation for Nonlinear Systems:
Neural and Fuzzy Approximator Techniques.
Jeffrey T Spooner, Manfredi Maggiore, Ra´ul Ord´o˜nez, Kevin M Passino
Copyright 2002 John Wiley & Sons, Inc ISBNs: 0-471-41546-4 (Hardback); 0-471-22113-9 (Electronic)
Trang 2domain-specific knowledge in a manner similar to how an expert system is
ea’sier to specify a good initial guess for the fuzzy system
In this chapter we define some basic fuzzy systems and neural networks,
The brain is made up of a huge number of different types of neurons in-
an input region, which contains numerous small branches called dendrites
tial which lasts only a millisecond or two is triggered and an impulse is sent
triggered the action
In our model of an artificial neuron, as is typical, we will preserve the
Trang 3dendrites
previous neurons are called recurrent neural networks, whereas networks
in which information is allowed to proceed in a single direction are called feedforward neural networks Examples of recurrent and feedforward neu- ral networks are shown in Figure 3.2 where each circle represents a neuron, and the lines represent the transmission of signals along axons To simplify analysis, we will focus on the use of feedforward neural networks for the estimation and control schemes in the chapters to follow
The input vector to the neuron is x = [zr , , z,]~, where zi, 1 5 i < 72,
voltage caused by an electrochemical reaction, in this artificial framework,
Trang 421 may be a variable representing, for example, the value for a temperature
past knowledge and may be allowed to change over time based upon new neural stimuli Using both the weights and inputs, a mapping is performed
to yua.ntify the relation between w and x This relationship, for example, may describe the “colinearity” of w and x (discussed below) We will denote the input mapping by
s=wax
A number of input mappings have been used in the neural network lit- erature, the most popular being the inner product and Euclidean input mappings Another useful but not as widely used input mapping is the weighted average
Trang 5Inner Product: The inner product input mapping (also commonly refered
to as the dot product), ma,y be considered to be a measure of the similarity between the orientation of the vectors w and x, defined by
s=w@x=wTx (3 1)
w and x Thus, if w and x are orthogonal, then s(x) = 0 Similarily, the inner product increases as x and w become colinear (the angle 6 decreases and at 6 = 0, x and w are colinear)
We will find that for each of the techniques in this book, it is required that we know how a change in the weights will affect the output of the neural network To determine this, the gradient of each neuron output with respect to the weights will be calculated For the standard inner product input mapping, we find that
6% T
-=x
Notice that the gradient for the inner product input mapping with respect
to the weights is simply the value of the inputs themselves
product is the weighted average defined by
WTX s=w@x=
Geomet,rically speaking, the weighted average again determines to what degree the two vectors are colinear To ensure that the input mapping is well defined for all x (even x = 0), we may choose to rewrite (3.3) as
Trang 6This mapping will result in s 2 0 since it is the Euclidean norm of the
dS x-w - -
ron is often also provided a constant bias as an input so that now II;’ =
[1,x1, - * A] T with x E R”? We have chosen the bias to be 1 since the
this value That is, the weight vector is now changed to
w = [wo, Wl, - - - 7 %I T
with wg corresponding to the bias input The same neuron input mappings afs described above apply to the case where a bias term is used
According to Figure 3.3, a.fter the input mapping, the neuron produces an output using an activation function This activation function transforms the value produced by the input mapping to a value which is suitable for another neuron, or possibly a value which may be understood by an external system (e.g., as an input to an actuator)
Trang 7Definition 3.2: A function $ : R + R is said to be a squashing func-
0
function, on the other hand, Y/J(S) E L, does not hold for all s
With $J() continuously differentiable, the change in the neuron output with respect to the neuron weights may be obtained from the chain rule as
This formula will become useful when determining how to adjust the neuron weights so that the output of the squashing function changes in some desired ma’nner A few of the more commonly used activation functions will now
be defined
original activation functions studied by McCulloch and Pitts It is defined
bY
1 s>o
function in our adaptive schemes (because its derivative is not well defined),
it will prove valuable in the function approximation proofs in Chapter 5
input to output defined by
This is a monotonic, unbounded activation function The gradient of the
activa8tion functions are often used to generate the outputs of multi-layered neural networks
rated linear activation function, defined as
Trang 8(3.13)
This is a monotonic, bipolar activation function whose derivative is
aq!(s>/as = 1 - $2(s>
sigmoid In general a sigmoid is any such “s-shaped” function, and is often specified as
Note that the gradient is defined in terms of q(e) itself
tions is the radial basis function, commonly defined by
$(s) = exp(-s2/y2), (3.15)
have been used in a number of applications owing to their mathematical properties Their function approximation properties will be discussed in Chapter 5
used They tend to be limited only by one’s imagination and practical implementations Even though we can define and use rather sophisticated activation functions, the ones presented above tend to be sufficient We will discuss this point in more detail in Chapter 5 during a discussion on universal approximation capabilities of neural networks To give you an idea of other possibilities for activation functions consider
dJ( > s =I- S (3.16)
Trang 9Sec 3.2 Neural Networks
inner product input mapping and with 2 = [l, 21T What set of weights, w = [wr, wslT, will cause $(s) = O? This is equivalent to finding w1 and wz such that s = wr + 2~2 = 0 Any point along the line w1 = -2~2 will satisfy this Notice that with two adjustable weights, an infinite number of choices for wr and w2 exist which cause
tilayer feedforward neural network It is a collection of artificial neurons
in which the output of one neuron is passed to the input of another The neurons (or nodes) are typically arranged in collections called layers, such that the output of all the nodes in a given layer are passed to the inputs of the nodes in the next layer
An input layer is the input vector x, while the output layer is the con- nection between the neural network and the rest of the world A hidden layer is a collection of nodes which lie between the input and output layers
as shown in Figure 3.5 Without a hidden layer, the MLP reduces to a collection of n neurons operating in parallel
)Il / \ output layer 1,’
b 2, ,,’ I “\
/ /
‘1 / , / x , ‘j
~ ,/;,:I ’ ’ -,s -;
hidden layer
‘< ,;- \ A’ /’ _- -,-’ ,, /// , : /‘\ ’ _ A’ _ ,_- ‘
,, _ I -; /’ ),‘- - +, \ / /’ _ ,‘._ ‘,, \(, / ‘ \ 1: +4’ j \-
Trang 10In a fully connected MLP, each neuron output is fed to each neuron input within the next 1a)yer If we let 6 be a vector of all the adjustable parameters
consider the following example:
nodes with activation functions defined by &(s) for the j’” hidden node and a single output node defined using a linear activation func- tion The input-output mapping for this MLP is defined by
included with each node There are q neurons in the hidden layer, n
Within a multilayer perceptron, if there are many layers and many nodes
in each layer, there will be a large number of adjustable parameters (e.g., weights and biases) MLP’s with several hundred or thousands of adjustable weights are common in complex real-world applications
of radial basis activation functions with an associated Euclidean input map- ping (but there are many ways to define this class of neural networks) The output is then ta)ken as a linear activation function with an inner product
or weighted average input mapping A RBNN with two inputs and 4 nodes
is shown in Figure 3.6
The input-output relationship in a RBNN with x = [xl, , x,lT as an input is given by
(3.19)
the output node Typically, the values of the vectors ci, i = 1, , m and the scalar y are held fixed, while the values of 6 are adjusted so that the mapping produced by the RBNN matches some desired mapping Because the adjustable weights appear linearly, we may express (3.19) as
Y-(x, 0) = eTc(x>, (3.20)
Trang 11Figure 3.6 Radial basis neural network with 4 nodes
in the output node, the RBNN becomes
The Gaussian form of the activation functions lets one view the RBNN
as a weighted avera,ge when using (3.21), where the value of wi is weighted heavier when x is close to ci Thus the input space is broken into overlap- ping regions with centers corresponding to the ci’s as shown in Figure 3.6
If past input values are also processed by a neural network we obtain what
using a single input and associated delayed values is shown in Figure 3.7 (here l/z denotes a delay of T)
Here, u is the input and we can, for instance, let x = [z@), , ~(lc n)]~
so that the output of the neural network is y = Y-(x, 0) (0 is a vector holding
Trang 12the parameters of the neural network) In estimation and control applica- tions it is common to let the input to the neural network be a sequence of past inputs and outputs of the process
“if the current speed is 50 miles per hour and the desired speed is 55 miles per hour then press down on the accelerator a bit more.” Other rules may incorporate information about the rate at which the speed is approaching,
or departing from, the desired speed The fuzzy controller uses fuzzy sets and fuzzy logic to implement a set of rules about how to control the vehicle speed During operation, it determines which control rules apply to the current situation, and applies these in an analogous way to how a human would if he or she were physically controlling the system In this way, it is said that the fuzzy controller emulates the human cognitive decision-making process (or, in other words, it conducts “inference”)
A fuzzy system is shown in Figure 3.8 Here, we show the rule-base that holds the set of rules about, for example, how to control a process Also,
we see explicit inclusion of the “inference mechanism” which is the part
of the fuzzy system that decides which rules should be used, and applies them Not shown here, but discussed below, are the processes that trans- form information into a, form that can be used by the inference mechanism
a’ form that can be used in practical applications (“defuzzification”)
Trang 13
%I
A multiple-input single-output (MISO) fuzzy system is a nonlinear mapping
define a MIMO fuzzy system with m outputs simply define m MIS0 fuzzy
Here, Fbn is the uth linguistic value associated with the linguistic variable Zb that describes input Q, Similarly, Gd is the a th linguistic value associated with the linguistic variable jj that describes the output y Linguistic vari- ables are simply word descriptions of, for exa.mple, numeric variables (e.g.,
“speed” might be the linguistic variable for the velocity of the vehicle that
we denote with v(t>) Linguistic variables change over time and hence take
on specific linguistic values (typically adjectives) For instance, “speed” is
“small” or “speed” is “large.” The “linguistic rules” listed above are those gathered from a human expert
tween the desired speed q&t) and the actual (sensed) speed w(t) (i.e.,
controller input variable xi(t) could be “speed-error” so that the lin-
linguistic values for the speed-error linguistic variable and that these
the fuzzy controller is the change in throttle angle that we denote as
ha*s linguistic variables “increase,” (‘stay-the-same,” and “decrease.”
Trang 14The three rules, in the rule-base, of the form listed above would be
Rule RI says that if the vehicle is traveling at a speed less than the desired speed, then increase the amount of throttle (i.e., make g(t)
greater than the desired speed, then decrease the amount of throttle ( i.e., make y (t> negative) Rule R3 says that if the actual speed is close to the desired speed then do not move the throttle angle (i.e.,
To apply the knowledge represented in the linguistic rules we further quantify the meaning of the rules using fuzzy sets and fuzzy logic In particular, using fuzzy set theory, the rule-base is expressed as a set of
where Ft and Ga are fuzzy sets defined by
a8 particulalr linguistic statement For example, ,QF,” quantifies how well the linguistic variable Ir;b, that represents xb, is described by the linguistic value Pt There are many ways to define membership functions [170] For instance, Ta,bles 3.1 specifies triangular membership functions with center
c a,nd width w, and it specifies Gaussian membership functions with center
c a’nd width 0 It is good practice to sketch these six functions, labeling all aspects of the plots
quantify the meaning of each of the rule premises and consequents