1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Machine Learning and Robot Perception - Bruno Apolloni et al (Eds) Part 15 pps

7 172 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 7
Dung lượng 138,58 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Therefore, the prospective dialogue strategy taken by player1 is com-puted by taking into account the following probabilistic noise as follows: , 0 ˆ where, bˆi can be obtained by shif

Trang 1

Therefore, the prospective dialogue strategy taken by player1 is

com-puted by taking into account the following probabilistic noise as follows:

) , 0 ( ˆ

where, i can be obtained by shifting the original pay-off matrix in

Eq.(6) We suggest that the addictive noise effect to the pay-off may play a

crucial role in the stability of dialogue strategies, and prevents in particular

from having to use dead reckoning This could also be reminiscent of a

regularization effect in machine learning theory In practice, the

regulariza-tion has been applied in wide range of sensory technology In our case, the

proposed dialogue strategy incorporating Eq.(15) in our dialogue strategy

is capable of having real world competence This may be true for

intelli-gent sensory technology, for instance, proposed by (Koshizen, 2002) Such

a technology learns cross-correlation among different sensors - selecting

sensors that can be the best ones for minimizing a predictive localization

error of a robot, by modeling the uncertainty (Koshizen, 2001) That means

probabilistic models, computed by sonar and infrared sensors, were

em-ployed to estimate each robot's location

Figures 8.6–8.7 describes the computational aspect, resulting from

sev-eral simulations considering the proposed dialogue strategy 'type 2'

impli-cates when player choose their dialogue actions statistically, subject to the

approximation of true pay-off matrix

Altogether, the players interacted 5000 times Interactions were made of

100 sets, and each set consisted of 50 steps The initial value of possible

numbers of the pay-off matrix was 1000 points All components of the

pay-off matrix were normalized The plotted points represent dialogue

ac-tions, which were taken by player1 during their interactions The rule of

their interactions was assumed to follow the context of the modified IPD

game As a result, the actual pay-off matrix of player2 was cooperative, so

a pay-off matrix was approximated by inhibiting anticooperative actions

during the interactions

Figures 8.8–8.9 illustrates the Total Squared Error (TSE), which

corre-sponds to the Euclid distance between a true pay-off matrix and an

ap-proximated pay-off matrix The TSE was calculated during the IPD game

between the two players In our computation, the TSE is given by the

fol-lowing equation

4

1

2

) ˆ (

i

i i

b b TSE

(16)

Trang 2

Furthermore, let us penalize the TSE shown in Eq.(16) That is,

) ( )

ˆ (

4

1

2

*

f b

b TSE

i

i i

:





(17) where, :( f) denotes the smoothness function, normally called

regulari-zation term or regularizer (Tikonov, 1963) In machine learning theory, it

is known that the regularizer :( f) represents the complexity term It

can also express either the smoothness of the approximated function f

given by x The regularizer has been applied into real world application

(Vauhkonen, 1998)(Koshizen and Rosseel, 2001)

A crucial difference when Fig.8 compares Fig.9 is, the size of variance

before the dialogue strategies (type1 or/and type2) are undertaken

Furthermore, the second term of Eq.(17) corresponds to the smoothness

of (probabilistic) generative models, which are obtained by a learning

scheme The models can be used for selective purposes in order to acquire

the unique model that fits the 'true' density function best Therefore, the

re-sult from the learning scheme can be further minimized Generally, the

process is called model selection Theoretically, results brought by Eq.(17)

are closely related to Eq.(16)(ex., Hamza, 2002) In our case, TSE* is not

calculated by the regularizer :( f) explicitly, though it is implicitly

brought by the actual calculation according to Eq.(17)

We attempt to enhance our proposed computation by taking into account

Eq.(17) That is, the dialogue strategy must be able to reduce the

uncer-tainty of other's offs In practice, players inquire about their own

pay-offs explicitly The inquiry reducing the uncertainty of player2's pay-off

corresponds to :( f), which considers the past experiences of their

inquiries Additionally, O may provide self-control of interactive dialogue

to the inquiries In fact, many inquiries would sometimes be regarded as

troublesome to others Therefore, self-control will be needed

During dialogue interaction, the inquiry by each player is performed in

our dialogue strategy, in order to reduce the uncertainty of a true pay-off of

player2 The uncertainty is also modeled by probabilistic density functions

Thus, we expect that our proposed computation shown in Eq.(17) is

capa-ble of providing a minimization better than TSE in Eq.(15), i.e.,

TSE

Fig 4 to 9 represents several computational results, which

were obtained by the original model and the improved model The biggest

difference between the original and the improved models was that the

ap-proximatedpay-offfunctionfinvolved a probabilistic property It certainly

Trang 3

affects the dialogue strategy, which is capable of making generative mod-els smooth by reducing the uncertainty In order to be effective, a long-lasting interaction between the two players must be ensured, as described before The probabilistic property can cope with fluctuations of a pay-off

in others This can often resolve a problem where there is no longer a unique best strategy such as in the IPD game The initial variances created

by each action, are relatively large (Figs 8.4 and 8.6) whereas in Fig 8.5

and 8.7 they are smaller In these figures, + denotes a true pay-off's value

and (•) denotes an approximated value calculated by a pay-off function Since Figs 8.6 and 8.7 were obtained by computational results from the probabilistic pay-off function, the approximated values could be close to the true value The inverse can also be true as shown in Fig.8.4 and 8.5 Additionally, Figs 8.8 and 8.9 illustrate TSE for each case represented in Figs 8.4, 8.5 and Figs 8.6, 8.7 The final TSE is 0.650999 (Fig.8.8: left), 0.011754 (Fig 8.8: right), 0.0000161 (Fig 8.9: left) and 0.000141 (Fig 8.9: right) respectively From all the results, one can see that the computa-tion involving a probabilistic pay-off funccomputa-tion showed better performances with respect to TSE because it avoided the dead-reckoning problem across the pay-off in others

Pattern Classification User Classification

Discriminant Function Pay-off Function

Mean Squared Error for

Discriminant Function

Approximation (Standard

Expectation Maximization)

Total Squared Error for Pay-off Function Approximation (Mutual Expectation

Maximization) Regularization to

parameterize Degree of

generalization

Regularization to parameterize Degree

of Satisfaction

Pattern Classification User Classification

Discriminant Function Pay-off Function

Mean Squared Error for

Discriminant Function

Approximation (Standard

Expectation Maximization)

Total Squared Error for Pay-off Function Approximation (Mutual Expectation

Maximization) Regularization to

parameterize Degree of

generalization

Regularization to parameterize Degree

of Satisfaction

Fig 8.10 Analogous correspondences between pattern regression and user

modeling

Figure 8.10 shows analogous correspondences between pattern regres-sion and user modeling From the figure, we can clearly see a lot of struc-tural similarities for each element such as classification, functional

Trang 4

ap-proxima-tion, and regularization We can also see cross-correlations between pattern regression and user modeling

8.6 Conclusion and Future Work

In this paper, we theoretically presented a computational method for user modeling (UM) that can be used for estimating pay-offs of a user Our proposed system allows a pay-off matrix in others to be approximated based on inductive game theory This means that behavioral examples in others need to be employed with the pay-off approximation The inductive game theory involves social cooperative issues, which take into account a dialogue strategy in terms of maximizing the pay-off function We re-minded that the dialogue strategy had to bring into play long-lasting inter-actions with each other, so the approximating pay-off matrix could be used for estimating pay-offs in others This makes a substructure of inducing social cooperative behavior, which leads to the maximum reciprocal ex-pectation thereof

In our work, we provided a computation model of the social cooperation mechanism, using inductive game theory In the theory, predictive dia-logue strategies were assumed by implementation based on behavioral de-cisions taken by others Additionally, induction is taken as a general prin-ciple for the cognitive process of each individual

The first simulation was carried out using the IPD game to minimize a total squared error (TSE), which was calculated by both a true and an ap-proximated pay-off matrix It is noted that minimizing the TSE can essen-tially be identical to maximizing expectation of a pay-off matrix In the simulation, inquiring about pay-offs in others was considered as a compu-tational aspect of the dialogue strategy Then, the second simulation, in which a pay-off matrix can be approximated by a probability distribution function (PDF), was undertaken Since we assumed that the pay-off matrix could fluctuate over time, a probabilistic form of pay-off matrix would be suitable to deal with the uncertainty Consequently, the result, obtained by the second simulation (Section 8.5.2) provided better performances be-cause of escaping from the dead-reckoning problem of a pay-off matrix Moreover, we also pointed out the significance to introduce how the probabilistic pay-off function could cope with a real world competence when the behavioral analysis was used to model pay-offs in others In principle, the behavioral analysis can be calculated by sensory technology based on vision and audition Additionally, there would be no sense the pay-off matrix to change in daily communication This means that sensing

Trang 5

technology has to come up with a way to reduce the uncertainty of some-one's pay-offs Consequently, this could lead to approximating pay-off dis-tribution function accurately Furthermore, in the second simulation, we pointed out that the proposed dialogue strategy could play a role in refining the estimated pay-off function This is reminiscent of model selection problem in machine learning theory In the theory, (probabilistic) genera-tive models are selecgenera-tively eliminated in terms of generalization perform-ance Our dialogue strategy brings into play the on-line maintenance of user models That is, the dialogue strategy, leads to a long-lasting interac-tion, which allowed user models to be selected, in terms of approximating

a true pay-off's density function More specifically, the dialogue strategy would allow inquiry to reduce the uncertainty of pay-offs in others The timing and content quality when inquiring to others, should also be noted

as being a human-like dialogue strategy involving cognitive capabilities Our study has shown that inductive game theory effectively provides a theoretical motivation to elicit the proposed dialogue strategy, which is feasible with maximum mutual expectation and uncertainty reduction Nevertheless, substantial studies will still require establishing our algo-rithm by considering with the inductive game theory

Another future extension of this work could be applied to our proposed computation with humanoid robot applications, allowing humanoid robots

to be able to carry out reciprocal interactions with humans Our computa-tion of UM suggests that users who resist the temptacomputa-tion to defect for short-term gain and instead persist in mutual cooperation between robots and humans A long-lasting interaction will thus require other's pay-off's estimations Importantly, the long-lasting interaction could also be used to evaluate how much the robots gain the satisfaction of humans We are convinced that this could be one of the most faithful aspects particularly when humanoid robots are considered with man-machine interactions Consequently, our work provides a new scheme of man-machine interac-tion, which is computed by maximizing a mutual expectation of pay-off functions in others

Trang 6

1 Newell, A Unified theories of cognition, Cambridge, MA:Harvard University Press, 1983

2 Newell, A Unified theories of cognition, Cambridge, MA:Harvard University Press, 1983

3 Fischer, G User modeling in Human-Computer Interaction User

modeling and User-Adapted Interaction, 11:65 86, 2001

4 Basu, C., Hirsh, H and Cohen, W Recommendation as Classification: Using and Content-Based Information in Recommendation In: AAAI98-Proceedings of the Fifteenth National Conference on Artifi-cial Intelligence, Madison, Wisconsin, 714-720, 1998

5 Gervasio, M., Iba, W and Langley, P Learning to Predict User Opera-tions for Adaptive Scheduling In : AAAi98-Proceedings of the Fif-teenth National Conference on Artificial Intelligence, Madison, Wis-consin, 721-726, 2001

6 Nash, J.F., Annals of Mathematics, 54:286-295, 1951

7 Kaneko, M and Matsui, A., Inductive Game Theory: Discrimination

and Prejudices Journal of Public Economic Theory, Blackwell

Pub-lishers, Inc., 1(1):101-137, 1999

8 Axelrod, R and Hamilton, W.D., The evolution of cooperation Sci-ence, 211, 1390-1396, 1981

9 Axelrod, R.M., The Evolution of Cooperation, New York: Basic Books, 1984

10 Boyd, R., Is the repeated Prisoner's dilemma a good model of recipro-cal altruism ? Ethol Sociobiol, Vol.9, 211-222, 1988

11 Nesse, R.M., Evolutionary Explanations of Emotions, Human Nature,

1, 261-289, 1990

12 Trivers R., Social Evolution, Menlo Part, CA: Cummings, 1985

13 Webb, G.I., Pazzani, M.J and Billsus, D., Machine Learning for User Modeling, User Modeling and User-Adapted Interaction, 11(1-2), 19-29

14 Valiant L.G., A Theory of The Learnable Communications of the ACM, 27, 1134-1142, 1984

15 Debreu, G., A Continuity properties of paretian utility International

Economic Review, Vol.5, pp.285-293, 1964

16 Baker, F and Rachlin, H Probability of reciprocation in repeated pris-oner's dilemma games Journal of Behavioral Decision Making, 14, 51-67, John Wiley & Sons, Ltd

Trang 7

17 Andre, E Rist, T., Mueller, J., Integrating Reactive and Scripted Be-haviors in a Life-Llike Presentation Agent Proc Int Conf Autono-mous Agents, 261-268

18 Noma, T., Zhao, L and Badler, N.I Design of a Virtual Human Pre-senter, IEEE Computer Graphics and Applications, 20(4), 79-85, 2000

19 Legerstee, M., Barna, J and DiAdamo, C., Precursors to the develop-ment at intention of 6 months: Understanding people and their actions, Developmental Psychology, 36(5), 627-634

20 Lieberman, H., Letizia:An Agent that Assists Web Browsing In:IJCAI95-Proceedings of the Fourteenth International Joint Confer-ence on Artificial IntelligConfer-ence, Montreal, Canada 924-929

21 Dempster, A.P., Laird, N.M and Rubin, D.B., Maximum Likelihood from Incomplete Data via the EM Algorithm, J.Roy.Statist.Soc.Ser, B39, 1-38, 1977

22 Koshizen,T., Improved Sensor Selection Technique by Integrating, J

of Intel and Robo Syst., 2001

23 Koshizen, T., Yamada, S and Tsujino, H., Semantic Rewiring Mecha-nism of Neural Cross-supramodal Integration based on Spatial and Temporal Properties of Attention Neurocomputing, 52-54, 643-648, 2003

24 Tikhonov, A.N., On Solving ill-posed problem and method of regu-larization Doklady Akademii Nawk USSR, 153, 501-504, 1963

25 Vauhkonen, M., Vadasz, D and Kaipio, J.P., Tiknonov Regularization and Prior Information in Electrical Impedance Tomography IEEE Transaction on Madical Imaging, 17, 2, 285-293,2001

26 Koshizen, T and Rosseel, Y A New EM Algorithm using the Tikonov Regularizer and the GMB-REM Robot's Position Estimation System Int J of Knowle Intel Engi Syst., 5, 2-14, 2001

27 Hamza, A.B., Krim, H and Unal G.B., Unifying probabilistic and variational estimation IEEE Signal Processing Magazine, 37-47,2002

Ngày đăng: 10/08/2014, 04:22

TỪ KHÓA LIÊN QUAN