1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Trends in Intelligent Systems and Computer Engineering

666 567 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 666
Dung lượng 13,59 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

9 8 7 6 5 4 3 2 1 springer.com Oscar Castillo Tijuana Institute of Technology Department of Computer Science Chula Vista CA 91909 USA Li Xu Zhejiang University College of Electrical Engi

Trang 2

and Computer Engineering Trends in Intelligent Systems

Trang 3

Volume 6

Oscar Castillo, Li Xu, and Sio-Iong Ao

Recent Advances in Industrial Engineering and Operations Research ISBN 978-0-387-74934-1, 2008

Alan H S Chan, and Sio-Iong Ao

Lecture Notes in Electrical Engineering

ISBN 978-0-387-74903-7, 2008

Advances in Communication Systems and Electrical EngineeringISBN 978-0-387-74937-2, 2008

Xu Huang, Yuh-Shyan Chen, and Sio-Iong Ao

Time-Domain Beamforming and Blind Source Separation

Julien Bourgeois, and Wolfgang Minker

Digital Noise Monitoring of Defect Origin

Telman Aliev

ISBN 978-0-387-71753-1, 2007

ISBN 978-0-387-68835-0, 2007

Multi-Carrier Spread Spectrum 2007

Simon Plass, Armin Dammann, Stefan Kaiser, and K Fazel

ISBN 978-1-4020-6128-8, 2007

Trends in Intelligent Systems and Computer Engineering

Trang 4

Oscar Castillo • Li Xu • Sio-Iong Ao

Systems and Computer Engineering

Editors

Trends in Intelligent

Trang 5

Sio-Iong Ao

IAENG Secretariat

Unit 1, 1/F

Hong Kong

People s Republic of China

2008 Springer Science+Business Media, LLC

or dissimilar methodology now known or hereafter developed is forbidden.

to proprietary rights.

9 8 7 6 5 4 3 2 1

springer.com

Oscar Castillo

Tijuana Institute of Technology

Department of Computer Science

Chula Vista CA 91909

USA

Li Xu Zhejiang University College of Electrical Engineering Department of Systems ScienceYu-Quan Campus

310027 HangzhouPeople,s Republic of China

Library of Congress Control Number: 2007935315

Printed on acid-free paper

All rights reserved This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY

10013, USA), except for brief excerpts in connection with reviews or scholarly analysis Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar

The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject

Editors

and Engineering

Trang 6

A large international conference, Intelligent Systems and Computer Engineering,was held in Hong Kong, March 21–23, 2007, under the International MultiConfer-ence of Engineers and Computer Scientists (IMECS) 2007 The IMECS 2007 isorganized by the International Association of Engineers (IAENG), a nonprofit inter-national association for engineers and computer scientists The IMECS conferencesserve as good platforms for the engineering community to meet with each other and

to exchange ideas The conferences also strike a balance between theoretical and plication development The conference committees have been formed with over twohundred committee members who are mainly research center heads, faculty deans,department heads, professors, and research scientists from over thirty countries Theconferences are truly international meetings with a high level of participation frommany countries The response that we have received for the multiconference is ex-cellent There have been more than one thousand one hundred manuscript submis-sions for the IMECS 2007 All submitted papers have gone through the peer reviewprocess and the overall acceptance rate is 58.46%

ap-This volume contains revised and extended research articles on intelligent tems and computer engineering written by prominent researchers participating inthe multiconference IMECS 2007 There is huge demand, not only for theories butalso applications, for the intelligent systems and computer engineering in the society

sys-to meet the needs of rapidly developing sys-top-end high technologies and sys-to improvethe increasing high quality of life Topics covered include automated planning, ex-pert systems, machine learning, fuzzy systems, knowledge-based systems, computersystems organization, computing methodologies, and industrial applications Thepapers are representative of these subjects The book offers state-of-the-art tremen-dous advances in intelligent systems and computer engineering and also serves as

an excellent reference work for researchers and graduate students working with telligent systems and computer engineering

in-Sio Iong Ao, Oscar Castillo, and Li Xu

July 2007Hong Kong, Mexico, and China

v

Trang 7

Preface vContributors xi

1 A Metamodel-Assisted Steady-State Evolution Strategy

for Simulation-Based Optimization 1Anna Persson, Henrik Grimm, and Amos Ng

2 Automatically Defined Groups for Knowledge Acquisition

from Computer Logs and Its Extension for Adaptive Agent Size 15Akira Hara, Yoshiaki Kurosawa, and Takumi Ichimura

3 Robust Hybrid Sliding Mode Control for Uncertain Nonlinear

Systems Using Output Recurrent CMAC 33Chih-Min Lin, Ming-Hung Lin, and Chiu-Hsiung Chen

4 A Dynamic GA-Based Rhythm Generator 57Tzimeas Dimitrios and Mangina Eleni

5 Evolutionary Particle Swarm Optimization: A Metaoptimization

Method with GA for Estimating Optimal PSO Models 75Hong Zhang and Masumi Ishikawa

6 Human–Robot Interaction as a Cooperative Game 91Kang Woo Lee and Jeong-Hoon Hwang

7 Swarm and Entropic Modeling for Landmine Detection Robots 105Cagdas Bayram, Hakki Erhan Sevil, and Serhan Ozdemir

8 Iris Recognition Based on 2D Wavelet and AdaBoost Neural

Network 117Anna Wang, Yu Chen, Xinhua Zhang, and Jie Wu

vii

Trang 8

viii Contents

9 An Improved Multiclassifier for Soft Fault Diagnosis of Analog

Circuits 129Anna Wang and Junfang Liu

10 The Effect of Background Knowledge in Graph-Based Learning

in the Chemoinformatics Domain 141Thashmee Karunaratne and Henrik Bostr¨om

11 Clustering Dependencies with Support Vectors 155

I Zoppis and G Mauri

12 A Comparative Study of Gender Assignment in a Standard Genetic

Algorithm 167

K Tahera, R N Ibrahim, and P B Lochert

13 PSO Algorithm for Primer Design 175Ming-Hsien Lin, Yu-Huei Cheng, Cheng-San Yang, Hsueh-Wei Chang,Li-Yeh Chuang, and Cheng-Hong Yang

14 Genetic Algorithms and Heuristic Rules for Solving the Nesting

Problem in the Package Industry 189Roberto Selow, Fl´avio Neves, Jr., and Heitor S Lopes

15 MCSA-CNN Algorithm for Image Noise Cancellation 209Te-Jen Su, Yi-Hui, Chiao-Yu Chuang, and Wen-Pin Tsai

16 An Integrated Approach Providing Exact SNP IDs from Sequences 221Yu-Huei Cheng, Cheng-San Yang, Hsueh-Wei Chang, Li-Yeh Chuang,and Cheng-Hong Yang

17 Pseudo-Reverse Approach in Genetic Evolution 233Sukanya Manna and Cheng-Yuan Liou

18 Microarray Data Feature Selection Using Hybrid GA-IBPSO 243Cheng-San Yang, Li-Yeh Chuang, Chang-Hsuan Ho, and Cheng-HongYang

19 Discrete-Time Model Representations for Biochemical Pathways 255Fei He, Lam Fat Yeung, and Martin Brown

20 Performance Evaluation of Decision Tree for Intrusion Detection

Using Reduced Feature Spaces 273Behrouz Minaei Bidgoli, Morteza Analoui, Mohammad Hossein

Rezvani, and Hadi Shahriar Shahhoseini

21 Novel and Efficient Hybrid Strategies for Constraining the Search

Space in Frequent Itemset Mining 285

B Kalpana and R Nadarajan

Trang 9

Jun Takezawa

24 Prediction Method for Real Thai Stock Index Based on Neurofuzzy

Approach 327Monruthai Radeerom, Chonawat Srisa-an, and M.L Kulthon Kasemsan

25 Innovative Technology Management System with Bibliometrics

in the Context of Technology Intelligence 349Hua Chang, J¨urgen Gausemeier, Stephan Ihmels, and Christoph

Wenzelmann

26 Cobweb/IDX: Mapping Cobweb to SQL 363Konstantina Lepinioti and Stephen Mc Kearney

27 Interoperability of Performance and Functional Analysis

for Electronic System Designs in Behavioural Hybrid Process

Calculus (BHPC) 375

Ka Lok Man and Michel P Schellekens

28 Partitioning Strategy for Embedded Multiprocessor FPGA Systems 395Trong-Yen Lee, Yang-Hsin Fan, Yu-Min Cheng, Chia-Chun Tsai,

and Rong-Shue Hsiao

29 Interpretation of Sound Tomography Image for the Recognition

of Ganoderma Infection Level in Oil Palm 409Mohd Su’ud Mazliham, Pierre Loonis, and Abu Seman Idris

30 A Secure Multiagent Intelligent Conceptual Framework

for Modeling Enterprise Resource Planning 427Kaveh Pashaei, Farzad Peyravi, and Fattaneh Taghyareh

31 On Generating Algebraic Equations for A5-Type Key Stream

Generator 443Mehreen Afzal and Ashraf Masood

32 A Simulation-Based Study on Memory Design Issues for Embedded

Systems 453Mohsen Sharifi, Mohsen Soryani, and Mohammad Hossein Rezvani

33 SimDiv: A New Solution for Protein Comparison 467Hassan Sayyadi, Sara Salehi, and Mohammad Ghodsi

Trang 10

x Contents

34 Using Filtering Algorithm for Partial Similarity Search on 3D

Shape Retrieval System 485Yingliang Lu, Kunihiko Kaneko, and Akifumi Makinouchi

35 Topic-Specific Language Model Based on Graph Spectral Approach

for Speech Recognition 497Shinya Takahashi

36 Automatic Construction of FSA Language Model for Speech

Recognition by FSA DP-Matching 515Tsuyoshi Morimoto and Shin-ya Takahashi

37 Density: A Context Parameter of Ad Hoc Networks 525Muhammad Hassan Raza, Larry Hughes, and Imran Raza

38 Integrating Design by Contract Focusing Maximum Benefit 541J¨org Preißinger

39 Performance Engineering for Enterprise Applications 557Marcel Seelig, Jan Schaffner, and Gero Decker

40 A Framework for UML-Based Software Component Testing 575Weiqun Zheng and Gary Bundell

41 Extending the Service Domain of an Interactive Bounded Queue 599Walter Dosch and Annette St¨umpel

42 A Hybrid Evolutionary Approach to Cluster Detection 619Junping Sun, William Sverdlik, and Samir Tout

43 Transforming the Natural Language Text for Improving

Compression Performance 637Ashutosh Gupta and Suneeta Agarwal

44 Compression Using Encryption 645Ashutosh Gupta and Suneeta Agarwal

Index 655

Trang 11

Behrouz Minaei Bidgoli

Department of Computer Science and Engineering, Michigan State University, EastLansing, MI 48824, USA, minaeibi@cse.msu.edu

Henrik Bostr¨om

Sk¨ovde Cognition and Artificial Intelligence Lab, School of Humanities and matics, University of Sk¨ovde, SE-541 28 Sk¨ovde, Sweden, henrik.bostrom@his.seMartin Brown

Infor-School of Electronic and Electrical Engineering, The University of Manchester,Manchester M60 1QD, UK, martin.brown@manchester.ac.uk

Gary Bundell

Centre for Intelligent Information Processing Systems, School of Electrical,Electronic and Computer Engineering, University of Western Australia, Crawley,

WA 6009, Australia, bundell@ee.uwa.edu.au

B Eng Hua Chang

Heinz Nixdorf Institute, University of Paderborn, Fuerstenallee 11, 33102Paderborn, Germany, Hua.Chang@hni.uni-paderborn.de

xi

Trang 12

xii ContributorsHsueh-Wei Chang

Environmental Biology, Kaohsiung, changhw@kmu.edu.tw

Chiu-Hsiung Chen

Department of Computer Sciences and Information Engineering, China

University of Technology, HuKou Township 303, Taiwan, Republic of China,chchchen@cute.edu.tw

Kaohsiung University, yuhuei.cheng@gmail.com

Trang 13

Contributors xiiiMohammad Ghodsi

Computer Engineering Department, Sharif University of Technology, Tehran, IranIPM School of Computer Science, Tehran, Iran, ghodsi@sharif.edu

Human-Robot Interaction Research Center, Korea Advanced Institute of Scienceand Technology, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Korea

R N Ibrahim

Department of Mechanical Engineering, Monash University, Wellington Rd.,Clayton 3800, Australia, Raafat.Ibrahim@eng.monash.edu.au

Takumi Ichimura

Graduate School of Information Sciences, Hiroshima City University,

3-4-1, Ozuka-higashi, Asaminami-ku, Hiroshima 731-3194, Japan,

ichimura@its.hiroshima-cu.ac.jp

Trang 14

xiv ContributorsMasumi Ishikawa

Kyushu Institute of Technology, 2-4 Hibikino, Wakamatsu, Kitakyushu 808-0196,Japan, ishikawa@brain.kyutech.ac.jp

Science Program in Information Technology (MSIT), Faculty of

Information Technology, Rangsit University, Pathumtani, Thailand 12000,kasemsan@rangsit.rsu.ac.th

Yoshiaki Kurosawa

Graduate School of Information Sciences, Hiroshima City University,

3-4-1, Ozuka-higashi, Asaminami-ku, Hiroshima 731-3194, Japan,

Kaohsiung University, iamminghsien@gmail.com

Trang 15

Contributors xvCheng-Yuan Liou

Department of Computer Science and Information Engineering, National TaiwanUniversity, Taipei, Taiwan, Republic of China

CPGEI, Universidade Tecnol´ogica Federal do Paran´a (UTFPR), Av 7 de setembro,

3165 - Curitiba - Paran´a, Brazil, hslopes@cpgei.cefetpr.br

Electronics and Computer Science Department, Fukuoka University,

8-19-1 Nanakuma, Jonan-ku, Fukuoka 814-0180, Japan, morimoto@tlsun.tl.fukuoka-u-ac.jp

Trang 16

The University of Isfahan, Iran

Fl´avio Neves Junior

CPGEI, Universidade Tecnol´ogica Federal do Paran´a (UTFPR), Av 7 de setembro,

3165 - Curitiba - Paran´a, Brazil, hslopes@cpgei.cefetpr.br

Amos Ng

Centre for Intelligent Automation, University of Sk¨ovde, Sweden

Shinichi Oeda

Department of Information and Computer Engineering, Kisarazu National College

of Technology, Kisarazu, Japan

Imran Raza

Department of Computer Science, COMSATS Institute of Information Technology,Lahore, Pakistan, iraza@ciitlahore.edu.pk

Trang 17

Contributors xviiMuhammad Hassan Raza

Department of Engineering Mathematics and Internetworking, DalhousieUniversity, Halifax, Nova Scotia, Canada, hraza@dal.ca

Mohammad Hossein Rezvani

Computer Engineering Department, Iran University of Science and Technology,Narmak, Tehran 16846, Iran, rezvani@iust.ac.ir

Hasso-Plattner-Institute for Software Systems Engineering, Prof.-Dr.-Helmert-Str.2-3, 14482 Potsdam, Germany, marcel.seelig@hpi.uni-potsdam.de

Roberto Selow

Electrical Engineering Department, Centro Universit´ario Positivo,

Rua Prof Pedro Viriato Parigot de Souza, 5300 - Curitiba - Paran´a, Brazil,

rselow@unicenp.edu.br

Idris Abu Seman

Malaysia Palm Oil Board No 6, Persiaran Institusi, Bandar Baru Bangi, 43000Kajang, Malaysia, idris@mpob.gov.my

Hakki Erhan Sevil

Mechanical Engineering Department, Izmir Institute of Technology, Turkey,erhansevil@iyte.edu.tr

Hadi Shahriar Shahhoseini

Electrical Engineering Department, Iran University of Science and Technology,Narmak, Tehran 16844, Iran, hshsh@iust.ac.ir

Mohsen Sharifi

Iran University of Science and Technology, Computer Engineering Department,Tehran 16846-13114, Iran, msharifi@iust.ac.ir

Trang 18

xviii ContributorsYue Shen

School of Computer & Information Engineering, Hunan Agricultural University,Changsha 410128, China, shenyue@hunau.edu.cn

Mohsen Soryani

Iran University of Science and Technology, Computer Engineering Department,Tehran 16846-13114, Iran, soryani@iust.ac.ir

Chonawat Srisa-an

Science Program in Information Technology (MSIT), Faculty of

Information Technology, Rangsit University, Pathumtani, Thailand 12000,chonawat@rangsit.rsu.ac.th

Department of Preventive Medicine, St Marianna University School of Medicine,Kawasaki, Japan

Mazliham Mohd Su’ud

Universiti Kuala Lumpur, Sek 14, Jalan Teras Jernang 43650 Bandar Baru Bangi,Selangor, Malaysia, mazliham@tm.net.my

Universite de La Rochelle, Laboratoire Informatique Image Interaction, AvenueMichel Crepeau 17000 La Rochelle, France

Department of Emergency and Intensive Care Medicine, Nagoya UniversityGraduate School of Medicine, Nagoya, Japan

Chia-Chun Tsai

Department of Computer Science and Information Engineering, Nanhua University,Chia-Yi, Taiwan, Republic of China, chun@mail.nhu.edu.tw

Trang 19

Contributors xixWen-Pin Tsai

Department of Electronic Engineering, National Kaohsiung University of AppliedSciences, Kaohsiung, Taiwan 807, Republic of China

Anna Wang

Institute of Electronic Information Engineering, College of Information

Science and Engineering, Northeastern University, Shenyang, China,

National Kaohsiung University, chyang@cc.kuas.edu.tw

Lam Fat Yeung

Department of Electronic Engineering, City University of Hong Kong, Hong Kong,eelyeung@cityu.edu.hk

Jiangsu Provincial Key Laboratory of Computer Information Processing

Technology, Suzhou University, Suzhou 2150063, China, hunanyufei@126.comCollege of Computer & Communication, Hunan University, Changsha 410082,China, yufei@hunau.net

Hong Zhang

Kyushu Institute of Technology, 2-4 Hibikino, Wakamatsu, Kitakyushu 808-0196,Japan, zhang@brain.kyutech.ac.jp

Trang 20

xx ContributorsXinhua Zhang

414# mailbox, North Eastern University, Shen Yang, Liao Ning, China 110004,wan-ganna@mail.neu.edu.cn

Weiqun Zheng

Centre for Intelligent Information Processing Systems, School of Electrical,Electronic and Computer Engineering, University of Western Australia, Crawley,

WA 6009, Australia, zheng@ee.uwa.edu.au

Trang 21

of computing time This poses a serious hindrance to the practical application ofEAs in real-world scenarios, and to address this problem the incorporation of com-putationally efficient metamodels has been suggested, so-called metamodel-assistedEAs [11] The purpose of metamodels is to approximate the relationship betweenthe input and output variables of a simulation by computationally efficient mathe-matical models If the original simulation is represented as

meta-This chapter presents a new metamodel-assisted EA for optimization of tationally expensive simulation-optimization problems The proposed algorithm is

compu-Oscar Castillo et al (eds.), Trends in Intelligent Systems and Computer Engineering. 1 c

 Springer Science+Business Media, LLC 2008

Trang 22

2 A Persson et al.basically an evolution strategy inspired by concepts from genetic algorithms Formaximum parallelism and increased efficiency, the algorithm uses a steady-statedesign The chapter describes how the algorithm is successfully applied to optimizetwo real-world problems in the manufacturing domain The first problem considered

is about optimal buffer allocation in a car engine production line, and the secondproblem considered is about optimal production scheduling in a manufacturing cellfor aircraft engines In both problems, artificial neural networks (ANNs) are used asthe metamodel

In the next section, background information of EAs is presented and some ples of combining EAs and ANNs are given

a guided random search In evolving a population of solutions, EAs apply cally inspired operations for selection, crossover, and mutation The solutions in theinitial population are usually generated randomly, covering the entire search space.During each generation, some solutions are selected to breed offspring for the nextgeneration of the population Either a complete population is bred at once (genera-tional approach), or one individual at a time is bred and inserted into the population(steady-state approach)

biologi-The solutions in the population are evaluated using a simulation (Fig 1.1) biologi-The

EA feeds a solution to the simulation, which measures its performance Based onthe evaluation feedback given from the simulation, possibly in combination withprevious evaluations, the EA generates a new set of solutions The evaluation of

Fig 1.1 Evaluation of

solutions using a simulation

model

Trang 23

1 A Metamodel-Assisted Steady-State Evolution Strategy 3solutions continues until a user-defined termination criterion has been fulfilled Thiscriterion may, for example, be that (a) a solution that satisfies a certain fitness levelhas been found, (b) the evaluation process has repeated for a certain number of

times, or (c) that the best solutions in the last n evaluations have not changed

(con-vergence has been reached)

Two well-defined EAs have served as the basis for much of the activity inthe field: evolution strategies and genetic algorithms, which are described in thefollowing

Evolution strategies (ESs) are a variant of EAs founded in the middle of the1960s In an ES, λ offspring are generated from µ parents (λ =µ) [1] Theselection of parents to breed offspring is random-based and independent of the par-ents’ fitness values Mutation of offspring is done by adding a normally distributedrandom value, where the standard deviation of the normal distribution usually isself-adaptive The µ out of theλ generated offspring having the best fitness areselected to form the next generation of the population

Genetic algorithms (GAs) became widely recognized in the early 1970s [4] In

a GA, µ offspring are generated from µ parents The parental selection process

is fitness-based and individuals with high fitness have a higher probability to beselected for breeding the next generation of the population Different methods existfor the selection of parents One example is tournament selection, in which a fewindividuals are chosen at random and the one with the best fitness is selected asthe winner In this selection method individuals with worse fitness may also be se-lected, which prevents premature convergence A common approach is that the bestindividuals among the parents are carried over to the next generation unaltered, astrategy known as elitism

1.2.2 Combining Evolutionary Algorithms and Artificial Neural Networks

The use of metamodels was first proposed to reduce the limitations of consuming simulations Traditionally, regression and response surface methodshave been two of the most common metamodeling approaches In recent years,however, ANNs have gained increased popularity as this technique requires fewerassumptions and less precise information about the systems being modeled whencompared with traditional techniques [3] The first work providing the foundationsfor developing ANN metamodels for simulation was done Both of these studiesyielded results that indicated the potential applications of ANNs as metamodels fordiscrete-event and continuous simulation, particularly when saving computationalcost is important

time-In general terms, an ANN is a nonlinear statistical data modeling method used tomodel complex relationships between inputs and outputs Originally, the inspirationfor the technique was from the area of neuroscience and the study of neurons as in-formation processing elements in the central nervous system ANNs have universal

Trang 24

4 A Persson et al Fig 1.2 Evaluation of solu-

tions using both a simulation

model and a metamodel

Evaluation Component

Solution

Evolutionary Algorithm

Performance Simulation ANN

approximation characteristics and the ability to adapt to changes through training.Instead of only following a set of rules, ANNs are able to learn underlying rela-tionships between inputs and outputs from a collection of training examples, and

to generalize these relationships to previously unseen data These attributes makeANNs very suitable to be used as surrogates for computationally expensive simula-tion models

There exist several different approaches of using ANNs as simulation surrogates.The most straightforward approach is to first train the ANN using historical dataand then completely replace the simulation with the ANN during the optimizationprocess These approaches can, however, only be successful when there is a smalldiscrepancy between the outputs from the ANN and the simulation Due to lack

of data and the high complexity of real-world problems, it is generally difficult todevelop an ANN with sufficient approximation accuracy that is globally correctand ANNs often suffer from large approximation errors which may introduce falseoptima [6] Therefore, most successful approaches instead alternate between theANN and the simulation model during optimization (Fig 1.2)

In conjunction with EAs, ANNs have proven to be very useful for reducing thetime consumption of optimizations Most work within this area has focused on GAs,but there are also a few reports of combining ANNs with ESs Some examples ofthis work are presented in the following

Most work in combining ANNs and EAs is focused on GAs Bull [2] presents

an approach where an ANN is used in conjunction with a GA to optimize a retical test problem The ANN is first trained with a number of initial samples toapproximate the simulation and the GA then uses the ANN for evaluations In every

theo-50 generations, the best individual in the population is evaluated using the tion This individual then replaces the sample representing the worst fitness in thetraining dataset and the ANN is retrained The author found that the combination

simula-of GAs and ANNs has great potential, but that one must be careful so that the timization is not misled by the ANN when the fitness landscape of the modelledsystem is complex

op-Jin et al [6] propose another approach for using ANNs in combination with GAs.The main idea of this approach is that the frequency at which the simulation is usedand the ANN is updated is determined by the estimated accuracy of the ANN Theauthors introduce the concept of evolution control and propose two control meth-ods: controlled individuals and controlled generations With controlled individuals,part of the individuals in a population is chosen and evaluated using the simulation

Trang 25

1 A Metamodel-Assisted Steady-State Evolution Strategy 5The controlled individuals can be chosen either randomly or according to their fit-

ness values With controlled generations, the whole population of N generations are evaluated with the simulation in every M generations (N ≤ M) Online learn-

ing of the ANN is applied after each call to the simulation when new training dataare available The authors carry out empirical studies to investigate the convergenceproperties of the implemented evolution strategy on two benchmark problems Theyfind that correct convergence occurs with both control mechanisms

A third approach of combining ANNs and GAs is presented by Khu et al [7].The authors propose a strategic and periodic scheme of updating the ANN to ensurethat it is constantly relevant as the search progresses In the suggested approach, thewhole population is first evaluated using the ANN and the best individuals in thepopulation are then evaluated using the simulation The authors implement an ANNand a GA for hydrological model calibration and show that there is a significantadvantage in using ANNs for water and environmental system design

H¨usken et al [5] present an approach of combining ANNs and ESs The authorspropose an approach in whichλ offspring are generated fromµ parents and eval-uated using the ANN (λ >µ) The ANN evaluations are the basis for the prese-

lection of s (0 < s <λ) individuals to be simulated Of the s simulated individuals,

the µ individuals having the highest simulation fitness form the next generation

of the population The authors apply their proposed algorithm to optimize an ample problem in the domain of aerodynamic design and experiment on differentANN architectures Results from the study show that structurally optimized net-works exhibit better performance than standard networks

ex-1.3 A New Metamodel-Assisted Steady-State Evolution Strategy

In this chapter an optimization algorithm based on an ES and inspired by conceptsfrom GA is proposed The algorithm uses a steady-state design, in which one indi-vidual at a time is bred and inserted into the population (as opposed to generationalapproaches in which a whole generation is created at once) The main reason forchoosing a steady-state design is that it has a high degree of parallelism, which is

a very important efficiency factor when simulation evaluations are computationallyexpensive

The implementation details of the algorithm are presented with pseudocode inFig 1.3 An initial population ofµsolutions is first generated and evaluated usingthe simulation The simulated samples are used to construct a metamodel (e.g., anANN) Using crossover and mutation,λ offspring are generated from parents in thepopulation chosen using the GA concept of tournament selection The offspring areevaluated using the metamodel and one individual is selected to be simulated, againusing tournament selection When the individual has been simulated, the simula-tion input–output sample is used to train the metamodel online Before the simu-lated individual is inserted into the population, one of theµsolutions already in thepopulation is removed Similar to the previous selection processes, the individual to

Trang 26

6 A Persson et al.

population ← Generate Initial Population( )

for each individual in population

Mutate(individual) Metamodel Evaluation(individual)

offspring.Add(individual)

end

replacement individual ← Select For Replacement(offspring)

Simulation Evaluation(replacement individual)

Update Metamodel(replacement individual)

population.Remove(Select Individual For Removal(population))

population.Add(replacement individual)

end

Fig 1.3 Pseudocode of proposed algorithm

be replaced is chosen using tournament selection In the replacement strategy, the

GA concept of elitism is used; that is, the individual in the population having thehighest fitness is always preserved

To make use of parallel processing nodes, several iterations of the optimizationloop are executed in parallel

1.4 Real-World Optimization

This section describes how the algorithm described in the previous section has beenimplemented in the optimization of two real-world problems in the manufacturingdomain

1.4.1 Real-World Optimization Problems

1.4.1.1 Buffer Allocation Problem

The first problem considered is about finding optimal buffer levels in a productionline at the engine manufacturer Volvo Cars Engines, Sk¨ovde, Sweden The VolvoCars factory is responsible for supplying engine components for car engines toassembly plants and the specific production line studied in this chapter is responsible

Trang 27

1 A Metamodel-Assisted Steady-State Evolution Strategy 7for the cylinder blocks Production is largely customer order-driven and severalmodels are built on the same production line, which imposes major demands onflexibility As a way to achieve improved production in the cylinder block line, themanagement team wants to optimize its buffer levels It is desirable to find a config-uration of the buffer levels that maximizes the overall throughput of the line, whilesimultaneously minimizing the lead time of cylinder blocks To analyze the systemand perform optimizations, a detailed simulation model of the line has been devel-oped using the QUEST software package.

For the scenario considered here, 11 buffers are subject to optimization and aduration corresponding to a two-week period of production is simulated As the pro-duction line is complex and the simulation model is very detailed, one single simula-tion run for a period of this length takes about two hours to complete Because there

is a high degree of stochastic behavior in the production line due to unpredictablemachine breakdowns, the simulation of each buffer level configuration is replicatedfive times and the average output of the five replications is taken as the simulationresult The optimization objective is described by

/num cylinderblocks − w2throughput

where C is the set of all cylinder blocks and w nis the weighted importance of anobjective The goal of the optimization is to minimize the objective function value

1.4.1.2 Production Scheduling Problem

The second problem considered is a production scheduling problem at Volvo Aero(Sweden) The largest activity at Volvo Aero is development and production ofadvanced components for aircraft engines and gas turbines Nowadays, more than80% of all new commercial aircraft with more than 100 passengers are equippedwith engine components from Volvo Aero Volvo Aero also produces engine com-ponents for space rockets As a partner of the European space program, they developrocket engine turbines and combustion chambers

At the Volvo Aero factory studied in this chapter, a new manufacturing cell hasrecently been introduced for the processing of engine components The highly auto-mated cell comprises multiple operations and is able to process several componenttypes at the same time After a period of initial tests, full production is now to bestarted in the cell Similar to other manufacturing companies, Volvo Aero contin-uously strives for competitiveness and cost reduction, and it is therefore importantthat the new cell is operated as efficiently as possible

To aid production planning, a simulation model of the cell has been built usingthe SIMUL8 software package The simulation model provides a convenient way

to perform what-if analyses of different production scenarios without the need ofexperimenting with the real system Besides what-if analyses, the simulation modelcan also be used for optimization of the production We describe how the simulation

Trang 28

8 A Persson et al.model has been used to enhance the production by optimization of the scheduling

of components to be processed in the cell For the production to be as efficient aspossible, it is interesting to find a schedule that is optimal with respect to maxi-mal utilization in combination with minimal shortage, tardiness, and wait-time ofcomponents The optimization objective is described by

where P is the set of all products and w is the weighted importance of an objective.

The goal of the optimization is to minimize the objective function value

1.4.2 Optimization Parameters

The population comprises 20 individuals (randomly initiated) From the parent ulation, 15 offspring are generated by performing a one-point crossover betweentwo solutions (with a probability of 0.5) selected using tournament selection, that

pop-is, taking the better of two randomly chosen solutions Each value in a created spring is mutated using a Gaussian distribution with a deviation that is randomlyselected from the interval (0,10)

off-1.4.3 Metamodel

For each of the two optimization problems, a fast metamodel of the simulationmodel is constructed by training an ANN to estimate fitness as a function of in-put parameters (buffer levels and planned lead-times, respectively) The ANN has afeedforward architecture with two hidden layers (Fig 1.4) When the optimization

Input parameter 1

Fitness Input parameter 2

Input parameter n

Input layer Hidden layer 1 Hidden layer 2 Output layer

Fig 1.4 Conceptual illustration of ANN

Trang 29

1 A Metamodel-Assisted Steady-State Evolution Strategy 9starts the ANN is untrained and after each generation, the newly simulated sam-ples are added to the training dataset and the ANN is trained with the most recentsamples (at most 500) using continuous training To avoid overfitting, 10% of thetraining data is used for cross-validation The training data is linearly normalized tovalues between 0 and 1 If any of the new samples has a lower or higher value thanany earlier samples, renormalization of the data is performed and the weights of theANN are reset.

1.4.4 Platform

The optimization has been realized using the OPTIMIZE platform, which is anInternet-based parallel and distributed computing platform that supports multipleusers to run experiments and optimizations with different deterministic/stochasticsimulation systems [10] In the platform various EAs, ANN-based metamodels, de-terministic/stochastic simulation systems, and a corresponding database manage-ment system are integrated in a parallel and distributed fashion and made available

to users through Web services technology

1.5 Results

This section presents the results of the proposed algorithm applied to the two world optimization problems described in the previous section For an indication ofthe performance of the proposed algorithm, a standard steady-state ES not using ametamodel is also implemented for the two optimization problems This algorithmuses the same representation, objective function, and mutation operator as the pro-posed metamodel-assisted algorithm

real-In Fig 1.5, results from the buffer allocation optimization are shown real-In thisexperiment, 100 simulations have been performed (where each simulation is theaverage result of five replications) Figure 1.6 shows results from the productionscheduling problem In this experiment, 1000 simulations have been performed andthe presented result is the average of 10 replications of the optimization

As Figs 1.5 and 1.6 show, the proposed metamodel-assisted algorithm convergessignificantly faster than the standard ES for both optimization problems, which in-dicates the potential of using a metamodel

1.6 An Improved Offspring Selection Procedure

A possible enhancement of the proposed algorithm would be an improved offspringselection procedure In the selection of the next offspring to be inserted into thepopulation, a number of different approaches have been proposed in the literature

Trang 30

10 A Persson et al.

2030 2035 2040 2045 2050

2055

2060 2065 2070 2075 2080

1 2 3 8 10 15 48 54 57 60 65 73 100

Simulation

Using metamodel Not using metamodel

Fig 1.5 Optimization results for buffer allocation problem

Not using metamodel

Fig 1.6 Optimization results for production scheduling problem

Trang 31

1 A Metamodel-Assisted Steady-State Evolution Strategy 11The most common approach is to simply select the offspring having the best meta-model fitness Metamodels in real-world optimization problems are, however, oftensubject to estimation errors and when these uncertainties are not accounted for, apremature and suboptimal convergence may occur on complex problems with manymisleading local optima [12] Poor solutions might be kept for the next generationand the good ones might be excluded Optimization without taking the uncertaintiesinto consideration is therefore likely to perform badly [9] Although this is a well-known problem, the majority of existing metamodel-assisted EAs do not accountfor metamodel uncertainties.

We suggest a new offspring selection procedure that is aware of the uncertainty

in metamodel estimations In this procedure, the probability of each offspring ing the highest simulation fitness among all offspring is quantified and taken intoaccount when selecting the offspring to be inserted into the population This meansthat a higher confidence in the potential of an offspring will increase the chancesthat it is selected

hav-1.6.1 Overall Selection Procedure

First of all, each offspring is evaluated using the metamodel and assigned a model fitness value The accuracy of the metamodel is then measured and its esti-mation error is expressed through an error probability distribution This distribution,

meta-in combmeta-ination with the metamodel fitness values, is used to calculate the ity of each offspring having the highest simulation fitness (the formulas used forthe calculation are presented in the next section) Based on these probabilities, oneoffspring is chosen using roulette wheel selection to be simulated and inserted intothe population

probabil-1.6.2 Formulas for Probability Calculation

The metamodel error is represented by a probability distribution e This distribution

is derived from a list of differences between metamodel fitness value and simulation

fitness value for samples in a test set Based on e, the offspring probabilities are calculated using two functions: f and F.

The function f is a probability distribution over x of the simulation output given

a metamodel output o, according to Eq 1.1.

The function F is a cumulative probability distribution for a given metamodel output o, representing the probability that the simulated output would be less than the value of x (in case of a maximization problem), according to Eq 1.2.

Trang 32

Based on the two functions f and F, the probability of an offspring a having the

highest simulation fitness among all offspring is calculated according to Eq 1.3,

compu-The proposed algorithm is successfully applied to optimize two real-world lems in the manufacturing domain The first problem considered is about findingoptimal buffer levels in a car engine production line, and the second problemconsidered is about optimal production scheduling in a manufacturing cell foraircraft engines In both problems, an ANN is used as the metamodel

prob-Results from the optimization show that the algorithm is successful in optimizingboth real-world problems A comparison with a corresponding algorithm not using ametamodel indicates that the use of metamodels may be very efficient in simulation-based optimization of complex problems

A possible enhancement of the algorithm in the form of an improved spring selection procedure that is aware of uncertainties in metamodel estima-tions is also discussed in the chapter In this procedure, the probability of eachoffspring having the highest simulation fitness among all offspring is quantifiedand taken into consideration when selecting the offspring to be inserted into thepopulation

Trang 33

off-1 A Metamodel-Assisted Steady-State Evolution Strategy 13References

1 Beyer, H.G., Schwefel, H.P (2002) Evolution strategies—A comprehensive introduction

Nat-ural Computing 1(1), pp 3–52.

2 Bull, L (1999) On model-based evolutionary computation Software Computing (3),

pp 76–82.

3 Fonseca, D.J., Navaresse, D.O., Moynihan, G.P (2003) Simulation metamodeling through

ar-tificial neural networks Engineering Applications of Arar-tificial Intelligence 16(3), pp 177–183.

4 Holland, J.H (1975) Adaptation in Natural and Artificial Systems, University of Michigan

Press, Ann Arbor.

5 H¨usken, M., Jin, Y., Sendhoff, B (2005) Structure optimization of neural networks for

evo-lutionary design optimization Source Soft Computing—A Fusion of Foundations,

Methodolo-gies and Applications 9(1), pp 21–28.

6 Jin, Y., Olhofer, M., Sendhoof, B (2002) A framework for evolutionary optimization

with approximate fitness functions IEEE Transactions on Evolutionary Computation 6(5),

pp 481–494.

7 Khu, S.T., Savic, D., Liu, Y., Madsen, H (2004) A fast evolutionary-based metamodelling

approach for the calibration of a rainfall-runoff model In: Proceedings of the First Biennial

Meeting of the International Environmental Modelling and Software Society, pp 147–152,

Osnabruck, Germany.

8 Laguna, M., Marti, R (2002) Neural network prediction in a system for optimizing

simula-tions IEEE Transactions (34), pp 273–282.

9 Lim, D., Ong, Y.-S., Lee, B.-S (2005) Inverse multi-objective robust evolutionary design

op-timization in the presence of uncertainty In: Proceedings of the 2005 Workshops on Genetic

and Evolutionary Computation, pp 55–62, Washington, DC.

10 Ng, A., Grimm, H., Lezama, T., Persson, A., Andersson, M., J¨agstam, M (2007) Web services

for metamodel-assisted parallel simulation optimization In: Proceedings of The IAENG

Inter-national Conference on Internet Computing and Web Services (ICICWS’07), March 21–23,

pp 879–885, Hong Kong.

11 Ong, Y.S., Nair, P.B., Keane, A.J., Wong, K.W (2004) Surrogate-assisted evolutionary

opti-mization frameworks for high-fidelity engineering design problems In: Knowledge

Incorpo-ration in Evolutionary Computation, pp 307–332, Springer, New York.

12 Ulmer, H., Streichert, F., Zell, A (2003) Evolution strategies assisted by Gaussian processes

with improved pre-selection criterion In: Proceedings of IEEE Congress on Evolutionary

Computation (CEC’03), December 8–12, 2003, pp 692–699, Canberra, Australia.

Trang 34

Chapter 2

Automatically Defined Groups for Knowledge Acquisition from Computer Logs

and Its Extension for Adaptive Agent Size

Akira Hara, Yoshiaki Kurosawa, and Takumi Ichimura

2.1 Introduction

Recently, a large amount of data is stored in databases through the advance of puter and network environments To acquire knowledge from the databases is im-portant for analyses of the present condition of the systems and for predictions ofcoming incidents The log file is one of the databases stored automatically in com-puter systems Unexpected incidents such as system troubles as well as the histories

com-of daily service programs’ actions are recorded in the log files System trators have to check the messages in the log files in order to analyze the presentcondition of the systems However, the descriptions of the messages are written invarious formats according to the kinds of service programs and application software

adminis-It may be difficult to understand the meaning of the messages without the manuals

or specifications Moreover, the log files become enormous, and important messagesare liable to mingle with a lot of insignificant messages Therefore, checking the logfiles is a troublesome task for administrators

Log monitoring tools such as SWATCH [1], in which regular expressions forrepresenting problematic phrases are used for pattern matching, are effective fordetecting well-known typical error messages However, various programs running inthe systems may be open source software or software companies’ products, and theymay have been newly developed or upgraded recently Therefore, it is impossible todetect all the problematic messages by the predefined rules In addition, in order tocope with illegal use by hackers, it is important to detect unusual behavior such asthe start of the unsupposed service program, even if the message does not correspond

to the error message To realize this system, the error-detection rules depending onthe environment of the systems should be acquired adaptively by means of evolution

or learning

Genetic programming (GP) [2] is one of the evolutionary computation ods, and it can optimize the tree structural programs Much research on extractingrules from databases by GP has been done in recent years In the research [3–5],

meth-Oscar Castillo et al (eds.), Trends in Intelligent Systems and Computer Engineering. 15 c

 Springer Science+Business Media, LLC 2008

Trang 35

16 A Hara et al.the tree structural program in a GP individual represents an IF-THEN rule In order

to acquire multiple rules, we had previously proposed an outstanding method thatunited GP with cooperative problem-solving by multiple agents We called thismethod automatically defined groups (ADG) [6, 7] By using this method, we haddeveloped the rule extraction algorithm from the database [8–12] In this system,two or more rules hidden in the database, and respective rules’ importance can beacquired by cooperation of agents However, we meet a problematic situation whenthe database has many latent rules In this case, the number of agents runs shortfor search and for evaluation of each rule because the number of agents is fixed inadvance In order to solve this problem, we have improved ADG so that the methodcan treat the variable number of agents In other words, the number of agents in-creases adaptively according to the acquired rules

In Sect 2.2, we explain the algorithm of ADG, and the application to rule traction from classified data In Sect 2.3, we describe how to extract rules from logfiles by ADG, and show a preliminary experiment using a centralized control serverfor many client computers In Sect 2.4, we describe an issue in the case where weapply the rule-extracting algorithm to a large-scale log file, and then we propose theADG with variable agent size for solving the problem We also show the results ofexperiments using the large-scale log files In Sect 2.5, we describe conclusions andfuture work

ex-2.2 Rule Extraction by ADG

2.2.1 Automatically Defined Groups

In the field of data processing, to cluster the enormous data and then to extractcommon characteristics from each cluster of data are important for knowledge ac-quisition In order to accomplish this task, we adopt a multiagent approach, in whichagents compete with one another for their share of the data, and each agent gener-ates a rule for the assigned data; the former corresponds to the clustering of data,and the latter corresponds to the rule extraction in each cluster As a result, all rulesare extracted by multiagent cooperation However, we do not know how many rulessubsist in the given data and how data should be allotted to each agent Moreover,

as we prepare abundant agents, the number of tree structural programs increases in

an individual Therefore, search performance declines

In order to solve these problems, we have proposed an original evolutionarymethod, automatically defined groups The method is an extension of GP, and it op-timizes both the grouping of agents and the tree structural program of each group inthe process of evolution By grouping multiple agents, we can prevent the increase ofsearch space and perform an efficient optimization Moreover, we can easily analyzeagents’ behavior group by group Respective groups play different roles from oneanother for cooperative problem-solving The acquired group structure is utilized

Trang 36

2 ADG for Knowledge Acquisition from Logs and Its Extension 17

Fig 2.1 Concept of automatically defined groups

for understanding how many roles are needed and which agents have the same role.That is, the following three points are automatically acquired by using ADG

• How many groups (roles) are required to solve the problem?

• To which group does each agent belong?

• What is the program of each group?

In the original ADG, each individual consists of a predefined number of agents.The individual maintains multiple trees, each of which functions as a specializedprogram for a distinct group as shown in Fig 2.1 We define a group as the set ofagents referring to the same tree for the determination of their actions All agentsbelonging to the same group use the same program

Generating an initial population, agents in each GP individual are divided intoseveral groups at random Crossover operations are restricted to corresponding treepairs For example, a tree referred to by agent 1 in an individual breeds with atree referred to by agent 1 in another individual This breeding strategy is calledrestricted breeding [13–15] In ADG, we also have to consider the sets of agentsthat refer to the trees used for the crossover The group structure is optimized

by dividing or unifying the groups according to the inclusion relationship of thesets

The concrete processes are as follows We arbitrarily choose an agent for twoparental individuals A tree referred to by the agent in each individual is used for

crossover We use T and T  as expressions of these trees, respectively In each

parental individual, we decide a set A(T ), the set of agents that refer to the lected tree T When we perform a crossover operation on trees T and T , there arethe following three cases

se-(a) If the relationship of the sets is A(T ) = A(T ), the structure of each individual

is unchanged

(b) If the relationship of the sets is A(T ) ⊃ A(T ), the division of groups takes

place in the individual with T , so that the only tree referred to by the agents in

Trang 37

agent1,2,3

{1,2}

{1,2}

{1,3},{1,3}

(type b)

(type c)Fig 2.2 Examples of crossover

A(T ) ∩ A(T  ) can be used for crossover The individual which maintains T  is

unchanged Figure 2.2 (type b) indicates an example of this type of crossover

(c) If the relationship of the sets is A(T ) ⊃ A(T  ) and A(T ) ⊂ A(T ), the unification

of groups takes place in both individuals so that the agents in A(T ) ∪ A(T )

can refer to an identical tree Figure 2.2 (type c) shows an example of thiscrossover

We expect that the search works efficiently and the adequate group structure isacquired by using this method

Trang 38

2 ADG for Knowledge Acquisition from Logs and Its Extension 19

2.2.2 Rule Extraction from Classified Data

In some kinds of databases, each datum is classified into the positive or negativecase (or more than two categories) For example, patient diagnostic data in hospitalsare classified into some categories according to their diseases It is an important task

to extract characteristics for a target class However, even if data belong to the sameclass, all the data in the class do not necessarily have the same characteristics Apart of a dataset might show a different characteristic It is possible to apply ADG torule extraction from such classified data In ADG, multiple tree structural rules aregenerated evolutionally, and each rule represents the characteristic of a subset in thesame class of data Figure 2.3 shows a concept of rule extraction using ADG Eachagent group extracts a rule for the divided subset, and the rules acquired by multiplegroups can cover all the data in the target class Moreover, when agents are grouped,the load of each agent and predictive accuracy of its rule are considered As a result,

a lot of agents come to belong in the group with the high use-frequency and accuracy rule In other words, we can regard the number of agents in each group asthe important degree of the rule Thus, two or more rules and the important degree

high-of respective rules can be acquired at the same time This method was applied tomedical data and the effectiveness has been verified [8–11]

Database

Target Class

Rule for subset 1

Rule for subset 2

Rule for subset 3

Trang 39

20 A Hara et al.2.3 Knowledge Acquisition from Log Files by ADG

2.3.1 How to Extract Rules from Unclassified Log Messages

We apply the rule extraction method using ADG to detect trouble in computer tems from log files In order to use the method described in the previous section,

sys-we need supervised information for its learning phase In other words, sys-we have toclassify each message in the log files into two classes: normal message class andabnormal message class indicating system trouble However, this is a difficult taskbecause complete knowledge for computer administration is needed and log dataare of enormous size In order to classify log messages automatically into the ap-propriate class, we consider a state transition pattern of computer system operation

We focus on the following two different states and make use of the difference of thestates as the supervised information

1 Normal state This is the state in the period of stable operation of the computersystem We assume that the administrators keep good conditions of various sys-tem configurations in this state Therefore, frequently observed messages (e.g.,

“Successfully access,” “File was opened,” etc.) are not concerned with the errormessages Of course, some insignificant warning messages (e.g., “Short of paper

in printer,” etc.) may sometimes appear

2 Abnormal state This is the state in the period of unstable operation of the puter system The transition to the abnormal state may happen due to hardwaretrouble such as hard disk drive errors, or by restarting service programs with newconfigurations in the current system Moreover, some network security attacksmay cause the unstable state In this state, many error messages (e.g., “I/O error,”

com-“Access denied,” “File not found,” etc.) are included in the log files Of course,the messages observed in the normal state also appear in the abnormal state.The extraction of rules is performed by using log files in the respective states.First, we define the base period of the normal state, which seems to be stable, anddefine the testing period, which might be in the abnormal state Then we prepare thetwo databases One is composed of log messages in the normal state period, and theother is composed of log messages in the abnormal state period By evolutionarycomputations, we can find rules, which respond to the messages appearing only inthe abnormal state

For knowledge representation to detect a remarkable problematic case, we usethe logical expressions, which return true only to such problematic messages Thetagging procedure using regular expressions as described in [16] was used for thepreprocessing to the log files and the representation of the rules Figure 2.4 shows

an illustration of the preprocessing Each message in the log files is separated intoseveral fields (e.g., daemon name field, host name field, comment field, etc.) bythe preprocessing, and each field is tagged Moreover, words that appear in the logmessages are registered in the word lists for respective tags beforehand

Trang 40

2 ADG for Knowledge Acquisition from Logs and Its Extension 21

[server1 : /var/log/messages]

2005/11/14 12:58:16 server1 named unexpected

RCODE(SERVFAIL) resolving ’host.there.ne.jp/A/IN’

2006/12/11 14:34:09 server1 smbd write_data:

write failure in writing to client Error Connection rest by peer

:

preprocessing

<HOST> server1 </HOST> <LOGNAME> messages </LOGNAME>

<DATE> 2005/11/14 </DATE> <TIME> 12:58:16 </TIME>

<COMP> server1 </COMP> <DAEMON> named </DAEMON>

<EXP> unexpected RCODE(SERVFAIL) resolving

<HOST> server1 </HOST> <LOGNAME> messages </LOGNAME>

<DATE> 2006/12/11 </DATE> <TIME>14:34:09 </TIME>

<COMP> server1 </COMP> <DAEMON> smbd </DAEMON>

<EXP> write_data: write failure in writing to client.

Error Connection rest by peer </EXP>

1 server1

2 server2 :

DAEMON Tag EXP Tag

Word Lists

’host.there.ne.jp/A/IN’ </EXP>

Fig 2.4 Preprocessing to log files

The rule is made by the conjunction of multiple terms, each of which judgeswhether the selected word is included in the field of the selected tag The followingexpression is an example of the rule

(and (include <DAEMON> 3)(include <EXP> 8))

We assume that the word “nfsd” is registered third in the word list for the

<DAEMON> tag, and the word “failure” is registered eighth in the word list for

the <EXP> tag For example, this rule returns true to the message including the

following strings

<DAEMON>nfsd</DAEMON> <EXP>Warning:access failure</EXP>Multiple trees in an individual of ADG represent the respective logical expres-sions Each message in the log files is input to all trees in the individual Then,calculations are performed to determine whether the message satisfies each logical

Ngày đăng: 22/11/2014, 04:11

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
1. S. Murthy and J. J. Garcia-Luna-Aceves (1996) An efficient routing protocol for wireless networks. In: ACM Mobile Networks and Applications Journal, Special Issue on Routing in Mobile Communication Networks. Volume 1. No. 2 Sách, tạp chí
Tiêu đề: ACM Mobile Networks and Applications Journal
3. D. D. Perkins, H. D. Hughes, and C. B. Owen (2002) Factors affecting the performance of ad hoc networks. Communications, 2002. ICC 2002. IEEE International Conference Sách, tạp chí
Tiêu đề: Communications
4. Y. Tseng et al. (1999) The broadcast storm problem in a mobile ad-hoc network. In: Proceed- ings of ACM International Conference on Mobile Computing and Networking (MOBICOM) Sách, tạp chí
Tiêu đề: Proceed-
5. C. K. Toh (2002) Ad hoc wireless networks, protocols and systems. In: Proceeings of IEEE Conference. Prentice Hall, Upper Saddle River, NJ Sách, tạp chí
Tiêu đề: Proceeings of IEEE"Conference
6. L. Hughes and Y. Zhang (2004) Self-limiting adaptive protocols for controlled flooding in ad hoc networks. In: Proceedings of Ad-hoc, Mobile, and Wireless Networks. IEEE, Canada Sách, tạp chí
Tiêu đề: Proceedings of Ad-hoc, Mobile, and Wireless Networks
7. L. Hughes, Y. Zhang, and K. Shumon (2003) Cartesian ad hoc routing protocol. In: Proceed- ings of Second International Conference, ADHOC-NOW. Springer, Montreal, Canada Sách, tạp chí
Tiêu đề: Proceed-"ings of Second International Conference, ADHOC-NOW
8. A. Durresi, V. Paruchuri, L. Barolli, and Jain Raj (2005) QoS-energy aware broadcast for sensor networks. In: Proceedings of Parallel Architectures, Algorithms and Networks. ISPAN 2005. 8th International Symposium Sách, tạp chí
Tiêu đề: Proceedings of Parallel Architectures, Algorithms and Networks. ISPAN
9. J. C. Boettcher and L. M. Gaines (2005) Industry Research Using the Economic Census: How to Find It, How to Use It, Greenwood Press, Westport, CT Sách, tạp chí
Tiêu đề: Industry Research Using the Economic Census: How"to Find It, How to Use It
10. Ferguson Niels et al. (2003) Practical cryptography. In: Proceedings of IEEE Conference, IEE Sách, tạp chí
Tiêu đề: Proceedings of IEEE Conference
12. A. McIntosh et al. (1994) Statistical analysis of CCSN/SS7 traffic data from working CCS subnetworks. IEEE Journal on Selected Areas in Communications, 12(3) Sách, tạp chí
Tiêu đề: Statistical analysis of CCSN/SS7 traffic data from working CCS subnetworks
Tác giả: A. McIntosh, et al
Nhà XB: IEEE Journal on Selected Areas in Communications
Năm: 1994
13. George G. Morgan (2004) How to Do Everything with Your Genealogy. McGraw-Hill Pro- fessional, New York Sách, tạp chí
Tiêu đề: How to Do Everything with Your Genealogy
14. Jon D. Fricker and Robert K. Whitford (2004) Fundamentals of Transportation Engineering:A Multimodal Approach. Prentice Hall, Upper Saddle River, NJ Sách, tạp chí
Tiêu đề: Fundamentals of Transportation Engineering:"A Multimodal Approach
15. Bob O’Hara and Al Petrick (1999) IEEE 802.11 Handbook, A Designer’s Companion. IEEE Press, Washington, DC Sách, tạp chí
Tiêu đề: IEEE 802.11 Handbook, A Designer’s Companion
16. Ranjith S. Jayaram and Injong Rhee (2003) A case for delay-based congestion control for CDMA 2.5G networks. In: Proceedings of International Conference on Ubiquitous Comput- ing. Springer, Seattle Sách, tạp chí
Tiêu đề: Proceedings of International Conference on Ubiquitous Comput-"ing
17. E. Guadagnoli and W. F. Velicer (1988) Relation of sample size to the stability of component patterns. Psychological Bulletin, Volume 103. No. 2:265–275 Sách, tạp chí
Tiêu đề: Psychological Bulletin
18. C. H. Yu and J. Behrens (1994) Misconceptions in statistical power and dynamic graphics as a remediation. Poster session presented at the Annual Meeting of American Statistical Association. Toronto, Canada. Volume 18. No. 2 Sách, tạp chí
Tiêu đề: Annual Meeting of American Statistical"Association
19. I. G. Dambolena (1984) Teaching the central limit theorem through computer simulation.Mathematics and Computer Education: 128–132 Sách, tạp chí
Tiêu đề: Mathematics and Computer Education
20. Olive J. Dunn and Virginia A. Clark (1987) Applied Statistics: Analysis of Variance and Regression, 2nd Edition. John Wiley and Sons, New York Sách, tạp chí
Tiêu đề: Applied Statistics: Analysis of Variance and"Regression
2. Yunjung Yi and Mario Gerla (2002) Efficient flooding in ad hoc networks using on-demand (passive) cluster formation. In: Proceedings of MOBIHOC 2002 Khác
11. Dawn Song, David Wagner, and Xuqing Tian (2001) Timing analysis of keystrokes and timing attacks on SSH. In: Proceedings of 10th USENIX Security Symposium Khác

TỪ KHÓA LIÊN QUAN