In modelling for the process industry, however, prior knowledge is typically par-tial, the effects of unknown input ’disturbances’ are not negligible, and it is desirable to have reprod
Trang 2Advances in Industrial Control
Trang 3Digital Controller Implementation
and Fragility
Robert S.H Istepanian and
James F Whidborne (Eds.)
Optimisation of Industrial Processes
Mohieddine Jelali and Andreas Kroll
Strategies for Feedback Linearisation
Freddy Garces, Victor M Becerra,
Chandrasekhar Kambhampati and
Kevin Warwick
Robust Autonomous Guidance
Alberto Isidori, Lorenzo Marconi and
Andrea Serrani
Dynamic Modelling of Gas Turbines
Gennady G Kulikov and Haydn A
Thompson (Eds.)
Control of Fuel Cell Power Systems
Jay T Pukrushpan, Anna G Stefanopoulou
and Huei Peng
Fuzzy Logic, Identification and Predictive
Ajoy K Palit and Dobrivoje Popovic
Modelling and Control of mini-Flying
Measurement, Control, and Communication Using IEEE 1588
Manufacturing Systems Control Design
Stjepan Bogdan, Frank L Lewis, ZdenkoKovaˇci´c and José Mireles Jr
Nonlinear H2/H∞Constrained Feedback Control
Murad Abu-Khalaf, Jie Huang andFrank L Lewis
Modern Supervisory and Optimal Control
Sandor A Markon, Hajime Kita, HiroshiKise and Thomas Bartz-BeielsteinPublication due July 2006
Wind Turbine Control Systems
Fernando D Bianchi, Hernán De Battistaand Ricardo J Mantz
Publication due August 2006
Soft Sensors for Monitoring and Control of Industrial Processes
Luigi Fortuna, Salvatore Graziani,Alessandro Rizzo and Maria GabriellaXibilia
Publication due August 2006
Practical PID Control
Antonio VisioliPublication due November 2006
Magnetic Control of Tokamak Plasmas
Marco Ariola and Alfredo PirontiPublication due May 2007
Trang 5Automatic Control, Signals, Sensors and Systems
Royal Institute of Technology (KTH)
SE-100 44 Stockholm
Sweden
British Library Cataloguing in Publication Data
Bohlin, Torsten,
1931-Practical grey-box process identification : theory and
applications - (Advances in industrial control)
1.Process control Mathematical models 2.Process control
-Mathematical models - Case studies
I.Title
670.4’27
ISBN-13: 9781846284021
ISBN-10: 1846284023
Library of Congress Control Number: 2006925303
Advances in Industrial Control series ISSN 1430-9491
ISBN-13: 978-1-84628-402-1
© Springer-Verlag London Limited 2006
MATLAB® and Simulink® are registered trademarks of The MathWorks, Inc., 3 Apple Hill Drive, Natick,
MA 01760-2098, U.S.A http://www.mathworks.com
Modelica® is a registered trademark of the “Modelica Association” http://www.modelica.org/
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers,
or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency Enquiries concerning reproduction outside those terms should be sent to the publishers.
The use of registered names, trademarks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use.
The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made.
Printed in Germany
9 8 7 6 5 4 3 2 1
Springer Science+Business Media
springer.com
Trang 6Advances in Industrial Control
Series Editors
Professor Michael J Grimble, Professor of Industrial Systems and DirectorProfessor Michael A Johnson, Professor (Emeritus) of Control Systemsand Deputy Director
Industrial Control Centre
Department of Electronic and Electrical Engineering
Series Advisory Board
Professor E.F Camacho
Escuela Superior de Ingenieros
Department of Electrical and Computer Engineering
The University of Newcastle
Department of Electrical Engineering
National University of Singapore
4 Engineering Drive 3
Singapore 117576
Trang 7Department of Electrical and Computer Engineering
Electronic Engineering Department
City University of Hong Kong
Tat Chee Avenue
Pennsylvania State University
Department of Mechanical Engineering
Department of Electrical Engineering
National University of Singapore
4 Engineering Drive 3
Singapore 117576
Professor Ikuo Yamamoto
Kyushu University Graduate School
Marine Technology Research and Development ProgramMARITEC, Headquarters, JAMSTEC
2-15 Natsushima Yokosuka
Kanagawa 237-0061
Japan
Trang 8To the KTH class of F53
Trang 9The series Advances in Industrial Control aims to report and encourage technology
transfer in control engineering The rapid development of control technology has
an impact on all areas of the control discipline New theory, new controllers, actuators, sensors, new industrial processes, computer methods, new applications, new philosophies}, new challenges Much of this development work resides in industrial reports, feasibility study papers and the reports of advanced collaborative projects The series offers an opportunity for researchers to present an extended exposition of such new work in all aspects of industrial control for wider and rapid dissemination
Experienced practitioners in the field of industrial control often say that about
70 – 80% of project time is spent on understanding and modelling a process, developing a simulation and then testing, calibrating and validating the simulation.Control design and investigations will then absorb the other 20 – 30% of the project time; thus, it is perhaps a little surprising that there is so little published on the formal procedures and tools for performing these developmental modelling tasks compared with the provision of simulation software tools There is a very clear difference between these two types of activities: simulation tools usually comprise libraries of numerical routines and a logical framework for their interconnection often based on graphical representations like block diagrams of the actual steps needed to arrive at a consistent model but replicating observed physical process behaviour is a far more demanding objective Such is the agenda
underlying the inspirational work of Torsten Bohlin reported in his new Advances
in Industrial Control monograph, Practical Grey-box Identification.
The starting point for this work lies in the task of providing models for a range
of industrial production processes including: Baker’s yeast production, steel rinsing (the rinsing of moving steel strip in a rolling-mill process), continuous pulp digestion, cement milling, an industrial recovery boiler process (pulp production process unit) and cardboard manufacturing The practical experience of producing these models supplied the raw data for understanding and abstracting the steps needed in a formal grey-box identification procedure; thus, it was a project that has been active for over 15 years and over this period, the grey-box identification procedure was formulated, tested, re-formulated and so-on until a generic procedure of wide applicability finally emerged
Trang 10x Series Editors’ Foreword
In parallel with this extraction of the fundamental grey-box identification procedure has been the development of the Process Model Calibrator and Validator
software, the so-called MoCaVA software This contains the tools that implement
the general steps of grey-box identification Consequently it is based on an holistic approach to process modelling that uses a graphical block-diagram representation but incorporates routines like loss function minimisation for model fitting and other statistical tools to allow testing of model hypotheses The software has been tested and validated through its use and development with an extensive and broadly based group of individual processes, some of which are listed above
This monograph captures three aspects of Torsten Bohlin’s work in this area Firstly, there is an introduction to the theory and fundamentals of grey-box identification (Part I) that carefully defines white-box, black-box and grey-box identification From this emerge the requirements of a grey-box procedure and the
need for software to implement the steps Secondly, there is the MoCaVa software
itself This is available for free download from a Springer website whose location
is given in the book Part II of the monograph is a tutorial introduction and user’s
guide to the use of the MoCaVa software For added realism, the tutorial is based
on a drum boiler model Finally the experience of the tutorial introduction is put to good use with the two fully documented case studies given as Part III of the monograph Process engineers will be able to work at their own pace through the model development for a rinsing process for steel strip in a rolling mill and the prediction of quality in a cardboard manufacturing process The value of the case studies is two-fold since they provide a clear insight into the procedures of grey-
box identification and give in-depth practical experience of using the MoCaVa
software for industrial processes; both of these are clearly transferable skills
The Advances in Industrial Control monograph series has often included
volumes on process modelling and system identification but it is believed that this
is only the second ever volume in the series on the generic steps in an holistic box identification procedure The volume will be welcomed by industrial process control engineers for its insights into the practical aspects of process model identification Academics and researchers will undoubtedly be inspired by the more generic theoretical and procedural aspects that the volume contributes to the science and practice of system identification
grey-M.J Grimble and M.A Johnson Industrial Control Centre Glasgow, Scotland, U.K
Trang 11Those who have tried the conventional approaches to making mathematical models
of industrial production processes have probably also experienced the limitations ofthe available methods They have either to build the models from first principles, orelse to apply one of the ‘black−box’ methods based on statistical estimation theory.Both approaches work well under the circumstances for which they were designed,and they have the advantage that there are well developed tools for facilitating thework Generally, the modelling tools (based on first principles) have their applications
to electrical, mechanical, and hydrodynamical systems, where much is known aboutthe principles governing such systems In contrast, the statistical methods have theirapplications in cases where little is known in advance, or when detailed knowledge isirrelevant for the purpose of the modelling, typically for design of feedback control
In modelling for the process industry, however, prior knowledge is typically
par-tial, the effects of unknown input (’disturbances’) are not negligible, and it is desirable
to have reproducibility of the model, for instance for the monitoring of unmeasured
variables, for feed−forward control, or for long−range prediction of variables withmuch delayed responses to control action Conceivably, ‘grey−box’ identification,which is a ‘hybrid’ of the two approaches, would help the situation by exploiting both
of the two available sources of information, namely i) such invariant prior knowledge that may be available, and ii) response data from experiments Thus, grey−box meth-
ods would have their applications, whenever there is some invariant prior knowledge
of the process and it would be a waste of information not to use it
After the first session on grey−box identification at the 4th IFAC Symposium on Adaptive Systems in Control and Signal Processingin 1992, and the first special issue
in Int J Adaptive Control and Signal Processing in 1994,the approach has now beenreasonably well accepted as a paradigm for how to address the practical problems inmodelling physical processes There are now quite a number of publications, most
about special applications (A Google search for “Grey box model” in 2005 gave 691
raises a number of fundamental questions, in addition to the practical problems: How
can I make use of what I do know? How much of my prior knowledge is useful and even correct, when used in the particular environment? What do I do about the un- known disturbances I cannot get rid of? Are my experiment data sufficient and rele-
vant? How do I know when the model is good enough?
Trang 12xii
It was the desire to find some answers to these questions that initiated a long−rangeproject at the Automatic Control department of KTH The present book is based onthe results of that project It stands on three legs:
i) A theoretical investigation of the fundamentals of grey−box identification It
re-vealed that sufficiently many theoretical principles were available in the literature foranswering the questions that needed to be answered The compilation was published
in a book (Bohlin, 1991a), which ended with a number of procedures for doing grey−box identification properly
ii) A software tool MoCaVa (Process Model Calibrator & Validator) based on one of
the procedures (Bohlin and Isaksson, 2003)
iii) A number of case studies of grey−box identification of industrial processes They
were carried out in order to see whether the theoretical procedure would also be a tical one, and to test the software being developed in parallel Most case studies havebeen done by PhD students at the department under the supervision of the author Theextent of the work was roughly one thesis per case
prac-This book will focus on the software and the case studies Thus it will serve as a
manual to MoCaVa, as well as illustrating how to apply MoCaVa efficiently Success
in grey−box identification, as in other design, will no doubt depend of the skill of thecraftsman using the tool, and I believe that skill is best gained by exercise, and casestudies to be a good introduction
In addition, there is a ‘theory’ chapter with the purpose of describing the basic
de-liberations, derivations, and decisions behind MoCaVa and the way it is constructed.
The purpose is to provide additional information to anyone who wants to understandmore of its properties than revealed in the user’s manual This may help the user to ap-praise the strengths and weaknesses of the program, either in order to be able to do the
same with the models that come out of it, or even to develop MoCaVa further (The
source code can be downloaded from Springer.) The focus is therefore on the
applica-bility of the theories for the purpose of MoCaVa, rather than on the theories
to be understood by readers who are not used to strict mathematics And, conversely,
it would be impractical to try and solve all problems of grey−boxidentification by ing on intuition and reasoning alone, however clever Therefore, the mathematics isinterpreted in intuitive terms, and necessary approximations motivated in the sameway, whenever the mathematical problems become unsurmountable, or an exact solu-tion would take prohibitively long for a computer to process The following is one of
rely-my favorite quotations: “The man thinks The theory helps him to think, and to tain his thinking consistent in complex situations” (Peterka)
main-The method presented in this book for building grey−box models of physical jects has three kinds of support: A systematic procedure to follow, a software packagefor doing it, and case studies for learning how to use it Part I motivates and describes
ob-the procedure and ob-the MoCaVa software Part II is a tutorial on ob-the use of MoCaVa
based on simple examples Part III contains two extensive case studies of full−scaleindustrial processes
Trang 13How to Use this Book
Successful grey−box identification of industrial processes requires knowledge of two
kinds, i) how the process works, and ii) how the software works Since the knowledge
is normally not resident within the same person, two must contribute Call them cess engineer” and “model designer” The latter should preferably have taken a course
“pro-in ‘Process identification’
Part I is for the “model designer”, who needs to understand how the MoCaVa
soft-ware operates, in order to appreciate its limitations − what it can and cannot do
Part II is for both It is a tutorial on running MoCaVa, written for anyone who
actu-ally wants to build a grey−box model It is also useful as an introduction to the casestudies, since it is based on two pervading simple examples
Part III is also for both It develops the case studies in some detail, highlighting the
contributions of the three ‘actors’ in the session, viz the engineer, the model designer/ program operator, and the MoCaVa program The technical details in Part III is prob-
ably of interest only to those working in the relevant businesses (steel or paper & pulp),but are still important as illustrations of the issues that must be considered in practicalgrey−box process identification
The style of parts II and III deviates somewhat from what is customary in textbooks, namely to use sentences in passive form, free of an explicit subject The idea
of the customary practice is that science and engineering statements should be validirrespective of the subject Unfortunately, the custom is devastating for the under-standing, when describing processes where there are indeed several subjects involved
“Who does what” becomes crucial Therefore, part II is written more like a user’smanual In describing grey−box identification practice there is, logically, no less thatfive ‘actors’ involved:
: The customer/engineer (providing the prior information about the physical processand appraising the result)
: The model designer/user of the program tools (often the same person as the tomer, but not if he/she lacks sufficient knowledge of the physical process to bemodelled)
cus-: The computer and program (analyzing the evidence of the data)
: The author of this book (trying to reason with a reader)
: The reader of the book (trying to understand what the author tries to say)
In order to reduce the risk of confusion when describing a grey−box identificationsession − a process that involves at least the first three actors − the following conven-tion will be used in the book:
The contributions of the different actors are marked with symbols at the beginning
of the paragraph, viz for the operator (doing key pressing and mouse clicking), for MoCaVa (computing and displaying the results), and for the model builder
(watching the screen, deliberating, and occasionally calculating on paper) It will nodoubt help the reader who wants to follow the examples on a computer, that the symbol
states explicitly what to do in each moment, and the symbol points to the expectedresponse There are also paragraphs without an initiating symbol − they have the ordi-nary rôle of the author talking to a reader
Also as a convention, Courier fonts are used for code, as well as for variablesthat appear in the code, and for names of submodels, files, and paths.HelveticaNarrow
is used for user communication windows and for labels that appear in screen images
Trang 14xiv
The book uses a number of special terms and concepts of relevance to process tification Some, but not all should be well−known, or self−explanatory to model de-signers, but probably not all The “Glossary of Terms” contains short definitions, with-out mathematics, and some with clarifying examples The list serves the same purpose
iden-as the ‘hypertext’ function in HTML documentation, although less conveniently.The contents in Part II is also available in HTML format This form has the well−known advantage that explanations of some key concepts become available at a mouse
click, and only if needed In Part II explanations appear either under the headers Help
or Hints, or else as references to sections in the appendix, which unavoidably means
either wading through text mass (that can possibly be skipped), or looking up the propriate sections in the appendix In order to reduce the length of Part II the number
ap-of printed screen images is also smaller than those in the HTML document
MoCaVa is downloadable from www.springer.com/1−84628−402−3
to-gether with all material needed for running the case studies (The package also tains the HTML−manual as well as on−line help facilities.) This offers a possibility toget more direct experience of the model−design session It would therefore be possible
con-to use Parts II and III as study material for a course in grey−box process identification
Acknowledgements
The author is indebted to the following individuals who participated in the Grey−boxdevelopment project:
Stefan Graebe, who wrote the first C−version of the IdKit tool box, and later
partici-pated in the Continuous Casting case study
James Sørlie, who investigated possible interfaces to other programs.
Bohao Liao, who investigated search methods.
Ning He, who investigated real−time evaluation of Likelihood.
Anders Hasselkvist, who wrote Predat.
Tomas Wenngren, who wrote the first GUI.
Germund Mathiasson and Jiri Uosukainen who wrote the first version of Validate Olle Ehrengren, who wrote the first version of Simulate.
Ping Fan, who did the Baker’s Yeast case study.
Björn Sohlberg, who did the first Steel Rinsing case study.
Jonas Funkquist, who did the Pulp Digester case study.
Oliver Havelange, who did the Cement Milling case study.
Jens Pettersson, who did the second Cardboard case study.
Ola Markusson, who did the EEG−signals case study.
Bengt Nilsson, who contributed process knowledge to the Cardboard case study Jan Erik Gustavsson, who contributed process knowledge to the Recovery Boiler case
study
Alf Isaksson, who participated in the Pulp Refiner and Drive Train cases, and headed
the MoCaVa project between 1998 and 2001.
Linus Loquist, who designed the MoCaVa home page.
Trang 15Part I Theory of Grey−box Process Identification
1 Prospects and Problems 3
1.1 Introduction 3
1.2 White, Black, and Grey Boxes 4
1.2.1 White−box Identification 5
1.2.2 Black−box Identification 6
1.2.3 Grey−box Identification 10
1.3 Basic Questions 13
1.3.1 Calibration 14
1.3.2 How to Specify a Model Set 15
1.4 and a Way to Get Answers 17
1.5 Tools for Grey−box Identification 18
1.5.1 Available Tools 18
1.5.2 Tools that Need to Be Developed 21
2 The MoCaVa Solution 23
2.1 The Model Set 23
2.1.1 Time Variables and Sampling 24
2.1.2 Process, Environment, and Data Interfaces 25
2.1.3 Multi−component Models 27
2.1.4 Expanding a Model Class 29
2.2 The Modelling Shell 31
2.2.1 Argument Relations and Attributes 34
2.2.2 Graphic Representations 37
2.3 Prior Knowledge 41
2.3.1 Hypotheses 42
2.3.2 Credibility Ranking 43
2.3.3 Model Classes with Inherent Conservation Law 43
2.3.4 Modelling ‘Actuators’ 44
2.3.5 Modelling ‘Input Noise’ 46
2.3.6 Standard I/O Interface Models 49
2.4 Fitting and Falsification 51
Trang 16xvi
2.4.1 The Loss Function 52
2.4.2 Nesting and Fair Tests 54
2.4.3 Evaluating Loss and its Derivatives 55
2.4.4 Predictor 56
2.4.5 Equivalent Discrete−time Model 56
2.5 Performance Optimization 57
2.5.1 Controlling the Updating of Sensitivity Matrices 58
2.5.2 Exploiting the Sparsity of Sensitivity Matrices 59
2.5.3 Using Performance Optimization 60
2.6 Search Routine 62
2.7 Applicability 65
2.7.1 Applications 65
2.7.2 A Method for Grey−box Model Design 67
2.7.3 What is Expected from the User? 68
2.7.4 Limitations of MoCaVa 69
2.7.5 Diagnostic Tools 69
2.7.6 What Can Go Wrong? 71
Part II Tutorial on MoCaVa 3 Preparations 77
3.1 Getting Started 77
3.1.1 System Requirements 77
3.1.2 Downloading 77
3.1.3 Installation 77
3.1.4 Starting MoCaVa 78
3.1.5 The HTML User’s Manual 78
3.2 The ‘Raw’ Data File 78
3.3 Making a Data File for MoCaVa 78
4 Calibration 83
4.1 Creating a New Project 83
4.2 The User’s Guide and the Pilot Window 85
4.3 Specifying the Data Sample 86
4.3.1 The Time Range Window 86
4.4 Creating a Model Component 88
4.4.1 Handling the Component Library 89
4.4.2 Entering Component Statements 90
4.4.3 Classifying Arguments 92
4.4.4 Specifying I/O Interfaces 95
4.4.5 Specifying Argument Attributes 98
4.4.6 Specifying Implicit Attributes 100
4.4.7 Assigning Data 100
4.5 Specifying Model Class 101
4.6 Simulating 103
4.6.1 Setting the Origin of the Free Parameter Space 103
4.6.2 Selecting Variables to be Plotted 104
4.6.3 Appraising Model Class 105
Trang 174.7 Handling Data Input 106
4.8 Fitting a Tentative Model Structure 107
4.8.1 Search Parameters 108
4.8.2 Appraising the Search Result 111
4.9 Testing a Tentative Model Structure 113
4.9.1 Appraising a Tentative Model 116
4.9.2 Nesting 118
4.9.3 Interpreting the Test Results 119
4.10 Refining a Tentative Model Structure 121
4.11 Multiple Alternative Structures 122
4.12 Augmenting a Disturbance Model 124
4.13 Checking the Final Model 132
4.14 Terminals and ‘Stubs’ 134
4.15 Copying Components 135
4.16 Effects of Incorrect Disturbance Structure 138
4.17 Exporting/Importing Parameters 140
4.18 Suspending and Exiting 141
4.18.1 The Score Table 142
4.19 Resuming a Suspended Session 143
4.20 Checking Integration Accuracy 143
5 Some Modelling Support 147
5.1 Modelling Feedback 147
5.1.1 The Model Class 148
5.1.2 User’s Functions and Library 153
5.2 Rescaling 154
5.3 Importing External Models 159
5.3.1 Using DymolaZ as Modelling Tool for MoCaVa 160
5.3.2 Detecting Over−parametrization 166
5.3.3 Assigning Variable Input to Imported Models 170
5.3.4 Selective Connection of Arguments to DymolaZ Models 173
Part III Case Studies 6 Case 1: Rinsing of the Steel Strip in a Rolling Mill 185
6.1 Background 185
6.2 Step 1: A Phenomenological Description 185
6.2.1 The Process Proper 185
6.2.2 The Measurement Gauges 188
6.2.3 The Input 189
6.3 Step 2: Variables and Causality 189
6.3.1 The variables 189
6.3.2 Cause and effect 190
6.3.3 Data Preparation 191
6.3.4 Relations to Measured Variables 192
6.4 Step 3: Modelling 194
6.4.1 Basic Mass Balances 194
6.4.2 Strip Input 201
Trang 18xviii
6.5 Step 4: Calibration 203
6.6 Refining the Model Class 206
6.6.1 The Squeezer Rolls 206
6.6.2 The Entry Rolls 211
6.7 Continuing Calibration 213
6.8 Refining the Model Class Again 215
6.8.1 Ventilation 215
6.9 More Hypothetical Improvements 217
6.9.1 Effective Mixing Volumes 217
6.9.2 Avoiding the pitfall of ‘Data Description’ 219
6.10 Modelling Disturbances 222
6.10.1 Pickling 222
6.10.2 State Noise 223
6.11 Determining the Simplest Environment Model 225
6.11.1 Variable Input Acid Concentration 225
6.11.2 Unexplained Variation in Residual Acid Concentration 225
6.11.3 Checking for Possible Over−fitting 229
6.11.4 Appraising Roller Conditions 233
6.12 Conclusions from the Calibration Session 233
7 Case 2: Quality Prediction in a Cardboard Making Process 235
7.1 Background 235
7.2 Step 1: A Phenomenological Description 235
7.3 Data Preparation 237
7.4 Step 2: Variables and Causality 244
7.4.1 Relations to Measured Variables 247
7.5 Step 3: Modelling 248
7.5.1 The Bending Stiffness 248
7.5.2 The Paper Machine 253
7.5.3 The Pulp Feed 260
7.5.4 Control Input 262
7.5.5 The Pulp Mixing 265
7.5.6 Pulp Input 267
7.5.7 The Pulp Constituents 269
7.6 Step 4: Calibration 271
7.7 Expanding the Tentative Model Class 279
7.7.1 The Pulp Refining 279
7.7.2 The Mixing−tank Dynamics 284
7.7.3 The Machine Chests 287
7.7.4 Filtering the “Kappa” Input 289
7.8 Checking for Over−fitting: The SBE Rule 290
7.9 Ending a Calibration Session 293
7.9.1 ‘Black−box’ vs ‘White−box’ Extensions 293
7.9.2 Determination vs Randomness 294
7.10 Modelling Disturbances 295
7.11 Calibrating Models with Stochastic Input 296
7.11.1 Determination vs Randomness Revisited 299
7.11.2 A Local Minimum 304
7.12 Conclusions from the Calibration Session 306
Trang 19A Mathematics and Algorithms 313
A.1 The Model Classes 313
A.2 The Loss Derivatives 316
A.3 The ODE Solver 317
A.3.1 The Reference Trajectory 317
A.3.2 The State Deviation 318
A.3.3 The Equivalent Discrete−time Sensitivity Matrices 318
A.4 The Predictor 321
A.4.1 The Equivalent Discrete−time Model 322
A.5 Mixed Algebraic and Differential Equations 322
A.6 Performance Optimization 326
A.6.1 The SensitivityUpdateControl Function 327
A.6.2 Memoization 330
A.7 The Search Routine 330
A.8 Library Routines 331
A.8.1 Output Conversion 331
A.8.2 Input Interpolators 331
A.8.3 Input Filters 334
A.8.3 Disturbance Models 335
A.9 The Advanced Specification Window 337
B.2.1 Optimization for Speed 337
B.2.2 User’s Checkpoints 338
B.2.3 Internal Integration Interval 338
B.2.4 Debugging 339
Glossary 341
References 345
Index 349
Trang 20identification”, typically defined as follows: “Given a parametric class of models, find
the member that fits given experiment data with the minimum loss according to a given criterion” (Ljung, 1987) Now, the three “given” conditions concern anyone who in-
tends to apply the software, whether that is in the form of theory, method, or computer
program Sometimes “given” means that prerequisites are built into the software,sometimes that they are expected as input from the user of the software
When one is faced with a given object instead, and possibly also with a given
pur-pose for the model, it is certainly not obvious how to get the answers to the questions
posed by identification software It is therefore important that developers of such ware do what they can to facilitate the answering It is not necessarily a desirable ambi-tion to make the software more automatic by demanding less from the user He or she
soft-is still responsible for the quality of the result, and any input that a user soft-is able to vide, but is not asked for, may be a waste of information and reduce the quality of the
pro-model A better goal is therefore to make the software demand its input in a form that
the user can supply more easily
Secondly, user input (both prior knowledge and experiment data) is often tain, irrelevant, contradictory, or even false A second goal for the software designer
uncer-is therefore to provide tools for apprauncer-ising the user’s input Admittedly, any softwaremust have something ‘given’, but it makes a difference whether the software wants
assumptions, taken for facts, or just hypotheses, that will be subject to tests This
moti-vates the decision to base MoCaVa on the ‘grey−box’ approach.
The general and somewhat vague idea of grey box identification is that when one
is making a mathematical model of a physical object, there are two sources of tion, namely response data and prior knowledge And grey−box identification meth-
informa-ods are such methinforma-ods that can use both
In practice, “prior knowledge” means different things And generally, prior edge is not easy to reconcile with the form of the models assumed by a particular identi-fication method In fact, each method starts with assuming a model class, and eachmodel class requires its particular form of prior knowledge What one can generally
knowl-do in order to take prior knowledge into account is to start with a versatile class of els, for which there are general tools available for analysis and identification, and try
Trang 21mod-and adapt its freedom, its ‘design parameters’, i.e., the specifications one has to enter
into the identification program, to the prior knowledge This means that the ‘grey−boxidentification methods’ tend to be as many and as diversified as the conventional iden-tification methods, also starting with given classes of models This makes it hard todelimit grey−box identification from other identification and also to make a survey of
‘grey box identification methods’
Neither is that the purpose of this chapter Instead it is to survey the fundamentals
the MoCaVa software is based on A user of the program will conceivably benefit from
an understanding of the purposes of the operations performed by various routines in
the program Generally, MoCaVa is constructed by specializing and codifying the
gen-eral concepts used in (Bohlin, 1991a) and following one of the procedures derived inthat book
In addition, the chapter will briefly discuss the prospects and problems of ing grey−box identification software further
develop-1.2 Black, White, and Grey Boxes
Commercially available tools for making mathematical models of dynamic processes
are of two kinds, with different demands on the user On one hand there are Modelling tools, generally associated with simulation software (e.g., DymolaZ, http://www.dy-
nasim.se/www/Publications.pdf), which require the user to provide complete cation of the equations governing the process, either expressed as statements written
specifi-in some modellspecifi-ing language, such as ModelicaX (Tiller, 2001), or by connectspecifi-ing ponents from a library This alternative may be supported by combining the modelling
com-tools with com-tools for parameter optimization (e.g., HQP,
http://sourceforge.net/pro-jects/hqp) Call this “white−box” identification
On the other hand there are “black−box” system identification tools (e.g., LABX System Identification Tool Box), which require the user to accept one of the ge- neric model structures (e.g., linear) and then to determine which tools to use in the par-
MAT-ticular case, and in what order, as well as the values of a number of design parameters
(order numbers, weighting factors, etc.) Finally, the user must interpret the resulting
model, which is expressed in a form that is not primarily adapted to the physical object.Unless the model is to be used directly for design of feedback control, there is somefurther translation to do
Generally, the user has two sources of information on which to base the model
ma-king: prior knowledge and experiment data “White−box” identification uses mainly
one source and “black−box” identification the other The strength of “white−box”identification is that it will allow the user to exploit invariant prior knowledge Itsweakness is its inability to cope with the unknown and with random effects in the ob-ject and its environment The latter is the strength of “black−box” identification based
on statistical methods, but also means that the reproducibility of its results may be indoubt In essence, “black−box” identification produces ‘data descriptions’, and re-peating the experiment may well produce a much different model This may or maynot be a problem, depending on what the model is to be used for
The idea of “grey−box” identification is to use both sources, and thus to combinethe strengths of the two approaches in order to reduce the effects of their weaknesses.When following Ljung’s definition of “System identification”, and regardless ofthe ‘colour’ of the ‘box’, the designer of a model of a physical object must do two
things, i) specify a class of models, and ii) fit its free elements to data Call this
Trang 221 Prospects and Problems
ling” and “Fitting” A method with a darker shade of ‘grey’ uses less prior knowledge
to delimit the model class Even if most available identification methods tend to bemore or less ‘grey’, the following notations allow a formal distinction between the ge-neric ‘white’, ‘black’, and ‘grey box’ approaches to model design
1.2.1 White−box Identification
Since both the model class definition and the fitting procedure are implemented as gorithms they can be described formally as functions:
The model designer specifies the model class F, which may contain a given number
of unknown parameters θ Given a control sequence u t(where the subscript denotes
the input history from some initial time up to present time t), and the parameter vector
θ, a simulation program allows the computing of the model’s response z (t|θ) at any time Any unknown parameters θ are then estimated by applying an optimization pro-
gram minimizing the deviation between measured response datay Nand those nents of the model’s output z Nthat correspond to the measured values The deviation
compo-is measured by a given loss function E The latter compo-is usually a sum of squared neous deviations, but various filtering schemes may be used to suppress particulartypes of data contamination
instanta-The following are some well−knownobstacles to designing “white boxes” in tice:
prac-: Unknown relations between some variables: Engineers often do not have the
com-plete mathematical knowledge of the object to be able to write a simulation model
: Too many relations for convenience: When they do have the knowledge, the result
is often too complex a model to be possible to simulate with the ease required forparameter fitting Many physical phenomena are describable only by partial differ-ential equations Simulation would then require supercomputers, and identifica-tion an order of magnitude more (Car and airplane designers could possibly affordthe luxury.)
: Unknown complexity: It falls solely on the designer to determine how much of the
known relations to include in the model
: Sensitivity to low−frequency disturbances: Comparing output of deterministic
models with data in the presence of low−frequency disturbances generally givespoor parameter estimates
: Primitive validation: If one would try and use only literature values for parameters,
or make separate experiments to determine some of them, in order to avoid thecumbersome calibration of a complex model and the usually expensive exper-imentation on a large process, this makes it the more difficult to validate the model
Remark 1.1 The sensitivity to disturbances can sometimes be reduced by clever
design of the loss function This requires some prior information on the object’s ronment
envi-Example 1.1
Consider a cylindrical tank with cross−section area A filled with liquid of density à up
to a level z, under pressure p, and having a free outlet at the bottom with area a The
Trang 23tank is replenished with the volume flow f According to Bernoulli’s law the variations
in the level will be governed by the following differential equation:
Withu = (f, p)as varying control variables, the equation cannot be solved
analytical-ly, but given values of θ = (A, a, Ã, g), an ODE solver will be able to produce a
se-quence of values{z (kh|θ)|k = 1, , N} of z sampled with interval h Hence F is
defined as the ODE solver operating on an equation of some form like
der(z) = −a*sqrt(z*g + p/rho) + f/A
with given constant parameters a,A,g,rho and variable control input p,f
With a recorded sequence of measurements y N = {y(kh)|k = 1, , N} of the tank level z during an experiment with known, step−wise changing input sequences
u N, it will be possible to set up and evaluate the loss function
E(uN , θ) =
N k=1
for any given value of θ Applying an optimization program, it will then be possible
to minimize the loss function with respect to any combination of the parameters, and
in this way estimate the values of any unknowns among(A, a, Ã), but not the value of gravity g.
1.2.2 Black−box Identification
Defining this case is somewhat more complicated, since the task usually involves termining one or more integer ‘order’ numbers, the values of which determine thenumber of parameters to be fitted (Ljung, 1987; Söderström and Stoica, 1989):
The designer cannot change Fn, which is particular to the method, except by
specify-ing an order index n The latter normally determines the number of unknown ters θ n However, the model class accepts a second, random input signal ω t(usually
parame-‘white noise’) in order to model the effects of random disturbances For given order
numbers the parameters θ nare estimated by minimizing the deviation between
re-sponse data and m steps predicted output (usually one step) according to a given loss
function E The difference between the model and the predictor is that the latter uses
previous, m steps delayed response data y t−min addition to the control sequenceu tforcomputing the predicted responses The predictor Pnis uniquely determined by Fn.However, exact and applicable predictors are known only for special classes Fn, and
this limits the versatility of black box identification programs Unknown orders n are
usually determined by increasing the order stepwise, and stopping when the loss
re-duction drops below a threshold χ2 A popular alternative is to use a loss function that
Trang 241 Prospects and Problems
weights the increasing complexity associated with increasing n, which allows mization with respect to both integer parameters n and real parameters θ n(Akaike,1974) The model classes are most often linear, but nonlinear black−box model classesare also used (Billings, 1980)
mini-The following are practical difficulties:
: Restricted and unfamiliar form: Many engineers do not feel comfortable with
models produced by black−box identification programs based on statistical ods Mainly, the model structure and parameters do not have a physical interpreta-tion, and this makes it difficult to compare the estimates with values from othersources
meth-: Over−parametrization: The number of parameters increases rapidly with the
num-ber of variables, and even more so when the model class is nonlinear This leadseasily to ‘over−fitting’, with all sorts of numerical problems and poor accuracy
: Poor reproducibility: What is produced is a ‘data description’ If this is also to be
an ‘object description’ the model class must contain a good model of the object
If it does not; if much of the variation in the data is caused by phenomena that arenot modelled well enough by Fnas effects of known inputu t, the fitting proce-
dure tends to use the remaining free arguments ω t and θ nto reduce the deviations
In other words, what cannot be modelled as response to control, will be modelled
as disturbance In this way even a structurally wrong model may still predict well
at a short range If the data sequence is long, the estimated parameter accuracy mayeven be high This means that one may well get a good model, with good short−
range predicting ability and a high theoretical accuracy, but when the identification
is repeated with a different data set, an equally ‘good’ but different model is tained That will not necessarily mean that the object has changed between the ex-
ob-periments; it may be a consequence of fitting a model with the wrong structure.Generally, it will be difficult to get reproducibility with black−box models, unlessthe dynamics of the whole object are invariant, including the properties of distur-
bances, and the model structure is right.
The basic cause of the poor reproducibility of black boxes, is that it is not possible to
enter the invariant and object−specific relations that are the basis of white−box
mod-els To gain the advantages of convenience and quick result, the model designer is infact willing to discard any result from previous research on the object of the modelling
Remark 1.2 Adaptive control will conceivably be able to alleviate the effects of
poor reproducibility, and benefit from the good predictive ability of the model, but thiscan be exploited only for feedback control of such variables that have online sensors.Monitoring of unmeasured variables, as well as control with long delays will still behazardous
Remark 1.3 Tulleken (1993) has suggested a way to force some prior knowledge
into black−box structures, thus making the models less ‘black’
Example 1.2
With the same tank object as in Example 1.1, one could choose to ignore the findings
of Bernoulli and describe the process as a “black box” A linear model is the most ular first choice, but if one would suspect that the process is nonlinear, and also takeinto account some rudimentary prior knowledge (that a hole at the bottom tends to re-duce the level), the following heuristic form would also be conceivable:
pop-dz dt = p1z + p2p + p3f − [p4z + p5p + p6f]α (1.9)
Trang 25Incidentally, this form contains the ‘true’ process, Equation 1.3, with p1= p2=
p6= 0, p3= 1 A, p4= a2g, p5= a2 Ã , and α = ½ But normally, that is not the
case
A more likely, and ‘blacker’ form, would be
dz dt = p1+ p2z + p3p + p4f + p5z2+ p6p2+ p7f2 (1.10)
This will define a deterministic black box of second order F n (u t , 0, θ n ) → z(t|θ n),
where n = 2, u = (f, p) and θ2= (p1, , p7) It can be processed as in Example 1.1
If the parameters are many enough, if measurements are accurate, and if the ment is not subject to external or internal disturbance, the resulting model may evenperform almost as well as the white box
experi-If, however, the varying pressure p is not recorded, it might still be possible to use
the following form
dz dt = p1+ p2z + p4f + p5z2+ p7f2+ v (1.11)
where ω is ‘white noise’, and v is ‘Brownian motion’ to model the unknown term
p3p + p6p2 Hence, θ2= (p1, p2, p4, p5, p7, p8)
When models have unknown input it becomes necessary to find the one−step (or
m−step)predictor associated with it, in order to be able to minimize the sum of squares
of prediction errors Exact predictors are known only for some classes of models And
even if the model belongs to a class which does allow a predictor to be derived, tion is usually no simple task
deriva-However, black−box identification programs have already done this for fairly eral classes of models that do allow exact derivation One such class is the NARMAX(for Nonlinear Auto Regressive Moving Average with eXternal input) discrete−timemodel class (Billings, 1980)
gen-y (τ) +
nz ν=1
na k=1
a ν
k P ν [y(τ − k)]=
nz ν=1
nb k=1
b ν
k P ν [u(τ − k)]
+ c0w (τ) +
nc k=0
The parameter array θ nis the collection of all a ν
k,b ν
k, andc kin Equation 1.13
Notice that Equation 1.13 contains only measured output y in addition to the input
u, which is an essential restriction, but makes it easy to derive a predictor (which is why
the class is defined in this way) Since the values ofw (τ) can be computed recursively
from Equation 1.13, and since E{w (τ|y τ−1 )} = 0, the predictor follows directly as
Trang 26a ν
k P ν [y(τ − k)]+
nz ν=1
nb k=1
b ν
k P ν [u(τ − k)]
+ 0 +
nc k=1
The special case ofn c= 0 (NARX) is particularly convenient Since the predictor
in Equation 1.14 will then be a linear function in all unknown parameters, and the lossfunction therefore quadratic, this makes it technically easy to fit a large number of pa-rameters
Now back to the original model, Equations 1.11 and 1.12 If, for simplicity, oneassumes that the sampling is dense enough, then
Trang 27Assuming that the measurements are accurate enough to allow y (τ) to replace z(th)
and with u (τ) = f (th), makes Equation 1.21 imbedded in 1.13, with P ν (u) ≡ u ν,
After a and b have been determined by minimizing Equation 1.15, the remaining
parameterc0can be computed from the minimum loss
However, reconstructing the original parameters p from the estimated a, b, and c
creates an over−determined system of equations; there are five unknown and nineequations This can be solved too, for instance using a pseudo inverse, but still causes
a complication, since the relations are case dependent and not preprogrammed into theidentification package
If one would want to avoid even having to determine the order numbers, and setting
up Equations 1.11 and 1.12, and hence also to reconstruct the parameters, it is possible
to specify sufficiently large order numbers, and let the necessarily large order numbers
be determined by the identification program The SFI rule (for Stepwise Forward clusion) achieves this (Leontaritis and Billings, 1987) It is a recursive procedure:The SFI rule:
In-Initialize n a = n b = n c = n z= 0
While significant reduction, repeat
For x ∈ (a, b, c, z), do
Alternative order number:n → ν; ν x + 1 → ν x
Compute alternative loss:Q x
and indicate significant reduction
It is possible to design the loss function E and compute the χ2threshold in such
a way that the decision of model order can be associated with risk values The
“Maxi-mum−Power” loss function used by MoCaVa minimizes the risk for choosing a lower order when a higher order is correct The threshold value χ2is based on a given riskfor choosing the higher−order model, when the lower−order is the correct
Trang 281 Prospects and Problems
able to convert into algorithms simple enough to allow i) simulation, ii) automatic ivation of at least an approximate predictor, and iii) fitting of parameters The particu- lar limitations imposed by MoCaVa will be specified below.
der-Remark 1.4 Continuous−time white noise into nonlinear models has to be handled
with care (Åström, 1970; Graebe, 1990b) In practice the equations have to be
discre-tized, and ω replaced by discrete−time white noise w (Section A.1, Restriction #3).
As would be expected, there are difficulties also with grey−box identification,
some have been experienced using MoCaVa3 and its predecessors Some may vanish
with further development, others are fundamental and will remain:
: Heavy computing: MoCaVa needs, in principle, to evaluate the sensitivities of all
state derivatives with respect to all states and all noise variables for all instants inthe time range of the data sample, and for deviations in all parameters that are to
be fitted And this must be repeated until convergence And again, the whole cess must be repeated until a satisfactory model structure has been found In theworst case each evaluation requires access to the model, which altogether creates
pro-a very lpro-arge number of executions of the pro-algorithm defining the model For otherthan small models the dominating part of the execution time is spent inside theuser’s model Since, the time it takes to run the user’s model once is not ‘negotia-
ble’, the only option for improving the design of MoCaVa is to try and reduce the
number of model accesses by taking shortcuts However, since the model structure
is relatively free, it is difficult to exploit special structural properties in order to beable to find the shortcuts, like the black−box methods are able to A way that is still
open is to have MoCaVa analyze the user−determined structure, in order to find such shortcuts MoCaVa3 is provided with some tools to do this (see Section 2.5).
: Interactive: It is difficult to reduce the time spent by the user in front of the
comput-er, for instance by doing the heavy computing overnight
: Failures: More freedom means more possibilities to set up problems that MoCaVa
cannot cope with The result may be that the search cannot fit parameters, or worse,produces a model that is wrong, because the assumptions built into its design arenot satisfied in the particular case The ‘advanced’ options that may become neces-sary to use with complex models require some user’s specifications of approxima-tion levels, and this adds another risk The causes of failures are discussed in Sec-tion 2.7.6
: Stopping: Available criteria for deciding when to stop the calibration session are
somewhat blunt instruments When a model cannot be falsified by the default ditional tests, this may well be so because the user has run out of ideas on how toimprove it In that case unconditional tests will have to do However, they do notgenerally have maximum power, and therefore have a larger risk of letting a wrongmodel pass A user may have to supply subjective assessment, in particular bylooking at ‘transients’ in the plotted residual sequences
con-: Too much stochastics: Stochastic models are marvellous short−range predictors,
and therefore generally excel in reducing the loss, in particular with slowly sponding processes Technically, they have at their disposal the whole sequence ofnoise input to manipulate, in addition to the free parameters, in order to reduce theloss However, they have a tendency to assign even some responses of known input
re-to disturbances, if given re-too much freedom re-to do so The result is inferior ibility, since disturbances are by definition not reproducible
Trang 29reproduc-Example 1.3
Return again to the tank object, Equation 1.3, and assume that the varying pressure p
has not been recorded during the experiment Since the model class Fnis not grammed but defined by the user in each case, the user must enter code like
prepro-der(z) = −a*sqrt(z*g + p/rho) + f/A
and, in addition, specify a model for describing the unknown p The latter may well
be a black−box (preprogrammed) model, unless one knows something particular
about p, that does not let it be modelled by a black box.
Since p is not in the data, the next step is to find a predictor for it, and for the output.
When the model is nonlinear, an optimal predictor is usually not practical, but mal predictors are Most common are various types of EKF (for Extended Kalman Fil-ter)
subopti-Armed with such a predictor, it is possible to proceed as in the black−box case, though the mode of operation of the program will be different Mainly, the EKF (which
al-is preprogrammed) must call a function that defines the model class, and which pends on the entered code, and therefore must be compiled and linked for each case(like in white box identification, and like in any ODE solver in a simulation program)
de-If, again for simplicity, the sampling interval h is short enough, and if the Brownian motion is used to model p, the discrete−time equivalent of the model will be
z (t + h) = z(t) + h [− a z(t) g + p(t) Ã + f (t) A] (1.25)
wherew iare uncorrelated Gaussian random sequences with zero means and unit
vari-ances, and σ and λ are parameters introduced to measure the average size of ment errors and the average rate of change of the unknown pressure p Unlike in Exam-
measure-ple 1.2, the measurements errors need not be small
It is convenient to use the state−equation form
Trang 301 Prospects and Problems
whereA (τ) = ∇ x G (x), C = H The optimal filter gain K(τ) is computed using an
al-gorithm that involves the solutionR xx (τ) of the Riccati equation associated with
Equa-tions 1.28 to 1.31
Remark 1.5 It is more common to use G (x^) in Equation 1.35 instead of G(x), since
this makes a better approximation, should the estimate^x drift far from the referencetrajectoryx On the other hand, this will make the EKF more susceptible to large distur-
bances, and will thus increase the risk for instability With Equation 1.35, both thisand Equation 1.36 are stable as long as the model is, while a negative value of
x
^
1g + x^
2 Ã, for instance caused by a spurious value in y (τ), would cause a run−time
error in the evaluation ofG (x^)
Notice that most matrices that are needed to handle the consequences of having an
unknown input depend on τ, which means that the calculations generally take much
more time than with a white box
The predictor is given by Equation 1.33, the prediction error by Equation 1.34, andthe loss function is computed from Equation 2.23:
Q (θ) = 12
N k=1
It is a function of θ = (a, A, Ã, λ, σ), which can therefore be estimated by an
opti-mization routine, like in the white−box case
An option that can be copied from the black−box identification procedure is the
estimation of model complexity, by testing whether all parameters are significant, or,
alternatively, some could possibly be left out from the model For instance, the SFI rulewill work, and risk values for making a wrong decision can be computed
1.3 Basic Questions
MoCaVa has been conceived with the following scenario in mind: Suppose a
produc-tion process is to be described by a dynamic model for simulaproduc-tion or other purposes
A number of submodels (or first principles or heuristic relations) for parts of the cess are available as prior information, developed under more or less well controlledconditions However, when the submodels are assembled into a model for the inte-grated process, all their input and output are no longer controlled or measured, and theenvironment is no longer the same as when the submodels were developed In addi-tion, unmodelled phenomena and unmeasurable input (disturbances) may effect theresponses significantly It is not known which of the submodels that are needed for asatisfactory model, or whether there will still remain unexplained phenomena in thedata, when all prior information has been used And, again, prior model information
pro-is more or less precpro-ise, reliable, and relevant
This raises a number of questions from a model designer:
: How can I make use of what I do know? One usually knows some, but not all And
it may be a waste not to use it
: How much of my prior knowledge is useful, or even correct when used in the
par-ticular environment? Too much detailed knowledge tends only to contribute to thecomplexity of the model and less to satisfying the purpose for which it is made
Trang 31Formulas obtained from the literature are often derived and verified in an ment quite different from the circumstances under which they will be put to use.
environ-: What do I do about the disturbances I cannot eliminate? This is the opposite
prob-lem: too little prior knowledge The response of an object is usually the effect oftwo kinds of input, known and unknown Call the second “disturbances” If onedoes not have a model for computing the unknown input, and cannot just neglect
it, then some will obviously have to be assumed instead
: Are my experiment data sufficient and relevant? Can I use ordinary data loggings,
obtained during normal production and therefore at little cost? Or do I have to makespecially designed experiments (and lose production while it is going on)?
: How do I know when the model is good enough? It may (or may not) be hazardous
just to try and use the model for the purpose it was designed, for instance control,and see if it works That depends of course on what it costs to fail
1.3.1 Calibration
Needless to say, none of the questions can be answered in advance Considering thediversity of a user’s prior information, originating in a variety of more or less reliablesources, it is also very unlikely that one would be able to formulate, much less solve,
a mathematical problem that, given prior input and data, would produce a ‘best’ modelaccording to a given criterion (and thus be able retain the usual definition of the identi-fication problem)
However, it is possible to conceive a multistep procedure for making a model that
satisfies many of the demands one may have on it, and taking the user’s prior
knowled-ge into account The steps in this procedure will require the solutions of less ing subproblems, like fitting to data, and testing whether one model is significantlybetter than another The literature offers principles and ideas for solving many of thesubproblems, and a number of those have been compiled into a systematic procedurefor grey−box identification (Bohlin, 1986, 1991a, 1994a) One of the procedures has
demand-also been implemented as a User’s Shell (IKUS) to IdKit (Bohlin, 1993) However, its
principles are general and can be implemented with other tool boxes that are generalenough and open enough
MoCaVa is based on two such ‘trial−and−error’ procedures, calibration and idation The procedures operate on sets of models, since it is not given a priori how
val-extensive the set has to be in order to satisfy the experiment data and the purpose ofthe model making
The calibration routine finds the simplest model that is consistent with priorknowledge and not falsified by experiment data It is a double loop of refinement andfalsification derived from basic principles of inference (Bohlin, 1991a):
Calibration procedure:
While the model set is falsified, repeat
Refine the tentative model set
Fit model parameters
Falsify the tentative model: Until falsified, repeat
Specify an alternative model set
If any alternative model is better, then indicate falsified
Notice that the procedure works with two sets and two models, namely tentative, which is the best so far, and alternative, which may or may not be better.
Trang 321 Prospects and Problems
The questions that have to be answered are now i) how to specify a model set, ii) how to fit a model within a given set, and iii) how to decide whether an alternative
model is better than a tentative one
1.3.2 How to Specify a Model Set
The following first structuring of the model set F is motivated by the mode of
opera-tion of computers and common system software, like Unix and Windows Assume
there is a given component library{< i }, such that a given selection of its members will
combine into a system defining an algorithm able to compute response values z (t),
given input arguments Define the model sets
Model: F (u t , ω t , , ν , θ ν) (1.30)
Model structure: F (u t , ω t , , ν , ·) (1.40)
Model class: F (u t , ω t, , ·, ·) (1.41)
Model library: F (u t , ω t, ·, ·, ·) (1.42)where
: A model library is the set of all models that can be formed by combining
compo-nents <i It is the maximum set within which to look for a model
: A model class is a smaller set, defined by the argument , which is an array of
indices of selected components
: A model structure is an even smaller set, where also the free−space index ν is
giv-en This determines the dimension of the free parameter space with coordinates
θ ν
: A model is a single member in the model structure, selected by specifying also the values of the free coordinates θ ν It includes all specifications necessary to carryout a simulation of the model given the control input, the random sequence, andthe time range
Notice that this creates two means of refining a model set: with more components, orwith more free parameters Change of model class requires recompilation in order togenerate efficient code, change of model structure and model does not The definitionalso concretizes ‘prior knowledge’ as (hypothetical) algebraic relations between vari-ables, and various variable attributes (like ranges, scales, values, and uncertainty) Thecase−specific model library contains prior knowledge of the object, while class andstructure will depend also on the experiment data
The slightly more structured procedure becomes
Calibration procedure:
While the structure is falsified, repeat
Refine the tentative model structure: → F(u t , ω t,^, ν^, ·)
Fit model parameters: → F(u t , ω t,^, ν^, θ^ν)
Falsify the tentative model: Until falsified, repeat
If no more alternative structures,
then expand the alternative model class: → F(u t , ω t, , ·, ·)
Specify alternative model structures: → F(u t , ω t , , ν , ·)
If any alternative model is better,
then indicate falsified and assign →^, ν → ν^
Trang 33The procedure does fitting and testing of an expanding sequence of hypothetical modelstructures It starts with those based on the simplest and most reliable components inthe component library, for instance those based on mass and energy balances Thestructure is then expanded by a procedure of ‘pruning and cultivation’: Hypotheticalsubmodels (components) are augmented and the structure is tested again Those who
do not contribute to reducing the value of the loss function are eliminated Those who
do contribute become candidates for further refinement
The procedure is interactive: The computer does the elimination, and also suggeststhe most probable of a limited number of alternatives The model designer suggests
the alternatives In this way a user of MoCaVa is given an opportunity to exploit his
or her knowledge of the physical object and exercise the probably increasing skill inmodelling, in order to reduce the number of tests of alternatives that would be neededotherwise The construction is based on the belief that, even if it is difficult to specifythe right model structure in advance, an engineer is usually good at improving a model,when it has been revealed where it fails As a last resort it is possible to use empirical
‘black boxes’ to model such parts or physical phenomena of a process, that have beenrevealed as significant, but for which there is no prior knowledge
P2are polynomials, for instance Legendre polynomials, of first and second order, and
ωis continuous white noise with unit power density
Then Equation 1.43 defines the model F(ut , ω t , , ν , θ ν) with
θ ν = (θ a
11, ööö , θ a 1n , θ b
11, ööö , θ b 1n , θ a
21, ööö , θ a 2n , θ b
21, ööö , θ b 2n ,θ c
1, ööö , θ c
n , θ d
1, ööö , θ b)(1.46)The point of the double indexing with and ν is that it allows the definition of a number of smaller model classes F(u t , ω t, , ·, ·), for instance:
: Linear and deterministic: = (1), ν = (n a
: Nonlinear and deterministic: = (1, 2), ν = (n a , n b , n a , n b)
plus a number of less likely alternatives Notice that change of class changes the tions (differential equations) and hence the source code of the computer program,which means recompilation
func-Each class allows a number of model structuresF(u t , ω t , , ν , ·), defined by the values of the order numbers ν , which also determines the number of parameters in
the model structure Change of structure does not generally require recompilation,provided enough space has been allocated for a maximum order, or dynamic allocation
is used
When also the values of the parameters are given, this defines the model.
Trang 341 Prospects and Problems
The model libraryF(u t , ω t, ·, ·, ·) is the set of model classes from which the user
can pick one by specifying Each transfer function in Equation 1.43 defines a
com-ponent If, as in this example, all can be combined, this generates eight model classes
in the library, including the ‘null’ model class y = 0.
1.4 and a Way to Get Answers
How does this answer the original questions from a model designer?
: Question: How can I make use of what I do know?
Answer: By entering hypotheses, and by specifying which of those to try next.
: Question: How much of my prior knowledge is useful, or even correct when used
in the particular environment?
Answer: That is useful which reduces loss significantly It is correct, if fitting its
parameters yields values that do not contradict any prior knowledge
: Question: What do I do with the disturbances I cannot eliminate?
Answer: Describe them as stochastic processes.
: Question: How do I know when the model is good enough?
Answer: There are two meanings of “good enough”:
i) The model is not good enough for available data and prior knowledge, as long
as it can still be falsified
ii) The model is good enough for its purpose, when the validation procedure yields
a satisfactory result
: Question: Are my experiment data sufficient and relevant?
Answer: They are, if the validation procedure yields a result satisfying the purpose
of the model design
Remark 1.6 Valid logical objections can be raised against these rather flat
an-swers For instance, it is possible to conceive cases, where the experiment data is quate for the purpose, but where the calibration procedure has failed to reveal errors
ade-in the model structure (because there are no better alternative hypotheses) It is alsopossible to conceive disturbances that do not let themselves be described by stochasticprocesses of the types available in the library, or at all All hinges on the assumptionthat the model library will indeed allow an adequate modelling
Remark 1.7 Even if there are cases where it is theoretically correct to use the same
data for calibration and validation (‘auto−validation’), it is generally safer to base thevalidation on a second data set (‘cross−validation’) Still, much of the results of the
validation procedure hinges on the assumption that the second data set is demanding
enough If it is not, the validation procedure will not reveal an inadequate model Infact, a failure will not be revealed until the model has been put to use and failed ob-viously The costs related to the latter event, will therefore determine how much work
to put into the validation process For instance, paper machine control can afford tofail occasionally, Mars landers rather not at all
Remark 1.8 Logically, the calibration and validation procedures have little to do
with one another, since the meanings of a “good model” are different A model maywell be good enough for such a limited purpose as feedback control, and thus easilyvalidated in that respect, but still be unable to satisfy an extensive data sequence gener-ated by a complex object Conversely, a model satisfying a data sequence containinglittle dynamic information, may well satisfy that data, as well as all one knows aboutthe object in advance, but still be unable to satisfy its purpose when validated with dif-ferent and more damanding data
Trang 351.5 Tools for Grey−box Identification
The following is a list of what is needed for realizing the calibration and validationschemes:
: A versatile class of models: So that it does contain a suitable model for the
particu-lar purpose Models must be possible to simulate and fit conveniently
: A tool to restrict this class according to prior knowledge: This is the whole point
of the grey box concept It means that there must be some modelling tool, allowing the user to formulate the prior knowledge conveniently Model class restriction is what identification is all about, and user−supportedrestriction is what grey−box
identification is all about
: A tool to fit parameters: In order to find the model that agrees with data most.
: A tool to falsify fitted models: In order to eliminate incorrect hypotheses about the
object
: A tool to validate models: So that the model will not be more complicated than
needed for the purpose
: A procedure to follow: In addition to the tool kit there is also a need for some kind
of ‘handbook’ or ‘guide’ on how to build grey−box models using the tools Again,grey−box model making is an interactive process: At each step, the software may
or may not need more information from the user, or more data, depending onwhether the result so far is satisfactory or not
Remark 1.9 The list leaves out the problem of what to do when no model is good
enough for the purpose An answer is to try and get better data, and there are methods
for doing this in the literature, again valid for certain classes of models MoCaVa does
not support this
1.5.1 Available Tools
Some of the tools have been available for some time Let’s look at what they can andcannot do, in order to find out what more is needed
Nonlinear State Models
A reasonably general form that evades the subtleties of continuous−time stochasticdifferential equations, and lets itself be simulated is
where x is the state vector, u is known control, ω and w are continuous and discrete
‘white noises’, and p are parameters.
This is a rather versatile class that suits many physical processes, and is used, for
instance, in the identification tool boxes in the Cypros (Camo, 1987), Matrix x(Gupta
et al., 1993), IdKit (Graebe, 1990a−d; Graebe and Bohlin, 1992; Bohlin and Graebe,
1994a,b), and CTSM (Kristensen and Madsen, 2003) packages Given models of this form the tools fit parameters p to given data All use the Maximum Likelihood criteri-
on, but different ways of fitting The first two are commercial products IdKit is not commercial; it has been used for case studies and has been developed into MoCaVa.
Kristensen, Madsen, and Jørgensen (2004) use a somewhat more general form of
Equation 1.48, where the variance of the ‘diffusion term’ ω may depend on u,p,t.
Trang 361 Prospects and Problems
What are the obstacles to a wider application of grey−box Maximum Likelihoodidentification tools? Mainly that they are difficult to use for other than experts Thereare difficulties with
: Modelling: Since one has usually to try a large number of structures before finding
a suitable one, in particular with models of the complexity required by a full−scaleindustrial process, it becomes quite a tedious task to write all the models, and also
to maintain the necessary housekeeping of all rejected attempts
: Setup and interpretation: It is easy to set up meaningless problems for the tools to
solve (which they gladly do) It is more difficult to see whether the solutions areany good
: The state−differential equations also leave out important dynamical properties ofsome real objects, for instance those containing delays, or phenomena better de-scribed by partial differential equations, or containing hard discontinuities, likedry friction
Modelling Tools
They are tools to enter prior knowledge Examples are SimulinkX
(www.math-works.com/products/simulink), Bond graphs (Margolis, 1992), DymolaZ (Elmqvist, 1978), Omola (Mattson et al., 1993), and ModelicaX (Tiller, 2001) SimulinkX is pro-
bably the most well known, and most adapted to the way control engineers like to scribe systems It generates the model−defining statements from block diagrams Theother are in principle model specification languages and tools, and they are normallycombined with simulation programs that accept models defined in these particular lan-
de-guages Sørlie (1994a, 1995a, 1996d) has shown a way to use Omola to write models for IdKit It is still a considerable effort to write models in these languages, instead of
directly in some programming language, such as M−files or C (in addition to the effort
of learning a new language) However, the advantage of using a comprehensive elling language is that it prevents the writing of inconsistent model equations It is alsopossible to include extensive libraries of component models, thus simplifying themodelling There is still no guarantee that the identification problems set up usingthese tools make sense
mod-The languages were developed for simulation purposes mod-There are some problemswith using them for grey−box identification:
: Specialized languages: The languages are basic, and the user has to learn one of
them Like other computer languages they tend to develop into covering more andmore objects, and this makes them more general and more abstract Libraries mayshow a way out, but are of course limited by what the vendor finds it profitable todevelop In addition, since calibrating and validating a model is a much more de-manding task than simulating it, the development tends to allow the writing ofmodels increasingly less suitable for identification purposes Again, more librariesmay be a way out, if specialized to suit the identification purpose
: ODE solving and parameter optimization: There are special numerical problems
associated with combining standard optimizers with efficient ODE solvers usingstep−length control The numerical errors interfere This means in practice thatboth integration and optimization will have to be done with extremely high numer-
ical precision There is at least one program (diffpar, Edsberg and Wikström, 1995)
designed to do simultaneous integration and optimization It handles only modelswithout disturbances
Trang 37: Not predicting models: Grey−box identification is not simulation plus fitting, it is prediction plus fitting (and more) Modelling languages do not primarily produce
predicting models The difference is that a predictor uses past input and output to
compute the next output, a simulator only past input The difference is importantwhen disturbances are important Whenever it pays to have feedback control it alsopays to use a predictive model, most obviously if the purpose is Model Predictive
Control Even if it is possible, in principle, to derive a predicting model from a
simulating one, this is no easy task It is known as ‘the nonlinear filtering problem’,and, in fact, only few cases have been solved so far In practice it is not as bad asthat, since approximating filters may be enough Sørlie (1996) has investigated the
possibilities of combining Omola with an Extended Kalman Filter.
Optimization Tools
Classical optimization methods are those of Fletcher−Powell and Newton−Raphsontype, and there are well developed computer libraries for doing the kind of continuousparameter optimization needed in both white, black, and grey−box identification Aparticular prerequisite of model−fitting is that one cannot usually afford to evaluatethe loss function a large number of times Quasi−Newton methods are particularly ef-fective for predictive models (Liao, 1989a) The reason is that one obtains the fast con-vergence of a second−order search method from evaluations of first−order derivatives
of the model output However, this enhances the search problem in more difficultcases:
: Multiple minima: Global search methods, like the new and intuitively attractive
method of ‘genetic programming’ tend to take an uncomfortably large number ofloss evaluations Alternatively, local search methods may have to be applied sever-
al times with different start values
: Discontinuities: The presence of discontinuities in the model’s parameter
depen-dence ruins any search based on assumptions of continuity Less serious, but stilltroublesome are discontinuities in parameter sensitivity
Validation and Falsification
Once again, those tasks have basically different purposes: Falsification decides
wheth-er a model is good enough for the available data Validation decides whethwheth-er it is goodenough for its purpose A model can be both “false” and “valid”, as well as any other
of the four possible combinations of the outcomes of the two tests
There are several quite general statistical tests for the falsification task, and most
black−box identification packages support some of them, mainly ‘chi−square’ and
‘cross−correlation’ tests They are typically used for order determination
Likeli-hood−Ratio tests are applicable to nonlinear models, and in addition have maximum
discriminating power, i.e., they have the maximum probability of rejecting an
incor-rect model, for a given risk of rejecting a corincor-rect one
Validation is conventionally done by making a loss function that reflects the pose of the modelling, evaluating the loss for the candidate model and see whether it
pur-is below a likewpur-ise given threshold The simplest case pur-is that when the modelling pur-isdone for control purposes, because a suitable loss is then the prediction error variance(when the model is evaluated using a different data sample)
Remark 1.10 Falsification methods are sometimes found under the “validation”
keyword in the literature
Trang 38and has been implemented in MoCaVa.
1.5.2 Tools that Need to Be Developed
Generally, there are tools enough to make grey−box models, and evidence that it can
be done in practice, if one knows how to use the tools What remains is to make it
easi-er This is not without problems, howeveasi-er The man−machinecommunication problemhas to be considered And communication has two directions:
: User input: What prior information is it reasonable to ask from the user? The
prob-lem is enhanced by the fact that users in different branches of engineering have ferent ways of looking at models, and therefore different kinds of prior knowledge.This means that, ideally, there should be different man−machine interfaces for dif-
dif-ferent categories of users The interface implemented in MoCaVa is designed for
process engineers, more than control engineers
: User support: The task which rests most heavily on the user is deciding what to do
next, when a model has been found inadequate What the computer can ceivably do to facilitate this is to present the evidence of the test results in a waythat reveals at which point the model fails and that also is easy to understand Un-fortunately, general tests are rather blunt instruments in this respect The result of
con-a stcon-atisticcon-al test hcon-as the bincon-ary vcon-alue of either “pcon-assed” or “fcon-ailed” (in prcon-actice, ittends to be “failed”, since maximum−power statistical tests are honed to razorsharpness in that respect)
However, there are some means to get more information out of testing a given
mod-el An option in MoCaVa works in connection with the stepwise refinement and
falsifi-cation of the model structure outlined above It is based on an idea that can be trated by the following simple example: Assume that the current tentative structure is
illus-expanded by a free parameter p, whose value is known a priori to be positive Instead
of limiting the search to positive values, it is more informative to proceed as follows:
Do not limit the search to positive values Then the test has one of three possible
out-comes as depicted in Figure 1.1: Hypothesis H0represents the tentative model (p =
0) and H1an alternative (p ≠ 0) The particular case that there is an alternative but
inadmissible model with a significantly lower loss Q(p < 0) means that H0is still jected (since a better model does exist), butH1is not the one, and the alternative struc-ture does not contain one This gives two pieces of information to the model designer:1) Continue the search for a better model, and 2) Use another model structure In addi-tion, the component of the total model to improve is the one containing the unsuccess-ful expansion This determines whether a component is worth cultivating or not Inconclusion, statistical tests give a two−valued answer, but tests combined with priorstructure knowledge may yield more
re-Remark 1.11 Notice that H0is rejected as soon as there is some alternative model
H1within the alternative structure with a loss below the threshold χ2 This means thatthere is no need to search for the alternative with the smallest loss, in order to test thetentative model, except when it cannot be rejected
Trang 39p Q(p)
H0rejected H1better
H0rejected H1wrong
Figure 1.1 Illustrating the three possible results of falsification
Conditional and Unconditional Tests
The rule used to decide whether a tentative model structure is falsified or not depends
on the alternative structure, and is therefore ‘conditional’ on the alternative tional’ tests do not assume an explicit alternative, but instead aim at testing the basic
‘Uncondi-hypothesis that known and unknown input are independent If not, there is obviously
information in the input data that could be used to improve the estimation of the known input and thus the predicting ability of the model
un-The disadvantage of unconditional tests is that they are less discriminating, i.e.,
they let a wider range of similar models pass the test This is so, because the set of plicit ‘alternatives’ is much wider However, they are still applicable, when the modeldesigner has run out of useful prior knowledge
im-The following modified calibration procedure takes into account the prospects fered by the various tests:
of-Calibration procedure:
While there is a better model, repeat
Refine the tentative model structure: → F(u t , ω t,^, ν^, ·)
Fit model parameters: → F(u t , ω t,^, ν^, θ^ν)
Test the tentative model: Until better model, repeat
If no more alternative structures, then expand the alternative
model class:→ F(u t , ω t, , ·, ·)
Specify alternative model structures: → F(u t , ω t , , ν , ·)
If an alternative model is significantly better, then indicate falsified
If an admissible alternative model is significantly better,
then indicate better model and assign →^, ν → ν^
If unfalsified, then test unconditionally: → falsified|unfalsified
Trang 40The MoCaVa Solution
The analysis in chapter 1 outlines what the purpose of the model making would require
MoCaVa to do That must be reconciled, somehow, with the restrictions set by what
a computer can do in reasonable execution time MoCaVa therefore contains further restrictions in order to compromise between the two In essence, MoCaVa makes use
of the following tools:
: Modified Likelihood−Ratio and correlation tests
: The general calibration procedure outlined in Section 1.5.2
: A collection of heuristic validation rules
Chapter 2 describes how these general tools are implemented, and motivates the strictions that make it possible
re-2.1 The Model Set
A second compromise that must be made in the design of MoCaVa is that between the
conflicting goals of versatility and convenience of the user’s modelling task The
mod-el set used in MoCaVa is therefore structured further to adapt to common properties
of industrial production processes, in particular to continuous transport processes Thelatter may be characterized as systems of separate units, each one accepting flows ofcommodities from one or more preceding units, changing their properties, and feedingthe product to one or more following units Since there is an obvious cause−and−effectrelationship between the input and output variables of the units, state−vector models(defined by assignment statements) are convenient to use in those cases
Secondly, the operation of an individual unit is generally a result of interaction ween particular physical phenomena (at least ‘first principles’ are generally expressed
bet-in this way) Also the different phenomena may be described by submodels
A third common characteristic of production processes is that the operation of
some units may be affected by the operations of other, control units Instead of flows
(mass or energy), they produce information input to the affected unit, but are still scribable by the same type of submodel
de-In order to satisfy the requirements MoCaVa is able to administer the creation of
submodels, and to connect them into systems