practical grey-box process identification, springer (2006)

In modelling for the process industry, however, prior knowledge is typically par-tial, the effects of unknown input ’disturbances’ are not negligible, and it is desirable to have reprod

Trang 2

Advances in Industrial Control

Trang 3

Digital Controller Implementation

and Fragility

Robert S.H Istepanian and

James F Whidborne (Eds.)

Optimisation of Industrial Processes

Mohieddine Jelali and Andreas Kroll

Strategies for Feedback Linearisation

Freddy Garces, Victor M Becerra,

Chandrasekhar Kambhampati and

Kevin Warwick

Robust Autonomous Guidance

Alberto Isidori, Lorenzo Marconi and

Andrea Serrani

Dynamic Modelling of Gas Turbines

Gennady G Kulikov and Haydn A

Thompson (Eds.)

Control of Fuel Cell Power Systems

Jay T Pukrushpan, Anna G Stefanopoulou

and Huei Peng

Fuzzy Logic, Identiﬁcation and Predictive

Ajoy K Palit and Dobrivoje Popovic

Modelling and Control of mini-Flying

Measurement, Control, and Communication Using IEEE 1588

Manufacturing Systems Control Design

Stjepan Bogdan, Frank L Lewis, ZdenkoKovaˇci´c and José Mireles Jr

Nonlinear H2/H∞Constrained Feedback Control

Murad Abu-Khalaf, Jie Huang andFrank L Lewis

Modern Supervisory and Optimal Control

Sandor A Markon, Hajime Kita, HiroshiKise and Thomas Bartz-BeielsteinPublication due July 2006

Wind Turbine Control Systems

Fernando D Bianchi, Hernán De Battistaand Ricardo J Mantz

Publication due August 2006

Soft Sensors for Monitoring and Control of Industrial Processes

Luigi Fortuna, Salvatore Graziani,Alessandro Rizzo and Maria GabriellaXibilia

Publication due August 2006

Practical PID Control

Antonio VisioliPublication due November 2006

Magnetic Control of Tokamak Plasmas

Marco Ariola and Alfredo PirontiPublication due May 2007

Trang 5

Automatic Control, Signals, Sensors and Systems

Royal Institute of Technology (KTH)

SE-100 44 Stockholm

Sweden

British Library Cataloguing in Publication Data

Bohlin, Torsten,

1931-Practical grey-box process identiﬁcation : theory and

applications - (Advances in industrial control)

1.Process control Mathematical models 2.Process control

-Mathematical models - Case studies

I.Title

670.4’27

ISBN-13: 9781846284021

ISBN-10: 1846284023

Library of Congress Control Number: 2006925303

Advances in Industrial Control series ISSN 1430-9491

ISBN-13: 978-1-84628-402-1

MATLAB® and Simulink® are registered trademarks of The MathWorks, Inc., 3 Apple Hill Drive, Natick,

MA 01760-2098, U.S.A http://www.mathworks.com

Modelica® is a registered trademark of the “Modelica Association” http://www.modelica.org/

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers,

or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency Enquiries concerning reproduction outside those terms should be sent to the publishers.

The use of registered names, trademarks, etc in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant laws and regulations and therefore free for general use.

The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made.

Printed in Germany

9 8 7 6 5 4 3 2 1

Springer Science+Business Media

springer.com

Trang 6

Advances in Industrial Control

Series Editors

Professor Michael J Grimble, Professor of Industrial Systems and DirectorProfessor Michael A Johnson, Professor (Emeritus) of Control Systemsand Deputy Director

Industrial Control Centre

Department of Electronic and Electrical Engineering

Series Advisory Board

Professor E.F Camacho

Escuela Superior de Ingenieros

Department of Electrical and Computer Engineering

The University of Newcastle

Department of Electrical Engineering

National University of Singapore

4 Engineering Drive 3

Singapore 117576

Trang 7

Department of Electrical and Computer Engineering

Electronic Engineering Department

City University of Hong Kong

Tat Chee Avenue

Pennsylvania State University

Department of Mechanical Engineering

Department of Electrical Engineering

National University of Singapore

4 Engineering Drive 3

Singapore 117576

Professor Ikuo Yamamoto

Kyushu University Graduate School

Marine Technology Research and Development ProgramMARITEC, Headquarters, JAMSTEC

2-15 Natsushima Yokosuka

Kanagawa 237-0061

Japan

Trang 8

To the KTH class of F53

Trang 9

The series Advances in Industrial Control aims to report and encourage technology

transfer in control engineering The rapid development of control technology has

an impact on all areas of the control discipline New theory, new controllers, actuators, sensors, new industrial processes, computer methods, new applications, new philosophies}, new challenges Much of this development work resides in industrial reports, feasibility study papers and the reports of advanced collaborative projects The series offers an opportunity for researchers to present an extended exposition of such new work in all aspects of industrial control for wider and rapid dissemination

Experienced practitioners in the field of industrial control often say that about

70 – 80% of project time is spent on understanding and modelling a process, developing a simulation and then testing, calibrating and validating the simulation.Control design and investigations will then absorb the other 20 – 30% of the project time; thus, it is perhaps a little surprising that there is so little published on the formal procedures and tools for performing these developmental modelling tasks compared with the provision of simulation software tools There is a very clear difference between these two types of activities: simulation tools usually comprise libraries of numerical routines and a logical framework for their interconnection often based on graphical representations like block diagrams of the actual steps needed to arrive at a consistent model but replicating observed physical process behaviour is a far more demanding objective Such is the agenda

underlying the inspirational work of Torsten Bohlin reported in his new Advances

in Industrial Control monograph, Practical Grey-box Identification.

The starting point for this work lies in the task of providing models for a range

of industrial production processes including: Baker’s yeast production, steel rinsing (the rinsing of moving steel strip in a rolling-mill process), continuous pulp digestion, cement milling, an industrial recovery boiler process (pulp production process unit) and cardboard manufacturing The practical experience of producing these models supplied the raw data for understanding and abstracting the steps needed in a formal grey-box identification procedure; thus, it was a project that has been active for over 15 years and over this period, the grey-box identification procedure was formulated, tested, re-formulated and so-on until a generic procedure of wide applicability finally emerged

Trang 10

x Series Editors’ Foreword

In parallel with this extraction of the fundamental grey-box identification procedure has been the development of the Process Model Calibrator and Validator

software, the so-called MoCaVA software This contains the tools that implement

the general steps of grey-box identification Consequently it is based on an holistic approach to process modelling that uses a graphical block-diagram representation but incorporates routines like loss function minimisation for model fitting and other statistical tools to allow testing of model hypotheses The software has been tested and validated through its use and development with an extensive and broadly based group of individual processes, some of which are listed above

This monograph captures three aspects of Torsten Bohlin’s work in this area Firstly, there is an introduction to the theory and fundamentals of grey-box identification (Part I) that carefully defines white-box, black-box and grey-box identification From this emerge the requirements of a grey-box procedure and the

need for software to implement the steps Secondly, there is the MoCaVa software

itself This is available for free download from a Springer website whose location

is given in the book Part II of the monograph is a tutorial introduction and user’s

guide to the use of the MoCaVa software For added realism, the tutorial is based

on a drum boiler model Finally the experience of the tutorial introduction is put to good use with the two fully documented case studies given as Part III of the monograph Process engineers will be able to work at their own pace through the model development for a rinsing process for steel strip in a rolling mill and the prediction of quality in a cardboard manufacturing process The value of the case studies is two-fold since they provide a clear insight into the procedures of grey-

box identification and give in-depth practical experience of using the MoCaVa

software for industrial processes; both of these are clearly transferable skills

The Advances in Industrial Control monograph series has often included

volumes on process modelling and system identification but it is believed that this

is only the second ever volume in the series on the generic steps in an holistic box identification procedure The volume will be welcomed by industrial process control engineers for its insights into the practical aspects of process model identification Academics and researchers will undoubtedly be inspired by the more generic theoretical and procedural aspects that the volume contributes to the science and practice of system identification

grey-M.J Grimble and M.A Johnson Industrial Control Centre Glasgow, Scotland, U.K

Trang 11

Those who have tried the conventional approaches to making mathematical models

of industrial production processes have probably also experienced the limitations ofthe available methods They have either to build the models from first principles, orelse to apply one of the ‘black−box’ methods based on statistical estimation theory.Both approaches work well under the circumstances for which they were designed,and they have the advantage that there are well developed tools for facilitating thework Generally, the modelling tools (based on first principles) have their applications

to electrical, mechanical, and hydrodynamical systems, where much is known aboutthe principles governing such systems In contrast, the statistical methods have theirapplications in cases where little is known in advance, or when detailed knowledge isirrelevant for the purpose of the modelling, typically for design of feedback control

In modelling for the process industry, however, prior knowledge is typically

par-tial, the effects of unknown input (’disturbances’) are not negligible, and it is desirable

to have reproducibility of the model, for instance for the monitoring of unmeasured

variables, for feed−forward control, or for long−range prediction of variables withmuch delayed responses to control action Conceivably, ‘grey−box’ identification,which is a ‘hybrid’ of the two approaches, would help the situation by exploiting both

of the two available sources of information, namely i) such invariant prior knowledge that may be available, and ii) response data from experiments Thus, grey−box meth-

ods would have their applications, whenever there is some invariant prior knowledge

of the process and it would be a waste of information not to use it

After the first session on grey−box identification at the 4th IFAC Symposium on Adaptive Systems in Control and Signal Processingin 1992, and the first special issue

in Int J Adaptive Control and Signal Processing in 1994,the approach has now beenreasonably well accepted as a paradigm for how to address the practical problems inmodelling physical processes There are now quite a number of publications, most

about special applications (A Google search for “Grey box model” in 2005 gave 691

raises a number of fundamental questions, in addition to the practical problems: How

can I make use of what I do know? How much of my prior knowledge is useful and even correct, when used in the particular environment? What do I do about the unknown disturbances I cannot get rid of? Are my experiment data sufficient and rele-

vant? How do I know when the model is good enough?

Trang 12

xii

It was the desire to find some answers to these questions that initiated a long−rangeproject at the Automatic Control department of KTH The present book is based onthe results of that project It stands on three legs:

i) A theoretical investigation of the fundamentals of grey−box identification It

re-vealed that sufficiently many theoretical principles were available in the literature foranswering the questions that needed to be answered The compilation was published

in a book (Bohlin, 1991a), which ended with a number of procedures for doing grey−box identification properly

ii) A software tool MoCaVa (Process Model Calibrator & Validator) based on one of

the procedures (Bohlin and Isaksson, 2003)

iii) A number of case studies of grey−box identification of industrial processes They

were carried out in order to see whether the theoretical procedure would also be a tical one, and to test the software being developed in parallel Most case studies havebeen done by PhD students at the department under the supervision of the author Theextent of the work was roughly one thesis per case

prac-This book will focus on the software and the case studies Thus it will serve as a

manual to MoCaVa, as well as illustrating how to apply MoCaVa efficiently Success

in grey−box identification, as in other design, will no doubt depend of the skill of thecraftsman using the tool, and I believe that skill is best gained by exercise, and casestudies to be a good introduction

In addition, there is a ‘theory’ chapter with the purpose of describing the basic

de-liberations, derivations, and decisions behind MoCaVa and the way it is constructed.

The purpose is to provide additional information to anyone who wants to understandmore of its properties than revealed in the user’s manual This may help the user to ap-praise the strengths and weaknesses of the program, either in order to be able to do the

same with the models that come out of it, or even to develop MoCaVa further (The

source code can be downloaded from Springer.) The focus is therefore on the

applica-bility of the theories for the purpose of MoCaVa, rather than on the theories

to be understood by readers who are not used to strict mathematics And, conversely,

it would be impractical to try and solve all problems of grey−boxidentification by ing on intuition and reasoning alone, however clever Therefore, the mathematics isinterpreted in intuitive terms, and necessary approximations motivated in the sameway, whenever the mathematical problems become unsurmountable, or an exact solu-tion would take prohibitively long for a computer to process The following is one of

rely-my favorite quotations: “The man thinks The theory helps him to think, and to tain his thinking consistent in complex situations” (Peterka)

main-The method presented in this book for building grey−box models of physical jects has three kinds of support: A systematic procedure to follow, a software packagefor doing it, and case studies for learning how to use it Part I motivates and describes

ob-the procedure and ob-the MoCaVa software Part II is a tutorial on ob-the use of MoCaVa

based on simple examples Part III contains two extensive case studies of full−scaleindustrial processes

Trang 13

How to Use this Book

Successful grey−box identification of industrial processes requires knowledge of two

kinds, i) how the process works, and ii) how the software works Since the knowledge

is normally not resident within the same person, two must contribute Call them cess engineer” and “model designer” The latter should preferably have taken a course

“pro-in ‘Process identification’

Part I is for the “model designer”, who needs to understand how the MoCaVa

soft-ware operates, in order to appreciate its limitations − what it can and cannot do

Part II is for both It is a tutorial on running MoCaVa, written for anyone who

actu-ally wants to build a grey−box model It is also useful as an introduction to the casestudies, since it is based on two pervading simple examples

Part III is also for both It develops the case studies in some detail, highlighting the

contributions of the three ‘actors’ in the session, viz the engineer, the model designer/ program operator, and the MoCaVa program The technical details in Part III is prob-

ably of interest only to those working in the relevant businesses (steel or paper & pulp),but are still important as illustrations of the issues that must be considered in practicalgrey−box process identification

The style of parts II and III deviates somewhat from what is customary in textbooks, namely to use sentences in passive form, free of an explicit subject The idea

of the customary practice is that science and engineering statements should be validirrespective of the subject Unfortunately, the custom is devastating for the under-standing, when describing processes where there are indeed several subjects involved

“Who does what” becomes crucial Therefore, part II is written more like a user’smanual In describing grey−box identification practice there is, logically, no less thatfive ‘actors’ involved:

: The customer/engineer (providing the prior information about the physical processand appraising the result)

: The model designer/user of the program tools (often the same person as the tomer, but not if he/she lacks sufficient knowledge of the physical process to bemodelled)

cus-: The computer and program (analyzing the evidence of the data)

: The author of this book (trying to reason with a reader)

: The reader of the book (trying to understand what the author tries to say)

In order to reduce the risk of confusion when describing a grey−box identificationsession − a process that involves at least the first three actors − the following conven-tion will be used in the book:

The contributions of the different actors are marked with symbols at the beginning

of the paragraph, viz for the operator (doing key pressing and mouse clicking), for MoCaVa (computing and displaying the results), and for the model builder

(watching the screen, deliberating, and occasionally calculating on paper) It will nodoubt help the reader who wants to follow the examples on a computer, that the symbol

states explicitly what to do in each moment, and the symbol points to the expectedresponse There are also paragraphs without an initiating symbol − they have the ordi-nary rôle of the author talking to a reader

Also as a convention, Courier fonts are used for code, as well as for variablesthat appear in the code, and for names of submodels, files, and paths.HelveticaNarrow

is used for user communication windows and for labels that appear in screen images

Trang 14

xiv

The book uses a number of special terms and concepts of relevance to process tification Some, but not all should be well−known, or self−explanatory to model de-signers, but probably not all The “Glossary of Terms” contains short definitions, with-out mathematics, and some with clarifying examples The list serves the same purpose

iden-as the ‘hypertext’ function in HTML documentation, although less conveniently.The contents in Part II is also available in HTML format This form has the well−known advantage that explanations of some key concepts become available at a mouse

click, and only if needed In Part II explanations appear either under the headers Help

or Hints, or else as references to sections in the appendix, which unavoidably means

either wading through text mass (that can possibly be skipped), or looking up the propriate sections in the appendix In order to reduce the length of Part II the number

ap-of printed screen images is also smaller than those in the HTML document

MoCaVa is downloadable from www.springer.com/1−84628−402−3

to-gether with all material needed for running the case studies (The package also tains the HTML−manual as well as on−line help facilities.) This offers a possibility toget more direct experience of the model−design session It would therefore be possible

con-to use Parts II and III as study material for a course in grey−box process identification

Acknowledgements

The author is indebted to the following individuals who participated in the Grey−boxdevelopment project:

Stefan Graebe, who wrote the first C−version of the IdKit tool box, and later

partici-pated in the Continuous Casting case study

James Sørlie, who investigated possible interfaces to other programs.

Bohao Liao, who investigated search methods.

Ning He, who investigated real−time evaluation of Likelihood.

Anders Hasselkvist, who wrote Predat.

Tomas Wenngren, who wrote the first GUI.

Germund Mathiasson and Jiri Uosukainen who wrote the first version of Validate Olle Ehrengren, who wrote the first version of Simulate.

Ping Fan, who did the Baker’s Yeast case study.

Björn Sohlberg, who did the first Steel Rinsing case study.

Jonas Funkquist, who did the Pulp Digester case study.

Oliver Havelange, who did the Cement Milling case study.

Jens Pettersson, who did the second Cardboard case study.

Ola Markusson, who did the EEG−signals case study.

Bengt Nilsson, who contributed process knowledge to the Cardboard case study Jan Erik Gustavsson, who contributed process knowledge to the Recovery Boiler case

study

Alf Isaksson, who participated in the Pulp Refiner and Drive Train cases, and headed

the MoCaVa project between 1998 and 2001.

Linus Loquist, who designed the MoCaVa home page.

Trang 15

Part I Theory of Grey−box Process Identification

1 Prospects and Problems 3

1.1 Introduction 3

1.2 White, Black, and Grey Boxes 4

1.2.1 White−box Identification 5

1.2.2 Black−box Identification 6

1.2.3 Grey−box Identification 10

1.3 Basic Questions 13

1.3.1 Calibration 14

1.3.2 How to Specify a Model Set 15

1.4 and a Way to Get Answers 17

1.5 Tools for Grey−box Identification 18

1.5.1 Available Tools 18

1.5.2 Tools that Need to Be Developed 21

2 The MoCaVa Solution 23

2.1 The Model Set 23

2.1.1 Time Variables and Sampling 24

2.1.2 Process, Environment, and Data Interfaces 25

2.1.3 Multi−component Models 27

2.1.4 Expanding a Model Class 29

2.2 The Modelling Shell 31

2.2.1 Argument Relations and Attributes 34

2.2.2 Graphic Representations 37

2.3 Prior Knowledge 41

2.3.1 Hypotheses 42

2.3.2 Credibility Ranking 43

2.3.3 Model Classes with Inherent Conservation Law 43

2.3.4 Modelling ‘Actuators’ 44

2.3.5 Modelling ‘Input Noise’ 46

2.3.6 Standard I/O Interface Models 49

2.4 Fitting and Falsification 51

Trang 16

xvi

2.4.1 The Loss Function 52

2.4.2 Nesting and Fair Tests 54

2.4.3 Evaluating Loss and its Derivatives 55

2.4.4 Predictor 56

2.4.5 Equivalent Discrete−time Model 56

2.5 Performance Optimization 57

2.5.1 Controlling the Updating of Sensitivity Matrices 58

2.5.2 Exploiting the Sparsity of Sensitivity Matrices 59

2.5.3 Using Performance Optimization 60

2.6 Search Routine 62

2.7 Applicability 65

2.7.1 Applications 65

2.7.2 A Method for Grey−box Model Design 67

2.7.3 What is Expected from the User? 68

2.7.4 Limitations of MoCaVa 69

2.7.5 Diagnostic Tools 69

2.7.6 What Can Go Wrong? 71

Part II Tutorial on MoCaVa 3 Preparations 77

3.1 Getting Started 77

3.1.1 System Requirements 77

3.1.2 Downloading 77

3.1.3 Installation 77

3.1.4 Starting MoCaVa 78

3.1.5 The HTML User’s Manual 78

3.2 The ‘Raw’ Data File 78

3.3 Making a Data File for MoCaVa 78

4 Calibration 83

4.1 Creating a New Project 83

4.2 The User’s Guide and the Pilot Window 85

4.3 Specifying the Data Sample 86

4.3.1 The Time Range Window 86

4.4 Creating a Model Component 88

4.4.1 Handling the Component Library 89

4.4.2 Entering Component Statements 90

4.4.3 Classifying Arguments 92

4.4.4 Specifying I/O Interfaces 95

4.4.5 Specifying Argument Attributes 98

4.4.6 Specifying Implicit Attributes 100

4.4.7 Assigning Data 100

4.5 Specifying Model Class 101

4.6 Simulating 103

4.6.1 Setting the Origin of the Free Parameter Space 103

4.6.2 Selecting Variables to be Plotted 104

4.6.3 Appraising Model Class 105

Trang 17

4.7 Handling Data Input 106

4.8 Fitting a Tentative Model Structure 107

4.8.1 Search Parameters 108

4.8.2 Appraising the Search Result 111

4.9 Testing a Tentative Model Structure 113

4.9.1 Appraising a Tentative Model 116

4.9.2 Nesting 118

4.9.3 Interpreting the Test Results 119

4.10 Refining a Tentative Model Structure 121

4.11 Multiple Alternative Structures 122

4.12 Augmenting a Disturbance Model 124

4.13 Checking the Final Model 132

4.14 Terminals and ‘Stubs’ 134

4.15 Copying Components 135

4.16 Effects of Incorrect Disturbance Structure 138

4.17 Exporting/Importing Parameters 140

4.18 Suspending and Exiting 141

4.18.1 The Score Table 142

4.19 Resuming a Suspended Session 143

4.20 Checking Integration Accuracy 143

5 Some Modelling Support 147

5.1 Modelling Feedback 147

5.1.1 The Model Class 148

5.1.2 User’s Functions and Library 153

5.2 Rescaling 154

5.3 Importing External Models 159

5.3.1 Using DymolaZ as Modelling Tool for MoCaVa 160

5.3.2 Detecting Over−parametrization 166

5.3.3 Assigning Variable Input to Imported Models 170

5.3.4 Selective Connection of Arguments to DymolaZ Models 173

Part III Case Studies 6 Case 1: Rinsing of the Steel Strip in a Rolling Mill 185

6.1 Background 185

6.2 Step 1: A Phenomenological Description 185

6.2.1 The Process Proper 185

6.2.2 The Measurement Gauges 188

6.2.3 The Input 189

6.3 Step 2: Variables and Causality 189

6.3.1 The variables 189

6.3.2 Cause and effect 190

6.3.3 Data Preparation 191

6.3.4 Relations to Measured Variables 192

6.4 Step 3: Modelling 194

6.4.1 Basic Mass Balances 194

6.4.2 Strip Input 201

Trang 18

xviii

6.5 Step 4: Calibration 203

6.6 Refining the Model Class 206

6.6.1 The Squeezer Rolls 206

6.6.2 The Entry Rolls 211

6.7 Continuing Calibration 213

6.8 Refining the Model Class Again 215

6.8.1 Ventilation 215

6.9 More Hypothetical Improvements 217

6.9.1 Effective Mixing Volumes 217

6.9.2 Avoiding the pitfall of ‘Data Description’ 219

6.10 Modelling Disturbances 222

6.10.1 Pickling 222

6.10.2 State Noise 223

6.11 Determining the Simplest Environment Model 225

6.11.1 Variable Input Acid Concentration 225

6.11.2 Unexplained Variation in Residual Acid Concentration 225

6.11.3 Checking for Possible Over−fitting 229

6.11.4 Appraising Roller Conditions 233

6.12 Conclusions from the Calibration Session 233

7 Case 2: Quality Prediction in a Cardboard Making Process 235

7.1 Background 235

7.2 Step 1: A Phenomenological Description 235

7.3 Data Preparation 237

7.4 Step 2: Variables and Causality 244

7.4.1 Relations to Measured Variables 247

7.5 Step 3: Modelling 248

7.5.1 The Bending Stiffness 248

7.5.2 The Paper Machine 253

7.5.3 The Pulp Feed 260

7.5.4 Control Input 262

7.5.5 The Pulp Mixing 265

7.5.6 Pulp Input 267

7.5.7 The Pulp Constituents 269

7.6 Step 4: Calibration 271

7.7 Expanding the Tentative Model Class 279

7.7.1 The Pulp Refining 279

7.7.2 The Mixing−tank Dynamics 284

7.7.3 The Machine Chests 287

7.7.4 Filtering the “Kappa” Input 289

7.8 Checking for Over−fitting: The SBE Rule 290

7.9 Ending a Calibration Session 293

7.9.1 ‘Black−box’ vs ‘White−box’ Extensions 293

7.9.2 Determination vs Randomness 294

7.10 Modelling Disturbances 295

7.11 Calibrating Models with Stochastic Input 296

7.11.1 Determination vs Randomness Revisited 299

7.11.2 A Local Minimum 304

7.12 Conclusions from the Calibration Session 306

Trang 19

A Mathematics and Algorithms 313

A.1 The Model Classes 313

A.2 The Loss Derivatives 316

A.3 The ODE Solver 317

A.3.1 The Reference Trajectory 317

A.3.2 The State Deviation 318

A.3.3 The Equivalent Discrete−time Sensitivity Matrices 318

A.4 The Predictor 321

A.4.1 The Equivalent Discrete−time Model 322

A.5 Mixed Algebraic and Differential Equations 322

A.6 Performance Optimization 326

A.6.1 The SensitivityUpdateControl Function 327

A.6.2 Memoization 330

A.7 The Search Routine 330

A.8 Library Routines 331

A.8.1 Output Conversion 331

A.8.2 Input Interpolators 331

A.8.3 Input Filters 334

A.8.3 Disturbance Models 335

A.9 The Advanced Specification Window 337

B.2.1 Optimization for Speed 337

B.2.2 User’s Checkpoints 338

B.2.3 Internal Integration Interval 338

B.2.4 Debugging 339

Glossary 341

References 345

Index 349

Trang 20

identification”, typically defined as follows: “Given a parametric class of models, find

the member that fits given experiment data with the minimum loss according to a given criterion” (Ljung, 1987) Now, the three “given” conditions concern anyone who in-

tends to apply the software, whether that is in the form of theory, method, or computer

program Sometimes “given” means that prerequisites are built into the software,sometimes that they are expected as input from the user of the software

When one is faced with a given object instead, and possibly also with a given

pur-pose for the model, it is certainly not obvious how to get the answers to the questions

posed by identification software It is therefore important that developers of such ware do what they can to facilitate the answering It is not necessarily a desirable ambi-tion to make the software more automatic by demanding less from the user He or she

soft-is still responsible for the quality of the result, and any input that a user soft-is able to vide, but is not asked for, may be a waste of information and reduce the quality of the

pro-model A better goal is therefore to make the software demand its input in a form that

the user can supply more easily

Secondly, user input (both prior knowledge and experiment data) is often tain, irrelevant, contradictory, or even false A second goal for the software designer

uncer-is therefore to provide tools for apprauncer-ising the user’s input Admittedly, any softwaremust have something ‘given’, but it makes a difference whether the software wants

assumptions, taken for facts, or just hypotheses, that will be subject to tests This

moti-vates the decision to base MoCaVa on the ‘grey−box’ approach.

The general and somewhat vague idea of grey box identification is that when one

is making a mathematical model of a physical object, there are two sources of tion, namely response data and prior knowledge And grey−box identification meth-

informa-ods are such methinforma-ods that can use both

In practice, “prior knowledge” means different things And generally, prior edge is not easy to reconcile with the form of the models assumed by a particular identi-fication method In fact, each method starts with assuming a model class, and eachmodel class requires its particular form of prior knowledge What one can generally

knowl-do in order to take prior knowledge into account is to start with a versatile class of els, for which there are general tools available for analysis and identification, and try

Trang 21

mod-and adapt its freedom, its ‘design parameters’, i.e., the specifications one has to enter

into the identification program, to the prior knowledge This means that the ‘grey−boxidentification methods’ tend to be as many and as diversified as the conventional iden-tification methods, also starting with given classes of models This makes it hard todelimit grey−box identification from other identification and also to make a survey of

‘grey box identification methods’

Neither is that the purpose of this chapter Instead it is to survey the fundamentals

the MoCaVa software is based on A user of the program will conceivably benefit from

an understanding of the purposes of the operations performed by various routines in

the program Generally, MoCaVa is constructed by specializing and codifying the

gen-eral concepts used in (Bohlin, 1991a) and following one of the procedures derived inthat book

In addition, the chapter will briefly discuss the prospects and problems of ing grey−box identification software further

develop-1.2 Black, White, and Grey Boxes

Commercially available tools for making mathematical models of dynamic processes

are of two kinds, with different demands on the user On one hand there are Modelling tools, generally associated with simulation software (e.g., DymolaZ, http://www.dy-

nasim.se/www/Publications.pdf), which require the user to provide complete cation of the equations governing the process, either expressed as statements written

specifi-in some modellspecifi-ing language, such as ModelicaX (Tiller, 2001), or by connectspecifi-ing ponents from a library This alternative may be supported by combining the modelling

com-tools with com-tools for parameter optimization (e.g., HQP,

http://sourceforge.net/pro-jects/hqp) Call this “white−box” identification

On the other hand there are “black−box” system identification tools (e.g., LABX System Identification Tool Box), which require the user to accept one of the generic model structures (e.g., linear) and then to determine which tools to use in the par-

MAT-ticular case, and in what order, as well as the values of a number of design parameters

(order numbers, weighting factors, etc.) Finally, the user must interpret the resulting

model, which is expressed in a form that is not primarily adapted to the physical object.Unless the model is to be used directly for design of feedback control, there is somefurther translation to do

Generally, the user has two sources of information on which to base the model

ma-king: prior knowledge and experiment data “White−box” identification uses mainly

one source and “black−box” identification the other The strength of “white−box”identification is that it will allow the user to exploit invariant prior knowledge Itsweakness is its inability to cope with the unknown and with random effects in the ob-ject and its environment The latter is the strength of “black−box” identification based

on statistical methods, but also means that the reproducibility of its results may be indoubt In essence, “black−box” identification produces ‘data descriptions’, and re-peating the experiment may well produce a much different model This may or maynot be a problem, depending on what the model is to be used for

The idea of “grey−box” identification is to use both sources, and thus to combinethe strengths of the two approaches in order to reduce the effects of their weaknesses.When following Ljung’s definition of “System identification”, and regardless ofthe ‘colour’ of the ‘box’, the designer of a model of a physical object must do two

things, i) specify a class of models, and ii) fit its free elements to data Call this

Trang 22

1 Prospects and Problems

ling” and “Fitting” A method with a darker shade of ‘grey’ uses less prior knowledge

to delimit the model class Even if most available identification methods tend to bemore or less ‘grey’, the following notations allow a formal distinction between the ge-neric ‘white’, ‘black’, and ‘grey box’ approaches to model design

1.2.1 White−box Identification

Since both the model class definition and the fitting procedure are implemented as gorithms they can be described formally as functions:

The model designer specifies the model class F, which may contain a given number

of unknown parameters θ Given a control sequence u t(where the subscript denotes

the input history from some initial time up to present time t), and the parameter vector

θ, a simulation program allows the computing of the model’s response z (t|θ) at any time Any unknown parameters θ are then estimated by applying an optimization pro-

gram minimizing the deviation between measured response datay Nand those nents of the model’s output z Nthat correspond to the measured values The deviation

compo-is measured by a given loss function E The latter compo-is usually a sum of squared neous deviations, but various filtering schemes may be used to suppress particulartypes of data contamination

instanta-The following are some well−knownobstacles to designing “white boxes” in tice:

prac-: Unknown relations between some variables: Engineers often do not have the

com-plete mathematical knowledge of the object to be able to write a simulation model

: Too many relations for convenience: When they do have the knowledge, the result

is often too complex a model to be possible to simulate with the ease required forparameter fitting Many physical phenomena are describable only by partial differ-ential equations Simulation would then require supercomputers, and identifica-tion an order of magnitude more (Car and airplane designers could possibly affordthe luxury.)

: Unknown complexity: It falls solely on the designer to determine how much of the

known relations to include in the model

: Sensitivity to low−frequency disturbances: Comparing output of deterministic

models with data in the presence of low−frequency disturbances generally givespoor parameter estimates

: Primitive validation: If one would try and use only literature values for parameters,

or make separate experiments to determine some of them, in order to avoid thecumbersome calibration of a complex model and the usually expensive exper-imentation on a large process, this makes it the more difficult to validate the model

Remark 1.1 The sensitivity to disturbances can sometimes be reduced by clever

design of the loss function This requires some prior information on the object’s ronment

envi-Example 1.1

Consider a cylindrical tank with cross−section area A filled with liquid of density Ã up

to a level z, under pressure p, and having a free outlet at the bottom with area a The

Trang 23

tank is replenished with the volume flow f According to Bernoulli’s law the variations

in the level will be governed by the following differential equation:

Withu = (f, p)as varying control variables, the equation cannot be solved

analytical-ly, but given values of θ = (A, a, Ã, g), an ODE solver will be able to produce a

se-quence of values{z (kh|θ)|k = 1, , N} of z sampled with interval h Hence F is

defined as the ODE solver operating on an equation of some form like

der(z) = −a*sqrt(z*g + p/rho) + f/A

with given constant parameters a,A,g,rho and variable control input p,f

With a recorded sequence of measurements y N = {y(kh)|k = 1, , N} of the tank level z during an experiment with known, step−wise changing input sequences

u N, it will be possible to set up and evaluate the loss function

E(uN , θ) =

N k=1

for any given value of θ Applying an optimization program, it will then be possible

to minimize the loss function with respect to any combination of the parameters, and

in this way estimate the values of any unknowns among(A, a, Ã), but not the value of gravity g.

1.2.2 Black−box Identification

Defining this case is somewhat more complicated, since the task usually involves termining one or more integer ‘order’ numbers, the values of which determine thenumber of parameters to be fitted (Ljung, 1987; Söderström and Stoica, 1989):

The designer cannot change Fn, which is particular to the method, except by

specify-ing an order index n The latter normally determines the number of unknown ters θ n However, the model class accepts a second, random input signal ω t(usually

parame-‘white noise’) in order to model the effects of random disturbances For given order

numbers the parameters θ nare estimated by minimizing the deviation between

re-sponse data and m steps predicted output (usually one step) according to a given loss

function E The difference between the model and the predictor is that the latter uses

previous, m steps delayed response data y t−min addition to the control sequenceu tforcomputing the predicted responses The predictor Pnis uniquely determined by Fn.However, exact and applicable predictors are known only for special classes Fn, and

this limits the versatility of black box identification programs Unknown orders n are

usually determined by increasing the order stepwise, and stopping when the loss

re-duction drops below a threshold χ2 A popular alternative is to use a loss function that

Trang 24

weights the increasing complexity associated with increasing n, which allows mization with respect to both integer parameters n and real parameters θ n(Akaike,1974) The model classes are most often linear, but nonlinear black−box model classesare also used (Billings, 1980)

mini-The following are practical difficulties:

: Restricted and unfamiliar form: Many engineers do not feel comfortable with

models produced by black−box identification programs based on statistical ods Mainly, the model structure and parameters do not have a physical interpreta-tion, and this makes it difficult to compare the estimates with values from othersources

meth-: Over−parametrization: The number of parameters increases rapidly with the

num-ber of variables, and even more so when the model class is nonlinear This leadseasily to ‘over−fitting’, with all sorts of numerical problems and poor accuracy

: Poor reproducibility: What is produced is a ‘data description’ If this is also to be

an ‘object description’ the model class must contain a good model of the object

If it does not; if much of the variation in the data is caused by phenomena that arenot modelled well enough by Fnas effects of known inputu t, the fitting proce-

dure tends to use the remaining free arguments ω t and θ nto reduce the deviations

In other words, what cannot be modelled as response to control, will be modelled

as disturbance In this way even a structurally wrong model may still predict well

at a short range If the data sequence is long, the estimated parameter accuracy mayeven be high This means that one may well get a good model, with good short−

range predicting ability and a high theoretical accuracy, but when the identification

is repeated with a different data set, an equally ‘good’ but different model is tained That will not necessarily mean that the object has changed between the ex-

ob-periments; it may be a consequence of fitting a model with the wrong structure.Generally, it will be difficult to get reproducibility with black−box models, unlessthe dynamics of the whole object are invariant, including the properties of distur-

bances, and the model structure is right.

The basic cause of the poor reproducibility of black boxes, is that it is not possible to

enter the invariant and object−specific relations that are the basis of white−box

mod-els To gain the advantages of convenience and quick result, the model designer is infact willing to discard any result from previous research on the object of the modelling

Remark 1.2 Adaptive control will conceivably be able to alleviate the effects of

poor reproducibility, and benefit from the good predictive ability of the model, but thiscan be exploited only for feedback control of such variables that have online sensors.Monitoring of unmeasured variables, as well as control with long delays will still behazardous

Remark 1.3 Tulleken (1993) has suggested a way to force some prior knowledge

into black−box structures, thus making the models less ‘black’

Example 1.2

With the same tank object as in Example 1.1, one could choose to ignore the findings

of Bernoulli and describe the process as a “black box” A linear model is the most ular first choice, but if one would suspect that the process is nonlinear, and also takeinto account some rudimentary prior knowledge (that a hole at the bottom tends to re-duce the level), the following heuristic form would also be conceivable:

pop-dz dt = p1z + p2p + p3f − [p4z + p5p + p6f]α (1.9)

Trang 25

Incidentally, this form contains the ‘true’ process, Equation 1.3, with p1= p2=

p6= 0, p3= 1 A, p4= a2g, p5= a2 Ã , and α = ½ But normally, that is not the

case

A more likely, and ‘blacker’ form, would be

dz dt = p1+ p2z + p3p + p4f + p5z2+ p6p2+ p7f2 (1.10)

This will define a deterministic black box of second order F n (u t , 0, θ n ) → z(t|θ n),

where n = 2, u = (f, p) and θ2= (p1, , p7) It can be processed as in Example 1.1

If the parameters are many enough, if measurements are accurate, and if the ment is not subject to external or internal disturbance, the resulting model may evenperform almost as well as the white box

experi-If, however, the varying pressure p is not recorded, it might still be possible to use

the following form

dz dt = p1+ p2z + p4f + p5z2+ p7f2+ v (1.11)

where ω is ‘white noise’, and v is ‘Brownian motion’ to model the unknown term

p3p + p6p2 Hence, θ2= (p1, p2, p4, p5, p7, p8)

When models have unknown input it becomes necessary to find the one−step (or

m−step)predictor associated with it, in order to be able to minimize the sum of squares

of prediction errors Exact predictors are known only for some classes of models And

even if the model belongs to a class which does allow a predictor to be derived, tion is usually no simple task

deriva-However, black−box identification programs have already done this for fairly eral classes of models that do allow exact derivation One such class is the NARMAX(for Nonlinear Auto Regressive Moving Average with eXternal input) discrete−timemodel class (Billings, 1980)

gen-y (τ) +

nz ν=1

na k=1

a ν

k P ν [y(τ − k)]=

nz ν=1

nb k=1

b ν

k P ν [u(τ − k)]

+ c0w (τ) +

nc k=0

The parameter array θ nis the collection of all a ν

k,b ν

k, andc kin Equation 1.13

Notice that Equation 1.13 contains only measured output y in addition to the input

u, which is an essential restriction, but makes it easy to derive a predictor (which is why

the class is defined in this way) Since the values ofw (τ) can be computed recursively

from Equation 1.13, and since E{w (τ|y τ−1 )} = 0, the predictor follows directly as

Trang 26

a ν

k P ν [y(τ − k)]+

nz ν=1

nb k=1

b ν

k P ν [u(τ − k)]

+ 0 +

nc k=1

The special case ofn c= 0 (NARX) is particularly convenient Since the predictor

in Equation 1.14 will then be a linear function in all unknown parameters, and the lossfunction therefore quadratic, this makes it technically easy to fit a large number of pa-rameters

Now back to the original model, Equations 1.11 and 1.12 If, for simplicity, oneassumes that the sampling is dense enough, then

Trang 27

Assuming that the measurements are accurate enough to allow y (τ) to replace z(th)

and with u (τ) = f (th), makes Equation 1.21 imbedded in 1.13, with P ν (u) ≡ u ν,

After a and b have been determined by minimizing Equation 1.15, the remaining

parameterc0can be computed from the minimum loss

However, reconstructing the original parameters p from the estimated a, b, and c

creates an over−determined system of equations; there are five unknown and nineequations This can be solved too, for instance using a pseudo inverse, but still causes

a complication, since the relations are case dependent and not preprogrammed into theidentification package

If one would want to avoid even having to determine the order numbers, and setting

up Equations 1.11 and 1.12, and hence also to reconstruct the parameters, it is possible

to specify sufficiently large order numbers, and let the necessarily large order numbers

be determined by the identification program The SFI rule (for Stepwise Forward clusion) achieves this (Leontaritis and Billings, 1987) It is a recursive procedure:The SFI rule:

In-Initialize n a = n b = n c = n z= 0

While significant reduction, repeat

For x ∈ (a, b, c, z), do

Alternative order number:n → ν; ν x + 1 → ν x

Compute alternative loss:Q x

and indicate significant reduction

It is possible to design the loss function E and compute the χ2threshold in such

a way that the decision of model order can be associated with risk values The

“Maxi-mum−Power” loss function used by MoCaVa minimizes the risk for choosing a lower order when a higher order is correct The threshold value χ2is based on a given riskfor choosing the higher−order model, when the lower−order is the correct

Trang 28

able to convert into algorithms simple enough to allow i) simulation, ii) automatic ivation of at least an approximate predictor, and iii) fitting of parameters The particular limitations imposed by MoCaVa will be specified below.

der-Remark 1.4 Continuous−time white noise into nonlinear models has to be handled

with care (Åström, 1970; Graebe, 1990b) In practice the equations have to be

discre-tized, and ω replaced by discrete−time white noise w (Section A.1, Restriction #3).

As would be expected, there are difficulties also with grey−box identification,

some have been experienced using MoCaVa3 and its predecessors Some may vanish

with further development, others are fundamental and will remain:

: Heavy computing: MoCaVa needs, in principle, to evaluate the sensitivities of all

state derivatives with respect to all states and all noise variables for all instants inthe time range of the data sample, and for deviations in all parameters that are to

be fitted And this must be repeated until convergence And again, the whole cess must be repeated until a satisfactory model structure has been found In theworst case each evaluation requires access to the model, which altogether creates

pro-a very lpro-arge number of executions of the pro-algorithm defining the model For otherthan small models the dominating part of the execution time is spent inside theuser’s model Since, the time it takes to run the user’s model once is not ‘negotia-

ble’, the only option for improving the design of MoCaVa is to try and reduce the

number of model accesses by taking shortcuts However, since the model structure

is relatively free, it is difficult to exploit special structural properties in order to beable to find the shortcuts, like the black−box methods are able to A way that is still

open is to have MoCaVa analyze the user−determined structure, in order to find such shortcuts MoCaVa3 is provided with some tools to do this (see Section 2.5).

: Interactive: It is difficult to reduce the time spent by the user in front of the

comput-er, for instance by doing the heavy computing overnight

: Failures: More freedom means more possibilities to set up problems that MoCaVa

cannot cope with The result may be that the search cannot fit parameters, or worse,produces a model that is wrong, because the assumptions built into its design arenot satisfied in the particular case The ‘advanced’ options that may become neces-sary to use with complex models require some user’s specifications of approxima-tion levels, and this adds another risk The causes of failures are discussed in Sec-tion 2.7.6

: Stopping: Available criteria for deciding when to stop the calibration session are

somewhat blunt instruments When a model cannot be falsified by the default ditional tests, this may well be so because the user has run out of ideas on how toimprove it In that case unconditional tests will have to do However, they do notgenerally have maximum power, and therefore have a larger risk of letting a wrongmodel pass A user may have to supply subjective assessment, in particular bylooking at ‘transients’ in the plotted residual sequences

con-: Too much stochastics: Stochastic models are marvellous short−range predictors,

and therefore generally excel in reducing the loss, in particular with slowly sponding processes Technically, they have at their disposal the whole sequence ofnoise input to manipulate, in addition to the free parameters, in order to reduce theloss However, they have a tendency to assign even some responses of known input

re-to disturbances, if given re-too much freedom re-to do so The result is inferior ibility, since disturbances are by definition not reproducible

Trang 29

reproduc-Example 1.3

Return again to the tank object, Equation 1.3, and assume that the varying pressure p

has not been recorded during the experiment Since the model class Fnis not grammed but defined by the user in each case, the user must enter code like

prepro-der(z) = −a*sqrt(z*g + p/rho) + f/A

and, in addition, specify a model for describing the unknown p The latter may well

be a black−box (preprogrammed) model, unless one knows something particular

about p, that does not let it be modelled by a black box.

Since p is not in the data, the next step is to find a predictor for it, and for the output.

When the model is nonlinear, an optimal predictor is usually not practical, but mal predictors are Most common are various types of EKF (for Extended Kalman Fil-ter)

subopti-Armed with such a predictor, it is possible to proceed as in the black−box case, though the mode of operation of the program will be different Mainly, the EKF (which

al-is preprogrammed) must call a function that defines the model class, and which pends on the entered code, and therefore must be compiled and linked for each case(like in white box identification, and like in any ODE solver in a simulation program)

de-If, again for simplicity, the sampling interval h is short enough, and if the Brownian motion is used to model p, the discrete−time equivalent of the model will be

z (t + h) = z(t) + h [− a z(t) g + p(t) Ã + f (t) A] (1.25)

wherew iare uncorrelated Gaussian random sequences with zero means and unit

vari-ances, and σ and λ are parameters introduced to measure the average size of ment errors and the average rate of change of the unknown pressure p Unlike in Exam-

measure-ple 1.2, the measurements errors need not be small

It is convenient to use the state−equation form

Trang 30

whereA (τ) = ∇ x G (x), C = H The optimal filter gain K(τ) is computed using an

al-gorithm that involves the solutionR xx (τ) of the Riccati equation associated with

Equa-tions 1.28 to 1.31

Remark 1.5 It is more common to use G (x^) in Equation 1.35 instead of G(x), since

this makes a better approximation, should the estimate^x drift far from the referencetrajectoryx On the other hand, this will make the EKF more susceptible to large distur-

bances, and will thus increase the risk for instability With Equation 1.35, both thisand Equation 1.36 are stable as long as the model is, while a negative value of

x

^

1g + x^

2 Ã, for instance caused by a spurious value in y (τ), would cause a run−time

error in the evaluation ofG (x^)

Notice that most matrices that are needed to handle the consequences of having an

unknown input depend on τ, which means that the calculations generally take much

more time than with a white box

The predictor is given by Equation 1.33, the prediction error by Equation 1.34, andthe loss function is computed from Equation 2.23:

Q (θ) = 12

N k=1

It is a function of θ = (a, A, Ã, λ, σ), which can therefore be estimated by an

opti-mization routine, like in the white−box case

An option that can be copied from the black−box identification procedure is the

estimation of model complexity, by testing whether all parameters are significant, or,

alternatively, some could possibly be left out from the model For instance, the SFI rulewill work, and risk values for making a wrong decision can be computed

1.3 Basic Questions

MoCaVa has been conceived with the following scenario in mind: Suppose a

produc-tion process is to be described by a dynamic model for simulaproduc-tion or other purposes

A number of submodels (or first principles or heuristic relations) for parts of the cess are available as prior information, developed under more or less well controlledconditions However, when the submodels are assembled into a model for the inte-grated process, all their input and output are no longer controlled or measured, and theenvironment is no longer the same as when the submodels were developed In addi-tion, unmodelled phenomena and unmeasurable input (disturbances) may effect theresponses significantly It is not known which of the submodels that are needed for asatisfactory model, or whether there will still remain unexplained phenomena in thedata, when all prior information has been used And, again, prior model information

pro-is more or less precpro-ise, reliable, and relevant

This raises a number of questions from a model designer:

: How can I make use of what I do know? One usually knows some, but not all And

it may be a waste not to use it

: How much of my prior knowledge is useful, or even correct when used in the

par-ticular environment? Too much detailed knowledge tends only to contribute to thecomplexity of the model and less to satisfying the purpose for which it is made

Trang 31

Formulas obtained from the literature are often derived and verified in an ment quite different from the circumstances under which they will be put to use.

environ-: What do I do about the disturbances I cannot eliminate? This is the opposite

prob-lem: too little prior knowledge The response of an object is usually the effect oftwo kinds of input, known and unknown Call the second “disturbances” If onedoes not have a model for computing the unknown input, and cannot just neglect

it, then some will obviously have to be assumed instead

: Are my experiment data sufficient and relevant? Can I use ordinary data loggings,

obtained during normal production and therefore at little cost? Or do I have to makespecially designed experiments (and lose production while it is going on)?

: How do I know when the model is good enough? It may (or may not) be hazardous

just to try and use the model for the purpose it was designed, for instance control,and see if it works That depends of course on what it costs to fail

1.3.1 Calibration

Needless to say, none of the questions can be answered in advance Considering thediversity of a user’s prior information, originating in a variety of more or less reliablesources, it is also very unlikely that one would be able to formulate, much less solve,

a mathematical problem that, given prior input and data, would produce a ‘best’ modelaccording to a given criterion (and thus be able retain the usual definition of the identi-fication problem)

However, it is possible to conceive a multistep procedure for making a model that

satisfies many of the demands one may have on it, and taking the user’s prior

knowled-ge into account The steps in this procedure will require the solutions of less ing subproblems, like fitting to data, and testing whether one model is significantlybetter than another The literature offers principles and ideas for solving many of thesubproblems, and a number of those have been compiled into a systematic procedurefor grey−box identification (Bohlin, 1986, 1991a, 1994a) One of the procedures has

demand-also been implemented as a User’s Shell (IKUS) to IdKit (Bohlin, 1993) However, its

principles are general and can be implemented with other tool boxes that are generalenough and open enough

MoCaVa is based on two such ‘trial−and−error’ procedures, calibration and idation The procedures operate on sets of models, since it is not given a priori how

val-extensive the set has to be in order to satisfy the experiment data and the purpose ofthe model making

The calibration routine finds the simplest model that is consistent with priorknowledge and not falsified by experiment data It is a double loop of refinement andfalsification derived from basic principles of inference (Bohlin, 1991a):

Calibration procedure:

While the model set is falsified, repeat

Refine the tentative model set

Fit model parameters

Falsify the tentative model: Until falsified, repeat

Specify an alternative model set

If any alternative model is better, then indicate falsified

Notice that the procedure works with two sets and two models, namely tentative, which is the best so far, and alternative, which may or may not be better.

Trang 32

The questions that have to be answered are now i) how to specify a model set, ii) how to fit a model within a given set, and iii) how to decide whether an alternative

model is better than a tentative one

1.3.2 How to Specify a Model Set

The following first structuring of the model set F is motivated by the mode of

opera-tion of computers and common system software, like Unix and Windows Assume

there is a given component library{< i }, such that a given selection of its members will

combine into a system defining an algorithm able to compute response values z (t),

given input arguments Define the model sets

Model: F (u t , ω t , , ν , θ ν) (1.30)

Model structure: F (u t , ω t , , ν , ·) (1.40)

Model class: F (u t , ω t, , ·, ·) (1.41)

Model library: F (u t , ω t, ·, ·, ·) (1.42)where

: A model library is the set of all models that can be formed by combining

compo-nents <i It is the maximum set within which to look for a model

: A model class is a smaller set, defined by the argument , which is an array of

indices of selected components

: A model structure is an even smaller set, where also the free−space index ν is

giv-en This determines the dimension of the free parameter space with coordinates

θ ν

: A model is a single member in the model structure, selected by specifying also the values of the free coordinates θ ν It includes all specifications necessary to carryout a simulation of the model given the control input, the random sequence, andthe time range

Notice that this creates two means of refining a model set: with more components, orwith more free parameters Change of model class requires recompilation in order togenerate efficient code, change of model structure and model does not The definitionalso concretizes ‘prior knowledge’ as (hypothetical) algebraic relations between vari-ables, and various variable attributes (like ranges, scales, values, and uncertainty) Thecase−specific model library contains prior knowledge of the object, while class andstructure will depend also on the experiment data

The slightly more structured procedure becomes

Calibration procedure:

While the structure is falsified, repeat

Refine the tentative model structure: → F(u t , ω t,^, ν^, ·)

Fit model parameters: → F(u t , ω t,^, ν^, θ^ν)

Falsify the tentative model: Until falsified, repeat

If no more alternative structures,

then expand the alternative model class: → F(u t , ω t, , ·, ·)

Specify alternative model structures: → F(u t , ω t , , ν , ·)

If any alternative model is better,

then indicate falsified and assign →^, ν → ν^

Trang 33

The procedure does fitting and testing of an expanding sequence of hypothetical modelstructures It starts with those based on the simplest and most reliable components inthe component library, for instance those based on mass and energy balances Thestructure is then expanded by a procedure of ‘pruning and cultivation’: Hypotheticalsubmodels (components) are augmented and the structure is tested again Those who

do not contribute to reducing the value of the loss function are eliminated Those who

do contribute become candidates for further refinement

The procedure is interactive: The computer does the elimination, and also suggeststhe most probable of a limited number of alternatives The model designer suggests

the alternatives In this way a user of MoCaVa is given an opportunity to exploit his

or her knowledge of the physical object and exercise the probably increasing skill inmodelling, in order to reduce the number of tests of alternatives that would be neededotherwise The construction is based on the belief that, even if it is difficult to specifythe right model structure in advance, an engineer is usually good at improving a model,when it has been revealed where it fails As a last resort it is possible to use empirical

‘black boxes’ to model such parts or physical phenomena of a process, that have beenrevealed as significant, but for which there is no prior knowledge

P2are polynomials, for instance Legendre polynomials, of first and second order, and

ωis continuous white noise with unit power density

Then Equation 1.43 defines the model F(ut , ω t , , ν , θ ν) with

θ ν = (θ a

11, ööö , θ a 1n , θ b

11, ööö , θ b 1n , θ a

21, ööö , θ a 2n , θ b

21, ööö , θ b 2n ,θ c

1, ööö , θ c

n , θ d

1, ööö , θ b)(1.46)The point of the double indexing with and ν is that it allows the definition of a number of smaller model classes F(u t , ω t, , ·, ·), for instance:

: Linear and deterministic: = (1), ν = (n a

: Nonlinear and deterministic: = (1, 2), ν = (n a , n b , n a , n b)

plus a number of less likely alternatives Notice that change of class changes the tions (differential equations) and hence the source code of the computer program,which means recompilation

func-Each class allows a number of model structuresF(u t , ω t , , ν , ·), defined by the values of the order numbers ν , which also determines the number of parameters in

the model structure Change of structure does not generally require recompilation,provided enough space has been allocated for a maximum order, or dynamic allocation

is used

When also the values of the parameters are given, this defines the model.

Trang 34

The model libraryF(u t , ω t, ·, ·, ·) is the set of model classes from which the user

can pick one by specifying Each transfer function in Equation 1.43 defines a

com-ponent If, as in this example, all can be combined, this generates eight model classes

in the library, including the ‘null’ model class y = 0.

1.4 and a Way to Get Answers

How does this answer the original questions from a model designer?

: Question: How can I make use of what I do know?

Answer: By entering hypotheses, and by specifying which of those to try next.

: Question: How much of my prior knowledge is useful, or even correct when used

in the particular environment?

Answer: That is useful which reduces loss significantly It is correct, if fitting its

parameters yields values that do not contradict any prior knowledge

: Question: What do I do with the disturbances I cannot eliminate?

Answer: Describe them as stochastic processes.

: Question: How do I know when the model is good enough?

Answer: There are two meanings of “good enough”:

i) The model is not good enough for available data and prior knowledge, as long

as it can still be falsified

ii) The model is good enough for its purpose, when the validation procedure yields

a satisfactory result

: Question: Are my experiment data sufficient and relevant?

Answer: They are, if the validation procedure yields a result satisfying the purpose

of the model design

Remark 1.6 Valid logical objections can be raised against these rather flat

an-swers For instance, it is possible to conceive cases, where the experiment data is quate for the purpose, but where the calibration procedure has failed to reveal errors

ade-in the model structure (because there are no better alternative hypotheses) It is alsopossible to conceive disturbances that do not let themselves be described by stochasticprocesses of the types available in the library, or at all All hinges on the assumptionthat the model library will indeed allow an adequate modelling

Remark 1.7 Even if there are cases where it is theoretically correct to use the same

data for calibration and validation (‘auto−validation’), it is generally safer to base thevalidation on a second data set (‘cross−validation’) Still, much of the results of the

validation procedure hinges on the assumption that the second data set is demanding

enough If it is not, the validation procedure will not reveal an inadequate model Infact, a failure will not be revealed until the model has been put to use and failed ob-viously The costs related to the latter event, will therefore determine how much work

to put into the validation process For instance, paper machine control can afford tofail occasionally, Mars landers rather not at all

Remark 1.8 Logically, the calibration and validation procedures have little to do

with one another, since the meanings of a “good model” are different A model maywell be good enough for such a limited purpose as feedback control, and thus easilyvalidated in that respect, but still be unable to satisfy an extensive data sequence gener-ated by a complex object Conversely, a model satisfying a data sequence containinglittle dynamic information, may well satisfy that data, as well as all one knows aboutthe object in advance, but still be unable to satisfy its purpose when validated with dif-ferent and more damanding data

Trang 35

1.5 Tools for Grey−box Identification

The following is a list of what is needed for realizing the calibration and validationschemes:

: A versatile class of models: So that it does contain a suitable model for the

particu-lar purpose Models must be possible to simulate and fit conveniently

: A tool to restrict this class according to prior knowledge: This is the whole point

of the grey box concept It means that there must be some modelling tool, allowing the user to formulate the prior knowledge conveniently Model class restriction is what identification is all about, and user−supportedrestriction is what grey−box

identification is all about

: A tool to fit parameters: In order to find the model that agrees with data most.

: A tool to falsify fitted models: In order to eliminate incorrect hypotheses about the

object

: A tool to validate models: So that the model will not be more complicated than

needed for the purpose

: A procedure to follow: In addition to the tool kit there is also a need for some kind

of ‘handbook’ or ‘guide’ on how to build grey−box models using the tools Again,grey−box model making is an interactive process: At each step, the software may

or may not need more information from the user, or more data, depending onwhether the result so far is satisfactory or not

Remark 1.9 The list leaves out the problem of what to do when no model is good

enough for the purpose An answer is to try and get better data, and there are methods

for doing this in the literature, again valid for certain classes of models MoCaVa does

not support this

1.5.1 Available Tools

Some of the tools have been available for some time Let’s look at what they can andcannot do, in order to find out what more is needed

Nonlinear State Models

A reasonably general form that evades the subtleties of continuous−time stochasticdifferential equations, and lets itself be simulated is

where x is the state vector, u is known control, ω and w are continuous and discrete

‘white noises’, and p are parameters.

This is a rather versatile class that suits many physical processes, and is used, for

instance, in the identification tool boxes in the Cypros (Camo, 1987), Matrix x(Gupta

et al., 1993), IdKit (Graebe, 1990a−d; Graebe and Bohlin, 1992; Bohlin and Graebe,

1994a,b), and CTSM (Kristensen and Madsen, 2003) packages Given models of this form the tools fit parameters p to given data All use the Maximum Likelihood criteri-

on, but different ways of fitting The first two are commercial products IdKit is not commercial; it has been used for case studies and has been developed into MoCaVa.

Kristensen, Madsen, and Jørgensen (2004) use a somewhat more general form of

Equation 1.48, where the variance of the ‘diffusion term’ ω may depend on u,p,t.

Trang 36

What are the obstacles to a wider application of grey−box Maximum Likelihoodidentification tools? Mainly that they are difficult to use for other than experts Thereare difficulties with

: Modelling: Since one has usually to try a large number of structures before finding

a suitable one, in particular with models of the complexity required by a full−scaleindustrial process, it becomes quite a tedious task to write all the models, and also

to maintain the necessary housekeeping of all rejected attempts

: Setup and interpretation: It is easy to set up meaningless problems for the tools to

solve (which they gladly do) It is more difficult to see whether the solutions areany good

: The state−differential equations also leave out important dynamical properties ofsome real objects, for instance those containing delays, or phenomena better de-scribed by partial differential equations, or containing hard discontinuities, likedry friction

Modelling Tools

They are tools to enter prior knowledge Examples are SimulinkX

(www.math-works.com/products/simulink), Bond graphs (Margolis, 1992), DymolaZ (Elmqvist, 1978), Omola (Mattson et al., 1993), and ModelicaX (Tiller, 2001) SimulinkX is pro-

bably the most well known, and most adapted to the way control engineers like to scribe systems It generates the model−defining statements from block diagrams Theother are in principle model specification languages and tools, and they are normallycombined with simulation programs that accept models defined in these particular lan-

de-guages Sørlie (1994a, 1995a, 1996d) has shown a way to use Omola to write models for IdKit It is still a considerable effort to write models in these languages, instead of

directly in some programming language, such as M−files or C (in addition to the effort

of learning a new language) However, the advantage of using a comprehensive elling language is that it prevents the writing of inconsistent model equations It is alsopossible to include extensive libraries of component models, thus simplifying themodelling There is still no guarantee that the identification problems set up usingthese tools make sense

mod-The languages were developed for simulation purposes mod-There are some problemswith using them for grey−box identification:

: Specialized languages: The languages are basic, and the user has to learn one of

them Like other computer languages they tend to develop into covering more andmore objects, and this makes them more general and more abstract Libraries mayshow a way out, but are of course limited by what the vendor finds it profitable todevelop In addition, since calibrating and validating a model is a much more de-manding task than simulating it, the development tends to allow the writing ofmodels increasingly less suitable for identification purposes Again, more librariesmay be a way out, if specialized to suit the identification purpose

: ODE solving and parameter optimization: There are special numerical problems

associated with combining standard optimizers with efficient ODE solvers usingstep−length control The numerical errors interfere This means in practice thatboth integration and optimization will have to be done with extremely high numer-

ical precision There is at least one program (diffpar, Edsberg and Wikström, 1995)

designed to do simultaneous integration and optimization It handles only modelswithout disturbances

Trang 37

: Not predicting models: Grey−box identification is not simulation plus fitting, it is prediction plus fitting (and more) Modelling languages do not primarily produce

predicting models The difference is that a predictor uses past input and output to

compute the next output, a simulator only past input The difference is importantwhen disturbances are important Whenever it pays to have feedback control it alsopays to use a predictive model, most obviously if the purpose is Model Predictive

Control Even if it is possible, in principle, to derive a predicting model from a

simulating one, this is no easy task It is known as ‘the nonlinear filtering problem’,and, in fact, only few cases have been solved so far In practice it is not as bad asthat, since approximating filters may be enough Sørlie (1996) has investigated the

possibilities of combining Omola with an Extended Kalman Filter.

Optimization Tools

Classical optimization methods are those of Fletcher−Powell and Newton−Raphsontype, and there are well developed computer libraries for doing the kind of continuousparameter optimization needed in both white, black, and grey−box identification Aparticular prerequisite of model−fitting is that one cannot usually afford to evaluatethe loss function a large number of times Quasi−Newton methods are particularly ef-fective for predictive models (Liao, 1989a) The reason is that one obtains the fast con-vergence of a second−order search method from evaluations of first−order derivatives

of the model output However, this enhances the search problem in more difficultcases:

: Multiple minima: Global search methods, like the new and intuitively attractive

method of ‘genetic programming’ tend to take an uncomfortably large number ofloss evaluations Alternatively, local search methods may have to be applied sever-

al times with different start values

: Discontinuities: The presence of discontinuities in the model’s parameter

depen-dence ruins any search based on assumptions of continuity Less serious, but stilltroublesome are discontinuities in parameter sensitivity

Validation and Falsification

Once again, those tasks have basically different purposes: Falsification decides

wheth-er a model is good enough for the available data Validation decides whethwheth-er it is goodenough for its purpose A model can be both “false” and “valid”, as well as any other

of the four possible combinations of the outcomes of the two tests

There are several quite general statistical tests for the falsification task, and most

black−box identification packages support some of them, mainly ‘chi−square’ and

‘cross−correlation’ tests They are typically used for order determination

Likeli-hood−Ratio tests are applicable to nonlinear models, and in addition have maximum

discriminating power, i.e., they have the maximum probability of rejecting an

incor-rect model, for a given risk of rejecting a corincor-rect one

Validation is conventionally done by making a loss function that reflects the pose of the modelling, evaluating the loss for the candidate model and see whether it

pur-is below a likewpur-ise given threshold The simplest case pur-is that when the modelling pur-isdone for control purposes, because a suitable loss is then the prediction error variance(when the model is evaluated using a different data sample)

Remark 1.10 Falsification methods are sometimes found under the “validation”

keyword in the literature

Trang 38

and has been implemented in MoCaVa.

1.5.2 Tools that Need to Be Developed

Generally, there are tools enough to make grey−box models, and evidence that it can

be done in practice, if one knows how to use the tools What remains is to make it

easi-er This is not without problems, howeveasi-er The man−machinecommunication problemhas to be considered And communication has two directions:

: User input: What prior information is it reasonable to ask from the user? The

prob-lem is enhanced by the fact that users in different branches of engineering have ferent ways of looking at models, and therefore different kinds of prior knowledge.This means that, ideally, there should be different man−machine interfaces for dif-

dif-ferent categories of users The interface implemented in MoCaVa is designed for

process engineers, more than control engineers

: User support: The task which rests most heavily on the user is deciding what to do

next, when a model has been found inadequate What the computer can ceivably do to facilitate this is to present the evidence of the test results in a waythat reveals at which point the model fails and that also is easy to understand Un-fortunately, general tests are rather blunt instruments in this respect The result of

con-a stcon-atisticcon-al test hcon-as the bincon-ary vcon-alue of either “pcon-assed” or “fcon-ailed” (in prcon-actice, ittends to be “failed”, since maximum−power statistical tests are honed to razorsharpness in that respect)

However, there are some means to get more information out of testing a given

mod-el An option in MoCaVa works in connection with the stepwise refinement and

falsifi-cation of the model structure outlined above It is based on an idea that can be trated by the following simple example: Assume that the current tentative structure is

illus-expanded by a free parameter p, whose value is known a priori to be positive Instead

of limiting the search to positive values, it is more informative to proceed as follows:

Do not limit the search to positive values Then the test has one of three possible

out-comes as depicted in Figure 1.1: Hypothesis H0represents the tentative model (p =

0) and H1an alternative (p ≠ 0) The particular case that there is an alternative but

inadmissible model with a significantly lower loss Q(p < 0) means that H0is still jected (since a better model does exist), butH1is not the one, and the alternative struc-ture does not contain one This gives two pieces of information to the model designer:1) Continue the search for a better model, and 2) Use another model structure In addi-tion, the component of the total model to improve is the one containing the unsuccess-ful expansion This determines whether a component is worth cultivating or not Inconclusion, statistical tests give a two−valued answer, but tests combined with priorstructure knowledge may yield more

re-Remark 1.11 Notice that H0is rejected as soon as there is some alternative model

H1within the alternative structure with a loss below the threshold χ2 This means thatthere is no need to search for the alternative with the smallest loss, in order to test thetentative model, except when it cannot be rejected

Trang 39

p Q(p)

H0rejected H1better

H0rejected H1wrong

Figure 1.1 Illustrating the three possible results of falsification

Conditional and Unconditional Tests

The rule used to decide whether a tentative model structure is falsified or not depends

on the alternative structure, and is therefore ‘conditional’ on the alternative tional’ tests do not assume an explicit alternative, but instead aim at testing the basic

‘Uncondi-hypothesis that known and unknown input are independent If not, there is obviously

information in the input data that could be used to improve the estimation of the known input and thus the predicting ability of the model

un-The disadvantage of unconditional tests is that they are less discriminating, i.e.,

they let a wider range of similar models pass the test This is so, because the set of plicit ‘alternatives’ is much wider However, they are still applicable, when the modeldesigner has run out of useful prior knowledge

im-The following modified calibration procedure takes into account the prospects fered by the various tests:

of-Calibration procedure:

While there is a better model, repeat

Refine the tentative model structure: → F(u t , ω t,^, ν^, ·)

Fit model parameters: → F(u t , ω t,^, ν^, θ^ν)

Test the tentative model: Until better model, repeat

If no more alternative structures, then expand the alternative

model class:→ F(u t , ω t, , ·, ·)

Specify alternative model structures: → F(u t , ω t , , ν , ·)

If an alternative model is significantly better, then indicate falsified

If an admissible alternative model is significantly better,

then indicate better model and assign →^, ν → ν^

If unfalsified, then test unconditionally: → falsified|unfalsified

Trang 40

The MoCaVa Solution

The analysis in chapter 1 outlines what the purpose of the model making would require

MoCaVa to do That must be reconciled, somehow, with the restrictions set by what

a computer can do in reasonable execution time MoCaVa therefore contains further restrictions in order to compromise between the two In essence, MoCaVa makes use

of the following tools:

: Modified Likelihood−Ratio and correlation tests

: The general calibration procedure outlined in Section 1.5.2

: A collection of heuristic validation rules

Chapter 2 describes how these general tools are implemented, and motivates the strictions that make it possible

re-2.1 The Model Set

A second compromise that must be made in the design of MoCaVa is that between the

conflicting goals of versatility and convenience of the user’s modelling task The

mod-el set used in MoCaVa is therefore structured further to adapt to common properties

of industrial production processes, in particular to continuous transport processes Thelatter may be characterized as systems of separate units, each one accepting flows ofcommodities from one or more preceding units, changing their properties, and feedingthe product to one or more following units Since there is an obvious cause−and−effectrelationship between the input and output variables of the units, state−vector models(defined by assignment statements) are convenient to use in those cases

Secondly, the operation of an individual unit is generally a result of interaction ween particular physical phenomena (at least ‘first principles’ are generally expressed

bet-in this way) Also the different phenomena may be described by submodels

A third common characteristic of production processes is that the operation of

some units may be affected by the operations of other, control units Instead of flows

(mass or energy), they produce information input to the affected unit, but are still scribable by the same type of submodel

de-In order to satisfy the requirements MoCaVa is able to administer the creation of

submodels, and to connect them into systems

Tiêu đề	Practical Grey-box Process Identification
Tác giả	Torsten Bohlin
Trường học	Royal Institute of Technology (KTH)
Chuyên ngành	Process Control
Thể loại	Book
Năm xuất bản	2006
Thành phố	Stockholm

Định dạng
Số trang	360
Dung lượng	9,23 MB