4.3 Adaptive Determination of Information Distribution Factor By the analysis of the estimation performance of combined filter, it is known that the information distribution principle n
Trang 21,1 1, 1,
T 1,1
, ,
k k
P Φ
If taken the equal sign, that is, achieved the de-correlation of local estimates, on the one
hand, the global optimal fusion estimate can be realized by Theorem 4.1 , but on the other,
the initial covariance matrix and process noise covariance of the sub-filter themselves can
enlarged by i1 times What’s more, the filter results of every local filter will not be
optimal
4.2 Structure and Performance Analysis of the Combined Filter
The combined filter is a 2-level filter The characteristic to distinguish from the traditional
distributed filters is the use of information distribution to realize information share of every
sub-filter Information fusion structure of the combined filter is shown in Fig 4.1
Optimal fusion
Fig 4.1 Structure Indication of the Combined Filter
From the filter structure shown in the Fig 4.1, the fusion process for the combined filter can
be divided into the following four steps
Step1 Given initial value and information distribution: The initial value of the global state in
the initial moment is supposed to beX0, the covariance to beQ0, the state estimate vector
of the local filter, the system covariance matrix and the state vector covariance matrix separately, respectively to beX Q P ˆ , , ,i i i i 1, , N, and the corresponding master filter
to beX Q P ˆ , ,m m m.The information is distributed through the information distribution factor by the following rules in the sub-filter and the master filter
Step2 the time to update the information: The process of updating time conducted
independently, the updated time algorithm is shown as follows:
Step3 Measurement update: As the master filter does not measure, there is no measurement
update in the Master Filter The measurement update only occurs in each local sub-filter, and can work by the following formula:
Step4 the optimal information fusion: The amount of information of the state equation and
the amount of information of the process equation can be apportioned by the information distribution to eliminate the correlation among sub-filters Then the core algorithm of the combined filter can be fused to the local information of every local filter to get the state optimal estimates
, 1 1 ,
1
ˆ ( | ) ( | ) ( | ) ( | ) ˆ ( | ) ( ( | )) ( ( | ) ( | ) ( | ) ( | ))
Trang 3State Optimal Estimation for Nonstandard Multi-sensor Information Fusion System 25
T 1,1
, ,
k k
P Φ
If taken the equal sign, that is, achieved the de-correlation of local estimates, on the one
hand, the global optimal fusion estimate can be realized by Theorem 4.1 , but on the other,
the initial covariance matrix and process noise covariance of the sub-filter themselves can
enlarged by i1 times What’s more, the filter results of every local filter will not be
optimal
4.2 Structure and Performance Analysis of the Combined Filter
The combined filter is a 2-level filter The characteristic to distinguish from the traditional
distributed filters is the use of information distribution to realize information share of every
sub-filter Information fusion structure of the combined filter is shown in Fig 4.1
Optimal fusion
Fig 4.1 Structure Indication of the Combined Filter
From the filter structure shown in the Fig 4.1, the fusion process for the combined filter can
be divided into the following four steps
Step1 Given initial value and information distribution: The initial value of the global state in
the initial moment is supposed to beX0, the covariance to beQ0, the state estimate vector
of the local filter, the system covariance matrix and the state vector covariance matrix separately, respectively to beX Q P ˆ , , ,i i i i 1, , N, and the corresponding master filter
to beX Q P ˆ , ,m m m.The information is distributed through the information distribution factor by the following rules in the sub-filter and the master filter
Step2 the time to update the information: The process of updating time conducted
independently, the updated time algorithm is shown as follows:
Step3 Measurement update: As the master filter does not measure, there is no measurement
update in the Master Filter The measurement update only occurs in each local sub-filter, and can work by the following formula:
Step4 the optimal information fusion: The amount of information of the state equation and
the amount of information of the process equation can be apportioned by the information distribution to eliminate the correlation among sub-filters Then the core algorithm of the combined filter can be fused to the local information of every local filter to get the state optimal estimates
, 1 1 ,
1
ˆ ( | ) ( | ) ( | ) ( | ) ˆ ( | ) ( ( | )) ( ( | ) ( | ) ( | ) ( | ))
Trang 4correlation between sub-filters and the master filter and between the various sub-filters in
the local filter and to enlarge the initial covariance matrix and the process noise covariance
of each sub-filter byi1times, the filter results of each local filter will not be optimal But
some information lost by the variance upper-bound technique can be re-synthesized in the
final fusion process to get the global optimal solution for the equation
In the above analysis for the structure of state fusion estimation, it is known that centralized
fusion structure is the optimal fusion estimation for the system state in the minimum
variance While in the combined filter, the optimal fusion algorithm is used to deal with
local filtering estimate to synthesize global state estimate Due to the application of variance
upper-bound technique, local filtering is turned into being suboptimal, the global filter after
its synthesis becomes global optimal, i.e the fact that the equivalence issue between the
combined filtering process and the centralized fusion filtering process To sum up, as can be
seen from the above analysis, the algorithm of combined filtering process is greatly
simplified by the use of variance upper-bound technique It is worth pointing out that the
use of variance upper-bound technique made local estimates suboptimum but the global
estimate after the fusion of local estimates is optimal, i.e combined filtering model is
equivalent to centralized filtering model in the estimated accuracy
4.3 Adaptive Determination of Information Distribution Factor
By the analysis of the estimation performance of combined filter, it is known that the
information distribution principle not only eliminates the correlation between sub-filters as
brought from public baseline information to make the filtering of every sub-filter conducted
themselves independently, but also makes global estimates of information fusion optimal
This is also the key technology of the fusion algorithm of combined filter Despite it is in this
case, different information distribution principles can be guaranteed to obtain different
structures and different characteristics (fault-tolerance, precision and amount of calculation)
of combined filter Therefore, there have been many research literatures on the selection of
information distribution factor of combined filter in recent years In the traditional structure
of the combined filter, when distributed information to the subsystem, their distribution
factors are predetermined and kept unchanged to make it difficult to reflect the dynamic
nature of subsystem for information fusion Therefore, it will be the main objective and
research direction to find and design the principle of information distribution which will be
simple, effective and dynamic fitness, and practical Its aim is that the overall performance
of the combined filter will keep close to the optimal performance of the local system in the
filtering process, namely, a large information distribution factors can be existed in high
precision sub-system, while smaller factors existed in lower precision sub-system to get
smaller to reduce its overall accuracy of estimated loss Method for determining adaptive
information allocation factors can better reflect the diversification of estimation accuracy in
subsystem and reduce the impact of the subsystem failure or precision degradation but
improve the overall estimation accuracy and the adaptability and fault tolerance of the
whole system But it held contradictory views given in Literature [28] to determine
information distribution factor formula as the above held view It argued that global optimal
estimation accuracy had nothing to do with the information distribution factor values when
statistical characteristics of noise are known, so there is no need for adaptive determination
Combined with above findings in the literature, on determining rules for information distribution factor, we should consider from two aspects
1) Under circumstances of meeting conditions required in Kalman filtering such as exact statistical properties of noise, it is known from filter performance analysis in Section 4.2 that:
if the value of the information distribution factor can satisfy information on conservation principles, the combined filter will be the global optimal one In other words, the global optimal estimation accuracy is unrelated to the value of information distribution factors, which will influence estimation accuracy of a sub-filter yet As is known in the information distribution process, process information obtained from each sub-filter isi gQ1, i gP1, Kalman filter can automatically use different weights according to the merits of the quality
of information: the smaller the value ofiis, the lower process message weight will be, so the accuracy of sub-filters is dependent on the accuracy of measuring information; on the contrary, the accuracy of sub-filters is dependent on the accuracy of process information 2) Under circumstances of not knowing statistical properties of noise or the failure of a subsystem, global estimates obviously loss the optimality and degrade the accuracy, and it
is necessary to introduce the determination mode of adaptive information distribution factor Information distribution factor will be adaptive dynamically determined by the sub-filter accuracy to overcome the loss of accuracy caused by fault subsystem to remain the relatively high accuracy in global estimates In determining adaptive information distribution factor, it should be considered that less precision sub-filter will allocate factor with smaller information to make the overall output of the combined filtering model had better fusion performance, or to obtain higher estimation accuracy and fault tolerance
In Kalman filter, the trace of error covariance matrix P includes the corresponding estimate vector or its linear combination of variance The estimated accuracy can be reflected in filter answered to the estimate vector or its linear combination through the analysis for the trace
of P So there will be the following definition:
Definition 4.1: The estimation accuracy of attenuation factor of the ith local filter is:
inilocal filter, tr( ) meaning the demand for computing trace function of the matrix When introduced attenuation factor estimation accuracyEDOPi, in fact, it is said to use the measurement of norm characterizationPi inPi matrix: the bigger the matrix norm is, the corresponding estimated covariance matrix will be larger, so the filtering effect is poorer; and vice versa
According to the definition of attenuation factor estimation accuracy, take the computing formula of information distribution factor in the combined filtering process as follows:
i i
Trang 5State Optimal Estimation for Nonstandard Multi-sensor Information Fusion System 27
correlation between sub-filters and the master filter and between the various sub-filters in
the local filter and to enlarge the initial covariance matrix and the process noise covariance
of each sub-filter byi1times, the filter results of each local filter will not be optimal But
some information lost by the variance upper-bound technique can be re-synthesized in the
final fusion process to get the global optimal solution for the equation
In the above analysis for the structure of state fusion estimation, it is known that centralized
fusion structure is the optimal fusion estimation for the system state in the minimum
variance While in the combined filter, the optimal fusion algorithm is used to deal with
local filtering estimate to synthesize global state estimate Due to the application of variance
upper-bound technique, local filtering is turned into being suboptimal, the global filter after
its synthesis becomes global optimal, i.e the fact that the equivalence issue between the
combined filtering process and the centralized fusion filtering process To sum up, as can be
seen from the above analysis, the algorithm of combined filtering process is greatly
simplified by the use of variance upper-bound technique It is worth pointing out that the
use of variance upper-bound technique made local estimates suboptimum but the global
estimate after the fusion of local estimates is optimal, i.e combined filtering model is
equivalent to centralized filtering model in the estimated accuracy
4.3 Adaptive Determination of Information Distribution Factor
By the analysis of the estimation performance of combined filter, it is known that the
information distribution principle not only eliminates the correlation between sub-filters as
brought from public baseline information to make the filtering of every sub-filter conducted
themselves independently, but also makes global estimates of information fusion optimal
This is also the key technology of the fusion algorithm of combined filter Despite it is in this
case, different information distribution principles can be guaranteed to obtain different
structures and different characteristics (fault-tolerance, precision and amount of calculation)
of combined filter Therefore, there have been many research literatures on the selection of
information distribution factor of combined filter in recent years In the traditional structure
of the combined filter, when distributed information to the subsystem, their distribution
factors are predetermined and kept unchanged to make it difficult to reflect the dynamic
nature of subsystem for information fusion Therefore, it will be the main objective and
research direction to find and design the principle of information distribution which will be
simple, effective and dynamic fitness, and practical Its aim is that the overall performance
of the combined filter will keep close to the optimal performance of the local system in the
filtering process, namely, a large information distribution factors can be existed in high
precision sub-system, while smaller factors existed in lower precision sub-system to get
smaller to reduce its overall accuracy of estimated loss Method for determining adaptive
information allocation factors can better reflect the diversification of estimation accuracy in
subsystem and reduce the impact of the subsystem failure or precision degradation but
improve the overall estimation accuracy and the adaptability and fault tolerance of the
whole system But it held contradictory views given in Literature [28] to determine
information distribution factor formula as the above held view It argued that global optimal
estimation accuracy had nothing to do with the information distribution factor values when
statistical characteristics of noise are known, so there is no need for adaptive determination
Combined with above findings in the literature, on determining rules for information distribution factor, we should consider from two aspects
1) Under circumstances of meeting conditions required in Kalman filtering such as exact statistical properties of noise, it is known from filter performance analysis in Section 4.2 that:
if the value of the information distribution factor can satisfy information on conservation principles, the combined filter will be the global optimal one In other words, the global optimal estimation accuracy is unrelated to the value of information distribution factors, which will influence estimation accuracy of a sub-filter yet As is known in the information distribution process, process information obtained from each sub-filter isi gQ1, i gP1, Kalman filter can automatically use different weights according to the merits of the quality
of information: the smaller the value ofiis, the lower process message weight will be, so the accuracy of sub-filters is dependent on the accuracy of measuring information; on the contrary, the accuracy of sub-filters is dependent on the accuracy of process information 2) Under circumstances of not knowing statistical properties of noise or the failure of a subsystem, global estimates obviously loss the optimality and degrade the accuracy, and it
is necessary to introduce the determination mode of adaptive information distribution factor Information distribution factor will be adaptive dynamically determined by the sub-filter accuracy to overcome the loss of accuracy caused by fault subsystem to remain the relatively high accuracy in global estimates In determining adaptive information distribution factor, it should be considered that less precision sub-filter will allocate factor with smaller information to make the overall output of the combined filtering model had better fusion performance, or to obtain higher estimation accuracy and fault tolerance
In Kalman filter, the trace of error covariance matrix P includes the corresponding estimate vector or its linear combination of variance The estimated accuracy can be reflected in filter answered to the estimate vector or its linear combination through the analysis for the trace
of P So there will be the following definition:
Definition 4.1: The estimation accuracy of attenuation factor of the ith local filter is:
inilocal filter, tr( ) meaning the demand for computing trace function of the matrix When introduced attenuation factor estimation accuracyEDOPi, in fact, it is said to use the measurement of norm characterizationPi inPi matrix: the bigger the matrix norm is, the corresponding estimated covariance matrix will be larger, so the filtering effect is poorer; and vice versa
According to the definition of attenuation factor estimation accuracy, take the computing formula of information distribution factor in the combined filtering process as follows:
i i
Trang 6Obviously, i can satisfy information on conservation principles and possess a very
intuitive physical sense, namely, the line reflects the estimated performance of sub-filters to
improve the fusion performance of the global filter by adjusting the proportion of the local
estimates information in the global estimates information Especially when the performance
degradation of a subsystem makes its local estimation error covariance matrix such a
singular huge increase that its adaptive information distribution can make the combined
filter participating of strong robustness and fault tolerance
5 Summary
The chapter focuses on non-standard multi-sensor information fusion system with each kind
of nonlinear, uncertain and correlated factor, which is widely popular in actual application,
because of the difference of measuring principle and character of sensor as well as
measuring environment
Aiming at the above non-standard factors, three resolution schemes based on semi-parameter
modeling, multi model fusion and self-adaptive estimation are relatively advanced, and
moreover, the corresponding fusion estimation model and algorithm are presented
(1) By introducing semi-parameter regression analysis concept to non-standard multi-sensor
state fusion estimation theory, the relational fusion estimation model and
parameter-non-parameter solution algorithm are established; the process is to separate
model error brought by nonlinear and uncertainty factors with semi-parameter modeling
method and then weakens the influence to the state fusion estimation precision; besides, the
conclusion is proved in theory that the state estimation obtained in this algorithm is the
optimal fusion estimation
(2) Two multi-model fusion estimation methods respectively based on multi-model adaptive
estimation and interacting multiple model fusion are researched to deal with nonlinear and
time-change factors existing in multi-sensor fusion system and moreover to realize the
optimal fusion estimation for the state
(3) Self-adaptive fusion estimation strategy is introduced to solve local dependency and
system parameter uncertainty existed in multi-sensor dynamical system and moreover to
realize the optimal fusion estimation for the state The fusion model for federal filter and its
optimality are researched; the fusion algorithms respectively in relevant or irrelevant for
each sub-filter are presented; the structure and algorithm scheme for federal filter are
designed; moreover, its estimation performance was also analyzed, which was influenced
by information allocation factors greatly So the selection method of information allocation
factors was discussed, in this chapter, which was dynamically and self-adaptively
determined according to the eigenvalue square decomposition of the covariance matrix
6 Reference
Hall L D, Llinas J Handbook of Multisensor Data Fusion Bcoa Raton, FL, USA: CRC Press,
2001
Bedworth M, O’Brien J the Omnibus Model: A New Model of Data Fusion IEEE
Transactions on Aerospace and Electronic System, 2000, 15(4): 30-36
Heintz, F., Doherty, P A Knowledge Processing Middleware Framework and its Relation to
the JDL Data Fusion Model Proceedings of the 7th International Conference on Information Fusion, 2005, pp: 1592-1599
Llinas J, Waltz E Multisensor Data Fusion Norwood, MA: Artech House, 1990
X R Li, Yunmin Zhu, Chongzhao Han Unified Optimal Linear Estimation Fusion-Part I:
Unified Models and Fusion Rules Proc 2000 International Conf on Information Fusion, July 2000
Jiongqi Wang, Haiyin Zhou, Deyong Zhao, el State Optimal Estimation with Nonstandard
Multi-sensor Information Fusion System Engineering and Electronics, 2008, 30(8): 1415-1420
Kennet A, Mayback P S Multiple Model Adaptive Estimation with Filter Pawning IEEE
Transaction on Aerospace Electron System, 2002, 38(3): 755-768 Bar-shalom, Y., Campo, L The Effect of The Common Process Noise on the Two-sensor
Fused-track Covariance IEEE Transaction on Aerospace and Electronic Systems,
1986, Vol.22: 803-805 Morariu, V I, Camps, O I Modeling Correspondences for Multi Camera Tracking Using
Nonlinear Manifold Learning and Target Dynamics IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June, 2006, pp: 545-552 Stephen C, Stubberud, Kathleen A, et al Data Association for Multisensor Types Using
Fuzzy Logic IEEE Transaction on Instrumentation and Measurement, 2006, 55(6): 2292-2303
Hammerand, D C ; Oden, J T ; Prudhomme, S ; Kuczma, M S Modeling Error and
Adaptivity in Nonlinear Continuum System, NTIS No: DE2001-780285/XAB Crassidis J Letal.A Real-time Error Filter and State Estimator.AIAA-943550.1994:92-102 Flammini, A, Marioli, D et al Robust Estimation of Magnetic Barkhausen Noise Based on a
Numerical Approach IEEE Transaction on Instrumentation and Measurement,
2002, 16(8): 1283-1288 Donoho D L., Elad M On the Stability of the Basis Pursuit in the Presence of Noise http:
//www-stat.stanford.edu/-donoho/reports.html Sun H Y, Wu Y Semi-parametric Regression and Model Refining Geospatial Information
Science, 2002, 4(5): 10-13 Green P.J., Silverman B.W Nonparametric Regression and Generalized Linear Models
London: CHAPMAN and HALL, 1994 Petros Maragos, FangKuo Sun Measuring the Fractal Dimension of Signals: Morphological
Covers and Iterative Optimization IEEE Trans On Signal Processing, 1998(1): 108~121
G, Sugihara, R.M.May Nonlinear Forecasting as a Way of Distinguishing Chaos From
Measurement Error in Time Series, Nature, 1990, 344: 734-741 Roy R, Paulraj A, kailath T ESPRIT Estimation of Signal Parameters Via Rotational
Invariance Technique IEEE Transaction Acoustics, Speech, Signal Processing, 1989, 37:984-98
Aufderheide B, Prasad V, Bequettre B W A Compassion of Fundamental Model-based and
Multi Model Predictive Control Proceeding of IEEE 40th Conference on Decision and Control, 2001: 4863-4868
Trang 7State Optimal Estimation for Nonstandard Multi-sensor Information Fusion System 29
Obviously, i can satisfy information on conservation principles and possess a very
intuitive physical sense, namely, the line reflects the estimated performance of sub-filters to
improve the fusion performance of the global filter by adjusting the proportion of the local
estimates information in the global estimates information Especially when the performance
degradation of a subsystem makes its local estimation error covariance matrix such a
singular huge increase that its adaptive information distribution can make the combined
filter participating of strong robustness and fault tolerance
5 Summary
The chapter focuses on non-standard multi-sensor information fusion system with each kind
of nonlinear, uncertain and correlated factor, which is widely popular in actual application,
because of the difference of measuring principle and character of sensor as well as
measuring environment
Aiming at the above non-standard factors, three resolution schemes based on semi-parameter
modeling, multi model fusion and self-adaptive estimation are relatively advanced, and
moreover, the corresponding fusion estimation model and algorithm are presented
(1) By introducing semi-parameter regression analysis concept to non-standard multi-sensor
state fusion estimation theory, the relational fusion estimation model and
parameter-non-parameter solution algorithm are established; the process is to separate
model error brought by nonlinear and uncertainty factors with semi-parameter modeling
method and then weakens the influence to the state fusion estimation precision; besides, the
conclusion is proved in theory that the state estimation obtained in this algorithm is the
optimal fusion estimation
(2) Two multi-model fusion estimation methods respectively based on multi-model adaptive
estimation and interacting multiple model fusion are researched to deal with nonlinear and
time-change factors existing in multi-sensor fusion system and moreover to realize the
optimal fusion estimation for the state
(3) Self-adaptive fusion estimation strategy is introduced to solve local dependency and
system parameter uncertainty existed in multi-sensor dynamical system and moreover to
realize the optimal fusion estimation for the state The fusion model for federal filter and its
optimality are researched; the fusion algorithms respectively in relevant or irrelevant for
each sub-filter are presented; the structure and algorithm scheme for federal filter are
designed; moreover, its estimation performance was also analyzed, which was influenced
by information allocation factors greatly So the selection method of information allocation
factors was discussed, in this chapter, which was dynamically and self-adaptively
determined according to the eigenvalue square decomposition of the covariance matrix
6 Reference
Hall L D, Llinas J Handbook of Multisensor Data Fusion Bcoa Raton, FL, USA: CRC Press,
2001
Bedworth M, O’Brien J the Omnibus Model: A New Model of Data Fusion IEEE
Transactions on Aerospace and Electronic System, 2000, 15(4): 30-36
Heintz, F., Doherty, P A Knowledge Processing Middleware Framework and its Relation to
the JDL Data Fusion Model Proceedings of the 7th International Conference on Information Fusion, 2005, pp: 1592-1599
Llinas J, Waltz E Multisensor Data Fusion Norwood, MA: Artech House, 1990
X R Li, Yunmin Zhu, Chongzhao Han Unified Optimal Linear Estimation Fusion-Part I:
Unified Models and Fusion Rules Proc 2000 International Conf on Information Fusion, July 2000
Jiongqi Wang, Haiyin Zhou, Deyong Zhao, el State Optimal Estimation with Nonstandard
Multi-sensor Information Fusion System Engineering and Electronics, 2008, 30(8): 1415-1420
Kennet A, Mayback P S Multiple Model Adaptive Estimation with Filter Pawning IEEE
Transaction on Aerospace Electron System, 2002, 38(3): 755-768 Bar-shalom, Y., Campo, L The Effect of The Common Process Noise on the Two-sensor
Fused-track Covariance IEEE Transaction on Aerospace and Electronic Systems,
1986, Vol.22: 803-805 Morariu, V I, Camps, O I Modeling Correspondences for Multi Camera Tracking Using
Nonlinear Manifold Learning and Target Dynamics IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June, 2006, pp: 545-552 Stephen C, Stubberud, Kathleen A, et al Data Association for Multisensor Types Using
Fuzzy Logic IEEE Transaction on Instrumentation and Measurement, 2006, 55(6): 2292-2303
Hammerand, D C ; Oden, J T ; Prudhomme, S ; Kuczma, M S Modeling Error and
Adaptivity in Nonlinear Continuum System, NTIS No: DE2001-780285/XAB Crassidis J Letal.A Real-time Error Filter and State Estimator.AIAA-943550.1994:92-102 Flammini, A, Marioli, D et al Robust Estimation of Magnetic Barkhausen Noise Based on a
Numerical Approach IEEE Transaction on Instrumentation and Measurement,
2002, 16(8): 1283-1288 Donoho D L., Elad M On the Stability of the Basis Pursuit in the Presence of Noise http:
//www-stat.stanford.edu/-donoho/reports.html Sun H Y, Wu Y Semi-parametric Regression and Model Refining Geospatial Information
Science, 2002, 4(5): 10-13 Green P.J., Silverman B.W Nonparametric Regression and Generalized Linear Models
London: CHAPMAN and HALL, 1994 Petros Maragos, FangKuo Sun Measuring the Fractal Dimension of Signals: Morphological
Covers and Iterative Optimization IEEE Trans On Signal Processing, 1998(1): 108~121
G, Sugihara, R.M.May Nonlinear Forecasting as a Way of Distinguishing Chaos From
Measurement Error in Time Series, Nature, 1990, 344: 734-741 Roy R, Paulraj A, kailath T ESPRIT Estimation of Signal Parameters Via Rotational
Invariance Technique IEEE Transaction Acoustics, Speech, Signal Processing, 1989, 37:984-98
Aufderheide B, Prasad V, Bequettre B W A Compassion of Fundamental Model-based and
Multi Model Predictive Control Proceeding of IEEE 40th Conference on Decision and Control, 2001: 4863-4868
Trang 8Aufderheide B, Bequette B W A Variably Tuned Multiple Model Predictive Controller Based
on Minimal Process Knowledge Proceedings of the IEEE American Control Conference, 2001, 3490-3495
X Rong Li, Jikov, Vesselin P A Survey of Maneuvering Target Tracking-Part V:
Multiple-Model Methods Proceeding of SPIE Conference on Signal and Data Proceeding of Small Targets, San Diego, CA, USA, 2003
T.M Berg, et al General Decentralized Kalman filters Proceedings of the American Control
Conference, Mayland, June, 1994, pp.2273-2274
Nahin P J, Pokoski Jl NCTR Plus Sensor Fusion of Equals IFNN IEEE Transaction on AES,
1980, Vol AES-16, No.3, pp.320-337
Bar-Shalom Y, Blom H A The Interacting Multiple Model Algorithm for Systems with
Markovian Switching Coefficients IEEE Transaction on Aut Con, 1988, AC-33: 780-783
X.Rong Li, Vesselin P Jilkov A Survey of Maneuvering Target Tracking-Part I: Dynamic
Models IEEE Transaction on Aerospace and Electronic Systems, 2003, 39(4): 1333-1361
Huimin Chen, Thiaglingam Kirubarjan, Yaakov Bar-Shalom Track-to-track Fusion Versus
Centralized Estimation: Theory and Application IEEE Transactions on AES, 2003, 39(2): 386-411
F.M.Ham Observability, Eigenvalues and Kalman Filtering IEEE Transactions on Aerospace
and Electronic Systems, 1982, 19(2): 156-164
Xianda, Zhang Matrix Analysis and Application Tsinghua University Press, 2004, Beijing
X Rong Li Information Fusion for Estimation and Decision International Workshop on Data
Fusion in 2002, Beijing, China
Trang 9Air traffic trajectories segmentation based on time-series sensor data 31
Air traffic trajectories segmentation based on time-series sensor data
José L Guerrero, Jesús García and José M Molina
X
Air traffic trajectories segmentation based on time-series sensor data
José L Guerrero, Jesús García and José M Molina
University Carlos III of Madrid
Spain
1 Introduction
ATC is a critical area related with safety, requiring strict validation in real conditions (Kennedy
& Gardner, 1998), being this a domain where the amount of data has gone under an
exponential growth due to the increase in the number of passengers and flights This has led to
the need of automation processes in order to help the work of human operators (Wickens et
al., 1998) These automation procedures can be basically divided into two different basic
processes: the required online tracking of the aircraft (along with the decisions required
according to this information) and the offline validation of that tracking process (which is
usually separated into two sub-processes, segmentation (Guerrero & Garcia, 2008), covering
the division of the initial data into a series of different segments, and reconstruction (Pérez et
al., 2006, García et al., 2007), which covers the approximation with different models of the
segments the trajectory was divided into) The reconstructed trajectories are used for the
analysis and evaluation processes over the online tracking results
This validation assessment of ATC centers is done with recorded datasets (usually named
opportunity traffic), used to reconstruct the necessary reference information The
reconstruction process transforms multi-sensor plots to a common coordinates frame and
organizes data in trajectories of an individual aircraft Then, for each trajectory, segments of
different modes of flight (MOF) must be identified, each one corresponding to time intervals
in which the aircraft is flying in a different type of motion These segments are a valuable
description of real data, providing information to analyze the behavior of target objects
(where uniform motion flight and maneuvers are performed, magnitudes, durations, etc)
The performance assessment of ATC multisensor/multitarget trackers require this
reconstruction analysis based on available air data, in a domain usually named opportunity
trajectory reconstruction (OTR), (Garcia et al., 2009)
OTR consists in a batch process where all the available real data from all available sensors is
used in order to obtain smoothed trajectories for all the individual aircrafts in the interest
area It requires accurate original-to-reconstructed trajectory’s measurements association,
bias estimation and correction to align all sensor measures, and also adaptive multisensor
smoothing to obtain the final interpolated trajectory It should be pointed out that it is an
off-line batch processing potentially quite different to the usual real time data fusion
systems used for ATC, due to the differences in the data processing order and its specific
2
Trang 10processing techniques, along with different availability of information (the whole trajectory
can be used by the algorithms in order to perform the best possible reconstruction)
OTR works as a special multisensor fusion system, aiming to estimate target kinematic state,
in which we take advantage of both past and future target position reports (smoothing
problem) In ATC domain, the typical sensors providing data for reconstruction are the
following:
• Radar data, from primary (PSR), secondary (SSR), and Mode S radars (Shipley,
1971) These measurements have random errors in the order of the hundreds of
meters (with a value which increases linearly with distance to radar)
• Multilateration data from Wide Area Multilateration (WAM) sensors (Yang et al.,
2002) They have much lower errors (in the order of 5-100 m), also showing a linear
relation in its value related to the distance to the sensors positions
• Automatic dependent surveillance (ADS-B) data (Drouilhet et al., 1996) Its quality
is dependent on aircraft equipment, with the general trend to adopt GPS/GNSS,
having errors in the order of 5-20 meters
The complementary nature of these sensor techniques allows a number of benefits (high
degree of accuracy, extended coverage, systematic errors estimation and correction, etc), and
brings new challenges for the fusion process in order to guarantee an improvement with
respect to any of those sensor techniques used alone
After a preprocessing phase to express all measurements in a common reference frame (the
stereographic plane used for visualization), the studied trajectories will have measurements
with the following attributes: detection time, stereographic projections of its x and y
components, covariance matrix, and real motion model (MM), (which is an attribute only
included in simulated trajectories, used for algorithm learning and validation) With these
input attributes, we will look for a domain transformation that will allow us to classify our
samples into a particular motion model with maximum accuracy, according to the model we
are applying
The movement of an aircraft in the ATC domain can be simplified into a series of basic
MM’s The most usually considered ones are uniform, accelerated and turn MM’s The
general idea of the proposed algorithm in this chapter is to analyze these models
individually and exploit the available information in three consecutive different phases
The first phase will receive the information in the common reference frame and the analyzed
model in order to obtain, as its output data, a set of synthesized attributes which will be
handled by a learning algorithm in order to obtain the classification for the different
trajectories measurements These synthesized attributes are based on domain transformations
according to the analyzed model by means of local information analysis (their value is based
on the definition of segments of measurements from the trajectory).They are obtained for each
measurement belonging to the trajectory (in fact, this process can be seen as a data
pre-processing for the data mining techniques (Famili et al., 1997))
The second phase applies data mining techniques (Eibe, 2005) over the synthesized
attributes from the previous phase, providing as its output an individual classification for
each measurement belonging to the analyzed trajectory This classification identifies the
measurement according to the model introduced in the first phase (determining whether it
belongs to that model or not)
The third phase, obtaining the data mining classification as its input, refines this
classification according to the knowledge of the possible MM’s and their transitions,
correcting possible misclassifications, and provides the final classification for each of the trajectory’s measurement This refinement is performed by means of the application of a filter
Finally, segments are constructed over those classifications (by joining segments with the same classification value) These segments are divided into two different possibilities: those belonging to the analyzed model (which are already a final output of the algorithm) and those which do not belong to it, having to be processed by different models It must be noted that the number of measurements processed by each model is reduced with each application of this cycle (due to the segments already obtained as a final output) and thus, more detailed models with lower complexity should be applied first Using the introduced division into three MM’s, the proposed order is the following: uniform, accelerated and finally turn model Figure 1 explains the algorithm’s approach:
Fig 1 Overview of the algorithm’s approach The validation of the algorithm is carried out by the generation of a set of test trajectories as representative as possible This implies not to use exact covariance matrixes, (but estimations of their value), and carefully choosing the shapes of the simulated trajectories
We have based our results on four types of simulated trajectories, each having two different samples Uniform, turn and accelerated trajectories are a direct validation of our three basic MM’s The fourth trajectory type, racetrack, is a typical situation during landing procedures The validation is performed, for a fixed model, with the results of its true positives rate (TPR, the rate of measurements correctly classified among all belonging to the model) and false positives rate (FPR, the rate of measurements incorrectly classified among all not belonging the model) This work will show the results of the three consecutive phases using
a uniform motion model
The different sections of this work will be divided with the following organization: the second section will deal with the problem definition, both in general and particularized for the chosen approach The third section will present in detail the general algorithm, followed
Trajectory input data First phase:
domain transformation
Second phase: data mining techniquesSynthesized attributes
Preliminaryclassifications
Third phase:
results filteringRefined classifications
NO Apply next
model
YESFinal segmentation results
Belongs tomodel?
Segment construction
Analyzed model
for each output segment
Trang 11Air traffic trajectories segmentation based on time-series sensor data 33
processing techniques, along with different availability of information (the whole trajectory
can be used by the algorithms in order to perform the best possible reconstruction)
OTR works as a special multisensor fusion system, aiming to estimate target kinematic state,
in which we take advantage of both past and future target position reports (smoothing
problem) In ATC domain, the typical sensors providing data for reconstruction are the
following:
• Radar data, from primary (PSR), secondary (SSR), and Mode S radars (Shipley,
1971) These measurements have random errors in the order of the hundreds of
meters (with a value which increases linearly with distance to radar)
• Multilateration data from Wide Area Multilateration (WAM) sensors (Yang et al.,
2002) They have much lower errors (in the order of 5-100 m), also showing a linear
relation in its value related to the distance to the sensors positions
• Automatic dependent surveillance (ADS-B) data (Drouilhet et al., 1996) Its quality
is dependent on aircraft equipment, with the general trend to adopt GPS/GNSS,
having errors in the order of 5-20 meters
The complementary nature of these sensor techniques allows a number of benefits (high
degree of accuracy, extended coverage, systematic errors estimation and correction, etc), and
brings new challenges for the fusion process in order to guarantee an improvement with
respect to any of those sensor techniques used alone
After a preprocessing phase to express all measurements in a common reference frame (the
stereographic plane used for visualization), the studied trajectories will have measurements
with the following attributes: detection time, stereographic projections of its x and y
components, covariance matrix, and real motion model (MM), (which is an attribute only
included in simulated trajectories, used for algorithm learning and validation) With these
input attributes, we will look for a domain transformation that will allow us to classify our
samples into a particular motion model with maximum accuracy, according to the model we
are applying
The movement of an aircraft in the ATC domain can be simplified into a series of basic
MM’s The most usually considered ones are uniform, accelerated and turn MM’s The
general idea of the proposed algorithm in this chapter is to analyze these models
individually and exploit the available information in three consecutive different phases
The first phase will receive the information in the common reference frame and the analyzed
model in order to obtain, as its output data, a set of synthesized attributes which will be
handled by a learning algorithm in order to obtain the classification for the different
trajectories measurements These synthesized attributes are based on domain transformations
according to the analyzed model by means of local information analysis (their value is based
on the definition of segments of measurements from the trajectory).They are obtained for each
measurement belonging to the trajectory (in fact, this process can be seen as a data
pre-processing for the data mining techniques (Famili et al., 1997))
The second phase applies data mining techniques (Eibe, 2005) over the synthesized
attributes from the previous phase, providing as its output an individual classification for
each measurement belonging to the analyzed trajectory This classification identifies the
measurement according to the model introduced in the first phase (determining whether it
belongs to that model or not)
The third phase, obtaining the data mining classification as its input, refines this
classification according to the knowledge of the possible MM’s and their transitions,
correcting possible misclassifications, and provides the final classification for each of the trajectory’s measurement This refinement is performed by means of the application of a filter
Finally, segments are constructed over those classifications (by joining segments with the same classification value) These segments are divided into two different possibilities: those belonging to the analyzed model (which are already a final output of the algorithm) and those which do not belong to it, having to be processed by different models It must be noted that the number of measurements processed by each model is reduced with each application of this cycle (due to the segments already obtained as a final output) and thus, more detailed models with lower complexity should be applied first Using the introduced division into three MM’s, the proposed order is the following: uniform, accelerated and finally turn model Figure 1 explains the algorithm’s approach:
Fig 1 Overview of the algorithm’s approach The validation of the algorithm is carried out by the generation of a set of test trajectories as representative as possible This implies not to use exact covariance matrixes, (but estimations of their value), and carefully choosing the shapes of the simulated trajectories
We have based our results on four types of simulated trajectories, each having two different samples Uniform, turn and accelerated trajectories are a direct validation of our three basic MM’s The fourth trajectory type, racetrack, is a typical situation during landing procedures The validation is performed, for a fixed model, with the results of its true positives rate (TPR, the rate of measurements correctly classified among all belonging to the model) and false positives rate (FPR, the rate of measurements incorrectly classified among all not belonging the model) This work will show the results of the three consecutive phases using
a uniform motion model
The different sections of this work will be divided with the following organization: the second section will deal with the problem definition, both in general and particularized for the chosen approach The third section will present in detail the general algorithm, followed
Trajectory input data First phase:
domain transformation
Second phase: data mining techniquesSynthesized attributes
Preliminaryclassifications
Third phase:
results filteringRefined classifications
NO Apply next
model
YESFinal segmentation results
Belongs tomodel?
Segment construction
Analyzed model
for each output segment
Trang 12by three sections detailing the three phases for that algorithm when the uniform movement
model is applied: the fourth section will present the different alternatives for the domain
transformation and choose between them the ones included in the final algorithm, the fifth
will present some representative machine learning techniques to be applied to obtain the
classification results and the sixth the filtering refinement over the previous results will be
introduced, leading to the segment synthesis processes The seventh section will cover the
results obtained over the explained phases, determining the used machine learning
technique and providing the segmentation results, both numerically and graphically, to
provide the reader with easy validation tools over the presented algorithm Finally a
conclusions section based on the presented results is presented
2 Problem definition
2.1 General problem definition
As we presented in the introduction section, each analyzed trajectory (ܶ) is composed of a
collection of sensor reports (or measurements), which are defined by the following vector:
ݔԦൌ ൫ݔǡ ݕǡ ݐǡ ܴ൯, ݆ ߳ ሼͳǡ ǥ ǡ ܰሽ (1) where j is the measurement number, i the trajectory number, N is the number of
measurements in a given trajectory, ݔǡ ݕ are the stereographic projections of the
measurement, ݐ is the detection time and ܴ is the covariance matrix (representing the error
introduced by the measuring device) From this problem definition our objective is to divide
our trajectory into a series of segments (ܤሻ, according to our estimated MOF This is
performed as an off-line processing (meaning that we may use past and future information
from our trajectory) The segmentation problem can be formalized using the following
notation:
ܶൌ ڂ ܤ ܤ ൌ ሼݔሽ ݆ ߳ ሼ݇ǡ ǥ ǡ ݇௫ሽ (2)
In the general definition of this problem these segments are obtained by the comparison
with a test model applied over different windows (aggregations) of measurements coming
from our trajectory, in order to obtain a fitness value, deciding finally the segmentation
operation as a function of that fitness value (Mann et al 2002), (Garcia et al., 2006)
We may consider the division of offline segmentation algorithms into different approaches:
a possible approach is to consider the whole data from the trajectory and the segments
obtained as the problem’s basic division unit (using a global approach), where the basic
operation of the segmentation algorithm is the division of the trajectory into those segments
(examples of this approach are the bottom-up and top-down families (Keogh et al., 2003)) In
the ATC domain, there have been approaches based on a direct adaptation of online
techniques, basically combining the results of forward application of the algorithm (the pure
online technique) with its backward application (applying the online technique reversely to
the time series according to the measurements detection time) (Garcia et al., 2006) An
alternative can be based on the consideration of obtaining a different classification value for
each of the trajectory’s measurements (along with their local information) and obtaining the
segments as a synthesized solution, built upon that classification (basically, by joining those adjacent measures sharing the same MM into a common segment) This approach allows the application of several refinements over the classification results before the final synthesis is performed, and thus is the one explored in the presented solution in this chapter
2.2 Local approach problem definition
We have presented our problem as an offline processing, meaning that we may use information both from our past and our future Introducing this fact into our local representation, we will restrict that information to a certain local segment around the measurement which we would like to classify These intervals are centered on that measurement, but the boundaries for them can be expressed either in number of measurements, (3), or according to their detection time values (4)
ܤሺݔ ሻ ൌ ሼݔሽ ݆ ߳ ሾ݉ െ ǡ ǥ ǡ ݉ǡ ǥ ǡ ݉ ሿ (3) ܤሺݔ ሻ ൌ ሼݔሽ ݐ୨߳൛ݐ୫ െ ǡ ǥ ǡ ୫ǡ ǥ ǡ ݐ୫ ൟ (4) Once we have chosen a window around our current measurement, we will have to apply a function to that segment in order to obtain its transformed value This general classification function F(ݔሬሬሬԦሻ, using measurement boundaries, may be represented with the following ఫప
formulation:
F(ݔሬሬሬሬሬԦሻ = F(ݔ୫ప ሬሬሬሬሬԦȁܶ୫ప ) ֜ F(ݔሬሬሬԦȁ൫ ୫୧ ൯ሻ = Fp(ݔԦ୫ି , , ݔԦ୫ , , ݔԦ୫ା ) (5) From this formulation of the problem we can already see some of the choices available: how
to choose the segments (according to (3) or (4)), which classification function to apply in (5) and how to perform the final segment synthesis Figure 2 shows an example of the local approach for trajectory segmentation
Fig 2 Local approach for trajectory segmentation approach overview
2,5 3 3,5 4 4,5 5 5,5 6 6,5
Trang 13Air traffic trajectories segmentation based on time-series sensor data 35
by three sections detailing the three phases for that algorithm when the uniform movement
model is applied: the fourth section will present the different alternatives for the domain
transformation and choose between them the ones included in the final algorithm, the fifth
will present some representative machine learning techniques to be applied to obtain the
classification results and the sixth the filtering refinement over the previous results will be
introduced, leading to the segment synthesis processes The seventh section will cover the
results obtained over the explained phases, determining the used machine learning
technique and providing the segmentation results, both numerically and graphically, to
provide the reader with easy validation tools over the presented algorithm Finally a
conclusions section based on the presented results is presented
2 Problem definition
2.1 General problem definition
As we presented in the introduction section, each analyzed trajectory (ܶ) is composed of a
collection of sensor reports (or measurements), which are defined by the following vector:
ݔԦൌ ൫ݔǡ ݕǡ ݐǡ ܴ൯, ݆ ߳ ሼͳǡ ǥ ǡ ܰሽ (1) where j is the measurement number, i the trajectory number, N is the number of
measurements in a given trajectory, ݔǡ ݕ are the stereographic projections of the
measurement, ݐ is the detection time and ܴ is the covariance matrix (representing the error
introduced by the measuring device) From this problem definition our objective is to divide
our trajectory into a series of segments (ܤሻ, according to our estimated MOF This is
performed as an off-line processing (meaning that we may use past and future information
from our trajectory) The segmentation problem can be formalized using the following
notation:
ܶൌ ڂ ܤ ܤ ൌ ሼݔሽ ݆ ߳ ሼ݇ǡ ǥ ǡ ݇௫ሽ (2)
In the general definition of this problem these segments are obtained by the comparison
with a test model applied over different windows (aggregations) of measurements coming
from our trajectory, in order to obtain a fitness value, deciding finally the segmentation
operation as a function of that fitness value (Mann et al 2002), (Garcia et al., 2006)
We may consider the division of offline segmentation algorithms into different approaches:
a possible approach is to consider the whole data from the trajectory and the segments
obtained as the problem’s basic division unit (using a global approach), where the basic
operation of the segmentation algorithm is the division of the trajectory into those segments
(examples of this approach are the bottom-up and top-down families (Keogh et al., 2003)) In
the ATC domain, there have been approaches based on a direct adaptation of online
techniques, basically combining the results of forward application of the algorithm (the pure
online technique) with its backward application (applying the online technique reversely to
the time series according to the measurements detection time) (Garcia et al., 2006) An
alternative can be based on the consideration of obtaining a different classification value for
each of the trajectory’s measurements (along with their local information) and obtaining the
segments as a synthesized solution, built upon that classification (basically, by joining those adjacent measures sharing the same MM into a common segment) This approach allows the application of several refinements over the classification results before the final synthesis is performed, and thus is the one explored in the presented solution in this chapter
2.2 Local approach problem definition
We have presented our problem as an offline processing, meaning that we may use information both from our past and our future Introducing this fact into our local representation, we will restrict that information to a certain local segment around the measurement which we would like to classify These intervals are centered on that measurement, but the boundaries for them can be expressed either in number of measurements, (3), or according to their detection time values (4)
ܤሺݔሻ ൌ ሼݔሽ ݆ ߳ ሾ݉ െ ǡ ǥ ǡ ݉ǡ ǥ ǡ ݉ ሿ (3) ܤሺݔ ሻ ൌ ሼݔሽ ݐ୨߳൛ݐ୫ െ ǡ ǥ ǡ ୫ǡ ǥ ǡ ݐ୫ ൟ (4) Once we have chosen a window around our current measurement, we will have to apply a function to that segment in order to obtain its transformed value This general classification function F(ݔሬሬሬԦሻ, using measurement boundaries, may be represented with the following ఫప
formulation:
F(ݔሬሬሬሬሬԦሻ = F(ݔ୫ప ሬሬሬሬሬԦȁܶ୫ప ) ֜ F(ݔሬሬሬԦȁ൫ ୫୧ ൯ሻ = Fp(ݔԦ୫ି , , ݔԦ୫ , , ݔԦ୫ା ) (5) From this formulation of the problem we can already see some of the choices available: how
to choose the segments (according to (3) or (4)), which classification function to apply in (5) and how to perform the final segment synthesis Figure 2 shows an example of the local approach for trajectory segmentation
Fig 2 Local approach for trajectory segmentation approach overview
2,5 3 3,5 4 4,5 5 5,5 6 6,5
Trang 143 General algorithm proposal
As presented in the introduction section, we will consider three basic MM’s and classify our
measurements individually according to them (Guerrero & Garcia, 2008) If a measurement
is classified as unknown, it will be included in the input data for the next model’s analysis
This general algorithm introduces a design criterion based on the introduced concepts of
TPR and FPR, respectively equivalent to the type I and type II errors (Allchin, 2001) The
design criterion will be to keep a FPR as low as possible, understanding that those
measurements already assigned to a wrong model will not be analyzed by the following
ones (and thus will remain wrongly classified, leading to a poorer trajectory reconstruction)
The proposed order for this analysis of the MM’s is the same in which they have been
introduced, and the choice is based on how accurately we can represent each of them
In the local approach problem definition section, the segmentation problem was divided
into two different sub-problems: the definition of the ܨሺݔሬሬሬሬሬԦሻ function (to perform ప
measurement classification) and a final segment synthesis over that classification
According to the different phases presented in the introduction section, we will divide the
definition of the classification function F(ݔሬሬሬԦሻinto two different tasks: a domain ఫప
transformation DtሺݔሬሬሬԦሻ (domain specific, which defines the first phase of our algorithm) and ఫప
a final classification Cl(DtሺݔሬሬሬԦሻ) (based on general classification algorithms, represented by ఫప
the data mining techniques which are introduced in the second phase) The final synthesis
over the classification results includes the refinement over that classification introduced by
the filtering process and the actual construction of the output segment (third phase of the
proposed algorithm)
The introduction of the domain transformation DtሺݔሬሬሬԦሻ from the initial data in the common ఫప
reference frame must deal with the following issues: segmentation, (which will cover the
decision of using an independent classification for each measurement or to treat segments as
an indivisible unit), definition for the boundaries of the segments, which involves segment
extension (which analyzes the definition of the segments by number of points or according
to their detection time values) and segment resolution (dealing with the choice of the length
of those segments, and how it affects our results), domain transformations (the different
possible models used in order to obtain an accurate classification in the following phases),
and threshold choosing technique (obtaining a value for a threshold in order to pre-classify
the measurements in the transformed domain)
The second phase introduces a set of machine learning techniques to try to determine
whether each of the measurements belongs to the analyzed model or not, based on the
pre-classifications obtained in the first phase In this second phase we will have to choose a
Cl(DtሺݔሬሬሬԦሻ) technique, along with its configuration parameters, to be included in the ఫప
algorithm proposal The considered techniques are decision trees (C4.5, (Quinlan, 1993))
clustering (EM, (Dellaert, 2002)) neural networks (multilayer perceptron, (Gurney, 1997))
and Bayesian nets (Jensen & Graven-Nielsen, 2007) (along with the simplified naive Bayes
approach (Rish, 2001))
Finally, the third phase (segment synthesis) will propose a filter, based on domain
knowledge, to reanalyze the trajectory classification results and correct those values which
may not follow this knowledge (essentially, based on the required smoothness in MM’s
changes) To obtain the final output for the model analysis, the isolated measurements will
be joined according to their classification in the final segments of the algorithm
The formalization of these phases and the subsequent changes performed to the data is presented in the following vectors, representing the input and output data for our three processes:
The following aspects are the key parameters for this phase, presented along with the different alternatives compared for them, (it must be noted that the possibilities compared here are not the only possible ones, but representative examples of different possible approaches):
Transformation function: correlation coefficient / Best linear unbiased estimator
residue
Segmentation granularity: segment study / independent study
Segment extension, time / samples, and segment resolution, length of the segment,
using the boundary units imposed by the previous decision
Threshold choosing technique, choice of a threshold to classify data in the
transformed domain
Each of these parameters requires an individual validation in order to build the actual final algorithm tested in the experimental section Each of them will be analyzed in an individual section in order to achieve this task
4.1 Transformation function analysis
The transformation function decision is probably the most crucial one involving this first phase of our algorithm The comparison presented tries to determine whether there is a real accuracy increase by introducing noise information (in the form of covariance matrixes) This section compares a correlation coefficient (Meyer, 1970) (a general statistic with no noise information) with a BLUE residue (Kay, 1993) (which introduces the noise in the measuring process) This analysis was originally proposed in (Guerrero & Garcia, 2008) The equations for the CC statistical are the following:
Trang 15Air traffic trajectories segmentation based on time-series sensor data 37
3 General algorithm proposal
As presented in the introduction section, we will consider three basic MM’s and classify our
measurements individually according to them (Guerrero & Garcia, 2008) If a measurement
is classified as unknown, it will be included in the input data for the next model’s analysis
This general algorithm introduces a design criterion based on the introduced concepts of
TPR and FPR, respectively equivalent to the type I and type II errors (Allchin, 2001) The
design criterion will be to keep a FPR as low as possible, understanding that those
measurements already assigned to a wrong model will not be analyzed by the following
ones (and thus will remain wrongly classified, leading to a poorer trajectory reconstruction)
The proposed order for this analysis of the MM’s is the same in which they have been
introduced, and the choice is based on how accurately we can represent each of them
In the local approach problem definition section, the segmentation problem was divided
into two different sub-problems: the definition of the ܨሺݔሬሬሬሬሬԦሻ function (to perform ప
measurement classification) and a final segment synthesis over that classification
According to the different phases presented in the introduction section, we will divide the
definition of the classification function F(ݔሬሬሬԦሻinto two different tasks: a domain ఫప
transformation DtሺݔሬሬሬԦሻ (domain specific, which defines the first phase of our algorithm) and ఫప
a final classification Cl(DtሺݔሬሬሬԦሻ) (based on general classification algorithms, represented by ఫప
the data mining techniques which are introduced in the second phase) The final synthesis
over the classification results includes the refinement over that classification introduced by
the filtering process and the actual construction of the output segment (third phase of the
proposed algorithm)
The introduction of the domain transformation DtሺݔሬሬሬԦሻ from the initial data in the common ఫప
reference frame must deal with the following issues: segmentation, (which will cover the
decision of using an independent classification for each measurement or to treat segments as
an indivisible unit), definition for the boundaries of the segments, which involves segment
extension (which analyzes the definition of the segments by number of points or according
to their detection time values) and segment resolution (dealing with the choice of the length
of those segments, and how it affects our results), domain transformations (the different
possible models used in order to obtain an accurate classification in the following phases),
and threshold choosing technique (obtaining a value for a threshold in order to pre-classify
the measurements in the transformed domain)
The second phase introduces a set of machine learning techniques to try to determine
whether each of the measurements belongs to the analyzed model or not, based on the
pre-classifications obtained in the first phase In this second phase we will have to choose a
Cl(DtሺݔሬሬሬԦሻ) technique, along with its configuration parameters, to be included in the ఫప
algorithm proposal The considered techniques are decision trees (C4.5, (Quinlan, 1993))
clustering (EM, (Dellaert, 2002)) neural networks (multilayer perceptron, (Gurney, 1997))
and Bayesian nets (Jensen & Graven-Nielsen, 2007) (along with the simplified naive Bayes
approach (Rish, 2001))
Finally, the third phase (segment synthesis) will propose a filter, based on domain
knowledge, to reanalyze the trajectory classification results and correct those values which
may not follow this knowledge (essentially, based on the required smoothness in MM’s
changes) To obtain the final output for the model analysis, the isolated measurements will
be joined according to their classification in the final segments of the algorithm
The formalization of these phases and the subsequent changes performed to the data is presented in the following vectors, representing the input and output data for our three processes:
The following aspects are the key parameters for this phase, presented along with the different alternatives compared for them, (it must be noted that the possibilities compared here are not the only possible ones, but representative examples of different possible approaches):
Transformation function: correlation coefficient / Best linear unbiased estimator
residue
Segmentation granularity: segment study / independent study
Segment extension, time / samples, and segment resolution, length of the segment,
using the boundary units imposed by the previous decision
Threshold choosing technique, choice of a threshold to classify data in the
transformed domain
Each of these parameters requires an individual validation in order to build the actual final algorithm tested in the experimental section Each of them will be analyzed in an individual section in order to achieve this task
4.1 Transformation function analysis
The transformation function decision is probably the most crucial one involving this first phase of our algorithm The comparison presented tries to determine whether there is a real accuracy increase by introducing noise information (in the form of covariance matrixes) This section compares a correlation coefficient (Meyer, 1970) (a general statistic with no noise information) with a BLUE residue (Kay, 1993) (which introduces the noise in the measuring process) This analysis was originally proposed in (Guerrero & Garcia, 2008) The equations for the CC statistical are the following: