254 11.4.1 A General Introduction to Multi-Sensor Data Fusion for Remote-Sensing Applications.... 254 11.4.2 Decision-Level Data Fusion for Remote-Sensing Applications .... Thus, data fu
Trang 1Data Fusion for Remote-Sensing Applications
Anne H.S Solberg
CONTENTS
11.1 Introduction 250
11.2 The ‘‘Multi’’ Concept in Remote Sensing 250
11.2.1 The Multi-Spectral or Multi-Frequency Aspect 250
11.2.2 The Multi-Temporal Aspect 251
11.2.3 The Multi-Polarization Aspect 251
11.2.4 The Multi-Sensor Aspect 251
11.2.5 Other Sources of Spatial Data 251
11.3 Multi-Sensor Data Registration 252
11.4 Multi-Sensor Image Classification 254
11.4.1 A General Introduction to Multi-Sensor Data Fusion for Remote-Sensing Applications 254
11.4.2 Decision-Level Data Fusion for Remote-Sensing Applications 254
11.4.3 Combination Schemes for Combining Classifier Outputs 256
11.4.4 Statistical Multi-Source Classification 257
11.4.5 Neural Nets for Multi-Source Classification 257
11.4.6 A Closer Look at Dempster–Shafer Evidence Theory for Data Fusion 258
11.4.7 Contextual Methods for Data Fusion 259
11.4.8 Using Markov Random Fields to Incorporate Ancillary Data 260
11.4.9 A Summary of Data Fusion Architectures 260
11.5 Multi-Temporal Image Classification 260
11.5.1 Multi-Temporal Classifiers 263
11.5.1.1 Direct Multi-Date Classification 263
11.5.1.2 Cascade Classifiers 263
11.5.1.3 Markov Chain and Markov Random Field Classifiers 264
11.5.1.4 Approaches Based on Characterizing the Temporal Signature 264
11.5.1.5 Other Decision-Level Approaches to Multi-Temporal Classification 264
11.6 Multi-Scale Image Classification 264
11.7 Concluding Remarks 266
11.7.1 Fusion Level 267
11.7.2 Selecting a Multi-Sensor Classifier 267
11.7.3 Selecting a Multi-Temporal Classifier 267
11.7.4 Approaches for Multi-Scale Data 267
Acknowledgment 267
References 267
249
Trang 2a more consistent interpretation of the scene compared to an interpretation based on datafrom a single sensor.
This development opens up for a potential significant change in the approach ofanalysis of earth observation data Traditionally, analysis of such data has been bymeans of analysis of a single satellite image The emerging exceptionally good coverage
in space, time, and the spectrum opens for analysis of time series of data, combiningdifferent sensor types, combining imagery of different scales, and better integration withancillary data and models Thus, data fusion to combine data from several sources isbecoming increasingly more important in many remote-sensing applications
This paper provides a tutorial on data fusion for remote-sensing applications The mainfocus is on methods for multi-source image classification, but separate sections on multi-sensor image registration, multi-scale classification, and multi-temporal image classifica-tion are also included The remainder of this chapter is organized in the followingmanner: in Section 11.2 the ‘‘multi’’ concept in remote sensing is presented Multi-sensordata registration is treated in Section 11.3 Classification strategies for multi-sensor appli-cations are discussed in Section 11.4 Multi-temporal image classification is discussed inSection 11.5, while multi-scale approaches are discussed in Section 11.6 Concludingremarks are given in Section 11.7
11.2 The ‘‘Multi’’ Concept in Remote Sensing
The variety of different sensors already available or being planned creates a number ofpossibilities for data fusion to provide better capabilities for scene interpretation This isreferred to as the ‘‘multi’’ concept in remote sensing The ‘‘multi’’ concept includes multi-temporal, multi-spectral or multi-frequency, multi-polarization, multi-scale, and multi-sensor image analysis In addition to the concepts discussed here, imaging using multipleincidence angles can also provide additional information [1,2]
11.2.1 The Multi-Spectral or Multi-Frequency Aspect
The measured backscatter values for an area vary with the wavelength band A land-usecategory will give different image signals depending on the frequency used, and by usingdifferent frequencies, a spectral signature that characterizes the land-use category can
be found A description of the scattering mechanisms for optical sensors can be found in
Trang 3Ref [3], while Ref [4] contains a thorough discussion of the backscattering mechanisms inthe microwave region Multi-spectral optical sensors have demonstrated this effect for
a substantial number of applications for several decades; they are now followed byhigh-spatial-resolution multi-spectral sensors such as Ikonos and Quickbird, and
by hyperspectral sensors from satellite platforms (e.g., Hyperion)
11.2.2 The Multi-Temporal Aspect
The term multi-temporal refers to the repeated imaging of an area over a period Byanalyzing an area through time, it is possible to develop interpretation techniques based
on an object’s temporal variations and to discriminate different pattern classes ingly Multi-temporal imagery allows, the study of the variation of backscatter of differentareas with time, weather conditions, and seasons It also allows monitoring of processesthat change over time
accord-The principal advantage of multi-temporal analysis is the increased amount of mation for the study area The information provided for a single image is, for certainapplications, not sufficient to properly distinguish between the desired patternclasses This limitation can sometimes be resolved by examining the pattern of temporalchanges in the spectral signature of an object This is particularly important for vegetationapplications Multi-temporal image analysis is discussed in more detail in Section 11.5
infor-11.2.3 The Multi-Polarization Aspect
The multi-polarization aspect is related to microwave image data The polarization of anelectromagnetic wave refers to the orientation of the electric field during propagation Areview of the theory and features of polarization is given in Refs [5,6]
11.2.4 The Multi-Sensor Aspect
With an increasing number of operational and experimental satellites, information about
a phenomenon can be captured using different types of sensors
Fusion of images from different sensors requires some additional preprocessing andposes certain difficulties that are not solved in traditional image classifiers Each sensorhas its own characteristics, and the image captured usually contains various artifacts thatshould be corrected or removed The images also need to be geometrically corrected andco-registered Because the multi-sensor images often are not acquired on the same data,the multi-temporal nature of the data must also often be explained
Figure 11.1shows a simple visualization of two synthetic aperture radar (SAR) imagesfrom an oil spill in the Baltic sea, imaged by the ENVISAT ASAR sensor and the RadarsatSAR sensor The images were taken a few hours apart During this time, the oil slick hasdrifted to some extent, and it has become more irregular in shape
11.2.5 Other Sources of Spatial Data
The preceding sections have addressed spatial data in the form of digital images obtainedfrom remote-sensing satellites For most regions, additional information is available in theform of various kinds of maps, for example, topography, ground cover, elevation, and so
on Frequently, maps contain spatial information not obtainable from a single remotelysensed image Such maps represent a valuable information resource in addition to the
Trang 4satellite images To integrate map information with a remotely sensed image, the mapmust be available in digital form, for example, in a GIS system.
11.3 Multi-Sensor Data Registration
A prerequisite for data fusion is that the data are co-registered, and geometrically andradiometrically corrected Data co-registration can be simple if the data are georeferenced
In that case, the co-registration consists merely of resampling the images to a commonmap projection However, an image-matching step is often necessary to obtain subpixelaccuracy in matching Complicating factors for multi-sensor data are the different ap-pearances of the same object imaged by different sensors, and nonrigid changes in objectposition between multi-temporal images
FIGURE 11.1 (See color insert following page 240.)
Example of multi-sensor visualization of an oil spill in the Baltic sea created by combining an ENVISAT ASAR image with a Radarsat SAR image taken a few hours later.
Trang 5The image resampling can be done at various stages of the image interpretationprocess Resampling an image affects the spatial statistics of the neighboring pixel,which is of importance for many radar image feature extraction methods that might usespeckle statistics or texture When fusing a radar image with other data sources, a solutionmight be to transform the other data sources to the geometry of the radar image Whenfusing a multi-temporal radar image, an alternative might be to use images from the sameimage mode of the sensor, for example, only ascending scenes with a given incidenceangle range If this is not possible and the spatial information from the original geo-metry is important, the data can be fused and resampling done after classification by thesensor-specific classifiers.
An image-matching step may be necessary to achieve subpixel accuracy in the registration even if the data are georeferenced A survey of image registration methods
co-is given by Zitova and Flusser [7] A full image regco-istration process generally consco-ists offour steps:
. Feature extraction This is the step where regions, edges, and contours can be used
to represent tie-points in the set of images to be matched are extracted This is acrucial step, as the registration accuracy can be no better than what is achievedfor the tie-points
Feature extraction can be grouped into area-based methods [8,9], feature-based methods[10–12], and hybrid approaches [7] In area-based methods, the gray levels of the imagesare used directly for matching, often by statistical comparison of pixel values in smallwindows, and they are best suited for images from the same or highly similar sensors.Feature-based methods will be application-dependent, as the type of features to use as tiepoints needs to be tailored to the application Features can be extracted either from thespatial domain (edges, lines, regions, intersections, and so on) or from the frequencydomain (e.g., wavelet features) Spatial features can perform well for matching data fromheterogeneous sensors, for example, optical and radar images Hybrid approaches useboth area-based and feature-based techniques by combining both a correlation-basedmatching with an edge-based approach, and they are useful in matching data fromheterogeneous sensors
. Feature matching In this step, the correspondence between the tie-points orfeatures in the sensed image and the reference image is found Area-basedmethods for feature extraction use correlation, Fourier-transform methods, oroptical flow [13] Feature-based methods use the equivalence between correl-ation in the spatial domain and multiplication in the Fourier domain to performmatching in the Fourier domain [10,11] Correlation-based methods are bestsuited for data from similar sensors The optical flow approach involves estima-tion of the relative motion between two images and is a broad approach It
is commonly used in video analysis, but only a few studies have used it inremote-sensing applications [29,30]
. Transformation selection concerns the choice of mapping function and estimation
of its parameters based on the established feature correspondence The affinetransform model is commonly used for remote-sensing applications, where theimages normally are preprocessed for geometrical correction—a step that justi-fies the use of affine transforms
. Image resampling In this step, the image is transformed by means of the ping function Image values in no-integer coordinates are computed by the
Trang 6map-appropriate interpolation technique Normally, either a nearest neighbor or abilinear interpolation is used Nearest neighbor interpolation is applicable when
no new pixel values should be introduced Bilinear interpolation is often a goodtrade-off between accuracy and computational complexity compared to cubic orhigher order interpolation
11.4 Multi-Sensor Image Classification
The literature on data fusion in the computer vision and machine intelligence domains issubstantial For an extensive review of data fusion, we recommend the book by Abidi andGonzalez [16] Multi-sensor architectures, sensor management, and designing sensorsetup are also thoroughly discussed in Ref [17]
11.4.1 A General Introduction to Multi-Sensor Data Fusion
for Remote-Sensing Applications
Fusion can be performed at the signal, pixel, feature, or decision level of representation (see
Figure 11.2) In signal-based fusion, signals from different sensors are combined to create
a new signal with a better signal-to-noise ratio than the original signals [18] Techniquesfor signal-level data fusion typically involve classic detection and estimation methods[19] If the data are noncommensurate, they must be fused at a higher level
Pixel-based fusion consists of merging information from different images on a pixel basis to improve the performance of image processing tasks such as segmentation[20] Feature-based fusion consists of merging features extracted from different signals orimages [21] In feature-level fusion, features are extracted from multiple sensor observa-tions, then combined into a concatenated feature vector, and classified using a standardclassifier Symbol-level or decision-level fusion consists of merging information at ahigher level of abstraction Based on the data from each single sensor, a preliminaryclassification is performed Fusion then consists of combining the outputs from thepreliminary classifications
pixel-by-The main approaches to data fusion in the remote-sensing literature are statisticalmethods [22–25], Dempster–Shafer theory [26–28], and neural networks [22,29] We willdiscuss each of these approaches in the following sections The best level and method-ology for a given remote-sensing application depends on several factors: the complexity
of the classification problem, the available data set, and the goal of the analysis
11.4.2 Decision-Level Data Fusion for Remote-Sensing Applications
In the general multi-sensor fusion case, we have a set of images X1 XPfrom P sensors.The class labels of the scene are denoted C The Bayesian approach is to assign each pixel
to the class that maximizes the posterior probabilities P(C j X1, , XP)
Trang 7Feature extraction
Classifier module
Feature extraction
Classifier module
Statistical Consensus theory Neural nets Dempster − Shager Fusion module
Classified image
Feature extraction
Classifier module Image data
sensor 1
Image data
sensor p
Statistical Neural nets Dempster − Shafer
Classified image Feature-level fusion
Multi-band image data
Classifier module
Classified image Pixel-level fusion
FIGURE 11.2 (See color insert following page 240.)
An illustration of data fusion on different levels.
Trang 8For decision-level fusion, the following conditional independence assumption is used:
P(X1, , XPjC) P(X1jC) P(XPjC)
This assumption means that the measurements from the different sensors are considered
to be conditionally independent
11.4.3 Combination Schemes for Combining Classifier Outputs
In the data fusion literature [30], various alternative methods have been proposed forcombining the outputs from the sensor-specific classifiers by weighting the influence ofeach sensor This is termed consensus theory The weighting schemes can be linear,logarithmic, or of a more general form (see Figure 11.3)
The simplest choice, the linear opinion pool (LOP), is given by
weight-The weights are supposed to represent the sensor’s reliability weight-The weights can beselected by heuristic methods based on their goodness [3] by weighting a sensor’sinfluence by a factor proportional to its overall classification accuracy on the trainingdata set An alternative approach for a linear combination pool is to use a geneticalgorithm [32]
An approach using a neural net to optimize the weights is presented in Ref [30] Yetanother possibility is to choose the weights in such a way that they not only weigh theindividual data sources but also the classes within the data sources [33]
Trang 9Benediktsson et al [30,31] use a multi-layer perceptron (MLP) neural network to combinethe class-conditional probability densities P(Xpj C) This allows a more flexible, nonlinearcombination scheme They compare the classification accuracy using MLPs to LOPs andLOGPs, and find that the neural net combination performs best.
Benediktsson and Sveinsson [34] provide a comparison of different weighting schemesfor an LOP and LOGP, genetic algorithm with and without pruning, parallel consensusneural nets, and conjugate gradient backpropagation (CGBP) nets on a single multi-source data set The best results were achieved by using a CGBP net to optimize theweights in an LOGP
A study that contradicts the weighting of different sources is found in Ref [35] In thisstudy, three different data sets (optical and radar) were merged using the LOGP, and theweights were varied between 0 and 1 Best results for all three data sets were found byusing equal weights
11.4.4 Statistical Multi-Source Classification
Statistical methods for fusion of remotely sensed data can be divided into four categories:the augmented vector approach, stratification, probabilistic relaxation, and extendedstatistical fusion In the augmented vector approach, data from different sources areconcatenated as if they were measurements from a single sensor This is the most commonapproach for many application-oriented applications of multi-source classification, be-cause no special software is needed This is an example of pixel-level fusion
Such a classifier is difficult to use when the data cannot be modeled with a commonprobability density function, or when the data set includes ancillary data (e.g., from a GISsystem) The fused data vector is then classified using ordinary single-source classifiers[36] Stratification has been used to incorporate ancillary GIS data in the classificationprocess The GIS data are stratified into categories and then a spectral model for each ofthese categories is used [37]
Richards et al [38] extended the methods used for spatially contextual classificationbased on probabilistic relaxation to incorporate ancillary data The methods based onextended statistical fusion [10,43] were derived by extending the concepts used forclassification of single-sensor data Each data source is considered independently andthe classification results are fused using weighted linear combinations
By using a statistical classifier one often assumes that the data have a multi-variateGaussian distribution Recent developments in statistical classifiers based on regressiontheory include choices of nonlinear classifiers [11–13,18–20,26,28,33,38,39–56] For a com-parison of neural nets and regression-based nonlinear classifiers, seeRef [57]
11.4.5 Neural Nets for Multi-Source Classification
Many multi-sensor studies have used neural nets because no specific assumptions aboutthe underlying probability densities are needed [40,58] A drawback of neural nets in thisrespect is that they act like a black box in that the user cannot control the usage of differentdata sources It is also difficult to explicitly use a spatial model for neighboring pixels (butone can extend the input vector from measurements from a single pixel to measurementsfrom neighboring pixels) Guan et al [41] utilized contextual information by using anetwork of neural networks with which they built a quadratic regularizer Anotherdrawback is that specifying a neural network architecture involves specifying a largenumber of parameters A classification experiment should take care in choosing themand apply different configurations, making the complete training process very time
Trang 10consuming [52,58] Hybrid approaches combining statistical methods and neuralnetworks for data fusion have also been proposed [30] Benediktsson et al [30] apply astatistical model to each individual source and use neural nets to reach a consensusdecision Most applications involving a neural net use an MLP or radial basis functionnetwork, but other neural network architectures can be used [59–61].
Neural nets for data fusion can be applied both at the pixel, feature, and decision level.For pixel- and feature-level fusion a single neural net is used to classify the joint featurevector or pixel measurement vector For decision-level fusion, a network combination likethe one outlined in Figure 11.4 is often used [29] An MLP neural net is first used toclassify the images from each source separately Then, the outputs from the sensor-specific nets are fused and weighted in a fusion network
11.4.6 A Closer Look at Dempster–Shafer Evidence Theory for Data Fusion
Dempster–Shafer theory of evidence provides a representation of multi-source data usingtwo central concepts: plausibility and belief Mathematical evidence theory was firstintroduced by Dempster in the 1960s, and later extended by Shafer [62]
A good introduction to Dempster–Shafer evidence theory for remote sensing datafusion is given in Ref [28]
Plausibility (Pls) and belief (Bel) are derived from a mass function m, which is defined
on the [0,1] interval The belief and plausibility functions for an element A are defined as
Sensor-specific neural net
Sensor-specific neural net
Classified image
FIGURE 11.4 (See color insert following page 240.)
Network architecture for decision-level fusion using neural networks.
Trang 11They are sometimes referred to as lower and upper probability functions The belief value
of hypothesis A can be interpreted as the minimum uncertainty value about A, and itsplausibility as the maximum uncertainty [28]
Evidence from p different sources is combined by combining the mass functions
The concepts of evidence theory belong to a different school than Bayesian multi-sensormodels Researchers coming from one school often have a tendency to dislike modelingused in the alternative theory Not many neutral comparisons of these two approachesexist The main advantage of this approach is its robustness in the method by whichinformation from several heterogeneous sources is combined A disadvantage is theunderlying basic assumption that the evidence from different sources is independent.According to Ref [43], Bayesian theory assumes that imprecision about uncertainty in themeasurements is assumed to be zero and uncertainty about an event is only measured bythe probability The author disagrees with this by pointing out that in Bayesian modeling,uncertainty about the measurements can be modeled in the priors Priors of this kind arenot always used, however Priors in a Bayesian model can also be used to modelspatial context and temporal class development It might be argued that the Dempster–Shafer theory can be more appropriate for a high number of heterogeneous sources.However, most papers on data fusion for remote sensing consider two or maximumthree different sources
11.4.7 Contextual Methods for Data Fusion
Remote-sensing data have an inherent spatial nature To account for this, contextualinformation can be incorporated in the interpretation process Basically, the effect ofcontext in an image-labeling problem is that when a pixel is considered in isolation, itmay provide incomplete information about the desired characteristics By considering thepixel in context with other measurements, more complete information might be derived.Only a limited set of studies have involved spatially contextual multi-source classifica-tion Richards et al [38] extended the methods used for spatial contextual classificationbased on probabilistic relaxation to incorporate ancillary data Binaghi et al [63] presented
a knowledge-based framework for contextual classification based on fuzzy set theory Wanand Fraser [61] used multiple self-organizing maps for contextual classification LeHe´garat-Mascle et al [28] combined the use of a Markov random field model with the
image segmentation method based on Markov random fields with adaptive hoods Markov random fields have also been used for data fusion in other applicationdomains [65,66]