Pansharpening is a pixel-level fusion technique used to increase the spatial resolutionof the multispectral image while simultaneously preserving its spectral information.. Thus, panshar
Trang 1of bands and wavelengths Pansharpening is a pixel-level fusion technique used to increase the spatial resolution
of the multispectral image while simultaneously preserving its spectral information In this paper, we provide areview of the pan-sharpening methods proposed in the literature giving a clear classification of them and a
description of their main characteristics Finally, we analyze how the quality of the pansharpened images can beassessed both visually and quantitatively and examine the different quality measures proposed for that purpose
1 Introduction
Nowadays, huge quantities of satellite images are
avail-able from many earth observation platforms, such as
SPOT [1], Landsat 7 [2], IKONOS [3], QuickBird [4]
and OrbView [5] Moreover, due to the growing number
of satellite sensors, the acquisition frequency of the
same scene is continuously increasing Remote sensing
images are recorded in digital form and then processed
by computers to produce image products useful for a
wide range of applications
The spatial resolution of a remote sensing imaging
system is expressed as the area of the ground captured
by one pixel and affects the reproduction of details
within the scene As the pixel size is reduced, more
scene details are preserved in the digital representation
[6] The instantaneous field of view (IFOV) is the
ground area sensed at a given instant of time The
spa-tial resolution depends on the IFOV For a given
num-ber of pixels, the finer the IFOV is, the higher the
spatial resolution Spatial resolution is also viewed as the
clarity of the high-frequency detail information available
in an image Spatial resolution in remote sensing is
usually expressed in meters or feet, which represents the
length of the side of the area covered by a pixel Figure
1 shows three images of the same ground area but with
different spatial resolutions The image at 5 m depicted
in Figure 1a was captured by the SPOT 5 satellite, whilethe other two images, at 10 m and 20 m, are simulatedfrom the first image As can be observed in theseimages, the detail information becomes clearer as thespatial resolution increases from 20 m to 5 m
Spectral resolution is the electromagnetic bandwidth
of the signals captured by the sensor producing a givenimage The narrower the spectral bandwidth is, thehigher the spectral resolution If the platform capturesimages with a few spectral bands, typically 4-7, they arereferred to as multispectral (MS) data, while if the num-ber of spectral bands is measured in hundreds or thou-sands, they are referred to as hyperspectral (HS) data[7] Together with the MS or HS image, satellites usuallyprovide a panchromatic (PAN) image This is an imagethat contains reflectance data representative of a widerange of wavelengths from the visible to the thermalinfrared, that is, it integrates the chromatic information;therefore, the name is“pan” chromatic A PAN image ofthe visible bands captures a combination of red, greenand blue data into a single measure of reflectance.Remote sensing systems are designed within oftencompeting constraints, among the most important onesbeing the trade-off between IFOV and signal-to-noiseratio (SNR) Since MS, and to a greater extent HS, sen-sors have reduced spectral bandwidths compared toPAN sensors, they typically have for a given IFOV areduced spatial resolution in order to collect morephotons and preserve the image SNR Many sensorssuch as SPOT, ETM+, IKONOS, OrbView and
* Correspondence: jmd@decsai.ugr.es
1
Departamento de Ciencias de la Computación e I.A., Universidad de
Granada, 18071, Granada, Spain
Full list of author information is available at the end of the article
© 2011 Amro et al; licensee Springer This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium,
Trang 2QuickBird have a set of MS bands and a co-registered
higher spatial resolution PAN band With appropriate
algorithms, it is possible to combine these data and
pro-duce MS imagery with higher spatial resolution This
concept is known as multispectral or multisensor
mer-ging, fusion or pansharpening (of the lower-resolution
image) [8]
Pansharpening can consequently be defined as a
pixel-level fusion technique used to increase the spatial
reso-lution of the MS image [9] Pansharpening is shorthand
for panchromatic sharpening, meaning the use of a PAN
(single band) image to sharpen an MS image In this
sense, to sharpen means to increase the spatial
resolu-tion of an MS image Thus, pansharpening techniques
increase the spatial resolution while simultaneously
pre-serving the spectral information in the MS image, giving
the best of the two worlds: high spectral resolution and
high spatial resolution [7] Some of the applications of
pansharpening include improving geometric correction,
enhancing certain features not visible in either of the
single data alone, changing detection using temporal
data sets and enhancing classification [10]
During the past years, an enormous amount of
pan-sharpening techniques have been developed, and in
order to choose the one that better serves to the user
needs, there are some points, mentioned by Pohl [9],
that have to be considered In the first place, the
objec-tive or application of the pansharpened image can help
in defining the necessary spectral and spatial resolution
For instance, some users may require frequent, repetitive
coverage, with relatively low spatial resolution (i.e.,
meteorology applications), others may desire the highest
possible spatial resolution (i.e., mapping), while other
users may need both high spatial resolution and
fre-quent coverage, plus rapid image delivery (i.e., military
surveillance)
Then, the data that are more useful to meet the needs
of the pansharpening applications, like the sensor, the
satellite coverage and atmospheric constraints such as
cloud cover and sun angle have to be selected We are
mostly interested in sensors that can capture
simultaneously a PAN channel with high spatial tion and some MS channels with high spectral resolu-tion like SPOT 5, Landsat 7 and QuickBird satellites Insome cases, PAN and MS images captured by differentsatellite sensors at different dates for the same scenecan be used for some applications [10], like in the case
resolu-of fusing different MS SPOT 5 images captured at ferent times with one PAN IKONOS image [11], whichcan be considered as a multisensor, multitemporal andmultiresolution pansharpening case
dif-We also have to take into account the need for datapre-processing, like registration, upsampling and histo-gram matching, as well as the selection of a pansharpen-ing technique that makes the combination of the datamost successful Finally, evaluation criteria are needed
to specify which is the most successful pansharpeningapproach
In this paper, we examine the classical and the-art pansharpening methods described in the litera-ture giving a clear classification of the methods and adescription of their main characteristics To the best ofour knowledge, there is no recent paper providing acomplete overview of the different pansharpening meth-ods However, some papers partially address the classifi-cation of pansharpening methods, see [12] for instance,
state-of-or relate already proposed techniques of mstate-of-ore globalparadigms [13-15]
This paper is organized as follows In Section 2 datapre-processing techniques are described In Section 3 aclassification of the pansharpening methods is presented,with a description of the methods related to each cate-gory and some examples In this section, we also pointout open research problems in each category In Section
4 we analyze how the quality of the pansharpenedimages can be assessed both visually and quantitativelyand examine the different quality measures proposed forthat purpose, and finally, Section 5 concludes the paper
Trang 3pixels that constitutes a digital image is determined by a
combination of scanning in the cross-track direction
(orthogonal to the motion of the sensor platform) and
by the platform motion along the in-track direction A
pixel is created whenever the sensor system
electroni-cally samples the continuous data stream provided by
the scanning [8] The image data recorded by sensors
and aircrafts can contain errors in geometry and
mea-sured brightness value of the pixels (which are referred
to as radiometric errors) [16] The relative motion of
the platform, the non-idealities in the sensors
them-selves and the curvature of the Earth can lead to
geo-metric errors of varying degrees of severity The
radiometric errors can result from the instrumentation
used to record the data, the wavelength dependence of
solar radiation and the effect of the atmosphere For
many applications using these images, it is necessary to
make corrections in geometry and brightness before the
data are used By using correction techniques [8,16], an
image can be registered to a map coordinate system and
therefore has its pixels addressable in terms of map
coordinates rather than pixel and line numbers, a
pro-cess often referred to as geocoding
The Earth Observing System Data and Information
System (EOSDIS) receives “raw” data from all
space-crafts and processes it to remove telemetry errors,
elimi-nate communication artifacts and create Level 0
Standard Data Products that represent raw science data
as measured by the instruments Other levels of remote
sensing data processing were defined in [17] by the
NASA Earth Science program In Level 1A, the
recon-structed, unprocessed instrument data at full resolution,
time-referenced and annotated with ancillary
informa-tion (including radiometric and geometric calibrainforma-tion
coefficients and georeferencing parameters) are
com-puted and appended, but not applied to Level 0 data (i
e., Level 0 can be fully recovered from Level 1A) Some
instruments have Level 1B data products, where the data
resulting from Level 1A are processed to sensor units
At Level 2, the geographical variables are derived (e.g.,
Ocean wave height, soil moisture, ice concentration) at
the same resolution and location as Level 1 data Level 3
maps the variables on uniform space-time grids usually
with some completeness and consistency, and finally,
Level 4gives the results from the analysis of the
pre-vious levels data For many applications, Level 1 data are
the most fundamental data records with significant
scientific utility, and it is the foundation upon which all
subsequent data sets are produced For pansharpening,
where the accuracy of the input data is crucial, at least
radiometric and geometric corrections need to be
per-formed on the satellite data Radiometric correction
rec-tifies defective columns and missing lines and reduces
the non-uniformity of the sensor response among
detectors The geometrical correction deals with tematic effects such as panoramic effect, earth curvatureand rotation Note, however, that even with geometri-cally registered PAN and MS images, differences mightappear between images as described in [10] These dif-ferences include object disappearance or appearance andcontrast inversion due to different spectral bands or dif-ferent times of acquisition Besides, both sensors do notaim exactly at the same direction, and acquisition timesare not identical which have an impact on the imaging
sys-of fast-moving objects
Once the image data have already been processed inone of the standard levels previously described, and inorder to apply pansharpening techniques, the images arepre-processed to accommodate the pansharpening algo-rithm requirements This pre-processing may includeregistration, resampling and histogram matching of the
MS and PAN images Let us now study these processes
in detail
2.1 Image registration
Many applications of remote sensing image data requiretwo or more scenes of the same geographical region,acquired at different dates or from different sensors, inorder to be processed together In this case, the role ofimage registration is to make the pixels in the twoimages precisely coincide with the same points on theground [8] Two images can be registered to each other
by registering each to a map coordinate base separately,
or one image can be chosen as a master to which theother is to be registered [16] However, due to the dif-ferent physical characteristics of the different sensors,the problem of registration is more complex than regis-tration of images from the same type of sensors [18]and has also to face problems like features present inone image that might appear only partially in the otherimage or do not appear at all Contrast reversal in someimage regions, multiple intensity values in one imagethat need to be mapped to a single intensity value in theother or considerably dissimilar images of the samescene produced by the image sensor when configuredwith different imaging parameters are also problems to
be solved by the registration techniques
Many image registration methods have been proposed
in the literature They can be classified into two gories: area-based methods and feature-based methods.Examples of area-based methods, which deal with theimages without attempting to detect common objects,include Fourier methods, cross-correlation and mutualinformation methods [19] Since gray-level values of theimages to be matched may be quite different, and takinginto account that for any two different image modalities,neither the correlation nor the mutual information ismaximal when the images are spatially aligned, area-
Trang 4cate-based techniques are not well adapted to the
multisen-sor image registration problem[18] Feature-based
meth-ods, which extract and match the common structures
(features) from two images, have been shown to be
more suitable for this task Example methods in this
category include methods using spatial relations, those
based on invariant descriptors, relaxation, and pyramidal
and wavelet image decompositions, among others [19]
2.2 Image upsampling and interpolation
When the registered remote sensing image is too coarse
and does not meet the required resolution, upsampling
may be needed to obtain a higher-resolution version of
the image The upsampling process may involve
interpo-lation, usually performed via convolution of the image
with an interpolation kernel [20] In order to reduce the
computational cost, preferably separable interpolants
have been considered [19] Many interpolants for
var-ious applications have been proposed in the literature A
brief discussion of interpolation methods used for image
resampling is provided in [19] Interpolation methods
specific to remote sensing, as the one described in [21],
have been proposed In [22], the authors study the
application of different interpolation methods to remote
sensing imagery These methods include nearest
neigh-bor interpolation that only considers the closest pixel to
the interpolated point, thus requiring the least
proces-sing time of all interpolation algorithms, bilinear
inter-polationthat creates the new pixel in the target image
from a weighted average of its four nearest neighboring
pixels in the source image and interpolation with
smoothing filterthat produces a weighted average of the
pixels contained in the area spanned by the filter mask
This process produces images with smooth transitions
in gray level, while interpolation with sharpening filter
enhances details that have been blurred and highlights
fine details However, sharpening filters produce aliasing
in the output image, an undesirable effect that can be
avoided applying interpolation with unsharp masking
that subtracts a blurred version of an image from the
image itself The authors of [22] conclude that only
bilinear interpolation, interpolation with smoothing filter
and interpolation with unsharp masking have the
poten-tial to be used to interpolate remote sensing images
Note that interpolation does not increase the
high-fre-quency detail information in the image but it is needed
to match the number of pixels of images with different
spatial resolutions
2.3 Histogram matching
Some pansharpening algorithms assume that the
spec-tral characteristics of the PAN image match those of
each band of the MS image or match those of a
transformed image based on the MS image nately, this is not usually the case [16], and those pan-sharpening methods are prone to spectral distortions.Matching the histograms of the PAN image and MSbands will minimize brightness mismatching during thefusion process, which may help to reduce the spectraldistortion in the pansharpened image Although thereare general purpose histogram matching techniques, asthe ones described, for instance in [16] and [20], thatcould be used in remote sensing, specific techniques likethe one presented in [23] are expected to provide moreappropriate images for the application of pansharpeningtechniques The technique in [23] minimizes the modifi-cation of the spectral information of the fused high-resolution multispectral (HRMS) image with respect tothe original low-resolution multispectral (LRMS) image.This method modifies the value of the PAN image ateach pixel (i, j) as
Unfortu-StretchedPAN (i, j) = (PAN(i, j) − μ PAN) σ b
σ PAN
+μ b, (1)
where μPANandμbare the mean of the PAN and MSimage band b, respectively, and sPAN and sb are thestandard deviation of the PAN and MS image band b,respectively This technique ensures that the mean andstandard deviation of PAN image and MS bands arewithin the same range, thus reducing the chromatic dif-ference between both images
3 Pansharpening categories
Once the remote sensing images are pre-processed inorder to satisfy the pansharpening method requirements,the pansharpening process is performed The literatureshows a large collection of these pansharpening methodsdeveloped over the last two decades as well as a largenumber of terms used to refer to image fusion In 1980,Wong et al.[24] proposed a technique for the integration
of Landsat Multispectral Scanner (MSS) and Seasat thetic aperture radar (SAR) images based on the modu-lation of the intensity of each pixel of the MSS channelswith the value of the corresponding pixel of the SARimage, hence named intensity modulation (IM) integra-tion method Other scientists evaluated multisensorimage data in the context of co-registered [25], resolu-tion enhancement [26] or coincident [27] data analysis.After the launch of the French SPOT satellite system
syn-in February of 1986, the civilian remote senssyn-ing sectorwas provided with the capability of applying high-resolu-tion MS imagery to a range of land use and land coveranalyses Cliche et al.[28] who worked with SPOT simu-lation data prior to the satellite’s launch showed thatsimulated 10-m resolution color images can be pro-duced by modulating each SPOT MS (XS) band with
Trang 5PAN data individually, using three different intensity
modulation (IM) methods Welch et al.[29] used the
merging of SPOT PAN and XS data using the
Intensity-Hue-Saturation (IHS) transformation, a method
pre-viously proposed by Haydn et al.[30] to merge Landsat
MSS with Return Beam Vidicon (RBV) data and Landsat
MSS with Heat Capacity Mapping Mission data In
1988, Chavez et al.[31] used SPOT panchromatic data
to“sharpen” Landsat Thematic Mapper (TM) images by
high-pass filtering (HPF) the SPOT PAN data before
merging it with the TM data A review of the so-called
classical methods, which include IHS, HPF, Brovey
transform (BT) [32] and principal component
substitu-tion (PCS) [33,34], among others, can be found in [9]
In 1987, Price [35] developed a fusion technique based
on the statistical properties of remote sensing images,
for the combination of the two different spatial
resolu-tions of the High Resolution Visible (HRV) SPOT
sen-sor Besides the Price method, the literature shows other
pansharpening methods based on the statistical
proper-ties of the images, such as spatially adaptive methods
[36] and Bayesian-based methods [37,38]
More recently, multiresolution analysis employing the
generalized Laplacian pyramid (GLP) [39,40], the
dis-crete wavelet transform [41,42] and the contourlet
transform [43-45] has been used in pansharpening using
the basic idea of extracting the spatial detail information
from the PAN image not present in the low-resolution
MS image, to inject it into the later
Image fusion methods have been classified in several
ways Schowengerdt [8] classified them into spectral
domain, spatial domain and scale-space techniques
Ran-chin and Wald [46] classified them into three groups:
projection and substitution methods, relative spectral
contribution methods and those relevant to the ARSIS
concept (from its French acronym“Amélioration de la
Résolution Spatiale par Injection de Structures” which
means“Enhancement of the spatial resolution by
struc-ture injections”) It was found that many of the existing
image fusion methods, such as the HPF and additive
wavelet transform (AWT) methods, can be
accommo-dated within the ARSIS concept [13], but Tu et al.[47]
found that the PCS, BT and AWT methods could be
also considered as IHS-like image fusion methods
Meanwhile, Bretschneider et al.[12] classified IHS and
PCA methods as transformation-based methods, in a
classification that also included more categories such as
addition and multiplication fusion, filter fusion (which
includes HPF method), fusion based on inter-band
rela-tions, wavelet decomposition fusion and further fusion
methods (based on statistical properties) Fusion
meth-ods that involve linear forward and backward transforms
had been classified by Sheftigara [48] as component
substitution methods Recently, two comprehensive meworks that generalize previously proposed fusionmethods such as IHS, BT, PCA, HPF or AWT andstudy the relationships between different methods havebeen proposed in [14,15]
fra-Although it is not possible to find a universal cation, in this work we classify the pansharpening meth-ods into the following categories according to the maintechnique they use:
classifi-(1) Component Substitution (CS) family, whichincludes IHS, PCS and Gram-Schmidt (GS), because allthese methods utilize, usually, a linear transformationand substitution for some components in the trans-formed domain
(2) Relative Spectral Contribution family, whichincludes BT, IM and P+XS, where a linear combination
of the spectral bands, instead of substitution, is applied.(3) High-Frequency Injection family, which includesHPF and HPM, where these two methods inject high-frequency details extracted by subtracting a low-pass fil-tering PAN image from the original one
(4) Methods based on the statistics of the image,which include Price and spatially adaptive methods,Bayesian-based and super-resolution methods
(5) Multiresolution family, which includes generalizedLaplacian pyramid, wavelet and contourlet methods andany combination of multiresolution analysis with meth-ods from other categories
Note that although the proposed classification definesfive categories, as we have already mentioned, somemethods can be classified in several categories and, so,the limits of each category are not sharp and there aremany relations among them The relations will beexplained when the categories are described
3.1 Component substitution family
The component substitution (CS) methods start byupsampling the low-resolution MS image to the size ofthe PAN image Then, the MS image is transformedinto a set of components, using usually a linear trans-form of the MS bands The CS methods work by substi-tuting a component of the (transformed) MS image, Cl,
methods are physically meaningful only when these twocomponents, Cland Ch, contain almost the same spec-tral information In other words, the Cl componentshould contain all the redundant information of the MSand PAN images, but Ch should contain more spatialinformation An improper construction of the Clcom-ponent tends to introduce high spectral distortion Thegeneral algorithm for the CS sharpening techniques issummarized in Algorithm 1 This algorithm has beengeneralized by Tu et al.[47], where the authors alsoprove that the forward and backward transforms are not
Trang 6needed and steps 2-5 of Algorithm 1 can be
summar-ized as finding a new component Cl and adding the
dif-ference between the PAN and this new component to
each upsampled MS image band This framework has
been further extended by Wang et al.[14] and Aiazzi et
al.[15] in the so-called general image fusion (GIF) and
extended GIF (EGIF) protocol, respectively
Algorithm 1 Component substitution
4 Replace the Cl component with the
histogram-matched PAN image
5 Backward transform the components to obtain the
pansharpened image
The CS family includes many popular pansharpening
methods, such as the IHS, PCS and Gram-Schmidt (GS)
methods [48,49], each of them involving a different
transformation of the MS image CS techniques are
attractive because they are fast and easy to implement
and allow users’ expectations to be fulfilled most of the
time, since they provide pansharpened images with good
visual/geometrical quality in most cases [50] However,
the results obtained by these methods highly depend on
the correlation between the bands, and since the same
transform is applied to the whole image, it does not
take into account local dissimilarities between PAN and
MS images [10,51]
A single type of transform does not always obtain the
optimal component required for substitution, and it
would be difficult to choose the appropriate spectral
transformation method for diverse data sets In order to
alleviate this problem, recent methods incorporate
sta-tistical tests or weighted measures to adaptively select
an optimal component for substitution and
transforma-tion This results in a new approach known as adaptive
component substitution[52-54]
The Intensity-Hue-Saturation (IHS) pansharpening
method [31,55] is one of the classical techniques
included in this family, and it uses the IHS color space,
which is often chosen due to the tendency of the visual
cognitive system of human beings to treat the intensity
(I), hue (H) and saturation (S) components as roughly
orthogonal perceptual axes IHS transform originally
was applied to RGB true color, but in the remote
sen-sing applications and for display purposes only, arbitrary
bands are assigned to RGB channel to produce false
color composites [14] The ability of IHS transform to
separate effectively spatial information (band I) andspectral information (bands H and S) [20] makes it veryapplicable in pan-sharpening There are different models
of IHS transform, differing in the method used to pute the intensity value Smith’s hexacone and triangularmodels are two of the most widely used ones [7] Anexample of pansharpened image using IHS method isshown in Figure 2b
com-The major limitation of this technique is that onlythree bands are involved Tu et al.[47] proposed a gen-eralized IHS transform that surpasses the dimensionallimitation In any case, since the spectral response of I,
as synthesized from the MS bands, does not generallymatch the radiometry of the histogram-matched PAN[50], when the fusion result is displayed in color compo-sition, large spectral distortion may appear as colorchanges In order to minimize the spectral distortion inIHS pansharpening, Tu et al.[56] proposed a new adap-tive IHS method in which the intensity band approxi-mates the PAN image for IKONOS images as closely aspossible This adaptive IHS has been extended by Rah-mani et al.[52] to deal with any kind of image by deter-mining the coefficientsaithat best approximate
MS and PAN images might remain [10]
Another method in the CS family is principal nent substitution (PCS) that relies on the principalcomponent analysis (PCA) mathematical transforma-tion The PCA, also known as the Karhunen-Loévetransform or the Hotelling transform, is widely used insignal processing, statistics and many other areas Thistransformation generates a new set of rotated axes, inwhich the new image spectral components are not cor-related The largest amount of the variance is mapped
compo-to the first component, with decreasing variance going
to each of the following ones The sum of the iances in all the components is equal to the total var-iance present in the original input images PCA andthe calculation of the transformation matrices can beperformed following the steps specified in [20] Theo-retically, the first principal component, PC1, collectsthe information that is common to all bands used asinput data to the PCA, i.e., the spatial information,while the spectral information that is specific to eachband is captured in the other principal components[42,33] This makes PCS an adequate technique whenmerging MS and PAN images PCS is similar to theIHS method, with the main advantage that an arbitrarynumber of bands can be considered However, some
Trang 7var-spatial information may not be mapped to the first
component, depending on the degree of correlation
and spectral contrast existing among the MS bands
[33], resulting in the same problems that IHS had To
overcome this drawback, Shah et al.[53] proposed a
new adaptive PCA-based pansharpening method that
determines, using cross-correlation, the appropriate PC
component to be substituted by the PAN image By
replacing this PC component by the high spatial
reso-lution PAN component, adaptive PCA method will
produce better results than traditional ones [53]
A widespread CS technique is the Gram-Schmidt (GS)
spectral sharpening This method was invented by
Laben and Brover in 1998 and patented by Eastman
Kodak [57] The GS transformation, as described in
[58], is a common technique used in linear algebra and
multivariate statistics GS is used to orthogonalize
matrix data or bands of a digital image removing
redun-dant (i.e., correlated) information that is contained in
multiple bands If there were perfect correlation between
input bands, the GS orthogonalization process would
produce a final band with all its elements equal to zero
For its use in pansharpening, GS transformation had
been modified [57] In the modified process, the mean
of each band is subtracted from each pixel in the band
before the orthogonalization is performed to produce a
more accurate outcome
In GS-based pansharpening, a lower-resolution PAN
band needs to be simulated and used as the first band
of the input to the GS transformation, together with the
MS image Two methods are used in [57] to simulatethis band; in the first method, the LRMS bands arecombined into a single lower-resolution PAN (LR PAN)
as the weighted mean of MS image These weightsdepend on the spectral response of the MS bands andhigh-resolution PAN (HR PAN) image and on the opti-cal transmittance of the PAN band The second methodsimulates the LR PAN image by blurring and subsam-pling the observed PAN image The major difference inresults, mostly noticeable in a true color display, is thatthe first method exhibits outstanding spatial quality, butspectral distortions may occur This distortion is due tothe fact that the average of the MS spectral bands is notlikely to have the same radiometry as the PAN image.The second method is unaffected by spectral distortionbut generally suffers from a lower sharpness and spatialenhancement This is due to the injection mechanism ofhigh-pass details taken from PAN, which is embeddedinto the inverse GS transformation, carried out by usingthe full-resolution PAN, while the forward transforma-tion uses the low-resolution approximation of PANobtained by resampling the decimated PAN image pro-vided by the user In order to avoid this drawback,Aiazzi et al.[54] proposed an Enhanced GS method,where the LR PAN is generated by a weighted average
of the MS bands and the weights are estimated to mize the MMSE with the downsampled PAN GS ismore general than PCA, which can be understood as a
mini-(a) Original LRMS image (b) IHS
(c) BT (d) HPF
Figure 2 Results of some classical pansharpening methods using SPOT five images.
Trang 8particular case of GS in which LR PAN is the first
prin-cipal component [15]
3.2 Relative Spectral Contribution (RSC) family
The RSC family can be considered as a variant of the CS
pansharpening family, when a linear combination of the
spectral bands, instead of substitution, is applied
Let PANh be the high spatial resolution PAN image,
MS l b the b low-resolution MS image band, h the
origi-nal spatial resolution of PAN and l the origiorigi-nal spatial
resolution of MSb(l <h), while MS h
b is the image MS l
b
the b low-resolution MS image band, h the original
spa-tial resolution of PAN and l the original spaspa-tial
resolu-tion of MSb(l <h), while MS l
b resampled at resolution
b the blow-resolution MS image band, h the original spatial
resolution of PAN and l the original spatial resolution
of MSb(l <h), while MS l
b lying within the spectral range
of the PANhimage The synthetic (pansharpened) bands
HRMS h b are given at each pixel (i, j) by
where b = 1, 2, , B and B is the number of MS
bands The process flow diagram of RSC sharpening
techniques is shown in Algorithm 2 This family does
not tell what to do when MS l b the b low-resolution MS
image band, h the original spatial resolution of PAN
and l the original spatial resolution of MSb(l <h), while
MS l b lies outside the spectral range of PANh In
Equa-tion 3 there is an influence of the other spectral bands
on the assessment of MS l
image band, h the original spatial resolution of PAN
and l the original spatial resolution of MSb(l <h), while
HRMS h
b, thus causing a spectral distortion Furthermore,
the method does not preserve the original spectral
con-tent once the pansharpened images HRMS h
b are broughtback to the original low spatial resolution [46] These
methods include the Brovey transform (BT) [32], the P
+ XS [59,60] and the intensity modulation (IM) method
The Brovey transform provides excellent contrast inthe image domain but greatly distorts the spectral char-acteristics [62] The Brovey sharpened image is not sui-table for pixel-based classification as the pixel values arechanged drastically [7] A variation of the BT methodsubtracts the intensity of the MS image from the PANimage before applying Equation 3 [14] Although thefirst BT method injects more spatial details, the secondone preserves better the spectral details
The concept of intensity modulation (IM) was ally proposed by Wong et al.[24] in 1980 for integratingLandsat MSS and Seasat SAR images Later, this methodwas used by Cliche et al.[28] for enhancing the spatialresolution of three-band SPOT MS (XS) images As amethod in the relative spectral contribution family, wecan derive IM from Equation 3, by replacing the sum ofall MS bands, by the intensity component of the IHStransformation [6] Note that the use of the IHS trans-formation limits to three the number of bands utilized
origin-by this method The intensity modulation may causecolor distortion if the spectral range of the intensityreplacement (or modulation) image is different from thespectral range covered by the three bands used in thecolor composition [63] In the literature, different ver-sions based on the IM concept have been used [6,28,63].The relations between RSC and CS families have beendeeply studied in [14,47] where these families are con-sidered as a particular case of the GIHS and GIF proto-cols, respectively The authors also found that RSCmethods are closely CS, with the difference, as alreadycommented, that the contribution of the PAN varieslocally
3.3 High-frequency injection family
The high-frequency injection family methods were firstproposed by Schowengerdt [64], working on full-resolu-tion and spatially compressed Landsat MSS data Hedemonstrated the use of a high-resolution band to
“sharpen” or edge-enhance lower-resolution bands ing the same approximate wavelength characteristics.Some years later, Chavez [65] proposed a project whoseprimary objective was to extract the spectral informationfrom the Landsat TM and combine (inject) it with thespatial information from a data set having much higher
Trang 9hav-spatial resolution To extract the details from the
high-resolution data set, he used a high-pass filter in order to
“enhance the high-frequency/spatial information but,
more important, suppress the low frequency/spectral
information in the higher-resolution image” [31] This
was necessary so that simple addition of the images did
not distort the spectral balance of the combined
product
A useful concept for understanding spatial filtering is
that any image is made of spatial components at
differ-ent kernel sizes Suppose we process an image in such a
way that the value at each output pixel is the average of
a small neighborhood of input pixels, a box filter The
result is a low-pass (LP) blurred version of the original
image that will be noted as LP Subtracting this image
from the original one produces high-pass (HP) image
that represents the difference between each original
pixel and the average of its neighborhood This relation
can be written as the following equation:
which is valid for any neighborhood size (scale) As
the neighborhood size is increased, the LP image hides
successively larger and larger structures, while the HP
image picks up the smaller structures lost in the LP
image (see Equation 4) [8]
The idea behind this type of spatial domain fusion is
to transfer the high-frequency content of the PAN
image to the MS images by applying spatial filtering
techniques [66] However, the size of the filter kernels
cannot be arbitrary because it has to reflect the
radio-metric normalization between the two images Chavez
et al.[34] suggested that the best kernel size is
approxi-mately twice the size of the ratio of the spatial
resolu-tions of the sensors, which produce edge-enhanced
synthetic images with the least spectral distortion and
edge noises According to [67], pansharpening methods
based on injecting high-frequency components into
resampled versions of the MS data have demonstrated a
superior performance and compared with many other
pansharpening methods such as the methods in the CS
family Several variations of high-frequency injection
pansharpening methods have been proposed as
High-Pass Filtering Pansharpening and High High-Pass Modulation
As we have already mentioned, the main idea of the
high-pass filtering (HPF) pansharpening method is to
extract from the PAN image the high-frequency
infor-mation, to later add or inject it into the MS image
pre-viously expanded to match the PAN pixel size This
spatial information extraction is performed by applying
a low-pass spatial filter to the PAN image,
where h0 is a low-pass filter and * the convolutionoperator The spatial information injection is performedadding, pixel by pixel, the filtered image that resultsfrom subtracting filteredPAN from the original PANimage, to the MS one [31,68] There are many differentfilters that can be used: Box filter, Gaussian, Laplacian,and so on Recently, the use of the modulation transferfunction (MTF) of the sensor as the low-pass filter hasbeen proposed in [69] The MTF is the amplitude spec-trum of the system point spread function (PSF) [70] In[69], the HP image is also multiplied by a weightselected to maximize the Quality Not requiring a Refer-ence (QNR) criterion proposed in the paper
As expected, HPF images present low spectral tion However, the ripple in the frequency response willhave some negative impact [14] The HPF method could
distor-be considered the predecessor of an extended group ofimage pansharpening procedures based on the sameprinciple: to extract spatial detail information from thePAN image not present in the MS image and inject itinto the latter in a multiresolution framework Thisprinciple is known as the ARSIS concept [46]
In the High Pass Modulation (HPM), also known as
PAN image is multiplied by each band of the LRMSimage and normalized by a low-pass filtered version ofthe PAN image to estimate the enhanced MS imagebands The principle of HPM is to transfer the high-fre-quency information of the PAN image to the LRMSband b (LRMSb) with a modulation coefficient kbwhichequals the ratio between the LRMS and the low-pass fil-tered version of the PAN image [14] Thus, the algo-rithm assumes that each pixel of the enhanced(sharpened) MS image in band b is simply proportional
to the corresponding higher-resolution image at eachpixel This constant of proportionality is a spatially vari-able gain factor, calculated by,
k b (i, j) = LRMS b (i, j)
where filteredPANis a low-pass filtered version of PANimage (see Equation 5) [8] According to [14] (whereHFI has also been formulated into the GIF frameworkand relations with CS, RSC and some multiresolutionfamily methods are explored) when the low-pass filter ischosen as in the HPF method, the HPM method willgive slightly better performance than HPF because thecolor of the pixels is not biased toward gray
The process flow diagram of the HFI sharpening niques is shown in Algorithm 3 Also, a pansharpenedimage using the HPM method is shown in Figure 2d.Note that the HFI methods are closely related, as wewill see later, to the multiresolution family The main
Trang 10tech-differences are the types of filter used, that a single level
of decomposition is applied to the images and the
differ-ent origins of the approaches
Algorithm 3 High-frequency injection
3 Calculate the high-frequency image by subtracting
the filtered PAN from the original PAN
4 Obtain the pansharpened image by adding the
high-frequency image to each band of the MS image
(modulated by the factor kb(i, j) in Equation 6 in the
case of HPM)
3.4 Methods based on the statistics of the image
The methods based on the statistics of the image
include a set of methods that exploit the statistical
char-acteristics of the MS and PAN images in the
panshar-pening process The first known method in this family
was proposed by Price [35] to combine PAN and MS
imagery from dual-resolution satellite instruments based
on the substantial redundancy existing in the PAN data
and the local correlation between the PAN and MS
images Later, the method was improved by Price [71]
by computing the local statistics of the images and by
Park et al.[36] in the so-called spatially adaptive
algorithm
Price’s method [71] uses the statistical relationship
between each band of the LRMS image and HR images
to sharpen the former It models the relationship
between the pixels of each band of the HRMS zb, the
PAN image x and the corresponding band of the LRMS
image yblinearly as
upsampled to the size of the HRMS image by pixelreplication, ˆx represents the panchromatic image down-sampled to the size of the MS image by averaging thepixels of x in the area covered by the pixels of y andupsampling again to its original size by pixel replication,and ˆa is a matrix defined as the upsampling, by pixelreplication, of a weight matrix a whose elements are cal-culated from a window 3 × 3 of each LR image pixel.Price’s algorithm succeeds in preserving the low-reso-lution radiometry in the fusion process, but sometimes,
it produces blocking artifact because it uses the sameweight for all the HR pixels corresponding to one LRpixel If the HR and LR images have little correlation,the blocking artifacts will be severe A pansharpenedimage using Price’s method proposed in [71] is shown
in Figure 3a
The spatially adaptive algorithm [36] starts fromPrice’s method [71], but with a more general andimproved mathematical model It features adaptiveinsertion of information according to the local correla-tion between the two images, preventing spectral distor-tion as much as possible and sharpening the MS imagessimultaneously This algorithm has also the advantagethat a number of high-resolution images, not only onePAN image, can be utilized as references of high-fre-quency information, which is not the case for mostmethods [36]
Besides those methods, most of the papers in thisfamily have used the Bayesian framework to model theknowledge about the images and estimate the panshar-pened image Since the work of Mascarenhas [37], anumber of pansharpening methods have been proposedusing the Bayesian framework (see [72,73] for instance).Bayesian methods model the degradation suffered bythe original HRMS image, z, as the conditional probabil-ity distribution of the observed LRMS image, y, and thePAN image, x, given the original z, called the likelihoodand denoted as p(y, x|z) They take into account the
(a) Price (b) Super-resolution [76]
Figure 3 Results of some statistical pansharpening methods using SPOT five images.
Trang 11available prior knowledge about the expected
character-istics of the pansharpened image, modeled in the
so-called prior distribution p(z), to determine the posterior
probability distribution p(z|y, x) by using Bayes law,
p(z |y, x) = p(y, x |z)p(z)
where p(y, x) is the joint probability distribution
Inference is performed from the posterior distribution
to draw estimates of the HRMS image, z
The main advantage of the Bayesian approach is to
place the problem of pansharpening into a clear
prob-abilistic framework [73], although assigning suitable
dis-tributions for the conditional and prior disdis-tributions and
the selection of an inference method are critical points
that lead to different Bayesian-based pansharpening
methods
As prior distribution, Fasbender et al.[73] assumed a
noninformative prior p(z)∝ 1, which gives equal
prob-ability to all possible solutions, that is, no solution is
preferred as no clear information on the HRMS image
is available This prior has also been used by Hardie et
al.[74] In [37], the prior information is carried out by
an interpolation operator and its covariance matrix;
both will be used as the mean vector and the covariance
matrix, respectively, for a Bayesian synthesis process In
[75], the prior knowledge about the smoothness of the
object luminosity distribution within each band makes it
possible to model the distribution of z using a
simulta-neous autoregressive model (SAR) as
the variance of the Gaussian distribution of zb, b = 1, ,
B, with B being the number of bands in the MS image
More advanced models try to incorporate a smoothness
constrain while preserving the edges in the image
Those models include adaptive SAR model [38], Total
Variation (TV) [76], Markov Random Fields (MRF)
[77]-based models and Stochastic Mixing Models (SMM)
[78] Note that the described models do not take into
account the correlations between the MS bands In [79],
the authors propose a TV prior model to take into
account spatial pixel relationships and a quadratic
model to enforce similarity between the pixels in the
same position in the different bands
It is usual to model the LRMS and PAN images as
degraded versions of the HRMS image by two different
processes: one modeling the LRMS image and usually
described as
where gs(z) represents a function that relates z to yand nsrepresents the noise of the LRMS image, and asecond one that models how the PAN image is obtainedfrom the HRMS image, which is written as
where gp(z) represents a function that relates z to xand np represents the noise of the PAN image Notethat, since the success of the pansharpening algorithmwill be limited by the accuracy of those models, the phy-sics of the sensor should be considered In particular,the MTF of the sensor and the sensor’s spectralresponse should be taken into account
The conditional distribution of the observed imagesgiven the original one, p(y, x|z), is usually defined as
by considering that the observed LRMS image and thePAN image are independent given the HRMS image.This allows an easier formulation of the degradationmodels However, Fasbender et al.[73] took into accountthat y and x may carry information of quite differentquality about z and defined p(y, x|z) = p(y|z)2(1-w)p(x|z)
Different models have been proposed for the tional distributions p(y|z) and p(x|z) The simpler model
condi-is to assume that gs(z) = z, so that y will be then y = z +
ns[73] where ns~ N(0, Σs) Note that in this case, y hasthe same resolution as z so an interpolation method has
to be used to obtain y from the observed MS image.However, most of the authors consider the relation y =
Hz+ ns where H is a matrix representing the blurring,usually represented by its MTF, the sensor integrationfunction and the spatial subsampling and nsis the cap-ture noise, assumed to be Gaussian with zero mean andvariance 1/b, leading to the distribution
This model has been extensively used [77,78,80], and
it is the base for the so-called super-resolution-basedmethods [81] as the ones described, for instance, in[38,76] The degradation model in [37] can be also writ-ten in this way A pansharpened image using the super-