OPTICAL IMAGING AND SPECTROSCOPY Phần 7 potx

We learned how tomodel sampling in Chapter 7, the present chapter discusses basic stragies forsignal estimation and how these strategies impact code design for each type of code... Since

Trang 1

mask pixel If each value of H can be independently selected, the number of codevalues greatly exceeds the number of signal pixels reconstructed Pixel coding iscommonly used in spectroscopy and spectral imaging Structured spatial and tem-poral modulation of object illumination is also an example of pixel coding Inimaging systems, focal plane foveation and some forms of embedded readoutcircuit processing may also be considered as pixel coding The impulse response

of a pixel coded system is shift-variant Physical constraints typically limit themaximum value or total energy of the elements of H

† Convolutional coding refers to systems with shift-invariant impulse reponseh(x x0) As we have seen in imaging system analysis, convolutional coding

is exceedingly common in optical systems, with conventional focal imaging asthe canonical example Further examples arise in dispersive spectroscopy Wefurther divide convolutional coding into projective coding, under which code par-ameters directly modulate the spatial structure of the impulse response, andFourier coding, under which code parameters modulate the spatial structure ofthe transfer function Coded aperture imaging and computed tomography areexamples of projective coding systems Section 10.2 describes the use of pupilplane modulation to implement Fourier coding for extended depth of field Thenumber of code elements in a convolutional code corresponds to the number

of resolution elements in the impulse response Since the support of theimpulse response is usually much less than the support of the image, thenumber of code elements per image pixel is much less than one

† Implicit coding refers to systems where code parameters do not directly modulate

H Rather, the physical structure of optical elements and the sampling geometryare selected to create an invertible measurement code Reference structure tom-ography, van Cittert – Zernike-based imaging, and Fourier transform spec-troscopy are examples of implicit coding Spectral filtering using thin-filmfilters is another example of implicit coding More sophisticated spatiospectralcoding using photonic crystal, plasmonic, and thin-film filters are under explora-tion The number of coding parameters per signal pixel in current implicit codingsystems is much less than one, but as the science of complex optical design andfabrication develops, one may imagine more sophisticated implicit codingsystems

The goal of this chapter is to provide the reader with a context for discussing trometer and imager design in Chapters 9 and 10 We do not discuss physicalimplementations of pixel, convolutional, or implicit codes in this chapter Eachcoding strategy arises in diverse situations; practical sensor codes often combineaspects of all three In considering sensor designs, the primary goal is always tocompare system performance metrics against design choices Accurate samplingand signal estimation models are central to such comparisons We learned how tomodel sampling in Chapter 7, the present chapter discusses basic stragies forsignal estimation and how these strategies impact code design for each type of code

spec-8.1 CODING TAXONOMY 303

Trang 2

The reader may find the pace of discussion a bit unusual in this chapter Aptcomparison may be made with Chapter 3, which progresses from traditionalFourier sampling theory through modern multiscale sampling Similarly, thepresent chapter describes results that are 50 – 200 years old in discussing linear esti-mation strategies for pixel and convolutional coding in Sections 8.2 and 8.3 As withwavelets in Chapter 3, Sections 8.4 and 8.5 describe relatively recent perspectives,focusing in this case on regularization, generalized sampling, and nonlinear signalinference A sharp distinction exists in the impact of modern methods, however Inthe transition from Fourier to multiband sampling, new theories augment andextend Shannon’s basic approach Nonlinear estimators, on the other hand, substan-tially replace and revolutionize traditional linear estimators and completely under-mine traditional approaches to sampling code design As indicated by the hierarchy

of data readout and processing steps described in Section 7.4, nonlinear processinghas become ubiquitous even in the simplest and most isomorphic sensor systems

A system designer refusing to apply multiscale methods can do reasonable, if tunately constrained, work, but competitive design cannot refuse the benefits of non-linear inference

unfor-While the narrative of this chapter through coding strategies also outlines the basiclandscape of coding and inverse problems, our discussion just scratches the surface ofdigital image estimation and analysis We cannot hope to provide even a representa-tive bibliography, but we note that more recent accessible discussions of inverse pro-blems in imaging are presented by Blahut [21], Bertero and Boccacci [19], andBarrett and Myers [8] The point estimation problem and regularization methodsare well covered by Hansen [111], Vogel [241], and Aster et al [6] A modern textcovering image processing, generalized sampling, and convex optimization has yet

to be published, but the text and extensive websites of Boyd and Vandenberghe[24] provide an excellent overview of the broad problem

The range of the code elements hijis constrained in physical systems Typically, hij

is nonnegative Common additional constraints include 0 hij 1 or P

i hij 1.Design of H subject to constraints is a weighing design problem A classicexample of the weighing design problem is illustrated in Fig 8.3 The problem is

to determine the masses of N objects using a balance One may place objectssingly or in groups on the left or right side One places a calibrated mass on the

Trang 3

right side to balance the scale The ith measurement takes the form

giþX

j

where mjis the mass of the jth object hijisþ1 for objects on the right, 21 for objects

on the left and 0 for objects left out of the ith measurement While one might naivelychoose to weigh each object on the scale in series (e.g., select hij¼ dij), this strategy

is just one of many possible weighing designs and is not necessarily the one that duces the best estimate of the object weights The “best” strategy is the one thatenables the most accurate estimation of the weights in the context of a noise anderror model for measurement If, for example, the error in each measurement is inde-pendent of the masses weighed, then one can show that the mean-square error inweighing the set of objects is reduced by group testing using the Hadamard testingstrategy discussed below

pro-8.2.1 Linear Estimators

In statistics, the problem of estimating f from g in Eqn (8.1) is called point estimation.The most common solution relies on a regression model with a goal of minimizingthe difference between the measurement vector Hfe produced by an estimate of fand the observed measurements g The mean-square regression error is

1(fe)¼ (g Hfh e)0(g Hfe)i (8:3)The minimum of 1 with respect to feoccurs at @1=@fe¼ 0, which is equivalent to

Trang 4

So far, we have made no assumptions about the noise vector n We have onlyassumed that our goal is to find a signal estimate that minimizes the mean-squareerror when placed in the forward model for the measurement If the expected value

of the noise vector nh i is nonzero, then the linear estimate fe will in general bebiased If, on the other hand

e is the covariance for another linear estimator ˜fe, then

S~f

e Sf e is a positive semidefinite matrix

In practical sensor systems, many situations arise in which the axioms of theGauss – Markov theorem are not valid and in which nonlinear estimators are preferred.The OLS estimator, however, is a good starting point for the fundamental challenge

of sensor system coding, which is to codesign H and signal inference algorithms so as

to optimize system performance metrics Suppose, specifically, that the system metric

is the mean-square estimation error

The selection of H for a given measurement system balances the goal of ing estimation error against physical implementation constraints In the case thatP

minimiz-jhij 1, for example, the best choice is the identity hij¼ dij This is the mostcommon case for imaging, where the amount of energy one can extract from eachpixel is finite

Trang 5

by Hadamard A Hadamard matrix Hnof order n is an n n matrix with elements

If Haand Hbare Hadamard matrices, then the Kro¨necker product Hab¼ Ha Hbis

a Hadamard matrix of order ab Applying this rule to H2, we find

37

Recursive application of the Kro¨necker product yields Hadamard matrices for n ¼ 2m

In addition to n ¼ 1 and n ¼ 2, it is conjectured that Hadamard matrices exist for all

n ¼ 4m, where m is an integer Currently (2008) n ¼ 668 (m ¼ 167) is the smallestnumber for which this conjecture is unproven

Assuming that the measurement matrix H is a Hadamard matrix H0H ¼ NI,

0 , hij, 1 is common in imaging and spectroscopy As discussed by Harwit andSloane [114], minimum variance least-squares estimation under this constraint isachieved using the Hadamard S matrix:

Sn¼1

Under this definition, the first row and column of Snvanish, meaning that Snis an(n 21) (n 21) measurement matrix The effect of using the S matrix of order nrather than the bipolar Hadamard matrix is an approximately four-fold increase inthe least-squares variance

8.2 PIXEL CODING 307

Trang 6

Spectroscopic systems often simulate Hadamard measurement by subtractingS-matrix measurements from measurements based on the complement ~Sn¼(Hnþ 1)=2 This difference isolates g ¼ Hnf The net effect of this subtraction is

to increase the variance of each effective measurement by a factor of 2, meaningthat least squares processing produces a factor of 2 greater signal estimation variance.This result is better than for the S matrix alone because the number of measurementshas been doubled

The naive approach to inversion of Eqn (8.17) divides the Fourier spectrum of themeasured data by the system transfer function according to the convolution theorem[Eqn (3.18)] to obtain an estimate of the object spectrum

dx dy

¼

ð ð( ^f ^fest)2du dv

(8:19)Noting that 1(u, v)¼ ( ^f ^f est)2

is nonnegative everywhere, one minimizes e2byminimizing 1(u, v) at all (u, v) Supposing that ^fest¼ ^w(u, v)^g(u, v), we find

Trang 7

where we assume that the signal and noise spectra are uncorrelated such that

^w(u, v)¼ ^h

(u, v)Sf(u, v)j^h(u, v)j2Sf(u, v)þ Sn(u, v) (8:21)The Wiener filter reduces to the direct inversion filter of Eqn (8.18) if the signal-to-noise ratio Sf/Sn 1 At spatial frequencies for which the noise power spectrumbecomes comparable toj^h(u, v)j2Sf(u, v), the noise spectrum term in the denominatorprevents the weak transfer function from amplifying noise in the detected data.Substituting in Eqn (8.20), the mean-square error at spatial frequency (u, v) for theWiener filter is

1(u, v)¼ Sf(u, v)

1þ j^h(u, v)j2[Sf(u, v)=Sn(u, v)] (8:22)Convolutional code design consists of selection of hˆ(u, v) to optimize some metric.While minimization of the mean-square error is not the only appropriate designmetric, it is an attractive goal Since the Wiener error decreases monotonically withj^h(u, v)j2, error minimization is achieved by maximizingj^h(u, v)j2 across the targetspatial spectrum

Code design is trivial for focal imaging, where Eqn (8.22) indicates clear tages for forming as tight a point spread function as possible Ideally, one selectsh(x, y)¼ d(x, y), such that hˆ(u, v) is constant As discussed in Section 8.1,however, in certain situations design to the goal h(x, y)¼ d(x, y) is not the bestchoice Of course, as discussed in Sections 8.4 and 8.5, one is unlikely to invertusing the Wiener filter in such situations

advan-Figure 8.4 illustrates the potential advantage of coding for coded aperture systems

by plotting the error of Eqn (8.22) under the assumption that the signal and noisepower spectra are constant The error decreases as the order of the coded apertureincreases, although the improvement is sublinear in the throughput of the mask.The student will, of course, wish to compare the estimation noise of the Wienerfilter with the earlier SNR analysis of Eqns (2.47) and (2.48)

The nonuniformity of the SNR across the spectral band illustrated in Fig 8.4 istypical of linear deconvolution strategies Estimation error tends to be particularlyhigh in near nulls or minima in the MTF Nonlinear methods, in contrast, mayutilize relationships between spectral components to estimate information evenfrom bands where the system transfer function vanishes Nonlinear strategies arealso more effective in enforcing structural prior knowledge, such as the nonnegativity

of optical signals

8.3 CONVOLUTIONAL CODING 309

Trang 8

The Wiener filter is an example of regularization Regularization constrainsinverse problems to keep noise from weakly sensed signal components from swamp-ing data from more strongly sensed components The Wiener filter specifically dampsnoise from null regions of the system transfer function In discrete form, Eqn (8.17) isimplemented by Toeplitz matricies Hansen presents a recent review of deconvolutionand regularization with Toeplitz matrices [112] We consider regularization in moredetail in the next section.

8.4 IMPLICIT CODING

A coding strategy is “explicit” if the system designer directly sets each element hijofthe system response H and “implicit” if H is determined indirectly from designparameters Coded aperture spectroscopy (Section 9.3) and wavefront coding(Section 10.2.2) are examples of explicit code designs Most optical systems,however, rely on implicit coding strategies where a relatively small number of lens

or filter parameters determine the large-scale system response Even in explicitlycoded systems, the actual system response always differs somewhat from thedesign response

Reference structure tomography (RST; Section 2.7) provides a simple example ofthe relationship between physical system parameters and sensor response Physical

Figure 8.4 Relative mean-square error as a function of spatial frequency for MURA coded apertures of various orders The MURA code is described by Eqn (2.45) We assume that

Sf(u, v) is a constant and that S f (u, v)=S n (u, v) ¼ 10.

Trang 9

parameters consist of the size and location of reference structures Placing onereference structure in the embedding space potentially modulates the visibility forall sensors While the RST forward model is linear, optimization of the referencestructure against coding and object estimation metrics is nonlinear This problem ismostly academic in the RST context, but the nonlinear relationship between opticalsystem parameters and the forward model is a ubiquitous issue in design.

The present section considers coding and signal estimation when H cannot beexplicitly encoded Of course an implicitly encoded system response is unlikely toassume an ideal Hadamard or identity matrix form On the other hand, we mayfind that the Hadamard form is less ideal than we have previously supposed Ourgoals are to consider (1) signal estimation strategies when H is ill-conditioned and(2) design goals for implicit ill-conditioned H

The m n measurement matrix H has a singular value decomposition (SVD)

singu-Inversion of g ¼ Hfþ n using the SVD is straightforward The data and objectnull spaces are spanned by the m2r and n2r vectors in U and V corresponding

to null singular values The data range is spanned by the columns of Ur¼(u1, u2, , ur) The object range is spanned by the columns of Vr¼(vvvvv1, vvvvv2, , vvvvvr) The generalized or Moore – Penrose pseudoinverse of H is

where PVHf is the projection of the object onto VH The problem with naive inversion

is immediately obvious from Eqn (8.26) If noise is uniformly distributed over thedata space, then the noise components corresponding to small singular values areamplified by the factor 1=l

8.4 IMPLICIT CODING 311

Trang 10

Regularization of the pseudoinverse consists of removing or damping the effect ofsingular components corresponding to small singular values The most direct regular-ization strategy consists of simply forming a psuedoinverse from a subset of thesingular values with ligreater than some threshold, thereby improving the effectivecondition number This approach is called truncated SVD reconstruction.

Consider, for example, the shift-coded downsampling matrix A simple pling matrix takes Haar averages at a certain level For example, 4 downsampling iseffectively a projection up two levels on the Haar basis A 4 downsampling matrixtakes the form

1

4 1 4 1 4 1

4 0 0 0 0 0 0 0 0 0 0 0 0 1

4 1 4 1 4 1

4 0 0 0 0 0 0 0 0 0 0 0 0 1

4 1 4 1 4 1

4

(8:27)

In general, downsampling by the factor d projects f from Rnto Rn/d

Digital superresolution over multiple apertures or multiple exposures combinesdownsampled images with diverse sampling phases to restore f [Rnfrom d differentprojections in Rn/d We discuss digital superresolution in Section 10.4.2 For thepresent purposes, the shift-coded downsampling operator is useful to illustrate regu-larization By “shift coding” we mean the matrix that includes all single pixel shifts ofthe downsampling vector For 4 downsampling the shift coded operator is

H¼

1

4 1 4 1 4 1

4 0 0 0 0 0 0 0 1

4 1 4 1 4 1

4 0 0 0 0 0 0 1

4 1 4 1 4 1

4 0 0 0 0 0 0 1

4 1 4 1 4 1

4 0 0 0 0 0 0 1

4 1 4 1 4 1

4

(8:28)

The singular value spectrum of a 256 256 shift-coded 4 downsample ator is illustrated in Fig 8.5 Only one set of singular vectors is shown because thedata and object space vectors are identical for Toeplitz matrices (e.g., matrices repre-senting shift-invariant transformations) [112] This singular value spectrum is typical

oper-of many measurement systems Large singular values correspond to relatively frequency features in singular vectors Small singular values correspond to singularvectors containing high-frequency components By truncating the basis, one effec-tively lowpass-filters the reconstruction

Trang 11

low-Transformations of images are greatly simplified if the system operator is separable

in Cartesian coordinates A separable downsampling operator may operate on animage f with a left operator Hlfor vertical downsampling and a right operator Hr

for horizontal downsampling As an example, Fig 8.6(a) shows a particular imageconsisting of a 256 256-pixel array We model measured data from this image as

The least mean-square estimate of the image for shift-coded 4 downsampling with

s2¼ 104 normally distributed additive noise is illustrated in Fig 8.6(b) Asexpected, the mean-square error is enormous because of the ill-conditioned measure-ment operators Figure 8.6(c) is a truncated SVD reconstruction from the same datausing the first 125 of 256 singular vectors One observes both artifacts and blurring inthe truncated SVD image; the loss of spatial resolution is illustrated in a detail fromthe center of the image in Fig 8.7

The mean-square error in the truncated SVD reconstruction (0.037) exceeds themeasurement variance by more than two orders of magnitude The MSE includeseffects due to both noise and reconstruction bias, however Since the truncatedSVD reconstruction is not of full rank, image components in the null space of thereconstruction operator are dropped and lead to bias in the estimated image Onemay consider that the goal of truncated SVD reconstruction is to measure the projec-tion of f on the subspace spanned by the high singular value components In this case,

Figure 8.5 Singular values of a 256 256 shift-coded 4 downsample operator.

Trang 12

one is more interested in the error between the estimated projection and the true jection,jjPVHf PVHfejj2 For the image of Fig 8.6(c) the mean-square projectionerror is 3:3 104, which is 3 larger than the measurement variance The vastmajority of the difference between the reconstructed image and the original arisesfrom bias due to the structure of the singular vectors As discussed in Section 8.5,

pro-it might be possible to remove this bias using nonlinear inversion algorpro-ithms

Figure 8.6 A 256 256 image reconstructed using linear least-squares and truncated SVD: (a) original; (b) least-squares reconstruction MSE ¼ 51.4; (c) truncated SVD MSE ¼ 4.18e2003.

Figure 8.7 Detail of the original image (a) and the truncated SVD reconstruction (b).

Trang 13

Tikhonov regularization addresses the noise sensitivity of the pseudoinverse byconstraining the norm of the estimated signal The basic idea is that since noisecauses large fluctuations, damping such fluctuations may reduce noise sensitivity.The goal is to find fesatisfying

fe¼Xr i¼1

l2 i

l2i þ l2 o

lo! 0, the Tikhonov solution is the pseudoinverse solution (or least squares in

Figure 8.8 Reconstruction of the 4 downsampled shift coded system using Tikhonov larization Detail images at the bottom compare the same original and reconstructed regions as

regu-in Fig 8.7.

Trang 14

the case of a rectangular system matrix) One may expect the Tikhonov solution toresemble the order-k truncated SVD solution in the range that lk lo Figure 8.8

is a Tikhonov reconstruction the data from Fig 8.6 with lo ¼ 0:3 There is noTikhonov regularization parameter that obtains MSE comparable to the truncatedSVD for this particular image, but one may expect images with more high-frequencycontent to achieve better Tikhonov restoration Just as estimation of Sn(u, v) is central

to the Wiener filter, determination of lo is central to Tikhonov regularization.Tikhonov regularization is closely related to Wiener filtering, both are part of alarge family of similar noise damping strategies Since our primary focus here is

on the design of H, we refer the reader to the literature for further discussion [111].The nominal design goal for implicit coding is basically the same as for pixel andconvolutional coding: making the singular spectrum flat Hadamard, Fourier trans-form, and identity matrices perform well under least-squares inversion becausetheir singular values are all equal Any measurement matrix formed of orthogonalrow vectors similarly achieves uniform and independent estimation of the singularvalues (with the measurement row vectors forming the object space singularvectors) For the reasons listed in Section 8.3, however, there are many situationswhere unitary H is impossible or undesirable

For implicit coding systems in particular, one seeks to optimize sensor system formance over a limited range of physical control parameters Relatively subtle changes

per-in samplper-ing strategy may substantially impact signal estimation As an example, sider again a 4 downsampling system Suppose that one can implement any

con-8 element shift invariant sampling code with four elements equal to14and four elementsequal to 0 The downsampling code 11110000/4 with SVD spectral illustrated inFig 8.6 is one such example, but there 70 different possible codes Figure 8.9 plotsthe singular values for three such codes for a 128 128 measurement matrix The

11110000 code produces the largest singular values for low-frequency singularvectors but lower singular values in the midrange of frequency response The otherexample codes produce fewer low-frequency singular vectors and yield higher singularvalues in midrange Figure 8.10 shows the lo ¼ 0:3 Tikhonov reconstruction of thedetail region shown in Figs 8.7 and 8.8 for these codes with s2 ¼ 104 The MSE

is higher for the noncompact PSFs, but one can argue that the Tihonov reconstructionusing the 11100100 code captures features missed by the 11110000 code TruncatedSVD reconstruction using the disjoint codes produces artifacts due to the higher-frequency structure of the singular vectors At this point, we argue only that codedesign matters, leaving our discussion for how it might matter to the next section.More generally, we may decompose f in terms of the object space singularvectors as

Trang 15

Since identically and independently distributed zero mean noise maintains theseproperties under unitary transformation, one obtains the covariance statistics ofEqn (8.8) on least-squares inversion of Eqn (8.33) In fact, since L is diagonal,each singular value component can be independently estimated with variance

One may confidently say that optical measurement effectively consists of ing the singular value components fiSVfor li.s One has less confidence in assert-ing how one should design the structure of the singular vectors or how one shouldestimate f from the singular value components Building on our discussion from

measur-Figure 8.9 Singular value spectra for Toeplitz matrix sampling using eight-element tional codes The code elements listed as 1 are implemented as 1

convolu-4 so that the singular values are comparable to those in Fig 8.5.

Trang 16

Section 7.5.4, one generally seeks to design H such that f V?and such that distinctimages are mapped to distinct measurements So long as these requirements are sat-isfied, one has some hope of reconstructing f accurately.

Truncated SVD data are anticompressive in the sense that one obtains fewermeasurement data values than the number of raw measurements recorded As wesee with the reconstructions in this section, this does not imply that the number ofestimated pixels is reduced One may ask, however, why not measure the SVD pro-jections directly? With this question we arrive at the heart of optical sensor design.One is unlikely to have the physical capacity to implement optimal object space

Figure 8.10 Tikhonov and truncated SVD reconstruction of the detail region of Fig 8.7 Tikhonov reconstruction with l0¼ 0:3 is illustrated on the left; the top image corresponds

to the 11110000 code The SVD on the right used the first 125 of 256 singular vectors from the left and right.

Trang 17

projectors in a measurement system Physical constraints on H determine the structure

of the measurements Optical sensor design consists of optimizing the singular valuesand singular vectors within physical constraints to optimize signal estimation Tounderstand the full extent of this problem, one must also consider the possibility ofnonlinear image estimation, which is the focus of the next section

8.5 INVERSE PROBLEMS

As discussed in Section 7.5, a generalized sampling system separates the processes ofmeasurement, analysis, and display sampling Generalized measurements consist ofmultiplex projections of the object state With the exception of principal componentanalysis, the signal estimation algorithms mentioned in Section 7.5 bear little resemb-lence to the estimation algorithms considered thus far in the present chapter As wehave seen, however, linear least squares is only appropriate for well-conditionedmeasurement systems Regularization methods, such as the Wiener filter and trun-cated SVD reconstruction, have wider applicability but produce biased reconstruc-tions The magnitude of the bias may be expected to grow as the effective rank(the number of useful singular values) drops

Regularized SVD reconstruction differs sharply in this respect from compressedsensing As discussed in Section 7.5.4, a compressively sampled sparse signal may

be reconstructed without bias even though the measurement operator is of low rank.The present section considers similar methods for estimation of images sampled byill-conditioned operators

Prior to considering estimation strategies, it is useful to emphasize lessons learned

in Section 8.4 Specifically, no matter what type of generalized sampling one follows

in forward system design, the singular vectors of the as-implemented measurementmodel provide an excellent guide to the data that one actually measures One mayregard design of the singular vectors as the primary goal of implicit coding.Evaluation of the quality of the singular vectors depends on the image estimationalgorithm

Image estimation and analysis from a set of projections fiSV¼ hvi, fi is an ordinarily rich and complex subject One can imagine, for example, that each singularvector could respond to a feature in a single image One might in this case identify theimage by probablistic analysis of the relative projections of the measurements Onceidentified, the full image might be reconstructed on the basis of a single measurementvalue One can imagine many variations on this theme targeting specific image fea-tures As the primary focus of this text is the design of optical systems to estimatemostly unconstrained continuous images and spectra, however, we limit our attention

extra-to estimation more evolutionary revisions extra-to least-squares methods

As discussed at the end of Section 8.1, inverse problems have a long history and

an extensive biography The main objectives of the present section are to present afew examples to prepare the reader for design and analysis exercises in this and suc-ceeding chapters Inversion algorithms continue to evolve rapidly in the literature; theinterested reader is well advised to explore beyond the simple presentation in this text

8.5 INVERSE PROBLEMS 319

Trang 18

We focus here on the two most popular strategies for image and spectrum estimation:convex optimization and maximum likelihood methods.

8.5.1 Convex Optimization

The inverse problem returns an estimated image fe given the measurements g ¼

Hfþ n Optimization-based estimation algorithms augment the measurements with

an objective function g ( fe) describing the quality of the estimated image onthe basis of prior knowledge The objective function returns a scalar value Theoptimization-based inverse problem may be summarized as follows

fe¼ arg min

f g(f)such that

Image estimation using an objective function consists of finding the image estimate fe

consistent with the measurements that also minimizes the objective function.The core issues in optimization-based image estimations are (1) selection of theobjective function and (2) numerical optimization The objective function may bederived from

† Physical Constraints Unconstrained estimators may produce images that violateknown physical properties of the object The most common example in opticalsystems is nonnegativity Optical power spectra and irradiance values cannot

be negative, but algebraic and Wiener filter inversion commonly produces tive values from noisy data Optimization of least-squares estimation with anobjective function produces a better signal estimate than does truncation of non-physical values

nega-† Functional Constraints Natural objects do not consist of assortments of dent random pixels (commonly called “snow” in the age of analog television).Rather, pixel values are locally and globally correlated Local correlation isoften described as “smoothness,” and pixels near a given pixel are likely tohave similar values Global correlation is described by sharpness, and edgestend to propagate long distances across an image An objective function canenforce smoothness by limiting the spatial gradient of a reconstructed imageand sharpness by constraining coefficients in wavelet or “curvelet” decompo-sitions Sparsity, as applied in compressive sampling, is also a functionalconstraint

indepen-† Feature Constraints At the highest level, image inference may be aware of thenature of the object For example, knowledge that one is reconstructing animage of a dog may lead one to impose a “dog-like” constraint Such higher-order analysis lies at the interface between computational imaging and machinevision and is not discussed here

Trang 19

Constrained least-squares estimators provide the simplest optimization methods.Lawson and Hanson [146] present diverse algorithms for variations on the least-squares estimation problem, including the algorithm for nonnegative estimationimplemented in Matlab as the function lsqnonneg lsqnonneg is a recursivealgorithm designed to move the ordinarly least-squares solution to the nearest nonne-gative solution.

The least-gradient (LG) algorithm described by Pitsianis and Sun [31] provides auseful example of constrained least-squares methods LG is closely related to well-known least squares with quadratic inequality (LSQI) minimization problems Thesignal estimated by the LG agorithm is

fLG¼ arg min

f g(f)¼ krfk2such that

266

377

We obtain the LG solution in two steps First, we find a particular least-squaressolution fp to the linear equation H f ¼ g The general solution to the equation canthen be described as f ¼ fpþ N c, where N spans the null space of H, and c is anarbitrary coefficient vector The problem described by Eqn (8.36) reduces to alinear least-squares problem without constraints:

fLG¼ arg min

c kr(N c fp)k22The solution is expressed

fLG¼ fp N(NTrTrN)1(rN)Trfp (8:37)where we assume that the rN is of full rank in columns The general solution[Eqn (8.37)] does not depend on the selection of a particular solution fp to themeasurement equation More advanced strategies than ordinary least-squares inver-sion include QR factorization of the measurement matrix Other approaches, likelsqnonneg, require iterative processing

Trang 20

Figures 8.11 and 8.12 plot example LG reconstructions using the signal ofFig 3.9 The measurement operator shown in Fig 8.11 takes the level 0 Haaraverages (The function is modeled using 1024 points The measurement operatorconsists of a 161024 matrix; 64 continuous values in each row are 1.) The measure-ment operator is a 64 downsample matrix Figure 8.11(a) shows the true functionand the least-squares inversion from the downsampled data Figure 8.11(b) is the

LG reconstruction For these measurements, LG estimation may be simply regarded

as interpolation on sampled data

Figure 8.12 considers the same data with the rect(x) sampling kernel replaced bysinc(8x) The measurement operator is again 16 1024 As shown in Fig 8.12(b), theleast-squares inversion reflects the structure of the singular vectors of the measure-ment operator The LG operator uses null space smoothing to remove the naive struc-ture of the singular vectors The efficacy of LG and other constrained least-squaresmethods depends on the structure of the sampled signal space For example, thesinc(8x) sampling function may achieve better results on sparse signals, as illustrated

in Fig 8.13, which compares Haar and sinc kernel measurement for a signal ing of two Gaussian spikes

consist-The ability to implement computationally efficient spatially separable processing

is a particular attraction of linear constrained reconstruction For example, the coded downsample operator of Section 8.4 may be inverted simply by operating

shift-Figure 8.11 Reconstructions of the signal of Fig 3.9 as sampled on the Haar basis of order 0: (a) the true function and the least-squares estimate; (b) the least gradient; (c) the measurement operator H.

Trang 21

Eqn (8.29) from the left and right by the using the LG operator of Eqn (8.37).Figure 8.14 uses this approach to demonstrate a slight improvement in image fidelityunder LG smoothing of the Tikhonov regularized image of Fig 8.8 Of course, theshift-coded downsample operator does not have a null space, but Fig 8.14 treatsthe 156 singular vectors corresponding to the smallest singular values as the nullspace for LG optimization.

Equation (8.35) is a convex optimization problem if g ( f ) is a convex function

A set of points Vf, such as the domain of input objects, is convex if for all f1, f2[ Vf

for 0 a 1 The point a f1þ (1 a) f2is on the line segment between f1and f2

at a distance (1 2 a)k f12 f2k from f1and ak f12 f2k from f2

g( f ) is a convex function if Vfis a convex set and

Trang 22

The basic idea of convex optimization is illustrated in Fig 8.15 Figure 8.15(a)illustrates a convex function as a density map over a convex region in 2D.Figure 8.15(b) shows a nonconvex set in the 2D plane Optimization is implemented

by a search algorithm that moves from point to point in Vc Typically, the algorithmanalyzes the gradient of g ( f ) and moves interatively to reduce the current value of g

Figure 8.13 Reconstructions of the signal of a pair of isolated Gaussian signals as captured

by zeroth-order Haar function and by sampling function sinc(8x) (shown in Fig 8.12): (a) the true function and the least-squares estimate for each sampling function; (b) the true function and the least-gradient reconstructions.

Figure 8.14 Least-gradient reconstruction of the Tikhonov regularized image of Fig 8.8.

Trang 23

If Vcand g ( f ) are convex, it turns out that any local minima of the objective functiondiscovered in this process is also the global minimum over Vc[24] If, as illustrated inFig 8.15 (b), Vcis not convex, then the search may be trapped in a local minimum.Simple gradient search algorithms converge slowly, but numerous fast algorithmshave been developed for convex optimization [24].

Equation (8.35) is a constrained optimization problem, with optimization of theobjective function as the goal and the forward model as the constraint A generalapproach to solving the constrained problems reduces Eqn (8.35) to the uncon-strained optimization problem

l0 Algorithms under which this iteration rapidly converges have been developed[96], leaving rapid solution of the unconstrained minimization problem as the heart

of convex optimization

A linear constraint with a quadratic objective function provides the simplest form

of convex optimization problem As observed for Eqn (8.36), this problem can besolved algebraically One may find, of course, that the algebraic problem requiresadvanced methods for large matricies At the next level of complexity, manyconvex optimization problems provide differentiable objectives These problemsare solved by gradient search algorithms, usually based on “Newton’s method” forconditioning the descent At a third level of complexity, diverse algorithmsmapping optimization problems onto linear programming problems, interior pointmethods and interative shrinkage/thresholding algorithms may be considered.Software for convex optimization and inverse problems is summarized on the RiceUniversity compressive sensing Website (www.dsp.ece.rice.edu/cs/), on

Figure 8.15 Boundary (a) outlines a convex set in 2D Minimization of a convex function over this set finds the global minimum Boundary (b) outlines a nonconvex set Minimization of a convex function over this set may be trapped in a local minima.

Trang 24

the Caltech l1-magic site (www.acm.caltech.edu/l1magic/), on Boyd’swebpage (www.stanford.edu/boyd/cvx/), and on Figueiredo’s website(www.lx.it.pt/mtf/).

One may imagine many objective functions for image and spectrum estimationand would certainly expect that as this rapidly evolving field matures, objective func-tions of increasing sophistication will emerge At present, however, the most com-monly applied objective functions are the l1norm emerging from the compressivesampling theory [59,39,40] and the total variation (TV) objective function [212]

gTV( f )¼XN1

i, j¼1

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi( fiþ1, j fij)2þ ( fi, jþ1 fij)2

q

(8:41)

The l1objective is effective if the signal is sparse on the analysis basis and the TVobjective is effective if the gradient of the signal is sparse Since TV is oftenapplied to image data, we index f in 2D in Eqn (8.41) The first term under theroot analyzes the discrete horizontal gradient and the second, the vertical gradient

As illustrated in Figs 7.25 and 7.27, the l1objective is often applied to signals thatare not sparse in the display basis One assumes, however, that there exists a usefulbasis on which the signal is sparse Let u ¼ W f be a vector describing the signal onthe sparse basis The optimization problem may then be described as

u¼ arg min

u kuk1such that

Determination of the sparse basis is, of course, a central issue under this approach.Current strategies often assume a wavelet basis or use hyperoptimization strategies

to evaluate prospective bases

We consider a simpler example here, focusing on the atomic discharge spectrum ofxenon Atomic discharge spectra consist of very sharp discrete features, meaning thatthey are typically sparse in the natural basis Figure 8.16(a) shows the spectrum of

a xenon discharge lamp measured to 0.1 nm resolution over the spectral range

860 – 930 nm The spectrum was collected by the instrument described byWagadarikar et al [243] Measured data extended slightly beyond the displaylimits; 765 data sample experimental values were used for the simulations shown

in Fig 8.16 Figure 8.16(b) is the spectral estimate reconstructed from 130 randomprojections of the spectrum The reconstruction used the Caltech l1-magic programl1eq_example.m Typical results have reported that sparse signals consisting

of K features require approximately 3K random projections for accurate tion While the xenon spectrum contains only four features over this range, eachfeature is approximately 0.5 nm wide in these data, suggesting that there are

reconstruc-20 – 30 features in the spectrum The experimental spectrum, including backgroundnoise, was presented to the simulated measurement system

Trang 25

Figure 8.16 (c) shows baseline details for diverse measurement and reconstructiondata The plot 1 baseline is the experimental data, which has slight noise features onthe baseline The plot on the 2 baseline is the reconstructed data from (b) The 3 base-line shows the reconstruction obtained from 130 projections if the baseline noise isthresholded off of the experimental data prior to simulated measurement The 4 base-line shows the reconstructed data from the noisy experimental data if 200 projectionsare used The 5 baseline shows the reconstruction from 100 projections, and the 6baseline shows the reconstruction from 90 random projections The random projec-tions used the normal distribution measurement operator generated by the original l1-magic program As illustrated in the figure, estimated signal degregation is rapid if thesample density falls below a critical point Note that each sucessive trace in Fig 8.16

is shifted to the right by 1 nm to aid visualization

A second example uses the TV objective function and the two-step iterativeshrinkage/thresholding algorithm (TWIST) [20] As discussed in [76], the originaliterative shrinkage/thresholding algorithm combines maximum likelihood estimationwith wavelet sparsity We briefly review maximum likelihood methods in Section8.5.2 For the present purposes, we simply treat TWIST as a blackbox optimizer ofthe TV objective

We use TWIST to consider again the 4 downsample shift code Rather thanforce model consistency with the full measurement operator, however, we focus on

Figure 8.16 (a) Discharge spectrum of xenon measured measured by Wagadarikar et al [243]; (b) reconstruction using l 1 minimization from 130 random projections; (c) reconstruction baseline detail for several strategies.

Trang 26

the optimization problem

fe¼ arg min

f gTV( f )such that

fi,eSV¼ gSVi for all i r (8:43)where r is the rank of the truncated SVD and fi,eSVand gSVi are the projections onto thesingular vectors discussed in Section 8.4 This optimization forces consistency withthe high-singular-value vectors, treating those vectors as generalized measurementprojectors

Reconstruction under this algorithm is illustrated in Fig 8.17, which analyzes thesame image as in Fig 8.6 using the sampling codes 11110000 and 11100100 Thefirst 125 out of 256 singular vectors are used in each case In comparison withFigs 8.8 and 8.10, we observe that truncated SVD reconstruction augmented byTWIST optimization substantially improves the image in each case While the dis-joint code performs worse under truncated SVD and Tikhonov reconstruction, it

Figure 8.17 Reconstruction of the image of Fig 8.6 using SVD/TWIST optimization fying Eqn (8.43) for the first 125 singular vectors: (a) MSE ¼ 2.94e2003 using the 11110000 shift code; (b) MSE ¼ 2.65e2003 using the 11100100 code.

Định dạng
Số trang	52
Dung lượng	1,83 MB