Singular Value Decomposition (SVD)

Introduction

Although rarely tackled at the undergraduate level, SVD is extremely useful, particularly in statistics and signal processing. In Lab 19, we looked at square matrices that are diagonalizable and orthogonally diagonalizable. An impor- tant fact about the diagonalization is the resulting diagonal matrix contains the eigenvalues of the original matrix on the main diagonal. However, not all matrices are diagonalizable. In this case you may look at singular value decomposition (SVD). If A is an m×n matrix, singular values, σj, are the square roots of the eigenvalues for the matrixATA.

The term singular value relates to the distance of the given matrix to a singular matrix. The idea behind SVD is that every matrixA, can be decom- posed into the productUP

VT whereU andV are orthogonal matrices and P

ii=σi andP

ij = 0 otherwise.

Recall from Lab 19 that all symmetric matrices are orthogonally diagonalizable. Thus for all matricesA,ATAis symmetric, we can find an orthogonal matrixP such thatATA=P DPT.

SVD is extremely unique in that it can be found for all matrices and it can be used to find the best/optimalk-rank approximation of a matrix.

Calculating the SVD

The SVD for anm×nmatrixA,A=UP

VT whereU is am×morthogo- nal matrix whose columns form an orthonormal basis forRm,V is ann×n orthogonal matrix whose columns form an orthonormal basis forRn, and P is anm×nmatrix such thatP

ii =σi.

Since all symmetric matrices are orthogonally diagonalizable, we can find

100 Exploring Linear Algebra Labs and Projects with MATLABR an orthogonal matrixP such that

ATA=P DPT =VX

UTUX

VT =V







σ12 0 0 0 0 σ22 0 0 0 0 . .. 0 0 0 0 σ2n





 VT

and

AAT =UX

VTVX

UT =U







σ21 0 0 0 0 σ22 0 0 0 0 . .. 0 0 0 0 σ2m





 UT.

Note also thatATAvi =σ2ivi,AATui=σi2ui, andAvi =σiui. Example: Find the SVD ofA=

−1 0 3

0 2 0

. ATA=





1 0 −3

0 4 0

−3 0 9



with eigenvalues 10, 4, and 0.

Using normalized eigenvectors ofATA, define V =





−1

√10 0 √3 10

0 1 0

√3

10 0 √110



. Sim-

ilarly use the normalized eigenvectors ofAAT =

10 0 0 4

to define U =

1 0 0 1

. Finally define P

= √

10 0 0

0 2 0

. Exercise: Find the SVD for

1 1 0 0

Orthogonal Grids: Visualizing SVD

Here we look at a visualization of singular values. We begin by visualizing the transformation with square matrices. Just as we have learned how to apply linear transformations to vectors inR2, we can explore what happens if we apply those same transformations to the Cartesian grid. If the transformed grid lines remain orthogonal, we call this transformed grid anorthogonal grid.

The following demonstration shows how linear transformations affect the or- thogonality of grid lines.

Exercises:

a. To determine if rotation or dilation alone change the orthogonal- ity of the grid, use https: // www. mathworks. com/ matlabcentral/

fileexchange/ 65197-orthogonal-grids.

Matrix Decomposition with Applications 101

FIGURE 5.1: Orthogonal grids

b. Setting a = 2, b = 1, and d = 2, determine the approximate angle of rotation, θ, that produces an orthogonal grid with axes defined by the red vectors in the demonstration. This particular transformation is called a sheer transformation.

For the following exercises use the demonstrationhttps: // www. mathworks.

com/ matlabcentral/ fileexchange/ 65264-singular-values.

c. Denote the original (blue) vectors, in the demonstration, as v1 and v2, using the same sheer transformation described in b., determine the approximate lengths of the transformed (red) vectors, M v1 andM v2, when the sheer grid axes are orthogonal. These lengths are called the singular values,σ1 andσ2, ofM.

d. Vectors u1 and u2 are orthonormal vectors in the direction ofM v1 and M v2 respectively when M v1 andM v2 are orthogonal. Find u1 andu2. Orthogonal Components of a Vector

The orthogonal components of a vector v = vw+vw⊥ where vw in W and vw⊥ in W⊥, the orthogonal component to W. Given an orthonormal basis

102 Exploring Linear Algebra Labs and Projects with MATLABR

FIGURE 5.2: Singular values related to sheer transformation

{v1,v2,ã ã ã,vn}forW,vw= (v1ãv)v1+ (v2ãv)v2+ã ã ã+ (vnãv)vn.

Using the theory above, for any vector x, x= (v1ãx)v1+ (v2ãx)v2 and thusM x=M(v1ãx)v1+M(v2ãx)v2=u1σ1(v1ãx) +u2σ2(v2ãx).

Noting that for any two vectors uand w, uãw= uTw, we can say that M x=u1σ1(vT1x) +u2σ2(v2Tx).

More generallyM =UP

VT whereU is a matrix whose columns are the vectorsu1andu2,P

ii=σi andP

ij = 0 otherwise, andV is a matrix whose columns are the vectorsv1andv2. This is called thesingular value decomposition ofM.

Exercise: Using your results from c and d above, determine U, P , and V such thatUPVT =M whereM is the shear matrix

k 1 0 k

,k= 2.

Matrix Decomposition with Applications 103 Relating Eigenvalues and Singular Values

Recall that an eigenvector, x, is the solution to (A−λI)x = 0, where λ is an eigenvalue and A is a square matrix. This system of homogeneous equa- tions has a solution precisely when A−λI is singular. We have gone into detail about eigenvalues and the corresponding eigenvectors of square matrices in Lab 14, but is there a similar concept for matrices which are not square?

In general, eigenvalues and singular values are not related except when the matrix is symmetric. If a matrixA is symmetric then its singular values are the absolute values of its eigenvalues.

Note also that ifA is symmetric that the eigenvectors ofA are the same as the eigenvectors ofATAandAAT and thus the normalized eigenvectors of Acan be used to defineV andU.

Exercises: DefineA=

25 15 15 25

a. Determine the eigenvalues and singular values ofA. Use the singular values to define P

b. Find the eigenvectors of Aand determineV andU. Application to Data Imaging: Reducing Noise

SVD is regularly used to smooth out noisy data in such problems as imaging.

Essentially by not including all of the singular values in the singular value decomposition, one can begin to eliminate the noise in a data set.

Exercises:

a. Define the data set,data = [0 1 0 1 0 1 0 1 0 1;1 0 1 0 1 0 1 0 1 0;0 1 0 1 0 1 0 1 0 1;1 0 1 0 1 0 1 0 1 0;

0 1 0 1 0 1 0 1 0 1;1 0 1 0 1 0 1 0 1 0;0 1 0 1 0 1

0 1 0 1;1 0 1 0 1 0 1 0 1 0;0 1 0 1 0 1 0 1 0 1;1 0 1 0 1 0 1 0 1 0];and Type colormap(gray);image(75*data)to see the data set without noise.

b. Define a noisy data set and use the image command to visualize the noisy data. Typer = -.4 + (.4+.4)*rand(10,10);

noisy=r+data;image(75*noisy);This noisy data set will be your ma- trixM for the SVD algorithm. The noisy data set is the original data with some random noise added in.

c. Define M =noisy, and Typesvd(M) to see a list of all of the singular values for M. Determine the dominant singular values (and even more

104 Exploring Linear Algebra Labs and Projects with MATLABR importantly the number of dominant singular values). These will be the ones that you will include in your SVD to reduce the noise in the data.

d. Type[U,W,V]= svds(M,n), wherenis the number of dominant singular values you wish to include in the SVD (determined in part c.). Note here the matrices for the SVD will be stored inU, W,andV. MultiplyU∗W∗ transpose(V) to find an improved data set with reduced noise. Use the image command to visualize this improved data. (You may want to type image(75*U*W*transpose(V))in order to see the contrast.)

SVD is also applied extensively to the study of linear inverse problems and is useful in the analysis of regularization methods such as that of Tikhonov. It is widely used in statistics where it is related to principal component analysis and in signal processing and pattern recognition.

Matrix Decomposition with Applications 105

Basing It All on Just a Few Vectors

Markov Chains: An Application of Eigenvalues