8.3 PCA-Based Collaborative Filtering
8.3.2 Incremental Computation of the Singular Value
In what follows, we shall present an approach to incremental computation of the SVD. In a nutshell, we shall address the following question: given a matrixAand a vector a, how can we express the SVD of [A,a] in terms of that of A? The subsequent lemma, which summarizes the reasoning of Brand [Bra06, Bra03, Bra02, GE94], is crucial.
Lemma 8.1 Let A∈ Rmn,a∈Rm. Furthermore, let AẳUrSrVTr, where r rank A, be a full-rank truncated SVD.Let U:ẳUr,S :ẳSr, and V:ẳVr. Then we have
8.3 PCA-Based Collaborative Filtering 155
A;a
ẵ ẳ U; a⊥ a⊥ k k
S UTa 0T k ka⊥
VT 0 0T 1
ð8:15ị where a⊥:ẳ(IUUT)a.
The proof is by a straightforward evaluation of the right-hand side of (8.15).
Without loss of generality, we shall henceforth assume that a6ẳ0. The simplest case obtains if UTaẳ0, that is, a lives in the orthogonal complement of the range ofA. Then, up to a permutation, (8.15) is already an SVD of [A,a]
and we are done. Now let us assume that UTa6ẳ0. For the sake of simplicity, we may safely neglect the case where a⊥ẳ0 since a slight and obvious modification of the subsequent argumentation will do the trick. Under this assumption, both of the matrices
Ue :ẳ U; a⊥ a⊥ k k
,Ve :ẳ VT 0 0T 1
have orthogonal columns. Hence, if we are given a full-rank truncated SVD eS:ẳ S UTa
0T k ka⊥
ẳU S VT
then the matrices U^ :ẳUUe and V^ :ẳV Ve have orthogonal columns. Thus, the desired full-rank truncated SVD is given by
A;a
ẵ ẳU S^ V^T
and we are done. Therefore, it all comes down to computing a full-rank truncated SVD of eS. Unfortunately, the efficient computation of an SVD of eS is left open in Brand’s papers. We therefore outline the approach devised by Paprotny [Pap09].
Fortunately, it turns out that the special structure of this matrix may be exploited for efficient computation: we have
eSeSTẳ SST 0 0T 0
þzzT, where
z:ẳ UTa a⊥ k k
:
Hence,eSeSTis a rank-1 modification of a diagonal matrix. Again, the computation of a spectral decomposition of matrices of this type is well understood in numerical linear algebra. A sophisticated approach presented in [GE94] relies crucially on the solution of a so-called secular equation. One of the most fundamental insights of
finite-dimensional spectral theory is that the eigenvalues of a matrix A are, respecting multiplicities, precisely the roots of thecharacteristic polynomial
χBð ịλ :ẳdetðAλIị:
It turns out that the characteristic polynomial of a rank-1 modification of a diagonal matrix has a closed-form representation.
Proposition 8.3 LetΛ:ẳdiag(λ1,. . .,λn) be a real diagonal matrix andx∈ Rn. Then the characteristic polynomial ofΛ+ xxTis given by
detΛþxxTλI
ẳdetðΛλIị 1ỵxTðΛλIị1x
ẳ Yn
iẳ1
λiλ ð ị
!
1þXn
iẳ1
x2i λiλ
! : A proof may be found in, e.g., [GE94].
For the sake of simplicity, our discussion is restricted to the case where:
1. λ1,. . .,λnare distinct.
2. The spectra ofΛandΛ+xxTare disjoint.
For a more general treatment, please consult [BNS78].
If the second of the above conditions holds, theneλis an eigenvalue ofΛ+ xxTif and only if
1þxTΛeλI1
xẳ0: ð8:16ị
Hence, the eigenvalues of the rank-1 modification may be obtained by solving the secular equation (8.16). To do so, we may exploit the following insight.
Proposition 8.4 (cf. [GE94]) The eigenvalueseλ1. . .eλnofΛ+xxTsatisfy the interlacing property
eλiλieλi1, iẳ2,. . .,n:
By virtue of this observation, we are given intervals in which precisely one eigenvalue is located. How should we proceed to solve (8.16)? The most straight- forward way consists in deploying a bisection method. Since Proposition 8.4 pro- vides suitable initializations, such a method is guaranteed to converge. The rate, however, is only q-linear. A method exhibiting q-quadratic convergence is Newton–Raphson. Unfortunately, we are not in a position to guarantee convergence thereof because it may “leap” over the singularities and thus out of the search intervals in an early stage. A more sophisticated method, again, has been proposed in [BNS78]: the rational function on the right-hand side of (8.16) is iteratively approximated by low-degree rational functions the roots of which are available in
8.3 PCA-Based Collaborative Filtering 157
closed form. The method can be shown to converge locallyq-quadratically if the initialization overestimates the sought-after root. To obtain such an initial estimate, one may perform a few steps of bisection.
The naı¨ve way to compute the corresponding eigenvectors is by solving the sequence of null space problems
ΛþxxTeλ
uiẳ0,k k ẳui 1,iẳ1,. . .,d:
Since the computed eigenvalues are inexact, this method is numerically unsta- ble. In particular, it may lead to severe loss of orthogonality. A more sophisticated alternative is based on the following result, which is due to Lo¨wner.
Proposition 8.5 [GE94] Let λ1,. . .,λn,d1,. . .,dn ∈ R satisfy the interlacing property
λiþ1 >diþ1>λi,i ∈ n1:
Furthermore, let D:ẳdiag(d1,. . .,dn).Thenλ1,. . .,λnare the eigenvalues of DþbbT
for all b ∈ Rnsatisfying bj ẳYn
kẳ1
λkdj
. Yn
kẳ1,k6ẳj
dkdj
,j ∈ n:
We proceed by considering the approximate eigenvalues as exact eigenvalues of a slightly perturbed problem. The above result enables to compute the eigenvectors thereof analytically.
For a detailed description of the entire procedure, we refer the reader to [Pap09].
The main SVD procedure is summarized in Algorithm 8.1. We leave out the calculation of the SVD of eS since it is quite complex and there exist different approaches, one of which was presented here.
Algorithm 8.1:SVD update
Input:matrices U andV of left and right singular vectors of A, matrixS of singular values, new vectora
Output:updated matricesU^ andV^ of left and right singular vectors, matrix of singular valuesS of [A,a]
1: Calculate SVD ofeS :ẳ S UTa 0T k ka⊥
ẳU S VT 2: CalculateU^ :ẳ U; a⊥
a⊥ k k
U andV^ :ẳ VT 0 0T 1
V
Nowexperiencehas shown that the update algorithm 8.1 in general can also be applied successfully for truncated SVDs of any rank r including small ones.
However, unlike as for the truncated full-rank SVD presented there, the resulting SVD may not be optimal, i.e., it may lead to decompositions different from (8.14).