Bài viết Kỹ thuật xử lý ảnh bằng phương pháp phân tích suy biến trình bày những kiến thức căn bản về phương pháp phân tích suy biến (SVD) thông qua những ví dụ cụ thể trên ma trận nhằm giới thiệu kĩ thuật phân tích suy biến một cách cụ thể, và ứng dụng của nó trong việc xử lý ảnh, góp phần rất lớn trong việc tiết kiệm bộ nhớ khi lưu trữ dữ liệu và làm tăng tốc quá trình xử lý ảnh. Mời các bạn cùng tham khảo chi tiết nội dung bài viết!
Trang 1KỸ THUẬT XỬ LÝ ẢNH BẰNG PHƯƠNG PHÁP PHÂN TÍCH
Nguyễn Thị Hương
Trường Đại học Hà Nội
Tóm tắt - Phương pháp phân tích suy biến (singular value decomposition) được viết tắt là
SVD là một trong những phương pháp thuộc nhóm phân tích ma trận (matrix factorization ) được phát triển lần đầu bởi những nhà hình học vi phân Ban đầu mục đích của phương pháp này là tìm ra một phép xoay không gian sao cho tích vô hướng của các vector không thay đổi
Từ mối liên hệ này khái niệm về ma trận trực giao đã hình thành để tạo ra các phép xoay đặc biệt Phương pháp SVD đã được phát triển dựa trên những tính chất của ma trận trực giao và
ma trận đường chéo để tìm ra một ma trận xấp xỉ với ma trận gốc Phương pháp này sau đó đã được ứng dụng rộng rãi trong các lĩnh vực như hình học vi phân, hồi qui tuyến tính, xử lý hình ảnh, clustering, các thuật toán nèn và giảm chiều dữ liệu
Đề tài này trình bày những kiến thức căn bản về phương pháp phân tích suy biến (SVD) thông qua những ví dụ cụ thể trên ma trận nhằm giới thiệu kĩ thuật phân tích suy biến một cách
cụ thể, và ứng dụng của nó trong việc xử lý ảnh, góp phần rất lớn trong việc tiết kiệm bộ nhớ khi lưu trữ dữ liệu và làm tăng tốc quá trình xử lý ảnh
Từ khóa: Nén hình ảnh, phân tách giá trị số ít, dạng bậc hai, quy trình Gramid
Schmidt, xử lý hình ảnh, xếp hạng, trực giao, cơ sở trực giao, ma trận đường chéo
Abstract: Nowadays the data are transmitted in the form of images, graphics, audio and
video This types of data require a lot of storage capacity and transmission bandwidth Consequently, the theory of data compression becomes more significant for reducing the data redundancy in order to save more transfer and storage of data This paper addresses the problem of the lossy compression of images I will be discussing how Linear algebra can be used in the compression of images Basically I will be discussing how SVD techniques are extensively used in image compression process resulting in saving computer’s memory The basic idea here is each image can be represented as a matrix and we apply SVD on this matrix and get a reduced matrix out of this original matrix and the image corresponding to this reduced matrix requires much lesser storage space as compared to the original image
Keywords: Image compression, singular value decomposition, quadratic form, Gram – Schmidt process, image processing, rank, orthogonal, orthonormal basis, diagonal matrix
IMAGE PROCESSING USING SINGULAR VALUE
DECOMPOSITION (SVD)
I INTRODUCTION
Trang 2The Singular Value Decomposition (SVD) is a generalization of the eigen-decomposition used to analyze rectangular matrices Its play an important role in many applications: physical and biological processes, mathematical models in economics, data mining applications to rank documents in very large databases, including the Web, image processing applications, etc In this paper, we will study the SVD applied to the image compression
Image compression is a type of data compression that involves encoding information in images using fewer bits than the original image representation The main ideal of image compression is reducing the redundancy of the image and the transferring data in an efficient form The image compression takes an important place in several domains like web designing, in fact, maximally reduce an image allows us to create websites faster and saves bandwidth users, it also reduces the bandwidth of servers and thus save time and money When taking about compression, we generally take into account two aspects: image size in pixels and its degree of compression The nature of the image is also playing a significant role The main goal of system is to reduce the storage quantity as much as possible while ensuring that the decoded image displayed in the monitor can be visually similar to the original image as much as possible
II IMAGE PROCESSING USING SVD
An 𝑚 × 𝑛 pixels image can be represented b 𝑚 × 𝑛 matrix Suppose we have a 12 megapixel gray-scale image, which is 2000 × 2000 pixels or 2000 × 2000 matrix For each pixel, we have some level of black and white color, given by some integer between 0 and 255 0 representing black color and 255 representing white color Each of these integers (and hence each pixel) requires approximately 1 byte to store, resulting in approximately 8.6 Mb image
A color image usually has three components, a red, a green, and a blue Each of these is represented by a matrix, hence storing color images requires times the space (25.8 Mb)
2.1 Singular Value Decomposition (SVD)
The diagonalization theorem plays a part in many interesting applications Unfortunately, as we know, not all matrices can be factored as 𝐴 = 𝑃 𝐷𝑃−1 with 𝐷 diagonal However, a factorization 𝐴 = 𝑃 𝐷𝑃−1 is possible for any 𝑚 × 𝑛 matrix 𝐴 A
special factorization of this type, called the singular value decomposition, is one of the
most useful matrix factorization in applied linear algebra
The singular value decomposition is based on the following property of the ordinary diagonalization that can be imitated for rectangular matrices: The absolute values of the eigenvalues of a symmetric matrix 𝐴 measure the amounts that 𝐴 stretches
or shrinks certain vectors (the eigenvectors) If 𝐴𝑥 = 𝜆𝑥 and ∥ 𝑥 ∥ = 1, then
Trang 3If 𝜆1 is the eigenvalue with the greatest magnitude, then a corresponding unit eigenvector 𝑣1
identifies a direction in which the stretching effect of 𝐴 is greatest That is, the length of
𝐴𝑥
is maximized when 𝑥 = 𝑣1, and ∥ 𝐴𝑣1 ∥ = |𝜆1|, by (1) This description of 𝑣1 𝑎𝑛𝑑 |𝜆1| has an analogue for rectangular matrices that’s will lead to singular value decomposition
Example 1 If 𝐴 = [4 11 14
8 7 −2], then the linear transformation 𝑥 ↦ 𝐴𝑥 maps the unit sphere {𝑥 ∶∥ 𝑥 ∥ = 1} 𝑖𝑛 ℝ3 onto an ellipse in ℝ2 Find a unit vector x at which the length
∥ 𝐴𝑥 ∥ is maximized, and compute this maximum length
SOLUTION The quantity ∥ 𝐴𝑥 ∥2 is maximized at the same x that maximizes ∥ 𝐴𝑥 ∥, and ∥ 𝐴𝑥 ∥2 is easier to study Observe that
∥ 𝐴𝑥 ∥2= (𝐴𝑥)𝑇 (𝐴𝑥) = 𝑥𝑇𝐴𝑇𝐴𝑥 = 𝑥𝑇(𝐴𝑇𝐴)𝑥 Also, 𝐴𝑇𝐴 is a symmetric matrix, since (𝐴𝑇𝐴)𝑇 = 𝐴𝑇𝐴𝑇𝑇 = 𝐴𝑇𝐴 So the problem now
is to maximize the quadratic form 𝑥𝑇(𝐴𝑇𝐴)𝑥 subject to the constraint ∥ 𝑥 ∥ = 1 The maximum value is attained at a unit eigenvectors of 𝐴𝑇𝐴 Also, the maximum value is attained at a unit eigenvector of 𝐴𝑇𝐴 corresponding to 𝜆1
For the matrix 𝐴 in this example,
𝐴𝑇𝐴 = [
15 −2
] [4 11 14
8 7 −2] = [
100 170 140
]
The eigenvalues of 𝐴𝑇𝐴 are 𝜆1 = 360, 𝜆2 = 90, 𝑎𝑛𝑑 𝜆3 = 0 Corresponding unit eigenvectors are, respectively,
𝑣1 = [
1/3 2/3 2/3 ] , 𝑣2 = [
−2/3
−1/3 2/3 ] , 𝑣3 = [
2/3
−2/3 1/3 ]
The maximum value of ∥ 𝐴𝑥 ∥2 is 360, attained when x is the unit vector 𝑣1 The vector
𝐴𝑣1 is a point on the ellipse, namely,
𝐴𝑣1 = [4 11 14
8 7 −2] [
1/3 2/3 2/3
] = [18
6 ] For ∥ 𝑥 ∥ = 1, the maximum value of ∥ 𝐴𝑥 ∥ is ∥ 𝐴𝑣1 ∥ = √360 = 6√10 ∎
Trang 4Example 1 suggest that the effect of 𝐴 on the unit sphere in ℝ3 is related to the quadratic form 𝑥𝑇(𝐴𝑇𝐴)𝑥 In fact, the entire geometric behavior of the transformation
𝑥 ↦ 𝐴𝑥 is captured by this quadratic form, as we shall see
The Singular Values of 𝒎 × 𝒏 Matrix
Let 𝐴 be an 𝑚 × 𝑛 matrix Then 𝐴𝑇𝐴 is symmetric and can be orthogonally diagonalized Let {𝑣1 , … , 𝑣𝑛 } be an orthonormal basis for ℝ𝑛 consisting of eigenvectors of 𝐴𝑇𝐴, and let 𝜆1 , … , 𝜆𝑛 be the associated eigenvalues of 𝐴𝑇𝐴 Then, for
1 ≤ 𝑖 ≤ 𝑛,
∥ 𝐴𝑣𝑖 ∥2= (𝐴𝑣𝑖)𝑇𝐴𝑣𝑖 = 𝑣𝑖𝑇𝐴𝑇𝐴𝑣𝑖
= 𝑣𝑖𝑇(𝜆𝑖𝑣𝑖) = 𝜆𝑖 (2)
So the eigenvalues of 𝐴𝑇𝐴 are all nonnegative By renumbering, if necessary, we may assume that the eigenvalues are arranged so that
𝜆1≥ 𝜆2 ≥ ⋯ ≥ 𝜆𝑛 ≥ 0 The singular value of 𝐴 are the square roots of the eigenvalues of 𝐴𝑇𝐴, denoted by
𝜎1, … , 𝜎𝑛
and they are arranged in decreasing order That is, 𝜎𝑖 = √𝜆𝑖 for 1 ≤ 𝑖 ≤ 𝑛
By equation (2), the singular values of A are the lengths of the vectors 𝐴𝑣1 , … , 𝐴𝑣𝑛
THEOREM 1
Suppose {𝑣1 , … , 𝑣𝑛 } is an orthonormal basis of ℝ𝑛 consisting of eigenvectors of 𝐴𝑇𝐴, arranged so that the corresponding eigenvalues of 𝐴𝑇𝐴 sastify 𝜆1 ≥ 𝜆2 ≥ ⋯ ≥ 𝜆𝑛, and suppose 𝐴 has 𝑟 nonzero singular values Then {𝐴𝑣1 , … , 𝐴𝑣𝑛 } is an orthogonal basis for 𝐶𝑜𝑙 𝐴, and 𝑟𝑎𝑛𝑘 𝐴 = 𝑟
NUMERICAL NOTE
In some cases, the rank of A may be very sensitive to small changes in the entries of
A The obvious method of counting the number of pivot columns in A does not work well if A is row reduced by a computer Round off error often creates an
echelon form with full rank
In practice, the most reliable way to estimate the rank of a large matrix A is to count the number of nonzero singular values In this case, extremely small nonzero
singular values are assumed to be zero for all practical purposes, and the effective
rank of the matrix is the number obtained by counting the remaining nonzero
singular values
THE SINGULAR VALUE DECOMPOSITION
The decomposition of 𝐴 involves an 𝑚 × 𝑛 “diagonal” matrix Σ of the form
Trang 5Σ = [𝐷 0
0 0] (3) where 𝐷 is an 𝑟 × 𝑟 diagonal matrix for some r not exceeding the smaller of 𝑚 and 𝑛
If 𝑟 equals 𝑚 or 𝑛 or both, some or all of zero matrices do not appear
Any factorization 𝐴 = 𝑈 Σ 𝑉𝑇, with 𝑈 and 𝑉 orthogonal, Σ as in (3), and positive
diagonal entries in D, is called a singular value decomposition (or SVD) of 𝐴 The
matrices 𝑈 and 𝑉 are not uniquely determined by 𝐴, but the diagonal entries of Σ are necessarily the singular values of 𝐴 The columns of 𝑈 in such a decomposition are
called left singular vectors of 𝐴,
and the columns of 𝑉 are called right singular vectors of 𝐴
]
−9 9 ] The eigenvalues of 𝐴𝑇𝐴 are 18 and 0, with corresponding unit eigenvectors
𝑣1 = [1/√2
1/√2], 𝑣2 = [
1/√2 1/√2] These unit vectors form the columns of 𝑉:
𝑉 = [ 𝑣1 𝑣2 ] = [ 1/√2 1/√2
−1/√2 1/√2] The singular value are 𝜎1 = √18 = 3√2 𝑎𝑛𝑑 𝜎2 = 0 Since there is only one nonzero singular value, the matrix 𝐷 may be written as a single number That is, 𝐷 = 3√2 The matrix Σ is the same size as 𝐴, with 𝐷 in its upper left corner:
Σ = [
] = [3√2 00 0
]
To construct 𝑈, first construct 𝐴𝑣1 and 𝐴𝑣2:
Definition: (The Singular Value Decomposition)
Let A be an 𝑚 × 𝑛 matrix with rank 𝑟 Then there exist an 𝑚 × 𝑛 𝑚𝑎𝑡𝑟𝑖𝑥 Σ for which the diagonal entries in 𝐷 are the first 𝑟 singular values of 𝐴, 𝜎1 ≥ 𝜎2 ≥
… ≥ 𝜎𝑟 > 0, and there exist an 𝑚 × 𝑚 orthogonal matrix 𝑈 and an 𝑛 × 𝑛 orthogonal matrix 𝑉 such that
𝐴 = 𝑈 Σ 𝑉𝑇
Trang 6𝐴𝑣1 = [
2/√2
−4/√2
4/√2
], 𝐴𝑣2 = [
0 0 0 ]
As a check on the calculations, verify that ∥ 𝐴𝑣1 ∥ = 𝜎1 = 3√2 Of course, 𝐴𝑣2 = 0 because ∥ 𝐴𝑣2 ∥ = 𝜎2 = 0 The only column found for 𝑈 so far is
𝑢1 = 1 3√2𝐴𝑣1 = [
1/3
−2/3 2/3 ]
The other columns of 𝑈 are found by extending the set {𝑢1} to an orthonormal basis for
ℝ3
In this case, we need two orthogonal unit vectors u2 and u3 that are orthogonal to
equation 𝑥1− 2𝑥2+ 2𝑥3 = 0 A basis for the solution set of this equation is
𝑤1 = [
2 1 0 ], 𝑤2 = [
−2 0 1 ]
(Check that 𝑤1 𝑎𝑛𝑑 𝑤2 are each orthogonal to 𝑢1) Apply the Gram–Schmidt process
(with normalizations) to {𝑤1, 𝑤2}, and obtain
𝑢1 = [
2/√5 1/√5 0 ], 𝑤2 = [
−2/√45 4/√45 5/√45
] Finally, set 𝑈 = [𝑢1 𝑢2 𝑢3], take Σ 𝑎𝑛𝑑 𝑉𝑇 from above, and write
𝐴 = [
] = [
] [3√2 00 0
] [ 1/√2 −1/√2
−1/√2 1/√2 ] 2.2 An Application to Image Processing
Each piece is a column vector times a row vector An 𝑚 by 𝑛 matrix has m times
n entries (a big number when the matrix represents an image) But a column and a row only have 𝑚 + 𝑛 components, far less than m times n Those (column)(row) pieces
are full size matrices that can be processed with extreme speed—they need only 𝑚 plus
𝑛 numbers
Unusually, this image processing application of the SVD is coming before the matrix algebra it depends on I will start with simple images that only involve one or two pieces Right now I am thinking of an image as a large rectangular matrix The entries 𝑎𝑖𝑗 tell the grayscales of all the pixels in the image Think of a pixel as a small square, 𝑖 steps across and j steps up from the lower left corner Its grayscale is a number (often a whole number in the range 0 ≤ 𝑎𝑖𝑗 < 256 = 28) An all-white pixel has 𝑎𝑖𝑗 =
255 = 11111111 That number has eight 1’s when the computer writes 255 in binary notation
Trang 7You see how an image that has 𝑚 times 𝑛 pixels, with each pixel using 8 bits (0 or 1) for its grayscale, becomes an 𝑚 by 𝑛 matrix with 256 possible values for each entry
𝑎𝑖𝑗
In short, an image is a large matrix To copy it perfectly, we need 8 (m)(n) bits of information High definition television typically has 𝑚 = 1080 and 𝑛 = 1920 Often there are 24 frames each second and you probably like to watch in color (3 color scales) This requires transmitting (3)(8) (48, 470, 400) bits per second That is too expensive and it is not done The transmitter can’t keep up with the show
When compression is well done, you can’t see the difference from the original
Edges in the image (sudden changes in the grayscale) are the hard parts to compress
Major success in compression will be impossible if every 𝑎𝑖𝑗 is an independent
random number We totally depend on the fact that nearby pixels generally have similar
grayscales An edge produces a sudden jump when you cross over it Cartoons are more
compressible than real-world images, with edges everywhere
Lower Rank Image Example
The easiest images to compress are all black or all white or all a constant grayscale 𝑔 The matrix 𝐴 has the same g in every entry: 𝑎𝑖𝑗 = 𝑔 When 𝑔 = 1 and
𝑚 = 𝑛 = 6, here is an extreme example of the central SVD dogma of image processing:
Example Don’t send 𝐴 =
[
1 1 1]
send this 𝐴 =
[
1 1 1 1 1 1]
36 numbers become 12 numbers With 300 by 300 pixels, 90, 000 numbers become 600 And if we define the all-ones vector x in advance, we only have to send
produce the matrix Of course this first example is extreme But it makes an important
point If there are special vectors like x = ones that can usefully be defined in advance,
then image processing can be extremely fast
Example Suppose the flag has two triangles of different colors The lower left
triangle has 1’s and the upper right triangle has 0’s The main diagonal is included with the 1’s Here is the image matrix when 𝑛 = 4 It has full rank 𝑟 = 4 so it is invertible:
Triangular matrix 𝐴 = [
1 0
1 1
1 1
1 1
] and 𝐴−1= [
−1 1
0 0
0 0
] With full rank, 𝐴 has a full set of 𝑛 singular values 𝜎 (all positive) The SVD will produce n pieces 𝜎𝑖𝑢𝑖𝑣𝑖𝑇 of rank one Perfect reproduction needs all 𝑛 pieces In
Trang 8compression small 𝜎′𝑠 can be discarded with no serious loss in image quality We want
to understand and plot the 𝜎′𝑠 for
𝑛 = 4 and also for large 𝑛
Working by hand, we begin with 𝐴𝐴𝑇 (a computer proceed differently):
𝐴𝐴𝑇 = [
1 1
1 2
1 2
1 2
] and (𝐴𝐴𝑇)−1 = (𝐴−1)𝑇𝐴−1 = [
]
That −1,2, −1 inverse matrix is included because its eigenvalues all have the form 2 −
2 cos 𝜃
So we know the 𝜆′𝑠 for 𝐴𝐴𝑇 and the 𝜎′𝑠 for 𝐴:
2−2 cos 𝜃= 1
4𝑠𝑖𝑛 2 (𝜃/2) gives 𝜎 = √𝜆 = 2sin (𝜃/2)1 The 𝑛 different angles 𝜃 are equally spaced, which makes this example so exceptional:
2𝑛 + 1,
3𝜋 2𝑛 + 1, … ,
(2𝑛 − 1)𝜋 2𝑛 + 1 (𝑛 = 4 𝑖𝑛𝑐𝑙𝑢𝑑𝑒𝑠 𝜃 =
3𝜋
9 𝑤𝑖𝑡ℎ 2𝑠𝑖𝑛
𝜃
2= 1) That special case gives 𝜆 = 1 as an eigenvalue of 𝐴𝐴𝑇 when 𝑛 = 4 So 𝜎 = √𝜆
=1 is a value of singular 𝐴 You can check that the vector 𝑢 = (1,1,0, −1) has 𝐴𝐴𝑇𝑢 =
𝑢 (a truly special case)
The important point is to graph the n singular values of A Those numbers drop off (unlike the eigenvalues of A, which are all 1) But the drop off is not steep So the SVD gives only moderate compression of this triangular flag
Your faithful author has continued research on the ranks of flags Quite a few are
based on horizontal or vertical stripes Those have rank one—all rows or all columns are multiples of the ones vector (1, 1, …, 1) Armenia, Austria, Belgium, Bulgaria,
Chad, Colombia, Ireland, Madagascar, Mali, Netherlands, Nigeria, Romania, Russia (and more) have three stripes Indonesia and Poland have two!
At the other extreme, many flags include diagonal lines Those could be long diagonals as in the British flag Or they could be short diagonals coming from the edges
of a star as in the US flag The text example of a triangle of ones shows how those flag matrices will have large rank The rank increases to infinity as the pixel sizes get small Other flags have circles or crescents or various curved shapes Their ranks are large and also increasing to infinity These are still compressible! The compressed image won’t be perfect but our eyes won’t see the difference (with enough term
𝜎𝑖𝑢𝑖𝑣𝑖𝑇 from the SVD) Those examples actually bring out the main purpose of image compression
III CONCLUSION
Trang 9Despite the attention it has received in the last years, SVD in image processing is still in its infancy Many SVD characteristics are still unutilized in image processing Using SVD techniques, we can save a lot of computer memory which be used for other purpose
REFERENCES
[1] M Moonen, P van Dooren, J Vandewalle, “Singular value decomposition updating algorithm for subspace tracking”, SIAM Journal on Matrix Analysis and Applications (1992)
[2] H C Andrews and C L Patterson, “Singular value decompositions and digital image processing,” IEEE Trans on Acoustics, Speech, and Signal Processing, vol ASSP-24, pp 26–53, 1976
[3] David C Lay, Steven R Lay, Judi J McDonald (2016), Linear Algebra and
Its Applications, 5th Ed., USA: Pearson.
[4] D V S Chandra, “Digital Image Watermarking Using Singular Value Decomposition,” Proceeding of 45th IEEE Midwest Symposium on Circuits And Systems, pp 264-267, Tulsa, OK, August 2002
[5] R Liu and T Tan, “A SVD-Based Watermarking Scheme for Protecting Rightful Ownership,” IEEE Transaction on Multimedia, 4(1), pp.121- 128, March 2002 [6] R Karkarala and P.O Ogunbona, ”Signal Analysis Using a Multiresolution Form of the Singular Value Decomposition,” IEEE Trans Image Processing, vol 10,
pp 724-735, May 2001