SVD with Numpy

In this post we will learn how to perform Singular Value Decomposition (SVD) in Python with Numpy. We will start with learning the basics of SVD and then learn to use Numpy’s linalg.svd() function to perform SVD in Python with multiple examples.

It has a wide range of applications, including data compression, image processing, and information retrieval. In this post, we will learn how to perform SVD in NumPy, with examples to help illustrate the concepts.

What is SVD?

SVD is one of most commonly used matrix decomposition method that factorizes or decomposes a matrix into the product of three matrices: U, S, and V*. And the three matrices have special propeties making them extremely useful. For example, when we have input matrix M that is real we can factorize into the following three matrices

  • U is a orthogonal matrix, meaning that each vector in the matrix is of unit length
  • S is a diagonal matrix containing the singular values of the input matrix.
  • Vh is the transpose of a orthogonal matrix V.

The columns of the matrices U and V are called left sigulatr vectors and right singular vectors. SVD has applications in many fields including data compression, image processing, and feature selection.

In NumPy, you can use the numpy.linalg.svd function to perform SVD. This function takes a matrix M as input and returns the singular values and matrices of the decomposition. The singular values are returned as a 1-D array, and the matrices are returned as 2-D arrays.

Here is the basic syntax of using Numpy’s linalg.svd.

linalg.svd(a, full_matrices=True, compute_uv=True, hermitian=False)

Here is an example of how to use numpy.linalg.svd to decompose a matrix A

import numpy as np

Example 1: Performing SVD with Numpy

Let us see an example of using linalg.svd() function in Numpy to decompose 3×3 matrix with sequence of numbers from 1 to 9.

A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

We can use the above matrix as input to linalg.svd() function to factorise it.

# Perform svd with numpy
U, s, Vh = np.linalg.svd(A)

Numpy’s linalg.svd() returns the three matrices as a tuple. Let us take a look at the resulting matrices U, s, and Vh

U is matrix of dimension 3×3

U

array([[-0.21483724,  0.88723069,  0.40824829],
       [-0.52058739,  0.24964395, -0.81649658],
       [-0.82633754, -0.38794278,  0.40824829]])

The singular vector s is vector with 3 elements.

s

array([1.68481034e+01, 1.06836951e+00, 4.41842475e-16])

And Vh is a matrix of dimension 3×3.

Vh

array([[-0.47967118, -0.57236779, -0.66506441],
       [-0.77669099, -0.07568647,  0.62531805],
       [-0.40824829,  0.81649658, -0.40824829]])

As we know U and Vh are orthogonal, they satisfy the following property.

The product of U with its transpose will give us identity matrix. Let us test that in Numpy

np.dot(U, U.T)

array([[1.00000000e+00, 2.31660681e-16, 3.35641473e-16],
       [2.31660681e-16, 1.00000000e+00, 2.63249183e-16],
       [3.35641473e-16, 2.63249183e-16, 1.00000000e+00]])

Similarly, the product of V with its transpose will give us identity matrix.

np.dot(Vh, Vh.T)

array([[ 1.00000000e+00,  4.88561219e-17, -8.47339685e-17],
       [ 4.88561219e-17,  1.00000000e+00, -1.28242073e-16],
       [-8.47339685e-17, -1.28242073e-16,  1.00000000e+00]])

We can also reconstruct the original matrix by finding the product of the three matrices, U, s and Vh. First, we need to convert the singular vector to diagonla matrix with Numpy’s np.diag().

np.diag(s)
array([[1.68481034e+01, 0.00000000e+00, 0.00000000e+00],
       [0.00000000e+00, 1.06836951e+00, 0.00000000e+00],
       [0.00000000e+00, 0.00000000e+00, 4.41842475e-16]])

Now we can compute the product using Numpy and reconstruct the original input matrix.

# reconstructing the original matrix.
U @ np.diag(s) @ Vh

And we get the input matrix as the result.

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

Example 2: Performing SVD on random matrix with Numpy

In the second example, we will decompose a 4×4 matrix created by using Numpy’s integers() function that generates random integers. Here we first generate random integers and the use Numpy reshape() function to make it a 4×4 matrix.

rng = np.random.default_rng(2022)
A = rng.integers(low=1, high=20, size=(4,4))
A

Our input matrix looks like this.

array([[14,  5, 15,  2],
       [ 4, 12, 18,  2],
       [ 2, 13, 17, 15],
       [16,  3, 11,  1]])

And we can decompuse with linalg.svd() using the input matrix as argument. And it gives us three matrices as a tuple.

U, s, Vh = np.linalg.svd(A)

As in the previous examples we can verify the properties of these matrices U and Vh. Here we use the three matrices to reconstruct the original matrix. Since we use all the singular vectors to reconstruct, the reconstructed matrix is the same as the original matrix.

U @ np.diag(s) @ Vh

array([[14.,  5., 15.,  2.],
       [ 4., 12., 18.,  2.],
       [ 2., 13., 17., 15.],
       [16.,  3., 11.,  1.]])

Instead, if we had used only the top few singular vectors to reconstruct, we would have reconstructed the original matrix with some errors. Basically we reconstructed an approximate of the original matrix. A lot of the applications of SVD with high dimensional data uses this approximate reconstruction.