Matrix Computation (Implementation)

The code demos in this section could be found in GitHub repo: ScientificComputingDemos/SimpleLinearAlgebra.

LU Factorization

Forward-substitution

Forward substitution is an algorithm used to solve a system of linear equations with a lower triangular matrix

\[Lx = b\]

where $L \in \mathbb{R}^{n\times n}$ is a lower triangular matrix defined as

\[L = \left(\begin{matrix} l_{11} & 0 & \ldots & 0\\ l_{21} & l_{22} & \ldots & 0\\ \vdots & \vdots & \ddots & \vdots\\ l_{n1} & l_{n2} & \ldots & l_{nn} \end{matrix}\right)\]

The forward substitution can be summarized to the following algorithm

\[x_1 = b_1/l_{11},~~~ x_i = \left(b_i - \sum_{j=1}^{i-1}l_{ij}x_j\right)/l_{ii},~~ i=2, ..., n\]

Example

Consider the following system of lower triangular linear equations:

\[L = \left(\begin{matrix} 3 & 0 & 0\\ 2 & 5 & 0\\ 1 & 4 & 2 \end{matrix}\right) \left(\begin{matrix} x_1\\ x_2\\ x_3 \end{matrix}\right) = \left(\begin{matrix} 9\\ 12\\ 13 \end{matrix}\right)\]

To solve for $x_1$, $x_2$, and $x_3$ using forward substitution, we start with the first equation:

\[3x_1 + 0x_2 + 0x_3 = 9\]

Solving for $x_1$, we get $x_1 = 3$. Substituting $x = 3$ into the second equation (row), we get:

\[2(3) + 5x_2 + 0x_3 = 12\]

Solving for $x_2$, we get $x_2 = (12 - 6) / 5 = 1.2$. Substituting $x = 3$ and $x_2 = 1.2$ into the third equation (row), we get:

\[1(3) + 4(1.2) + 2x_3 = 13\]

Solving for $x_3$, we get $x_3 = (13 - 3 - 4(1.2)) / 2 = 1.5$. Therefore, the solution to the system of equations is:

\[x = \left(\begin{matrix}\ 3\\ 1.2\\ 1.5 \end{matrix}\right)\]

Back-substitution

Back substitution is an algorithm used to solve a system of linear equations with an upper triangular matrix

\[Ux = b\]

where $U \in \mathbb{R}^{n\times n}$ is an upper triangular matrix defined as

\[U = \left(\begin{matrix} u_{11} & u_{12} & \ldots & u_{1n}\\ 0 & u_{22} & \ldots & u_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \ldots & u_{nn} \end{matrix}\right)\]

The back substitution can be summarized to the following algorithm

\[x_n = b_n/u_{nn},~~~ x_i = \left(b_i - \sum_{j=i+1}^{n}u_{ij}x_j\right)/u_{ii},~~ i=n-1, ..., 1\]

We implement the above algorithm in Julia language.

function back_substitution!(l::AbstractMatrix, b::AbstractVector)
    n = length(b)
    @assert size(l) == (n, n) "size mismatch"
    x = zero(b)
    # loop over columns
    for j = 1:n
        # stop if matrix is singular
        if iszero(l[j, j])
            error("The lower triangular matrix is singular!")
        end
        # compute solution component
        x[j] = b[j] / l[j, j]
        for i = j+1:n
            # update right hand side
            b[i] = b[i] - l[i, j] * x[j]
        end
    end
    return x
end

back_substitution! (generic function with 1 method)

We can write a test for this algorithm.

using Test, LinearAlgebra

@testset "back substitution" begin
    # create a random lower triangular matrix
    l = LinearAlgebra.tril(randn(4, 4))
    # target vector
    b = randn(4)
    # solve the linear equation with our algorithm
    x = back_substitution!(l, copy(b))
    @test l * x ≈ b

    # The Julia's standard library `LinearAlgebra` contains a native implementation.
    x_native = LowerTriangular(l) \ b
    @test l * x_native ≈ b
end

Test.DefaultTestSet("back substitution", Any[], 2, false, false, true, 1.714382304796008e9, 1.714382304944341e9, false, "linalg-impl.md")

LU Factorization with Gaussian Elimination

LU decomposition is a method for solving linear equations that involves breaking down a matrix into lower and upper triangular matrices. The $LU$ decomposition of a matrix $A$ is represented as $A = LU$, where $L$ is a lower triangular matrix and $U$ is an upper triangular matrix.

The elementary elimination matrix

An elementary elimination matrix is a matrix that is used in the process of Gaussian elimination to transform a system of linear equations into an equivalent system that is easier to solve. It is a square matrix that is obtained by performing a single elementary row operation on the identity matrix.

\[(M_k)_{ij} = \begin{cases} \delta_{ij} & i= j,\\ - a_{ik}/a_{kk} & i > j \land j = k, \\ 0 & {\rm otherwise}. \end{cases}\]

Let $A = (a_{ij})$ be a square matrix of size $n \times n$. The $k$th elementary elimination matrix for it is defined as

\[M_k = \left(\begin{matrix} 1 & \ldots & 0 & 0 & 0 & \ldots & 0\\ \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & \ldots & 1 & 0 & 0 & \ldots & 0\\ 0 & \ldots & 0 & 1 & 0 & \ldots & 0\\ 0 & \ldots & 0 & -m_{k+1} & 1 & \ldots & 0\\ \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & \ldots & 0 & -m_{n} & 0 & \ldots & 1\\ \end{matrix}\right)\]

where $m_i = a_{ik}/a_{kk}$.

By applying this elementary elimination matrix $M_1$ on $A$, we can obtain a new matrix with the $a_{i1}' = 0$ for all $i>1$.

\[M_1 A = \left(\begin{matrix} a_{11} & a_{12} & a_{13} & \ldots & a_{1n}\\ 0 & a_{22}' & a_{23}' & \ldots & a_{2n}'\\ 0 & a_{32}' & a_{33}' & \ldots & a_{3n}'\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & a_{n2}' & a_{n3}' & \ldots & a_{nn}'\\ \end{matrix}\right)\]

For $k=1,2,\ldots,n$, apply $M_k$ on $A$. We will have an upper triangular matrix.

\[U = M_{n-1}\ldots M_1 A\]

Since $M_k$ is reversible, we have

\[\begin{align*} &A = LU\\ &L = M_1^{-1} M_2^{-1} \ldots M_{n-1}^{-1}, \end{align*}\]

Elementary elimination matrices have the following properties that making the above process efficient:

Its inverse can be computed in $O(n)$ time
\[M_k^{-1} = 2I - M_k\]
The multiplication of two elementary matrices can be computed in $O(n)$ time
\[M_k M_{k' > k} = M_k + M_{k'} - I\]

Code: Elementary Elimination Matrix

A3 = [1 2 2; 4 4 2; 4 6 4]

function elementary_elimination_matrix(A::AbstractMatrix{T}, k::Int) where T
    n = size(A, 1)
    @assert size(A, 2) == n
    # create Elementary Elimination Matrices
    M = Matrix{Float64}(I, n, n)
    for i=k+1:n
        M[i, k] =  -A[i, k] ./ A[k, k]
    end
    return M
end

elementary_elimination_matrix (generic function with 1 method)

The elementary elimination matrix for the above matrix $A3$ eliminating the first column is

julia> elementary_elimination_matrix(A3, 1)3×3 Matrix{Float64}:
  1.0  0.0  0.0
 -4.0  1.0  0.0
 -4.0  0.0  1.0
julia> elementary_elimination_matrix(A3, 1) * A33×3 Matrix{Float64}:
 1.0   2.0   2.0
 0.0  -4.0  -6.0
 0.0  -2.0  -4.0

Verify the property 1

julia> inv(elementary_elimination_matrix(A3, 1))3×3 Matrix{Float64}:
 1.0  0.0  0.0
 4.0  1.0  0.0
 4.0  0.0  1.0

Verify the property 2

julia> elementary_elimination_matrix(A3, 2)3×3 Matrix{Float64}:
 1.0   0.0  0.0
 0.0   1.0  0.0
 0.0  -1.5  1.0
julia> inv(elementary_elimination_matrix(A3, 1)) * inv(elementary_elimination_matrix(A3, 2))3×3 Matrix{Float64}:
 1.0  0.0  0.0
 4.0  1.0  0.0
 4.0  1.5  1.0

Code: LU Factorization by Gaussian Elimination

A naive implementation of elimentary elimination matrix is as follows

function lufact_naive!(A::AbstractMatrix{T}) where T
    n = size(A, 1)
    @assert size(A, 2) == n
    M = Matrix{T}(I, n, n)
    for k=1:n-1
        m = elementary_elimination_matrix(A, k)
        M = M * inv(m)
        A .= m * A
    end
    return M, A
end

lufact_naive!(copy(A3))

@testset "naive LU factorization" begin
    A = [1 2 2; 4 4 2; 4 6 4]
    L, U = lufact_naive!(copy(A))
    @test L * U ≈ A
end

Test.DefaultTestSet("naive LU factorization", Any[], 1, false, false, true, 1.714382305461667e9, 1.714382306265524e9, false, "linalg-impl.md")

The above implementation has time complexity $O(n^4)$ since we did not use the sparsity of elimentary elimination matrix. A better implementation that gives $O(n^3)$ time complexity is as follows.

function lufact!(a::AbstractMatrix)
    n = size(a, 1)
    @assert size(a, 2) == n "size mismatch"
    m = zero(a)
    m[1:n+1:end] .+= 1
    # loop over columns
    for k=1:n-1
        # stop if pivot is zero
        if iszero(a[k, k])
            error("Gaussian elimination fails!")
        end
        # compute multipliers for current column
        for i=k+1:n
            m[i, k] = a[i, k] / a[k, k]
        end
        # apply transformation to remaining sub-matrix
        for j=k+1:n
            for i=k+1:n
                a[i,j] -= m[i,k] * a[k, j]
            end
        end
    end
    return m, triu!(a)
end

lufact(a::AbstractMatrix) = lufact!(copy(a))

@testset "LU factorization" begin
    a = randn(4, 4)
    L, U = lufact(a)
    @test istril(L)
    @test istriu(U)
    @test L * U ≈ a
end

Test.DefaultTestSet("LU factorization", Any[], 3, false, false, true, 1.714382306275491e9, 1.71438230693452e9, false, "linalg-impl.md")

We can test the performance of our implementation.

julia> A4 = randn(4, 4)4×4 Matrix{Float64}:
  0.897759   -2.15279    1.18993   -0.247813
 -2.36847    -0.683268  -1.83512   -0.581775
  0.0220755   0.332306  -0.592658   0.589756
 -2.9631     -0.680586  -0.542837  -0.713306
julia> lufact(A4)([1.0 0.0 0.0 0.0; -2.638204596914282 1.0 0.0 0.0; 0.024589629267600527 -0.0605462987561217 1.0 0.0; -3.300553906196713 1.223678874449246 -3.2944017612152168 1.0], [0.8977585019504888 -2.1527898350609886 1.1899277226309366 -0.2478126544698392; 0.0 -6.362768015206671 1.3041563990724125 -1.235555436318963; 0.0 0.0 -0.5429558309804762 0.5210412331595067; 0.0 0.0 0.0 1.697216760003853])

Julia language has a much better implementation in the standard library LinearAlgebra.

julia> julia_lures = lu(A4, NoPivot())  # the version we implemented above has no pivotLinearAlgebra.LU{Float64, Matrix{Float64}, Vector{Int64}}
L factor:
4×4 Matrix{Float64}:
  1.0         0.0         0.0     0.0
 -2.6382      1.0         0.0     0.0
  0.0245896  -0.0605463   1.0     0.0
 -3.30055     1.22368    -3.2944  1.0
U factor:
4×4 Matrix{Float64}:
 0.897759  -2.15279   1.18993   -0.247813
 0.0       -6.36277   1.30416   -1.23556
 0.0        0.0      -0.542956   0.521041
 0.0        0.0       0.0        1.69722
julia> julia_lures.U4×4 Matrix{Float64}:
 0.897759  -2.15279   1.18993   -0.247813
 0.0       -6.36277   1.30416   -1.23556
 0.0        0.0      -0.542956   0.521041
 0.0        0.0       0.0        1.69722
julia> typeof(julia_lures)LinearAlgebra.LU{Float64, Matrix{Float64}, Vector{Int64}}
julia> fieldnames(julia_lures |> typeof)(:factors, :ipiv, :info)

Pivoting technique

How to handle small diagonal entries?

The above Gaussian elimination process is not stable if any diagonal entry in $A$ has a value that close to zero.

julia> small_diagonal_matrix = [1e-8 1; 1 1]2×2 Matrix{Float64}:
 1.0e-8  1.0
 1.0     1.0
julia> lures = lufact(small_diagonal_matrix)([1.0 0.0; 1.0e8 1.0], [1.0e-8 1.0; 0.0 -9.9999999e7])

This issue is can be resolved by permuting the rows of $A$ before factorizing it. For example:

julia> lufact(small_diagonal_matrix[end:-1:1, :])([1.0 0.0; 1.0e-8 1.0], [1.0 1.0; 0.0 0.99999999])

This technique is called pivoting.

Partial pivoting

LU factoriazation (or Gaussian elimination) with row pivoting is defined as

\[P A = L U\]

where $P$ is a permutation matrix. Pivoting in Gaussian elimination is the process of selecting a pivot element in a matrix and then using it to eliminate other elements in the same column or row. The pivot element is chosen as the largest absolute value in the column, and its row is swapped with the row containing the current element being eliminated if necessary. This is done to avoid division by zero or numerical instability, and to ensure that the elimination process proceeds smoothly. Pivoting is an important step in Gaussian elimination, as it ensures that the resulting matrix is in reduced row echelon form and that the solution to the system of equations is accurate.

Let $A=(a_{ij})$ be a square matrix of size $n\times n$. The Gaussian elimination process with partial pivoting can be represented as

\[M_{n-1}P_{n-1}\ldots M_2P_2M_1P_1 A = U\]

Here we emphsis that $P_{k}$ and $M_{j<k}$ commute.

Complete pivoting

The complete pivoting also allows permuting columns. The LU factorization with complete pivoting is defined as

\[P A Q = L U.\]

Complete pivoting produces better numerical stability but is also harder to implement. In most practical using cases, partial pivoting is good enough.

Code: LU Factoriazation by Gaussian Elimination with Partial Pivoting

A Julia implementation of the Gaussian elimination with partial pivoting is

function lufact_pivot!(a::AbstractMatrix)
    n = size(a, 1)
    @assert size(a, 2) == n "size mismatch"
    m = zero(a)
    P = collect(1:n)
    # loop over columns
    @inbounds for k=1:n-1
        # search for pivot in current column
        val, p = findmax(x->abs(a[x, k]), k:n)
        p += k-1
        # find index p such that |a_{pk}| ≥ |a_{ik}| for k ≤ i ≤ n
        if p != k
            # swap rows k and p of matrix A
            for col = 1:n
                a[k, col], a[p, col] = a[p, col], a[k, col]
            end
            # swap rows k and p of matrix M
            for col = 1:k-1
                m[k, col], m[p, col] = m[p, col], m[k, col]
            end
            P[k], P[p] = P[p], P[k]
        end
        if iszero(a[k, k])
            # skip current column if it's already zero
            continue
        end
        # compute multipliers for current column
        m[k, k] = 1
        for i=k+1:n
            m[i, k] = a[i, k] / a[k, k]
        end
        # apply transformation to remaining sub-matrix
        for j=k+1:n
            akj = a[k, j]
            for i=k+1:n
                a[i,j] -= m[i,k] * akj
            end
        end
    end
    m[n, n] = 1
    return m, triu!(a), P
end

@testset "lufact with pivot" begin
    n = 5
    A = randn(n, n)
    L, U, P = lufact_pivot!(copy(A))
    pmat = zeros(Int, n, n)
    setindex!.(Ref(pmat), 1, 1:n, P)
    @test L ≈ lu(A).L
    @test U ≈ lu(A).U
    @test pmat * A ≈ L * U
end

Test.DefaultTestSet("lufact with pivot", Any[], 3, false, false, true, 1.714382307254926e9, 1.714382308313793e9, false, "linalg-impl.md")

The performance of our implementation is as follows.

julia> using BenchmarkTools

julia> n = 200
200

julia> A = randn(n, n);

julia> @benchmark lufact_pivot!($A)
BenchmarkTools.Trial: 7451 samples with 1 evaluation.
 Range (min … max):  621.834 μs …  11.111 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     643.541 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   668.927 μs ± 255.808 μs  ┊ GC (mean ± σ):  0.84% ± 2.57%

     ▂█▂                                                        
  ▄▄▂███▆▄▄▅▅▅▅▄▄▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▂▂▁▂▂ ▃
  622 μs           Histogram: frequency by time          835 μs <

 Memory estimate: 314.31 KiB, allocs estimate: 3.

julia> n = 200
200

julia> A = randn(n, n);

julia> @benchmark lu($A)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  247.709 μs …  11.649 ms  ┊ GC (min … max): 0.00% … 96.82%
 Time  (median):     269.583 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   318.077 μs ± 247.482 μs  ┊ GC (mean ± σ):  1.69% ±  2.69%

  ▆██▄▂▃▅▅▄▃▂▂▁ ▁                     ▁▁▁                       ▂
  ████████████████▇▇▇▆▆▇▆▆▆▆▆▄▆▅▄▄▆▄▇█████▇▆▆▆▆▆▅▆▅▄▄▆▅▄▅▄▅▄▅▅▄ █
  248 μs        Histogram: log(frequency) by time        835 μs <

 Memory estimate: 314.31 KiB, allocs estimate: 3.

QR Factorization

The QR factorization of a matrix $A \in \mathbb{R}^{m\times n}$ is a factorization of the form

\[A = QR\]

where $Q \in \mathbb{R}^{m\times m}$ is an orthogonal matrix and $R \in \mathbb{R}^{m\times n}$ is an upper triangular matrix.

Householder Reflection

Let $v \in \mathbb{R}^m$ be nonzero, An $m$-by-$m$ matrix $P$ of the form

\[P = 1-\beta vv^T, ~~~\beta = \frac{2}{v^Tv}\]

is a Householder reflection, which is both symmetric and orthogonal. Suppose we want to project a vector $x$ to $e_1$, i.e. $Px = \beta e_1$. Then we can choose

\[\begin{align*} &v = x \pm \|x\|_2 e_1\\ &H = I - \beta vv^T \end{align*}\]

Let us define a Householder matrix in Julia.

struct HouseholderMatrix{T} <: AbstractArray{T, 2}
    v::Vector{T}
    β::T
end
function HouseholderMatrix(v::Vector{T}) where T
    HouseholderMatrix(v, 2/norm(v, 2)^2)
end

# array interfaces
Base.size(A::HouseholderMatrix) = (length(A.v), length(A.v))
Base.size(A::HouseholderMatrix, i::Int) = i == 1 || i == 2 ? length(A.v) : 1
function Base.getindex(A::HouseholderMatrix, i::Int, j::Int)
    (i == j ? 1 : 0) - A.β * A.v[i] * conj(A.v[j])
end

# Householder matrix is unitary
Base.inv(A::HouseholderMatrix) = A
# Householder matrix is Hermitian
Base.adjoint(A::HouseholderMatrix) = A

# Left and right multiplication
function left_mul!(B, A::HouseholderMatrix)
    B .-= (A.β .* A.v) * (A.v' * B)
    return B
end
function right_mul!(A, B::HouseholderMatrix)
    A .= A .- (A * (B.β .* B.v)) * B.v'
    return A
end

right_mul! (generic function with 1 method)

In this example, we define a HouseholderMatrix type, which is a subtype of AbstractArray. The v field is the vector $v$ and the β field is the scalar $\beta$. To define the array interfaces, we need to define the size and getindex functions. Please check the Julia manual for more details.

julia> using LinearAlgebra, Test
julia> @testset "householder property" begin
           v = randn(3)
           H = HouseholderMatrix(v)
           # symmetric
           @test H' ≈ H
           # reflexive
           @test H^2 ≈ I
           # orthogonal
           @test H' * H ≈ I
       endTest Summary:        | Pass  Total  Time
householder property |    3      3  0.7s
Test.DefaultTestSet("householder property", Any[], 3, false, false, true, 1.714382308329414e9, 1.714382308996207e9, false, "REPL[2]")

Let us define a function to compute the Householder matrix that projects a vector to $e_1$.

function householder_e1(v::AbstractVector{T}) where T
    v = copy(v)
    v[1] -= norm(v, 2)
    return HouseholderMatrix(v, 2/norm(v, 2)^2)
end

householder_e1 (generic function with 1 method)

julia> A = Float64[1 2 2; 4 4 2; 4 6 4]3×3 Matrix{Float64}:
 1.0  2.0  2.0
 4.0  4.0  2.0
 4.0  6.0  4.0
julia> hm = householder_e1(view(A,:,1))3×3 Main.HouseholderMatrix{Float64}:
 0.174078   0.696311   0.696311
 0.696311   0.412961  -0.587039
 0.696311  -0.587039   0.412961
julia> hm * A3×3 Matrix{Float64}:
 5.74456       7.31126    4.52602
 8.88178e-16  -0.477767  -0.129612
 8.88178e-16   1.52223    1.87039

QR factoriaztion by Householder reflection.

Let $H_k$ be a Householder reflection that zeros out the $k$-th column below the diagonal. Then we have

\[H_n \ldots H_2H_1 A = R\]

where $R$ is an upper triangular matrix. Then we can define the $Q$ matrix as

\[Q = H_1^{T} H_2 ^{T}\ldots H_n^{T},\]

which is a unitary matrix.

function householder_qr!(Q::AbstractMatrix{T}, a::AbstractMatrix{T}) where T
    m, n = size(a)
    @assert size(Q, 2) == m
    if m == 1
        return Q, a
    else
        # apply householder matrix
        H = householder_e1(view(a, :, 1))
        left_mul!(a, H)
        # update Q matrix
        right_mul!(Q, H')
        # recurse
        householder_qr!(view(Q, 1:m, 2:m), view(a, 2:m, 2:n))
    end
    return Q, a
end

householder_qr! (generic function with 1 method)

julia> @testset "householder QR" begin
           A = randn(3, 3)
           Q = Matrix{Float64}(I, 3, 3)
           R = copy(A)
           householder_qr!(Q, R)
           @info R
           @test Q * R ≈ A
           @test Q' * Q ≈ I
       end[ Info: [3.2692844905199427 -0.14725158946335168 0.8074414981893574; 5.551115123125783e-17 1.974943346460992 0.1936353353376712; 4.440892098500626e-16 1.6375789613221059e-15 1.1331504303235345]
Test Summary:  | Pass  Total  Time
householder QR |    2      2  3.2s
Test.DefaultTestSet("householder QR", Any[], 2, false, false, true, 1.714382309598754e9, 1.714382312781231e9, false, "REPL[1]")
julia> A = randn(3, 3)3×3 Matrix{Float64}:
 1.01931   -0.0928347  -0.394995
 1.11316   -1.04086    -0.936534
 0.893274   0.704807    1.70607
julia> g = givens_matrix(A, 2, 3)ERROR: UndefVarError: `givens_matrix` not defined
julia> left_mul!(copy(A), g)ERROR: UndefVarError: `g` not defined

Givens Rotations

\[G = \left(\begin{matrix} \cos\theta & -\sin\theta\\ \sin\theta & \cos\theta \end{matrix}\right)\]

rotation_matrix(angle) = [cos(angle) -sin(angle); sin(angle) cos(angle)]

rotation_matrix (generic function with 1 method)

julia> angle = π/40.7853981633974483
julia> initial_vector = [1.0, 0.0]2-element Vector{Float64}:
 1.0
 0.0
julia> final_vector = rotation_matrix(angle) * initial_vector
       # eliminating the y element2-element Vector{Float64}:
 0.7071067811865476
 0.7071067811865475
julia> atan(0.1, 0.5)0.19739555984988078
julia> initial_vector = randn(2)2-element Vector{Float64}:
  0.4391911640690489
 -1.3005460964861437
julia> angle = atan(initial_vector[2], initial_vector[1])-1.245123179130958
julia> final_vector = rotation_matrix(-angle) * initial_vector2-element Vector{Float64}:
  1.3727013614336048
 -1.6653345369377348e-16

\[\left( \begin{matrix} 1 & 0 & 0 & 0 & 0\\ 0 & c & 0 & s & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & -s & 0 & c & 0\\ 0 & 0 & 0 & 0 & 1 \end{matrix} \right) \left( \begin{matrix} a_1\\a_2\\a_3\\a_4\\a_5 \end{matrix} \right)= \left( \begin{matrix} a_1\\\alpha\\a_3\\0\\a_5 \end{matrix} \right)\]

where $s = \sin(\theta)$ and $c = \cos(\theta)$.

QR Factorization by Givens Rotations

struct GivensMatrix{T} <: AbstractArray{T, 2}
    c::T
    s::T
    i::Int
    j::Int
    n::Int
end

Base.size(g::GivensMatrix) = (g.n, g.n)
Base.size(g::GivensMatrix, i::Int) = i == 1 || i == 2 ? g.n : 1
function Base.getindex(g::GivensMatrix{T}, i::Int, j::Int) where T
    @boundscheck i <= g.n && j <= g.n
    if i == j
        return i == g.i || i == g.j ? g.c : one(T)
    elseif i == g.i && j == g.j
        return g.s
    elseif i == g.j && j == g.i
        return -g.s
    else
        return i == j ? one(T) : zero(T)
    end
end

function left_mul!(A::AbstractMatrix, givens::GivensMatrix)
    for col in 1:size(A, 2)
        vi, vj = A[givens.i, col], A[givens.j, col]
        A[givens.i, col] = vi * givens.c + vj * givens.s
        A[givens.j, col] = -vi * givens.s + vj * givens.c
    end
    return A
end
function right_mul!(A::AbstractMatrix, givens::GivensMatrix)
    for row in 1:size(A, 1)
        vi, vj = A[row, givens.i], A[row, givens.j]
        A[row, givens.i] = vi * givens.c + vj * givens.s
        A[row, givens.j] = -vi * givens.s + vj * givens.c
    end
    return A
end

right_mul! (generic function with 2 methods)

function givens_matrix(A, i, j)
    x, y = A[i, 1], A[j, 1]
    norm = sqrt(x^2 + y^2)
    c = x/norm
    s = y/norm
    return GivensMatrix(c, s, i, j, size(A, 1))
end

givens_matrix (generic function with 1 method)

function givens_qr!(Q::AbstractMatrix, A::AbstractMatrix)
    m, n = size(A)
    if m == 1
        return Q, A
    else
        for k = m:-1:2
            g = givens_matrix(A, k-1, k)
            left_mul!(A, g)
            right_mul!(Q, g)
        end
        givens_qr!(view(Q, :, 2:m), view(A, 2:m, 2:n))
        return Q, A
    end
end

givens_qr! (generic function with 1 method)

julia> @testset "givens QR" begin
           n = 3
           A = randn(n, n)
           R = copy(A)
           Q, R = givens_qr!(Matrix{Float64}(I, n, n), R)
           @test Q * R ≈ A
           @test Q * Q' ≈ I
           @info R
       end[ Info: [3.2692844905199427 -0.14725158946335165 0.8074414981893574; 0.0 1.9749433464609922 0.19363533533767216; 0.0 0.0 1.1331504303235342]
Test Summary: | Pass  Total  Time
givens QR     |    2      2  0.8s
Test.DefaultTestSet("givens QR", Any[], 2, false, false, true, 1.714382312881475e9, 1.714382313666443e9, false, "REPL[1]")

Gram-Schmidt Orthogonalization

The Gram-Schmidt orthogonalization is a method to compute the QR factorization of a matrix $A$ by constructing an orthogonal matrix $Q$ and an upper triangular matrix $R$.

\[q_k = \left(a_k - \sum_{i=1}^{k-1} r_{ik}q_i\right)/r_{kk}\]

function classical_gram_schmidt(A::AbstractMatrix{T}) where T
    m, n = size(A)
    Q = zeros(T, m, n)
    R = zeros(T, n, n)
    R[1, 1] = norm(view(A, :, 1))
    Q[:, 1] .= view(A, :, 1) ./ R[1, 1]
    for k = 2:n
        Q[:, k] .= view(A, :, k)
        # project z to span(A[:, 1:k-1])⊥
        for j = 1:k-1
            R[j, k] = view(Q, :, j)' * view(A, :, k)
            Q[:, k] .-= view(Q, :, j) .* R[j, k]
        end
        # normalize the k-th column
        R[k, k] = norm(view(Q, :, k))
        Q[:, k] ./= R[k, k]
    end
    return Q, R
end

@testset "classical GS" begin
    n = 10
    A = randn(n, n)
    Q, R = classical_gram_schmidt(A)
    @test Q * R ≈ A
    @test Q * Q' ≈ I
    @info R
end

Modified Gram-Schmidt Orthogonalization

function modified_gram_schmidt!(A::AbstractMatrix{T}) where T
    m, n = size(A)
    Q = zeros(T, m, n)
    R = zeros(T, n, n)
    for k = 1:n
        R[k, k] = norm(view(A, :, k))
        Q[:, k] .= view(A, :, k) ./ R[k, k]
        for j = k+1:n
            R[k, j] = view(Q, :, k)' * view(A, :, j)
            A[:, j] .-= view(Q, :, k) .* R[k, j]
        end
    end
    return Q, R
end

@testset "modified GS" begin
    n = 10
    A = randn(n, n)
    Q, R = modified_gram_schmidt!(copy(A))
    @test Q * R ≈ A
    @test Q * Q' ≈ I
    @info R
end

let
    n = 100
    A = randn(n, n)
    Q1, R1 = classical_gram_schmidt(A)
    Q2, R2 = modified_gram_schmidt!(copy(A))
    @info norm(Q1' * Q1 - I)
    @info norm(Q2' * Q2 - I)
end

Eigenvalue/Singular value decomposition problem

The eigenvalue problem is to find the eigenvalues $\lambda$ and eigenvectors $x$ of a matrix $A$ such that

\[Ax = \lambda x\]

Power method

The power method is an iterative method to find the largest eigenvalue of a matrix. Let $A$ be a symmetric matrix, and $x_0$ be a random vector. The power method is defined as

\[x_{k+1} = \frac{A x_k}{\|A x_k\|}\]

The power method converges to the eigenvector corresponding to the largest eigenvalue of $A$. Let us denote the largest two eigenvalues of $A$ as $\lambda_1$ and $\lambda_2$, the convergence rate of the power method is

\[\left|\frac{\lambda_1}{\lambda_2}\right|^k\]

The following is an implementation of the power method.

function power_method(A::AbstractMatrix, k::Int)
    @assert size(A, 1) == size(A, 2)
    x = normalize!(randn(size(A, 2)))
    for _ = 1:k
        x = A * x
        normalize!(x)
    end
    return x
end

power_method (generic function with 1 method)

julia> matsize = 1010
julia> A10 = randn(matsize, matsize); A10 += A10'  # random symmetric matrix10×10 Matrix{Float64}:
 -0.332798  -2.64579   -1.06457   …  -1.87588     0.274707   1.16752
 -2.64579   -0.170511   0.622339      1.60307     0.879856  -1.51845
 -1.06457    0.622339  -0.885442      1.94719    -0.11873    1.42443
 -2.7247    -1.02882    1.2989       -0.519567    0.232768  -2.28584
  0.789068  -1.10919   -0.394854      0.738742    0.821133  -0.587346
 -0.217467   1.82341   -1.25465   …  -0.0782477   2.32244   -0.942595
  3.18985   -0.379839  -1.04527      -0.375616   -0.348287  -0.418554
 -1.87588    1.60307    1.94719       1.36924    -0.356568  -1.60999
  0.274707   0.879856  -0.11873      -0.356568   -1.45908    1.02005
  1.16752   -1.51845    1.42443      -1.60999     1.02005   -1.10313
julia> vmax = eigen(A10).vectors[:,end]  # exact eigenvector10-element Vector{Float64}:
 -0.5191839343662336
  0.41290752515430973
  0.18900976195257552
  0.05301664868717321
  0.10780601866014639
  0.24756613968745333
 -0.4530652114821484
  0.43536213182125594
  0.07554879266899484
 -0.2190729099715404
julia> x = power_method(A10, 20)  # 20 iterations of power method10-element Vector{Float64}:
 -0.7027666284029243
  0.15774345188000985
  0.31163726791694263
 -0.24806155797310336
  0.11201646891799118
  0.30878291865663776
 -0.16132135149576585
  0.2686866474542615
  0.10801303072585856
 -0.3237364022707397
julia> 1-abs2(x' * vmax)  # the error0.30541584892273266

Rayleigh Quotient Iteration

The Rayleigh Quotient Iteration (RQI) is an iterative method to find the eigenvalue of a matrix. The RQI is defined as

\[x_{k+1} = \frac{(A - \sigma_k I)^{-1} x_k}{\|(A - \sigma_k I)^{-1} x_k\|}\]

where $\sigma_k = x_k^T A x_k$. The RQI converges to the eigenvector corresponding to the eigenvalue closest to $\sigma_k$. The following is an implementation of the RQI.

function rayleigh_quotient_iteration(A::AbstractMatrix, k::Int)
    @assert issymmetric(A) "A must be a symmetric matrix"
    x = normalize!(randn(size(A, 2)))
    for _ = 1:k
        sigma = x' * A * x
        y = (A - sigma * I) \ x
        x = normalize!(y)
    end
    return x
end

rayleigh_quotient_iteration (generic function with 1 method)

julia> x = rayleigh_quotient_iteration(A10, 5)  # 5 iterations of RQI10-element Vector{Float64}:
  0.3627905918053468
  0.3250475457956593
 -0.2567648686457516
 -0.44551593506228965
  0.02976315104588443
  0.03889041644123406
  0.3766767454038162
  0.537673781018456
 -0.057589006580464716
 -0.24823610666435555
julia> U = eigen(A10).vectors10×10 Matrix{Float64}:
  0.488965   -0.377333    -0.33325     …   0.152504   -0.178951    -0.519184
  0.337137    0.0652168    0.0881611      -0.0118375   0.0755025    0.412908
 -0.278938   -0.312038    -0.382925       -0.353461    0.0506528    0.18901
  0.52779     0.00327104  -0.00370493      0.172784    0.676618     0.0530166
 -0.0398162   0.179782     0.227258        0.46842    -0.197641     0.107806
 -0.185403    0.216093    -0.591364    …   0.626456    0.00428407   0.247566
 -0.392217    0.253917     0.117773        0.0329215   0.576755    -0.453065
  0.170207    0.119477     0.0220759      -0.253706    0.102583     0.435362
 -0.0813814  -0.421962     0.561213        0.343817   -0.0824663    0.0755488
  0.255918    0.648205     0.0620643      -0.162136   -0.335929    -0.219073
julia> (x' * U)'  # one should see a one-hot vector10-element Vector{Float64}:
  7.147060721024445e-16
 -7.216449660063518e-16
  6.175615574477433e-16
  4.163336342344337e-16
 -8.604228440844963e-16
 -0.9999999999999999
  4.520689378395559e-15
  4.822531263215524e-16
 -3.885780586188048e-16
  9.020562075079397e-17

Symmetric QR decomposition

The symmetric QR decomposition is an iterative method to decompose a symmetric matrix into a tridiagonal matrix. Let $A$ be a symmetric matrix, the symmetric QR decomposition is defined as

\[A = Q T Q^T\]

where $Q$ is an orthogonal matrix and $T$ is a tridiagonal matrix. The following is an implementation of the symmetric QR decomposition.

# Q is an identity matrix
function householder_trid!(Q, a)
    m, n = size(a)
    @assert m==n && size(Q, 2) == n
    if m == 2
        return Q, a
    else
        # apply householder matrix
        H = householder_e1(view(a, 2:n, 1))
        left_mul!(view(a, 2:n, :), H)
        right_mul!(view(a, :, 2:n), H')
        # update Q matrix
        right_mul!(view(Q, :, 2:n), H')
        # recurse
        householder_trid!(view(Q, :, 2:n), view(a, 2:m, 2:n))
    end
    return Q, a
end

householder_trid! (generic function with 1 method)

julia> @testset "householder tridiagonal" begin
           n = 5
           a = randn(n, n)
           a = a + a'
           Q = Matrix{Float64}(I, n, n)
           Q, T = householder_trid!(Q, copy(a))
           @test Q * T * Q' ≈ a
       endTest Summary:           | Pass  Total  Time
householder tridiagonal |    1      1  1.5s
Test.DefaultTestSet("householder tridiagonal", Any[], 1, false, false, true, 1.714382315018834e9, 1.71438231652591e9, false, "REPL[1]")

The symmetric QR decomposition also includes a process to converge the tridiagonal matrix to a diagonal matrix. We refer the reader to Section 8.3 of the book "Matrix Computations" by Golub and Van Loan^[Golub2016] for more details.

The SVD algorithm

The Singular Value Decomposition (SVD) is an algorithm to decompose a matrix into three matrices. Let $A$ be a matrix, the SVD is defined as

\[A = U S V^\dagger\]

where $U$ and $V$ are orthogonal matrices, and $S$ is a diagonal matrix. The algorithm to compute the SVD is

Let $C = A^T A$,
Use the symmetric QR algorithm to compute $V_1^T C V_1 = {\rm diag}(\sigma_i^2)$,
Apply QR decomposition to $AV_1$ obtaining $U^\dagger(AV_1) = R$. Then $V = V_1 R^\dagger {\rm diag}(\sigma_i^{-1})$, and $S = {\rm diag}(\sigma_i)$.

The following is an implementation of the SVD algorithm.

function simple_svd(A::AbstractMatrix)
    m, n = size(A)
    @assert m >= n "m must be greater than or equal to n"
    C = A' * A
    S2, V1 = eigen(C)
    σ = sqrt.(S2)
    AV1 = A * V1
    qrres = qr(AV1)
    U = qrres.Q
    V = V1 * qrres.R' * Diagonal(inv.(σ))
    return U, Diagonal(σ), V
end

simple_svd (generic function with 1 method)

julia> @testset "simple SVD" begin
           m, n = 5, 3
           A = randn(m, n)
           U, S, V = simple_svd(A)
           @test U * S * V' ≈ A
           @test isapprox(U' * U, I; atol=1e-8)
           @test isapprox(V' * V, I; atol=1e-8)
       endTest Summary: | Pass  Total  Time
simple SVD    |    3      3  1.5s
Test.DefaultTestSet("simple SVD", Any[], 3, false, false, true, 1.714382316534786e9, 1.714382318053088e9, false, "REPL[1]")

Cholesky Decomposition (Implementation)

Cholesky decomposition is a method of decomposing a positive-definite matrix into a product of a lower triangular matrix and its transpose. It is often used in solving systems of linear equations, computing the inverse of a matrix, and generating random numbers with a given covariance matrix. The Cholesky decomposition is computationally efficient and numerically stable, making it a popular choice in many applications.

Given a positive definite symmetric matrix $A\in \mathbb{R}^{n\times n}$, the Cholesky decomposition is formally defined as

\[A = LL^T,\]

where $L$ is an upper triangular matrix.

The implementation of Cholesky decomposition is similar to LU decomposition.

function chol!(a::AbstractMatrix)
    n = size(a, 1)
    @assert size(a, 2) == n
    for k=1:n
        a[k, k] = sqrt(a[k, k])
        for i=k+1:n
            a[i, k] = a[i, k] / a[k, k]
        end
        for j=k+1:n
            for i=k+1:n
                a[i,j] = a[i,j] - a[i, k] * a[j, k]
            end
        end
    end
    return a
end

chol! (generic function with 1 method)

julia> @testset "cholesky" begin
           n = 10
           Q, R = qr(randn(10, 10))
           a = Q * Diagonal(rand(10)) * Q'
           L = chol!(copy(a))
           @test tril(L) * tril(L)' ≈ a
           # cholesky(a) in Julia
       endTest Summary: | Pass  Total  Time
cholesky      |    1      1  0.1s
Test.DefaultTestSet("cholesky", Any[], 1, false, false, true, 1.714382318061295e9, 1.714382318175845e9, false, "REPL[1]")

References

Golub2016Golub, G.H., 2016. Matrix Computation 25, 228–234. https://doi.org/10.4037/ajcc2016979