Sparse Matrices and Graphs
Sparse Matrices
Matrices are often sparse. Consider the matrix that we used in the spring chain example, the stiffness matrix is tridiagonal and has only $3n-2$ nonzero elements.
\[\begin{align*} A = \begin{pmatrix} -C & C & 0 & \ldots & 0\\ C & -2C & C & \ldots & 0\\ 0 & C & -2C & \ldots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & C & -C \end{pmatrix} \end{align*}\]
Storing such a matrix in a dense format requires $n^2$ elements, which is very memory inefficient since it has only $3n-2$ nonzero elements.
COOrdinate (COO) format
The coordinate format means storing nonzero matrix elements into triples
\[\begin{align*} &(i_1, j_1, v_1)\\ &(i_2, j_2, v_2)\\ &\vdots\\ &(i_k, j_k, v_k) \end{align*}\]
To store the stiffness matrix in COO format, we only need to store $3n-2$ triples.
To implement a COO matrix in Julia, we need to define a new data type and implement the AbstractArray
interface.
size
: return the size of the matrixgetindex
: return the element at the given index
Let the number of nonzero elements in a COO matrix $A$ be ${\rm nnz}(A)$. The indexing operation requires enumerating over all ${\rm nnz}(A)$ elements.
using LinearAlgebra
struct COOMatrix{Tv, Ti} <: AbstractArray{Tv, 2} # Julia does not have a COO data type
m::Ti # number of rows
n::Ti # number of columns
colval::Vector{Ti} # column indices
rowval::Vector{Ti} # row indices
nzval::Vector{Tv} # values
function COOMatrix(m::Ti, n::Ti, colval::Vector{Ti}, rowval::Vector{Ti}, nzval::Vector{Tv}) where {Tv, Ti}
@assert length(colval) == length(rowval) == length(nzval)
new{Tv, Ti}(m, n, colval, rowval, nzval)
end
end
Base.size(coo::COOMatrix) = (coo.m, coo.n)
Base.size(coo::COOMatrix, i::Int) = getindex((coo.m, coo.n), i)
# the number of non-zero elements
nnz(coo::COOMatrix) = length(coo.nzval)
# implement get index for CSC matrix, call with A[i, j]
function Base.getindex(coo::COOMatrix{Tv}, i::Integer, j::Integer) where Tv
@boundscheck checkbounds(coo, i, j)
v = zero(Tv)
for (i2, j2, v2) in zip(coo.rowval, coo.colval, coo.nzval)
if i == i2 && j == j2
v += v2 # accumulate the value, since repeated indices are allowed.
end
end
return v
end
function Base.:(*)(A::COOMatrix{T1}, B::COOMatrix{T2}) where {T1, T2}
@assert size(A, 2) == size(B, 1)
rowval = Int[]
colval = Int[]
nzval = promote_type(T1, T2)[]
for (i, j, v) in zip(A.rowval, A.colval, A.nzval)
for (i2, j2, v2) in zip(B.rowval, B.colval, B.nzval)
if j == i2
push!(rowval, i)
push!(colval, j2)
push!(nzval, v * v2)
end
end
end
return COOMatrix(size(A, 1), size(B, 2), colval, rowval, nzval)
end
julia> using Test
julia> stiffmatrix = COOMatrix(3, 3, [1, 1, 2, 2, 2, 3, 3], [1, 2, 1, 2, 3, 2, 3], [-1.0, 1, 1, -2, 1, 1, -1])
3×3 Main.COOMatrix{Float64, Int64}: -1.0 1.0 0.0 1.0 -2.0 1.0 0.0 1.0 -1.0
julia> size(stiffmatrix)
(3, 3)
julia> nnz(stiffmatrix)
7
julia> dense_matrix = Matrix(stiffmatrix)
3×3 Matrix{Float64}: -1.0 1.0 0.0 1.0 -2.0 1.0 0.0 1.0 -1.0
julia> @test stiffmatrix * stiffmatrix ≈ dense_matrix ^ 2
Test Passed
Most operations on COO matrices are computational expensive. For example, multiplying two COO matrices requires $O({\rm nnz}(A)^2)$ computing time.
Compressed Sparse Column (CSC) format
A CSC format sparse matrix can be constructed with the SparseArrays.sparse
function. However, here we will implement a simple CSC matrix from scratch.
struct CSCMatrix{Tv,Ti} <: AbstractMatrix{Tv}
m::Int
n::Int
colptr::Vector{Ti}
rowval::Vector{Ti}
nzval::Vector{Tv}
function CSCMatrix(m::Int, n::Int, colptr::Vector{Ti}, rowval::Vector{Ti}, nzval::Vector{Tv}) where {Tv, Ti}
@assert length(colptr) == n + 1
@assert length(rowval) == length(nzval) == colptr[end] - 1
new{Tv, Ti}(m, n, colptr, rowval, nzval)
end
end
Base.size(A::CSCMatrix) = (A.m, A.n)
Base.size(A::CSCMatrix, i::Int) = getindex((A.m, A.n), i)
# the number of non-zero elements
nnz(csc::CSCMatrix) = length(csc.nzval)
# convert a COO matrix to a CSC matrix
function CSCMatrix(coo::COOMatrix{Tv, Ti}) where {Tv, Ti}
m, n = size(coo)
# sort the COO matrix by column
order = sortperm(1:nnz(coo); by=i->coo.rowval[i] + m * (coo.colval[i]-1))
colptr, rowval, nzval = similar(coo.rowval, n+1), similar(coo.rowval), similar(coo.nzval)
k = 0
ipre, jpre = 0, 0
colptr[1] = 1
for idx in order
i, j, v = coo.rowval[idx], coo.colval[idx], coo.nzval[idx]
# values with the same indices are accumulated
if i == ipre && j == jpre
nzval[k] += v
else
k += 1
if j != jpre
# a new column starts
colptr[jpre+1:j+1] .= k
end
rowval[k] = i
nzval[k] = v
ipre, jpre = i, j
end
end
colptr[jpre+1:end] .= k + 1
resize!(rowval, k)
resize!(nzval, k)
return CSCMatrix(m, n, colptr, rowval, nzval)
end
# implement get index for CSC matrix, call with A[i, j]
function Base.getindex(A::CSCMatrix{T}, i::Int, j::Int) where T
@boundscheck checkbounds(A, i, j)
for k in nzrange(A, j)
if A.rowval[k] == i
return A.nzval[k]
end
end
return zero(T)
end
function Base.:*(A::CSCMatrix{T1}, B::CSCMatrix{T2}) where {T1, T2}
T = promote_type(T1, T2)
@assert size(A, 2) == size(B, 1)
rowval, colval, nzval = Int[], Int[], T[]
for j2 in 1:size(B, 2) # enumerate the columns of B
for k2 in nzrange(B, j2) # enumerate the rows of B
v2 = B.nzval[k2]
for k1 in nzrange(A, B.rowval[k2]) # enumerate the rows of A
push!(rowval, A.rowval[k1])
push!(colval, j2)
push!(nzval, A.nzval[k1] * v2)
end
end
end
return CSCMatrix(COOMatrix(size(A, 1), size(B, 2), colval, rowval, nzval))
end
# return the range of non-zero elements in the j-th column
nzrange(A::CSCMatrix, j::Int) = A.colptr[j]:A.colptr[j+1]-1
nzrange (generic function with 1 method)
julia> coo_matrix = COOMatrix(5, 4, [1, 1, 2, 2, 4, 4], [2, 3, 1, 4, 3, 4], [1, 2, 3, 4, 5, 6])
5×4 Main.COOMatrix{Int64, Int64}: 0 3 0 0 1 0 0 0 2 0 0 5 0 4 0 6 0 0 0 0
julia> csc_matrix = CSCMatrix(coo_matrix)
5×4 Main.CSCMatrix{Int64, Int64}: 0 3 0 0 1 0 0 0 2 0 0 5 0 4 0 6 0 0 0 0
The csc_matrix
has type CSCMatrix
, which contains 5 fields
julia> fieldnames(csc_matrix |> typeof)
(:m, :n, :colptr, :rowval, :nzval)
julia> csc_matrix.m, csc_matrix.n
(5, 4)
julia> csc_matrix.colptr
5-element Vector{Int64}: 1 3 5 5 7
julia> csc_matrix.rowval
6-element Vector{Int64}: 2 3 1 4 3 4
julia> csc_matrix.nzval
6-element Vector{Int64}: 1 2 3 4 5 6
The m
, n
, rowval
and nzval
have the same meaning as those in the COO format. colptr
is an integer vector of size $n+1$, where colptr[j]
is the index in rowval
and nzval
of the first nonzero element in the $j$-th column, and colptr[j+1]
is the index of the first nonzero element in the $(j+1)$-th column. Hence the $j$-th column of the matrix is stored in rowval[colptr[j]:colptr[j+1]-1]
and nzval[colptr[j]:colptr[j+1]-1]
.
The number of operations required to index an element in the $j$-th column of a CSC matrix is linear to the nonzero elements in the $j$-th column. To get an element from the 2nd row and 3rd column of a CSC matrix, we can use the following code
julia> csc_matrix[2, 3]
0
The row indices and values of nonzero elements in the 3rd column can be obtained by
julia> rows3 = csc_matrix.rowval[csc_matrix.colptr[3]:csc_matrix.colptr[4]-1]
Int64[]
julia> val3 = csc_matrix.nzval[csc_matrix.colptr[3]:csc_matrix.colptr[4]-1]
Int64[]
julia> csc_matrix.rowval[nzrange(csc_matrix, 3)] # or equivalently, we can use `nzrange`
Int64[]
Multiplying two CSC matrices is much faster than multiplying two COO matrices. The time complexity of multiplying two CSC matrices $A$ and $B$ is $O({\rm nnz}(A){\rm nnz}(B)/n)$.
julia> csc_matrix2 = CSCMatrix(COOMatrix(coo_matrix.n, coo_matrix.m, coo_matrix.rowval, coo_matrix.colval, coo_matrix.nzval)) # transpose
4×5 Main.CSCMatrix{Int64, Int64}: 0 1 2 0 0 3 0 0 4 0 0 0 0 0 0 0 0 5 6 0
julia> @test Matrix(csc_matrix) * Matrix(csc_matrix2) ≈ csc_matrix * csc_matrix2
Test Passed
Dominant eigenvalue problem
Given a matrix $A \in \mathbb{R}^{n \times n}$, the dominant eigenvalue problem is to find the largest eigenvalue $\lambda_1$ and its corresponding eigenvector $x_1$ such that
\[A x_1 = \lambda_1 x_1.\]
The power method is a simple iterative algorithm to solve the dominant eigenvalue problem. The algorithm starts with a random vector $v_0$ and repeatedly multiplies it with the matrix $A$.
\[v_k = A^k v_0\]
By representing the initial vector $v_0$ as a linear combination of eigenvectors of $A$, i.e. $v_0 = \sum_{i=1}^n c_i x_i$, we have
\[v_k = \sum_{i=1}^n \lambda_i^k c_i x_i\]
where $\lambda_1 > \lambda_2 \geq \ldots \geq \lambda_n$ are the eigenvalues of $A$ and $x_i$ are the corresponding eigenvectors. The power method converges to the eigenvector corresponding to the largest eigenvalue as $k \rightarrow \infty$. The rate of convergence is dedicated by $|\lambda_2/\lambda_1|^k$. The Julia code for the power method is as follows.
function power_method(A::AbstractMatrix{T}, n::Int) where T
n = size(A, 2)
x = normalize!(randn(n))
for i=1:n
x = A * x
normalize!(x)
end
return x' * A * x', x
end
power_method (generic function with 1 method)
By inverting the sign, $A\rightarrow -A$, we can use the same method to obtain the smallest eigenvalue.
The Krylov subspace method
Let $A \in \mathbb{C}^{n \times n}$ be a large sparse matrix, the Arnoldi and Lanczos algorithms can be used to obtain its largest/smallest eigenvalue, with much faster convergence speed comparing with the power method.
The key idea of these algorithms is to generate an orthogonal matrix $Q \in \mathbb{C}^{n\times k}$, $Q^\dagger Q = I$, such that
\[Q^\dagger A Q = B.\]
We have the following property
\[\lambda_1(B) \leq \lambda_1(A),\]
where $\lambda_1(A)$ is the largest eigenvalue of $A$. By chooing $Q$ carefully, such that ${\rm span}(Q)$ contains the dominant eigenvectors of $A$, then $\lambda_1(B) = \lambda_1(A)$. When the equality holds, we have
\[By_1 = \lambda_1(B)y_1\]
where $y_i$ is the $i$-th eigenvector of $B$. By multiplying $y^\dagger$ on the left, we have
\[y_1^\dagger Q^\dagger A Q y_1 = \lambda_1(B)\]
Hence, the eigenvectors of $B$ are related to the eigenvectors of $A$ by the orthogonal matrix $Q$.
Inspired by the power method, we can define the $Q$ as the Krylov subspace that generated from a random initial vector $q_1$.
\[\mathcal{K}(A, q_1, k) = {\rm span}\{q_1, Aq_1, A^2q_1, \ldots, A^{k-1}q_1\}\]
The Arnoldi and Lanczos algorithm are two special cases of the Krylov subspace method. The Arnoldi algorithm is used to solve the eigenvalue problem, while the Lanczos algorithm is used to solve the symmetric eigenvalue problem.
KrylovKit.jl
The Julia package KrylovKit.jl
contains many Krylov space based algorithms. KrylovKit.jl
accepts general functions or callable objects as linear maps, and general Julia objects with vector like behavior (as defined in the docs) as vectors. The high level interface of KrylovKit is provided by the following functions:
linsolve
: solve linear systemseigsolve
: find a few eigenvalues and corresponding eigenvectorsgeneigsolve
: find a few generalized eigenvalues and corresponding vectorssvdsolve
: find a few singular values and corresponding left and right singular vectorsexponentiate
: apply the exponential of a linear map to a vectorexpintegrator
: exponential integrator for a linear non-homogeneous ODE, computes a linear combination of theϕⱼ
functions which generalizeϕ₀(z) = exp(z)
.
The Lanczos algorithm
The Lanczos algorithm is used to solve the symmetric eigenvalue problem. Given a symmetric matrix $A \in \mathbb{R}^{n \times n}$, the Lanczos algorithm generates an orthogonal matrix $Q \in \mathbb{R}^{n \times k}$, such that
\[Q^T A Q = T\]
where $T$ is a tridiagonal matrix
\[T = \left(\begin{matrix} \alpha_1 & \beta_1 & 0 & \ldots & 0\\ \beta_1 & \alpha_2 & \beta_2 & \ldots & 0\\ 0 & \beta_2 & \alpha_3 & \ldots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \beta_{k-1} & \alpha_k \end{matrix}\right),\]
Let $Q = [q_1 | q_2 | \ldots | q_n],$ and ${\rm span}(\{q_1, q_2, \ldots, q_k\}) = \mathcal{K}(A, q_1, k)$. We have $Aq_k = \beta_{k-1}q_{k-1} + \alpha_k q_k + \beta_k q_{k+1}$, or equivalently in the recursive style
\[q_{k+1} = (Aq_k - \beta_{k-1}q_{k-1} - \alpha_k q_k)/\beta_k.\]
By multiplying $q_k^T$ on the left, we have
\[\alpha_k = q_k^T A q_k.\]
Since $q_{k+1}$ is normalized, we have
\[\beta_k = \|Aq_k - \beta_{k-1}q_{k-1} - \alpha_k q_k\|_2\]
If at any moment, $\beta_k = 0$, the interation stops due to convergence of a subspace. We have the following reducible form
\[T(\beta_2 = 0) = \left(\begin{array}{cc|ccc} \alpha_1 & \beta_1 & 0 & \ldots & 0\\ \beta_1 & \alpha_2 & 0 & \ldots & 0\\ \hline 0 & 0 & \alpha_3 & \ldots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \beta_{k-1} & \alpha_k \end{array}\right),\]
A Julia implementation
function lanczos(A, q1::AbstractVector{T}; abstol, maxiter) where T
# normalize the input vector
q1 = normalize(q1)
# the first iteration
q = [q1]
Aq1 = A * q1
α = [q1' * Aq1]
rk = Aq1 .- α[1] .* q1
β = [norm(rk)]
for k = 2:min(length(q1), maxiter)
# the k-th orthonormal vector in Q
push!(q, rk ./ β[k-1])
Aqk = A * q[k]
# compute the diagonal element as αₖ = qₖᵀ A qₖ
push!(α, q[k]' * Aqk)
rk = Aqk .- α[k] .* q[k] .- β[k-1] * q[k-1]
# compute the off-diagonal element as βₖ = |rₖ|
nrk = norm(rk)
# break if βₖ is smaller than abstol or the maximum number of iteration is reached
if abs(nrk) < abstol || k == length(q1)
break
end
push!(β, nrk)
end
# returns T and Q
return SymTridiagonal(α, β), hcat(q...)
end
lanczos (generic function with 1 method)
Reorthogonalization
Let $r_0, \ldots, r_{k-1} \in \mathbb{C}_n$ be linearly independent vectors and the corresponding Householder matrices $H_0, \ldots, H_{k-1}$ such that $(H_0\ldots H_{k- 1})^T [r_0\mid\ldots\mid r_{k-1}]$ is an upper triangular matrix. Let $[q_1 \mid \ldots \mid q_k ]$ denote the first $k$ columns of the Householder product $(H_0 \ldots H_{k-1})$, then $q_1, \ldots, q_k$ are orthonormal vectors up to machine precision. The Lanczos algorithm with complete reorthogonalization is as follows:
function lanczos_reorthogonalize(A, q1::AbstractVector{T}; abstol, maxiter) where T
n = length(q1)
# normalize the input vector
q1 = normalize(q1)
# the first iteration
q = [q1]
Aq1 = A * q1
α = [q1' * Aq1]
rk = Aq1 .- α[1] .* q1
β = [norm(rk)]
householders = [householder_matrix(q1)]
for k = 2:min(n, maxiter)
# reorthogonalize rk: 1. compute the k-th householder matrix
for j = 1:k-1
left_mul!(view(rk, j:n), householders[j])
end
push!(householders, householder_matrix(view(rk, k:n)))
# reorthogonalize rk: 2. compute the k-th orthonormal vector in Q
qk = zeros(T, n); qk[k] = 1 # qₖ = H₁H₂…Hₖeₖ
for j = k:-1:1
left_mul!(view(qk, j:n), householders[j])
end
push!(q, qk)
Aqk = A * q[k]
# compute the diagonal element as αₖ = qₖᵀ A qₖ
push!(α, q[k]' * Aqk)
rk = Aqk .- α[k] .* q[k] .- β[k-1] * q[k-1]
# compute the off-diagonal element as βₖ = |rₖ|
nrk = norm(rk)
# break if βₖ is smaller than abstol or the maximum number of iteration is reached
if abs(nrk) < abstol || k == n
break
end
push!(β, nrk)
end
return SymTridiagonal(α, β), hcat(q...)
end
struct HouseholderMatrix{T} <: AbstractArray{T, 2}
v::Vector{T}
β::T
end
# the `mul!` interfaces can take two extra factors.
function left_mul!(B, A::HouseholderMatrix)
B .-= (A.β .* A.v) * (A.v' * B)
return B
end
function householder_matrix(v::AbstractVector{T}) where T
v = copy(v)
v[1] -= norm(v, 2)
return HouseholderMatrix(v, 2/norm(v, 2)^2)
end
householder_matrix (generic function with 1 method)
julia> using Graphs
julia> n = 1000
1000
julia> graph = random_regular_graph(n, 3)
{1000, 1500} undirected simple Int64 graph
julia> A = laplacian_matrix(graph)
1000×1000 SparseArrays.SparseMatrixCSC{Int64, Int64} with 4000 stored entries: ⎡⡛⢌⢪⢐⣰⠈⠔⠁⣔⠄⠲⢠⢀⣡⠤⡅⡬⠡⣎⢞⠠⢰⡊⠤⡶⣴⠈⠲⠇⣂⡧⡄⠺⢣⣱⠂⡀⡓⠎⢀⎤ ⎢⢊⢒⠕⢅⡴⡂⣖⠅⣈⠏⢷⣈⢖⡳⠥⠮⠣⡶⠄⠞⠍⣖⢤⡸⣍⡀⣉⡩⠕⡨⢐⡑⠠⠓⢴⠅⠬⢊⡐⠸⎥ ⎢⡐⠚⠰⠫⡻⣮⡃⡃⣈⢹⣲⢷⢤⠩⣖⢛⢦⠬⢲⠄⣑⡒⣘⠑⢑⢈⠌⠻⠀⠨⡐⢅⡐⠇⡡⠑⡦⠱⡍⡠⎥ ⎢⠔⠁⠜⠝⠭⠨⠟⢅⡉⡁⡚⢲⠼⠡⣪⣀⠝⣏⠱⡾⠍⡉⢑⣕⡕⠡⡰⣔⢼⡐⢀⢄⣲⠴⡆⢐⢫⢂⠵⡄⎥ ⎢⠐⠝⡦⠜⣆⣘⠇⠨⠕⢅⠠⡊⢃⠤⠻⠙⡉⠠⢾⡐⡐⢦⣫⡛⠗⠆⡴⢈⣝⣀⢧⡠⢐⡪⠂⣠⢭⡏⠓⡄⎥ ⎢⠘⣂⡙⢳⢼⣞⢺⣈⡠⠢⡕⢍⠃⢤⠀⠭⢬⠶⣓⡀⠪⣬⠅⡄⠒⢅⡮⠃⢊⡎⣩⡐⡂⠡⠒⢬⠭⠀⠂⠙⎥ ⎢⠄⣰⢼⡱⡄⡓⠖⡃⠉⡔⠉⣄⡿⣯⠢⣂⡰⢠⣅⠁⣑⡸⣲⠹⢀⣡⠍⡵⣄⠜⠈⡡⠫⠒⣇⠔⠺⠔⢺⠋⎥ ⎢⠄⠧⡡⡇⣼⢙⠊⢺⣟⠂⡄⡄⠨⢢⠑⣤⠟⠯⠪⠸⠤⢀⠊⢊⠩⢰⢃⢅⣊⢌⡐⠒⣊⠊⢴⡸⠅⡮⡐⢘⎥ ⎢⠆⡋⢩⡦⡈⡗⡷⢥⠃⡈⢢⡗⠐⣊⡿⡅⣻⢞⠕⠀⣓⠁⡢⠒⠤⡊⢒⠤⢌⠌⠲⢣⢋⠈⣐⠨⠈⡸⢠⣬⎥ ⎢⣪⢝⣠⠅⠘⠖⣱⡦⢚⠳⠙⠸⠅⠙⣊⡂⠑⠁⠑⢄⣢⠍⠰⠡⡅⣾⡔⣋⠰⠌⡀⣂⠩⢍⠰⣒⡥⠠⢎⣂⎥ ⎢⢀⣂⢣⢥⢱⠸⡇⠡⠰⣌⡊⣦⣑⡸⠀⢃⠝⠘⡌⠞⡕⢍⡤⡎⢅⡦⢱⡐⠬⢙⣞⢶⡀⠻⢀⠣⣊⠈⢒⡀⎥ ⎢⠊⡌⣀⡳⢖⠘⢕⢴⣯⠺⠁⠥⣜⡚⡪⢀⢨⠊⠔⡂⡠⠯⠛⢄⢉⠤⢊⡄⢣⠾⡌⣒⡀⠵⠀⠓⠁⣉⣥⡀⎥ ⎢⢘⣯⠃⠹⡑⢐⠕⡉⠹⠅⠜⢄⠄⣰⢃⣂⡠⠣⣡⣭⠡⡵⠃⡔⡵⢏⠒⢂⠁⡛⠬⢤⠇⠨⢇⣴⠀⠸⢑⡇⎥ ⎢⢢⡀⡇⡸⣦⡁⢐⢮⡐⢋⠮⠋⢇⡥⠍⢔⠘⡔⡴⢩⢑⠲⠊⠴⠸⢀⡛⣬⢚⠢⣨⠌⠦⢠⡅⢪⡘⢔⡉⡒⎥ ⎢⠩⢡⡑⡡⡀⡀⢒⠳⠓⢹⡪⠴⣀⠝⡊⢜⡂⠕⡐⠆⣆⢃⣩⡖⣥⠠⠺⡐⠛⣤⠓⠁⡢⠩⣽⡢⣔⡱⠓⠂⎥ ⎢⠉⠯⢔⠰⠔⢌⠀⢔⠉⡳⢃⠺⠆⡠⢰⠈⠼⣂⠠⢨⢺⣝⢢⢩⠂⣇⡂⠞⠝⠀⠛⢄⢼⡊⠈⡀⢽⡅⡝⠂⎥ ⎢⠾⣂⢤⠂⠴⠌⢘⡞⡰⡰⠌⡈⢫⠂⡪⠘⡋⠐⡇⢆⣤⡈⢄⡌⡉⡁⠈⣃⡌⡊⡲⠳⡻⢎⡔⡞⠢⢵⠏⢁⎥ ⎢⠱⠚⠔⠗⢅⠊⢈⢉⠈⣠⡘⣄⢉⠝⣐⡳⡐⡘⢰⢢⠤⡐⢤⠀⢉⣵⡡⣉⠳⡻⠂⠠⣰⠭⣟⢝⠭⡐⠂⡁⎥ ⎢⢤⠨⡢⢃⢌⡋⠫⢒⡧⠷⠃⠃⢚⠆⡡⡥⣂⡠⠁⡋⡊⠘⡅⢠⣀⡀⢒⢌⢔⡹⠗⠷⢌⣆⢃⠣⠑⢄⠴⡡⎥ ⎣⠊⢁⣐⡈⠃⡩⠑⠧⠙⠤⣌⠀⡾⠒⣐⢈⡀⣶⠪⢱⠘⠰⠁⠻⠵⠴⢣⠨⠹⠀⠳⠉⠏⢁⠌⠠⠔⡣⠑⢄⎦
julia> q1 = randn(n)
1000-element Vector{Float64}: -1.2369695998058534 -1.2666670752051943 -0.29289573829909643 1.6529592747347013 -1.8751223074397303 1.262363828651884 -1.284729397458202 0.9569948867296463 0.4961300822058952 -0.2942693935622077 ⋮ -0.5293833147580319 -1.770021970936599 0.3042510737125212 -1.4106526396873504 -0.33602961572341894 -0.6036600988431233 1.4538716703845196 -1.8777370646256477 0.1857611328275104
julia> tr, Q = lanczos_reorthogonalize(A, q1; abstol=1e-5, maxiter=100)
([2.9657889183371293 1.7754173402355677 … 0.0 0.0; 1.7754173402355677 2.9500419986042696 … 0.0 0.0; … ; 0.0 0.0 … 3.0721360049643986 1.3847316029967645; 0.0 0.0 … 1.3847316029967645 3.038115671319606], [-0.040102217715173895 0.01103530067859752 … -0.011658995999047867 0.04565413551225078; -0.0410650017837899 0.02420810327087623 … 0.021602893020417145 0.00798090204580233; … ; -0.06087572450389087 0.013396296230025079 … -0.051767351885798736 0.05794927899874108; 0.006022325360975219 -0.0010478652931360482 … -0.03624677273549715 0.011920385494852529])
julia> eigen(tr)
LinearAlgebra.Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}} values: 100-element Vector{Float64}: -1.7763568394002505e-15 0.18044967673872758 0.1879911100435363 0.19923804045626348 0.21002641809604494 0.220721834778276 0.24293395027701425 0.25954840349978436 0.27736145211578833 0.3026909748246265 ⋮ 5.708619937992621 5.722692564593327 5.74564389665999 5.762236786733555 5.775235357777766 5.787975577097146 5.801444426789523 5.815213457749225 5.817003515463697 vectors: 100×100 Matrix{Float64}: 0.00305616 -0.0528768 0.007192 … 0.0625813 0.0189993 -0.00510524 0.0829551 -0.0112525 0.100439 0.0305117 0.00694642 -0.0979617 0.013202 0.127373 0.0387484 -0.00977261 0.114042 -0.015224 0.161817 0.0493204 0.0140359 -0.13268 0.0175087 0.196488 0.0600222 -0.0203347 0.154604 -0.0201405 … 0.21249 0.0650923 0.0276807 -0.164229 0.0210579 0.231325 0.0711163 -0.037928 0.170374 -0.0214187 0.247202 0.0763195 0.053584 -0.178436 0.0219004 0.256605 0.079625 -0.0819103 0.204311 -0.0244371 0.254501 0.0794719 ⋮ ⋱ -1.06427e-12 -0.000911999 -0.0417987 -0.00295513 0.00624323 7.79353e-13 0.000840948 0.0390921 -0.0022788 0.00480037 -5.24121e-13 -0.000702175 -0.0330514 -0.00194122 0.00407645 3.61616e-13 0.000584536 0.0277954 -0.00166004 0.0034772 -2.45882e-13 -0.00047504 -0.022796 … -0.00131599 0.00275008 1.59082e-13 0.000355975 0.0172049 -0.00107746 0.00224732 -1.01158e-13 -0.000258287 -0.0125602 -0.000790453 0.00164614 5.9271e-14 0.00016635 0.00812351 -0.000550779 0.00114576 -2.70149e-14 -8.06078e-5 -0.0039468 -0.000274632 0.000570939
julia> using KrylovKit
julia> eigsolve(A, q1, 2, :SR)
([-3.0686749419396537e-15, 0.1804496424288632], [[-0.031622776601683486, -0.03162277660168367, -0.03162277660168401, -0.03162277660168426, -0.03162277660168361, -0.031622776601683784, -0.03162277660168353, -0.03162277660168334, -0.03162277660168382, -0.031622776601683125 … -0.03162277660168422, -0.031622776601684166, -0.031622776601684596, -0.03162277660168385, -0.03162277660168412, -0.03162277660168273, -0.03162277660168346, -0.03162277660168467, -0.03162277660168347, -0.03162277660168446], [0.019979307988712496, 0.03590197946553924, -0.008632157432609243, -0.015472944896156202, 0.010513903112047557, 0.0318912258594335, -0.061674251329638365, -0.011216800013359438, 0.0071526123541201725, -0.03180677367745066 … -0.004663631261724895, 0.028183031322775123, 0.017273572639812492, -0.015394259618092664, -0.03361359393727493, -0.004158309689584283, -0.039588062190827805, 0.03166399739521771, 0.025014264845281194, 0.030956463936089453]], ConvergenceInfo: 2 converged values after 18 iterations and 234 applications of the linear map; norms of residuals are given by (6.142936340819385e-35, 9.336079747252914e-13). )
Notes on Lanczos
A sophisticated Lanczos implementation should consider the following aspects:
- In practice, storing all $q$ vectors is not necessary.
- Blocking technique can be used to improve the solution, especially when the matrix has degenerate eigenvalues.
- Restarting technique can be used to improve the solution without increasing the memory usage.
These techniques could be found in Ref.[Golub2013].
The Arnoldi algorithm
If $A$ is not symmetric, then the orthogonal tridiagonalization $Q^T A Q = T$ does not exist in general. The Arnoldi approach involves the column by column generation of an orthogonal $Q$ such that $Q^TAQ = H$ is a Hessenberg matrix.
\[H = \left(\begin{matrix} h_{11} & h_{12} & h_{13} & \ldots & h_{1k}\\ h_{21} & h_{22} & h_{23} & \ldots & h_{2k}\\ 0 & h_{32} & h_{33} & \ldots & h_{3k}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \ldots & h_{kk} \end{matrix}\right)\]
That is, $h_{ij} = 0$ for $i>j+1$.
function arnoldi_iteration(A::AbstractMatrix{T}, x0::AbstractVector{T}; maxiter) where T
h = Vector{T}[]
q = [normalize(x0)]
n = length(x0)
@assert size(A) == (n, n)
for k = 1:min(maxiter, n)
u = A * q[k] # generate next vector
hk = zeros(T, k+1)
for j = 1:k # subtract from new vector its components in all preceding vectors
hk[j] = q[j]' * u
u = u - hk[j] * q[j]
end
hkk = norm(u)
hk[k+1] = hkk
push!(h, hk)
if abs(hkk) < 1e-8 || k >=n # stop if matrix is reducible
break
else
push!(q, u ./ hkk)
end
end
# construct `h`
kmax = length(h)
H = zeros(T, kmax, kmax)
for k = 1:length(h)
if k == kmax
H[1:k, k] .= h[k][1:k]
else
H[1:k+1, k] .= h[k]
end
end
return H, hcat(q...)
end
arnoldi_iteration (generic function with 1 method)
julia> import SparseArrays
julia> n = 100
100
julia> A = SparseArrays.sprand(n, n, 0.1)
100×100 SparseArrays.SparseMatrixCSC{Float64, Int64} with 1010 stored entries: ⎡⢀⠁⠠⠍⠄⡔⠀⡔⠀⠄⠀⠀⢀⠄⠆⢀⡁⠀⠐⠂⢂⠀⠠⠅⠄⠰⠀⢠⠰⠡⠂⠀⡄⠂⡀⠀⠖⠠⠑⠃⎤ ⎢⠀⠄⠢⡀⠀⠀⠨⡦⠀⠁⠀⠀⠠⠌⠀⠠⠄⣉⠤⡆⢐⠀⠀⠠⠀⠀⢀⢐⡂⡀⠀⠣⠀⠀⠠⠈⠨⡀⠠⢀⎥ ⎢⠥⠉⠀⡋⠀⠢⠀⠀⠠⠂⠲⠀⢀⠁⠤⠃⠂⠥⠠⢀⠀⠠⠤⢄⠌⠔⠁⠀⠀⣀⠠⠈⠁⡅⠐⡀⢼⠐⠨⠀⎥ ⎢⠤⠤⠠⡤⢐⠀⡂⠀⠀⠄⠀⠀⠀⠄⡄⢀⠄⠤⡀⠪⢰⠘⠤⢬⠬⠀⠐⠀⠘⠂⠠⡠⠄⡀⡊⠀⢀⢀⠈⠐⎥ ⎢⠤⠀⠈⠂⡈⠈⠀⡆⠠⠂⡀⠁⡐⡩⠄⠂⠀⠄⠀⠡⠄⠀⠐⠀⣀⠀⢀⠄⢠⠀⠢⢀⢀⠀⠨⠠⠠⡀⠀⣀⎥ ⎢⠠⠀⠄⠀⠐⠁⡠⠀⠨⡥⠀⠀⠜⠆⠀⠄⠠⠅⠂⢂⠂⠀⠠⢢⠀⠰⠀⠕⠠⠀⡡⠔⠤⠄⣀⠀⠀⠀⠡⠢⎥ ⎢⠀⡀⡨⠂⠀⠀⠄⠄⡀⢀⠆⠀⠅⣉⠄⠤⠠⠀⠴⠊⠀⠄⡀⠠⠀⠀⠖⠀⠨⠀⠀⠂⢨⠉⠬⠄⡸⠄⣠⠀⎥ ⎢⠀⠢⠀⠠⠀⡄⠄⢩⠄⠥⠈⠄⡀⠤⠀⠂⠀⡀⠂⠈⠀⠈⠠⠀⠠⡰⠼⠀⠤⠂⠣⠀⢀⠂⠀⠄⠐⠈⠠⠤⎥ ⎢⠄⢀⠀⠄⠁⠄⠆⢀⠆⠀⠀⠄⠠⠐⠆⢄⠀⠅⢄⠁⠰⠀⢠⠄⠄⠀⢀⠀⢨⠀⠊⠉⢨⠀⠠⡁⠈⠑⢀⢀⎥ ⎢⠔⠐⠀⠂⡠⢆⠀⡄⠄⠥⠀⠀⠀⠀⠡⡄⡢⠀⡀⡀⠈⠂⡴⠀⠀⠀⠐⠀⠈⠁⠈⠄⠤⠀⠀⠂⡀⠀⠤⠁⎥ ⎢⠐⠀⠀⡄⠠⡂⠠⠑⠂⠈⠀⡀⠈⢄⠀⠐⠂⠁⠀⠀⠠⡀⠐⠅⠑⡡⠃⠂⠀⢀⡀⠐⢀⠒⠘⠀⢐⠀⠰⡐⎥ ⎢⠄⠀⠄⠀⢀⠒⠠⠠⠐⢆⢀⡠⡘⠃⠆⢂⠑⢀⠊⠐⢀⣧⠠⠂⢀⢂⢠⠀⠀⢑⠉⠐⢐⠢⠐⠈⠚⠀⠸⠅⎥ ⎢⠂⠃⠀⠁⠀⠀⠀⠆⠀⠀⠠⠀⠁⢈⡀⠂⠂⠂⠀⠈⠀⠀⠀⠈⠁⠀⠢⠀⠄⢒⢀⠂⠀⠤⢠⡠⠊⠂⠃⠀⎥ ⎢⠀⠡⠀⠂⠀⡆⠀⠂⢀⠀⠂⠠⠀⠈⠀⠄⡈⠐⢀⠗⠌⠂⠘⠀⠐⠐⠀⠐⠰⠔⠀⠐⠐⠐⠁⠂⠠⠀⡈⢈⎥ ⎢⢤⡠⡔⠂⠀⡀⠀⠂⠠⠂⠀⠁⠡⠅⠠⡐⠀⠀⠀⠂⠃⠁⠐⠀⠨⠒⠢⢀⣠⠁⢠⠀⠒⠊⠂⠀⠈⠰⡋⠀⎥ ⎢⠀⢁⡄⠐⠀⠂⠀⠃⠄⡑⠄⠈⠀⣀⠀⠠⠀⠀⠰⢂⠞⠀⠀⠁⠛⡀⠺⠀⢠⠂⠐⠀⠦⠀⢀⠊⠚⡊⠐⡆⎥ ⎢⡙⠢⠀⣂⠀⠅⠁⠄⠦⠀⠐⠁⠄⡈⠀⠀⢄⠀⠀⠐⠚⠔⠀⠀⠀⠀⠐⠈⠐⠈⢀⠀⠂⠐⡲⠀⠀⡀⠁⠤⎥ ⎢⠠⠈⠀⡁⠀⠑⢊⠐⠀⠊⠠⠓⢀⠀⠀⡢⠄⠙⠐⠖⠰⡑⠒⢂⠀⠀⡈⠐⠀⠀⢀⡈⠀⠀⠁⠀⠀⠂⠈⠀⎥ ⎢⠅⠴⢐⢰⡀⡀⢈⡈⢀⡀⠀⡄⠀⠀⠂⠈⠀⡂⠀⠀⠈⠀⠊⠐⠂⠆⠀⠊⡀⢀⢀⡂⠅⠀⠀⠀⠈⠀⡐⠁⎥ ⎣⠃⠁⠤⠂⠄⠀⠀⠂⠀⢃⠈⠉⠀⠈⡀⢐⠂⡂⠘⠃⢢⠘⠈⡀⡈⠀⠉⠀⠀⢀⠐⠐⠄⠀⢡⠈⠈⠂⠒⠀⎦
julia> q1 = randn(n)
100-element Vector{Float64}: 0.4073052016293537 0.32236711401488266 1.6522588982524344 1.247314982542087 -1.0899095140602673 -1.587439021118574 0.7199133561991193 0.5152322577560772 -1.268685482056492 0.26350617360718037 ⋮ 1.0212229626804834 -0.164028633391345 0.6878278046969009 0.24739946706816834 -0.7332522192387666 -0.7800814569766747 -0.22277469661617758 1.3936817703696192 0.906738510489542
julia> h, q = arnoldi_iteration(A, q1; maxiter=20)
([0.05168527786982488 0.24806090785786167 … 0.05665446676990085 0.1382866762134083; 1.9047952909080437 0.3852962179512741 … 0.28181797737677755 0.13120220415102216; … ; 0.0 0.0 … 0.25642796679567553 0.03102916873522773; 0.0 0.0 … 1.654437968899098 0.053936346557917715], [0.03945875272675507 0.0011532261022612665 … 0.022274745660617403 0.010383532379779415; 0.031230154165146795 0.007144230515953954 … -0.0644484257422316 -0.11211490571246227; … ; 0.13501655303396895 -0.022639113176934185 … -0.045026554259232125 0.10913903741836943; 0.08784265590055393 -0.08913608457584776 … -0.07537378142269126 0.07229846740958014])
julia> eigen(h).values # naive implementation
20-element Vector{ComplexF64}: -1.6221706750763243 - 0.6876678483133931im -1.6221706750763243 + 0.6876678483133931im -1.2490365974420363 - 0.9695660771414254im -1.2490365974420363 + 0.9695660771414254im -1.0571849067648915 + 0.0im -0.8274650929988611 - 1.5188019730622881im -0.8274650929988611 + 1.5188019730622881im -0.2968235812960378 - 1.606969090350307im -0.2968235812960378 + 1.606969090350307im 0.060957883536660824 - 0.22410780563130814im 0.060957883536660824 + 0.22410780563130814im 0.188236190002101 - 1.6833947916613128im 0.188236190002101 + 1.6833947916613128im 0.7484467887732884 - 1.4938838982784994im 0.7484467887732884 + 1.4938838982784994im 1.3323993430980416 - 0.7493211495931722im 1.3323993430980416 + 0.7493211495931722im 1.6711460101996893 - 0.28289680365361514im 1.6711460101996893 + 0.28289680365361514im 5.098871388405611 + 0.0im
julia> eigsolve(A, q1, 2, :LR) # KrylovKit.eigsolve
(ComplexF64[5.098871386559908 + 0.0im, 1.6912606354666253 + 0.47191503960807996im, 1.6912606354666253 - 0.47191503960807996im], Vector{ComplexF64}[[-0.041207406999359125 + 0.0im, -0.13703225499077115 + 0.0im, -0.0915967137027868 + 0.0im, -0.09292769858311827 + 0.0im, -0.10046725362973202 + 0.0im, -0.060735101365228084 + 0.0im, -0.06746970172751199 + 0.0im, -0.08623737779290654 + 0.0im, -0.04833156064320023 + 0.0im, -0.13277192681592753 + 0.0im … -0.08057039459654496 + 0.0im, -0.025458224142938756 + 0.0im, -0.040546856188677476 + 0.0im, -0.0847282158049262 + 0.0im, -0.171495479264443 + 0.0im, -0.1755161973843738 + 0.0im, -0.12365454613094831 + 0.0im, -0.04789256159209485 + 0.0im, -0.043191554535595394 + 0.0im, -0.09234009686462065 + 0.0im], [-0.02283350182174576 - 0.01240992799115352im, -0.1172869896920991 + 0.032089181331556194im, -0.1520847216690462 - 0.1263921349713837im, -0.00011369815752475905 + 0.059083550306463894im, -0.00605709163735252 + 0.132286140445927im, -0.030558413949890353 + 0.05786115784203443im, 0.038751402886713554 - 0.08525994368509413im, 0.013324272879556974 + 0.032070368003660754im, -0.03937916209289496 - 0.032443946150088276im, 0.001963014346238117 - 0.05511949222834569im … -0.03245469214281416 + 0.052503593606461334im, 0.025694952187016815 - 0.009615560790526927im, 0.005997402267854563 - 0.07539759623266132im, 0.10321850522581763 - 0.005011029181794526im, 0.0074447179439675855 - 0.09491265912640727im, 0.11746623990181894 + 0.06997313486897168im, 0.00036889566902525815 + 0.022990769901820717im, -0.04821739818914142 - 0.0019137329263338497im, 0.04904300477917253 - 0.02290682387023434im, -0.006249701780957477 - 0.020069278693814706im], [-0.02283350182174576 + 0.01240992799115352im, -0.1172869896920991 - 0.032089181331556194im, -0.1520847216690462 + 0.1263921349713837im, -0.00011369815752475905 - 0.059083550306463894im, -0.00605709163735252 - 0.132286140445927im, -0.030558413949890353 - 0.05786115784203443im, 0.038751402886713554 + 0.08525994368509413im, 0.013324272879556974 - 0.032070368003660754im, -0.03937916209289496 + 0.032443946150088276im, 0.001963014346238117 + 0.05511949222834569im … -0.03245469214281416 - 0.052503593606461334im, 0.025694952187016815 + 0.009615560790526927im, 0.005997402267854563 + 0.07539759623266132im, 0.10321850522581763 + 0.005011029181794526im, 0.0074447179439675855 + 0.09491265912640727im, 0.11746623990181894 - 0.06997313486897168im, 0.00036889566902525815 - 0.022990769901820717im, -0.04821739818914142 + 0.0019137329263338497im, 0.04904300477917253 + 0.02290682387023434im, -0.006249701780957477 + 0.020069278693814706im]], ConvergenceInfo: 3 converged values after 9 iterations and 120 applications of the linear map; norms of residuals are given by (0.0, 1.665693259973868e-14, 1.665693259973868e-14). )
Graphs
A graph is a pair $G = (V, E)$, where $V$ is a set of vertices and $E$ is a set of edges. In Julia, the package Graphs.jl
provides a simple graph data structure. The following code creates a simple graph with 10 vertices.
julia> using Graphs
julia> g = SimpleGraph(10) # create an empty graph with 10 vertices
{10, 0} undirected simple Int64 graph
julia> add_vertex!(g) # add a vertex
true
julia> add_edge!(g, 3, 11) # add an edge between vertex 3 and 11
true
julia> has_edge(g, 3, 11) # check if there is an edge between vertex 3 and 11
true
julia> rem_vertex!(g, 7) # remove vertex 7
true
julia> has_edge(g, 3, 11)
false
julia> has_edge(g, 3, 7) # vertex number 11 "renamed" to vertex number 7
true
julia> neighbors(g, 3) # get the neighbors of vertex 3
1-element Vector{Int64}: 7
A graph can be represented by an adjacency matrix $A \in \mathbb{R}^{n \times n}$, where $n$ is the number of vertices. The element $A_{ij}$ is 1 if there is an edge between vertex $i$ and vertex $j$, and 0 otherwise.
For example, the adjacency matrix of the Petersen graph is
julia> using Graphs
julia> graph = smallgraph(:petersen)
{10, 15} undirected simple Int64 graph
julia> adj_matrix = adjacency_matrix(graph)
10×10 SparseArrays.SparseMatrixCSC{Int64, Int64} with 30 stored entries: ⋅ 1 ⋅ ⋅ 1 1 ⋅ ⋅ ⋅ ⋅ 1 ⋅ 1 ⋅ ⋅ ⋅ 1 ⋅ ⋅ ⋅ ⋅ 1 ⋅ 1 ⋅ ⋅ ⋅ 1 ⋅ ⋅ ⋅ ⋅ 1 ⋅ 1 ⋅ ⋅ ⋅ 1 ⋅ 1 ⋅ ⋅ 1 ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 ⋅ ⋅ 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 ⋅ ⋅ 1 ⋅ ⋅ 1 ⋅ ⋅ ⋅ 1 ⋅ ⋅ ⋅ 1 ⋅ 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 ⋅ 1 1 ⋅ ⋅
The Laplacian matrix $L_{n\times n}$ of a graph $G$ is defined as $L = D - A$, where $D$ is the degree matrix of the graph. The degree matrix is a diagonal matrix, where the diagonal element $D_{ii}$ is the degree of vertex $i$. The Laplacian matrix is symmetric and positive semidefinite.
julia> lap_matrix = laplacian_matrix(graph)
10×10 SparseArrays.SparseMatrixCSC{Int64, Int64} with 40 stored entries: 3 -1 ⋅ ⋅ -1 -1 ⋅ ⋅ ⋅ ⋅ -1 3 -1 ⋅ ⋅ ⋅ -1 ⋅ ⋅ ⋅ ⋅ -1 3 -1 ⋅ ⋅ ⋅ -1 ⋅ ⋅ ⋅ ⋅ -1 3 -1 ⋅ ⋅ ⋅ -1 ⋅ -1 ⋅ ⋅ -1 3 ⋅ ⋅ ⋅ ⋅ -1 -1 ⋅ ⋅ ⋅ ⋅ 3 ⋅ -1 -1 ⋅ ⋅ -1 ⋅ ⋅ ⋅ ⋅ 3 ⋅ -1 -1 ⋅ ⋅ -1 ⋅ ⋅ -1 ⋅ 3 ⋅ -1 ⋅ ⋅ ⋅ -1 ⋅ -1 -1 ⋅ 3 ⋅ ⋅ ⋅ ⋅ ⋅ -1 ⋅ -1 -1 ⋅ 3
The spectral graph theory
Theorem: The number of connected components in the graph is the dimension of the nullspace of the Laplacian and the algebraic multiplicity of the 0 eigenvalue.
julia> graphsize = 1000
1000
julia> graph = random_regular_graph(graphsize, 3)
{1000, 1500} undirected simple Int64 graph
julia> lmat = laplacian_matrix(graph)
1000×1000 SparseArrays.SparseMatrixCSC{Int64, Int64} with 4000 stored entries: ⎡⣻⢞⡤⡶⡑⡖⡠⢔⡐⡸⣨⣀⣤⠡⣰⠑⡈⠜⠈⡐⠢⠡⣒⢂⠢⠱⢴⠅⡄⢴⡙⢔⠀⠔⡴⠐⠎⡣⠠⠂⎤ ⎢⢠⡯⠱⣦⢣⠢⢃⡁⠍⢣⠘⡡⠃⣋⢎⣨⢙⣏⡧⢀⠓⠔⢠⠰⣆⡠⠏⢞⡐⠭⢒⢅⠊⠒⠀⠦⣠⠰⠱⢁⎥ ⎢⢱⠬⠩⡒⣱⢞⠪⢀⡪⠻⡉⠀⠠⠌⢐⣈⡨⠄⢊⢉⢛⣕⠉⠗⣄⣈⡠⠒⡩⠈⢵⢢⡮⠇⣨⠦⡂⠒⠑⢐⎥ ⎢⢀⢎⠍⠰⠊⢂⠟⣥⢁⢹⣛⣮⡥⣰⡤⡄⣆⡀⡰⡀⠊⠇⢟⠠⣈⠝⢖⠈⢁⢂⢅⠰⡋⠉⢔⣣⠁⠚⣳⡲⎥ ⎢⣐⡨⠧⣁⣮⡊⣅⣐⡑⢌⡠⡦⢚⡮⠊⢹⠙⠀⣭⠐⡈⠱⢿⠄⢀⠂⠢⠂⣓⠣⢈⢙⢖⣕⢎⢸⡼⢗⢠⠱⎥ ⎢⠂⢺⠖⡠⠃⠈⡻⣼⠠⡮⠑⢄⢄⡂⢆⢙⣠⡊⢏⡦⢌⢀⠗⠛⠢⢐⣚⡌⡈⡝⠉⠋⢴⠉⡱⠃⠞⢼⣁⡢⎥ ⎢⠄⡛⡭⢠⡀⠆⢁⣫⡺⡴⠠⠱⠵⢇⢝⢅⠑⠠⡡⡌⠑⣊⡡⠦⠂⣮⠰⠛⡰⠪⡖⢂⣘⠰⣁⢅⠱⠙⢃⡢⎥ ⎢⢔⠚⡊⣱⡐⢰⠀⠯⣎⣀⣌⢑⠗⢕⡻⢎⠒⡐⡐⢈⠖⣼⠀⡙⠤⢄⡄⠕⠰⢕⠕⠡⣭⡉⣂⠊⡡⣂⠉⡤⎥ ⎢⣂⠌⡷⢴⠂⠎⠈⠹⠓⠀⡠⠺⠑⡀⢘⠠⢱⢖⢒⢊⡆⣫⠁⠌⢄⠥⢪⡦⠀⡰⣾⡕⢀⢈⠆⢼⡱⣒⡑⠏⎥ ⎢⢂⠠⠉⢋⡎⢐⠐⠪⢃⠛⠫⡵⡁⠮⡐⢈⡸⢐⡛⢌⠰⢎⡒⡇⡆⠻⡔⣎⡈⢰⡈⠉⠵⡵⣴⡊⡠⠴⡅⡑⎥ ⎢⠌⡂⢙⠄⢟⢴⠮⠄⢆⡈⠂⢑⡱⢠⣘⣥⡬⣩⡰⢆⠵⢇⠪⡋⣗⢄⠔⣁⡄⠀⣃⡕⡠⠀⠣⠒⠹⢏⢋⢒⎥ ⎢⠸⢘⢀⡒⢧⠄⠛⡑⠛⠗⣽⠁⠡⡎⣄⠠⡁⠄⠼⠬⡮⠢⣑⣼⠭⡖⠣⡨⠡⡐⢒⡃⡄⡂⡪⡁⠵⡓⣐⡂⎥ ⎢⢌⡂⠈⡹⡀⢹⣆⠜⠠⠐⢈⢂⡨⣤⠀⢇⠄⡕⣬⡉⠙⢝⢣⠧⠕⢅⠠⢠⡸⠴⢇⡠⠸⣲⡣⢉⡐⣂⠣⠗⎥ ⎢⠔⠗⣫⢅⢠⠊⡘⠑⠨⠂⡚⠼⣴⠂⢄⠍⠪⡶⡰⢭⠔⢡⡉⡢⠀⣂⡱⢎⡲⡠⡂⠶⡹⢅⣀⠑⣝⢉⠛⢁⎥ ⎢⢀⣍⡔⡌⡃⠊⠡⢐⠽⡘⣆⠬⡰⡊⢔⢆⢀⡠⢂⣈⠀⠉⢁⠢⢒⡎⠘⡪⢕⣵⠀⢆⠕⠯⢘⡞⠪⣹⢧⠉⎥ ⎢⢓⢌⠜⢔⠱⣓⢁⡑⣆⢐⡧⠀⠸⢉⠕⡁⢞⠿⡆⠈⢍⠼⠼⠰⠉⡱⢨⡌⠠⢄⢕⢕⠑⢲⣆⣀⢌⣊⠁⠂⎥ ⎢⢀⠄⢪⠀⠮⠏⡏⠈⢜⢵⡔⠓⢒⡘⡇⠻⡀⢐⢕⡧⠀⠊⠠⠩⢲⣢⠗⢎⡵⡅⢱⣀⡑⢌⡁⢪⣂⡐⢴⠌⎥ ⎢⢐⠋⠠⡄⠢⡞⠴⣱⣊⣑⠵⠊⠅⢜⡨⠘⣈⣅⡰⠻⢩⠂⠎⠪⡍⢊⢄⠘⣲⠴⠈⢹⡡⣈⢛⣴⡒⠠⢠⠂⎥ ⎢⠮⡡⢀⡚⢨⠈⣡⠀⢶⢏⣚⣅⣕⠂⠡⢪⢱⢪⢀⡎⡷⢆⢵⠣⠰⢨⡗⢙⣎⣢⡢⢱⢈⠸⠘⡈⠿⢇⠤⠂⎥ ⎣⠠⠂⠕⢂⢑⢀⢹⡺⢄⡒⠡⡸⠩⡰⠃⡤⡵⠌⢅⠩⢫⢐⠰⠸⢭⠆⠟⢀⡍⠓⠡⠀⡐⠗⠠⠒⠠⠃⡵⢏⎦
julia> q1 = randn(graphsize)
1000-element Vector{Float64}: -1.330851710758853 0.6234582293831458 -0.07451414050782365 0.8472948362257049 1.3190564128510562 -1.4744904670365129 0.3828546389114204 -0.9239322037144979 0.13345085408129645 1.4886408228338743 ⋮ 3.0383552136606786 3.0535649576322372 -0.1367428932375742 -1.3279734333380295 0.23919760707775284 0.4701068886178393 1.1741518933164747 0.9473414468911898 -1.3449764143747989
julia> tri, Q = lanczos(lmat, q1; abstol=1e-8, maxiter=100)
([3.0422743025969576 1.7541725188415855 … 0.0 0.0; 1.7541725188415855 2.940157253390522 … 0.0 0.0; … ; 0.0 0.0 … 2.8609092743605204 1.4047262130142124; 0.0 0.0 … 1.4047262130142124 3.132165569924071], [-0.04395204399992074 0.03494691082785471 … -0.002056493844011692 0.03429250971796982; 0.02059002014156476 -0.026864761496176302 … 0.008445322155381612 -0.0023674018803809317; … ; 0.03128642554245834 -0.01643189775763376 … 0.026775784614088426 -0.017535303681653135; -0.044418519407958434 0.009789210792788668 … -0.006743734843746391 0.004424084805691158])
julia> -eigen(-tri).values # the eigenvalues of the tridiagonal matrix
100-element Vector{Float64}: 5.82083757905824 5.817681110313308 5.805444133480869 5.796116450630306 5.785621775095119 5.775273729158799 5.756023174332008 5.735848068163851 5.714842190602474 5.686347519138362 ⋮ 0.27933879929992855 0.25976371013366073 0.24421611256384423 0.2226975196331331 0.20668728449933482 0.20013555637592617 0.1886355351186726 0.1812956918567714 4.440892098500626e-15
julia> Q' * Q # the orthogonality of the Krylov vectors
100×100 Matrix{Float64}: 1.0 1.10242e-15 4.77049e-16 … -5.90726e-5 8.13467e-5 1.10242e-15 1.0 -1.0797e-15 0.00010245 -0.00014108 4.77049e-16 -1.0797e-15 1.0 -0.000140715 0.000193773 -2.44076e-15 4.57967e-16 7.49401e-16 0.000196095 -0.000270035 6.245e-17 -1.15533e-15 3.67761e-16 -0.000266782 0.000367376 1.079e-15 5.24754e-17 -2.57086e-15 … 0.000367437 -0.000505984 3.29597e-17 -1.92988e-17 -3.81639e-17 -0.000512916 0.000706318 -1.0339e-15 1.73472e-17 1.1692e-15 0.000676502 -0.000931587 -3.81639e-17 -9.19403e-17 1.76942e-16 -0.000907241 0.00124933 8.56953e-16 1.50921e-16 -1.0842e-15 0.00112165 -0.00154459 ⋮ ⋱ 3.60136e-6 -6.24586e-6 8.57867e-6 -1.78503e-15 2.74412e-16 -5.29542e-6 9.18388e-6 -1.2614e-5 3.72966e-16 8.88178e-16 7.95846e-6 -1.38024e-5 1.89576e-5 2.1155e-15 6.10623e-16 -1.18623e-5 2.05728e-5 -2.82567e-5 3.747e-16 -9.15934e-16 1.83318e-5 -3.17929e-5 4.36675e-5 … -3.69279e-15 2.23779e-16 -2.59146e-5 4.49438e-5 -6.17302e-5 1.29584e-15 -7.88432e-16 3.88527e-5 -6.73825e-5 9.25497e-5 1.18135e-15 1.5786e-15 -5.90726e-5 0.00010245 -0.000140715 1.0 -1.94289e-15 8.13467e-5 -0.00014108 0.000193773 -1.94289e-15 1.0
julia> eigsolve(lmat, q1, 2, :SR) # using function `KrylovKit.eigsolve`
([3.184786581633357e-15, 0.1812956366242269], [[-0.03162277660168359, -0.03162277660168404, -0.0316227766016839, -0.031622776601683715, -0.03162277660168395, -0.03162277660168373, -0.03162277660168402, -0.03162277660168373, -0.03162277660168361, -0.031622776601684104 … -0.0316227766016838, -0.03162277660168386, -0.03162277660168385, -0.031622776601684166, -0.03162277660168354, -0.0316227766016838, -0.031622776601684006, -0.031622776601683944, -0.03162277660168393, -0.031622776601683514], [-0.04772174780085878, 0.029070656092814534, -0.01891898021217681, 0.008808591815953873, -0.006224132933757641, -0.034911383043211695, 0.04543469971977188, -0.010949767379978522, -0.03207822828586729, 0.006746109881164105 … 0.005702993390025274, 0.03880631032234341, -0.03568359402652036, 0.03897322841787237, -0.0318486478572642, 0.003126947331469102, -0.014802197887037512, 0.04357915377633621, 0.027762705704699447, -0.04076192997862257]], ConvergenceInfo: 2 converged values after 18 iterations and 234 applications of the linear map; norms of residuals are given by (4.799862496008003e-36, 5.739969342015196e-13). )
NOTE: with larger graph_size
, you should see some "ghost" eigenvalues
Graph layout and clustering
Given a graph, we can use the spectral graph theory to detect the number of connected components and the clustering of the graph. What if we are interested in the clustering of the graph? The spectral clustering algorithm is a popular method to partition a graph into clusters[Ng2001]. The algorithm is as follows:
Given a set of points $S = \{s_1, \ldots,s_n\}$ in $\mathbb{R^l}$ that we want to cluster into k subsets:
- Form the affinity matrix $A \in \mathbb{R}^{n\times n}$ defined by $A_{ij} = \exp(-\|s_i -s_j\|_2/2\sigma^2)$ if $i \neq j$, and $A_{ii} = 0$.
- Define $D$ to be the diagonal matrix whose $(i, i)$-element is the sum of $A$'s $i$-th row, and construct the matrix $L = D^{-1/2}AD^{-1/2}$.
- Find $x_1 , x_2, \ldots ,x_k$, the $k$ largest eigenvectors of $L$ (chosen to be orthogonal to each other in the case of repeated eigenvalues), and form the matrix $X = [x_1x_2\ldots x_k] \in \mathbb{R}^{n\times n}$ by stacking the eigenvectors in columns.
- Form the matrix $Y$ from $X$ by renormalizing each of $X$'s rows to have unit length (i.e. $Y_{ij} = X_{ij}/(\sum_j X_{i,j}^2)^{1/2}$).
- Treating each row of $Y$ as a point in $\mathbb{R}^k$, cluster them into $k$ clusters via K-means or any other algorithm (that attempts to minimize distortion).
- Finally, assign the original point $S_i$ to cluster $j$ if and only if row $i$ of the matrix $Y$ was assigned to cluster $j$.
Here, the scaling parameter $\sigma^2$ controls how rapidly the affinity $A_{ij}$ falls off with the distance between $s_i$ and $s_j$, and we will later describe a method for choosing it automatically.
For an implementation of the spectral clustering algorithm, please check the demo.