Fast Fourier transform

Definition

Given a function $f(x)$ defined on $x \in \mathbb{C}$, the Fourier transformation is defined as

\[g(u) = \int_{-\infty}^{\infty} e^{-2\pi iux} f(x) dx.\]

The space of $u$ is called the momentum space, and the space of $x$ is called the position space. Its inverse process, or the inverse Fourier transformation is defined as

\[f(x) = \int_{-\infty}^{\infty} e^{2\pi iux} g(u) dk\]

The two-dimensional Fourier transformation and its inverse transformation are defined as

\[\begin{align*} &g(u, v) = \int_{-\infty}^{\infty}dy\int_{-\infty}^\infty e^{-2\pi i(ux+vy)} f(x, y) dx,\\ &f(x, y) = \int_{-\infty}^{\infty}du\int_{-\infty}^\infty e^{2\pi i(ux+vy)} g(u, v) dv. \end{align*}\]

Fourier transformation is widely used in many fields, including

  • Image and audio processing: YouTube: Image Compression and the FFT, Steve Brunton
  • Solid state physics: Kittel, Charles, and Paul McEuen. Introduction to solid state physics. John Wiley & Sons, 2018.
  • Quantum computing: Nielsen, Michael A., and Isaac L. Chuang. Quantum computation and quantum information. Cambridge university press, 2010.
  • Fourier optics: Goodman, Joseph W. Introduction to Fourier optics. Roberts and Company publishers, 2005.

Discrete Fourier Transformation (DFT)

Let $x$ be a vector of length $n$, the DFT of $x$ is defined as

\[y_{i}=\sum_{n=0}^{n-1}x_{j}\cdot e^{-{\frac {i2\pi }{n}}ij}\]

Since this transformation is linear, we can represent it as a matrix multiplication. Let $F_n$ be the matrix of size $n \times n$ defined as

\[F_n = \left( \begin{matrix} 1 & 1 & 1 & \ldots & 1\\ 1 & \omega & \omega^2 & \ldots & \omega^{n-1}\\ 1 & \omega^2 & \omega^4 & \ldots & \omega^{2n-2}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 1 & \omega^{n-1} & \omega^{2n-2} & \ldots & \omega^{(n-1)^2}\\ \end{matrix} \right)\]

where $\omega = e^{-2\pi i/n}$. This matrix is called the DFT matrix, and the DFT of $x$ is represented as $F_n x$. The inverse transformation is defined as $F_n^\dagger x/n$, i.e. $F_n F_n^\dagger = I$.

using Test, LinearAlgebra

function dft_matrix(n::Int)
    ω = exp(-2π*im/n)
    return [ω^((i-1)*(j-1)) for i=1:n, j=1:n]
end
dft_matrix (generic function with 1 method)
julia> n = 33
julia> Fn = dft_matrix(n)3×3 Matrix{ComplexF64}: 1.0+0.0im 1.0+0.0im 1.0+0.0im 1.0+0.0im -0.5-0.866025im -0.5+0.866025im 1.0+0.0im -0.5+0.866025im -0.5-0.866025im
julia> dft_matrix(n) * dft_matrix(n)' ./ n3×3 Matrix{ComplexF64}: 1.0+0.0im … 1.4803e-16+1.85037e-16im -3.70074e-17-1.11022e-16im 0.0+1.11022e-16im 1.4803e-16-1.85037e-16im 1.0+0.0im

The Cooley–Tukey's Fast Fourier transformation (FFT)

We have a recursive algorithm to compute the DFT.

\[F_n x = \left(\begin{matrix}I_{n/2} & D_{n/2}\\I_{n/2} & -D_{n/2} \end{matrix}\right)\left(\begin{matrix} F_{n/2} & 0 \\ 0 & F_{n/2}\end{matrix}\right)\left(\begin{matrix}x_{\rm odd}\\x_{\rm even}\end{matrix}\right)\]

where $D_n = {\rm diag}(1, \omega, \omega^2, \ldots, \omega^{n-1})$.

Quiz

What is the computational complexity of evaluating $F_n x$? Hint: $T(n) = 2 T(n/2) + O(n)$.

julia> using SparseArrays
julia> @testset "fft decomposition" begin n = 4 Fn = dft_matrix(n) F2n = dft_matrix(2n) # the permutation matrix to permute elements at 1:2:n (odd) to 1:n÷2 (top half) pm = sparse([iseven(j) ? (j÷2+n) : (j+1)÷2 for j=1:2n], 1:2n, ones(2n), 2n, 2n) # construct the D matrix ω = exp(-π*im/n) d1 = Diagonal([ω^(i-1) for i=1:n]) # construct F_{2n} from F_n F2n_ = [Fn d1 * Fn; Fn -d1 * Fn] @test F2n * pm' ≈ F2n_ endTest Summary: | Pass Total Time fft decomposition | 1 1 1.4s Test.DefaultTestSet("fft decomposition", Any[], 1, false, false, true, 1.714382294071963e9, 1.714382295510516e9, false, "REPL[2]")

We implement the $O(n\log(n))$ time Cooley-Tukey FFT algorithm.

function fft!(x::AbstractVector{T}) where T
    N = length(x)
    @inbounds if N <= 1
        return x
    end

    # divide
    odd  = x[1:2:N]
    even = x[2:2:N]

    # conquer
    fft!(odd)
    fft!(even)

    # combine
    @inbounds for i=1:N÷2
       t = exp(T(-2im*π*(i-1)/N)) * even[i]
       oi = odd[i]
       x[i]     = oi + t
       x[i+N÷2] = oi - t
    end
    return x
end
fft! (generic function with 1 method)
julia> @testset "fft" begin
           x = randn(ComplexF64, 8)
           @test fft!(copy(x)) ≈ dft_matrix(8) * x
       endTest Summary: | Pass  Total  Time
fft           |    1      1  0.3s
Test.DefaultTestSet("fft", Any[], 1, false, false, true, 1.714382295646515e9, 1.714382295933599e9, false, "REPL[1]")

The Julia package FFTW.jl contains a superfast FFT implementation.

julia> using FFTWWARNING: using FFTW.fft! in module Main conflicts with an existing identifier.
julia> @testset "fft" begin x = randn(ComplexF64, 8) @test FFTW.fft(copy(x)) ≈ dft_matrix(8) * x endTest Summary: | Pass Total Time fft | 1 1 0.0s Test.DefaultTestSet("fft", Any[], 1, false, false, true, 1.714382295938509e9, 1.714382295941608e9, false, "REPL[2]")

Application 1: Fast polynomial multiplication

Given two polynomials $p(x)$ and $q(x)$

\[\begin{align*} &p(x) = \sum_{k=0}^{n-1} a_k x^k\\ &q(x) = \sum_{k=0}^{n-1} b_k x^k \end{align*}\]

The multiplication of them is defined as

\[p(x)q(x) = \sum_{k=0}^{2n-2} c_k x^{k}\]

Fourier transformation can be used to compute the product of two polynomials in $O(n \log n)$ time, which is much faster than the naive algorithm that takes $O(n^2)$ time.

Algorithm: Fast polynomial multiplication
  1. Evaluate $p(x)$ and $q(x)$ at $2n$ points $ω^0, \ldots , ω^{2n−1}$ using DFT. This step takes time $O(n \log n)$.

  2. Obtain the values of $p(x)q(x)$ at these 2n points through pointwise multiplication

\[\begin{align*} (p \circ q)(ω^0) &= p(ω^0) q(ω^0), \\ (p \circ q)(ω^1) &= p(ω^1) q(ω^1),\\ &\vdots\\ (p \circ q)(ω^{2n−1}) &= p(ω^{2n−1}) q(ω^{2n−1}). \end{align*}\]

This step takes time $O(n)$.

  1. Interpolate the polynomial $p \circ q$ at the product values using inverse DFT to obtain coefficients $c_0, c_1, \ldots, c_{2n−2}$. This last step requires time $O(n \log n)$.

We can also use FFT to compute the convolution of two vectors $a = (a_0,\ldots , a_{n−1})$ and $b = (b_0, \ldots , b_{n−1})$, which is defined as a vector $c = (c_0, \ldots , c_{n−1})$ where

\[c_j = \sum^j_{k=0} a_kb_{j−k}, ~~~~~~ j = 0,\ldots, n − 1.\]

The running time is again $O(n \log n)$.

In the following example, we use the Polynomials package to define the polynomial and use the FFT algorithm to compute the product of two polynomials.

julia> using Polynomials
julia> p = Polynomial([1, 3, 2, 5, 6])Polynomial(1 + 3*x + 2*x^2 + 5*x^3 + 6*x^4)
julia> q = Polynomial([3, 1, 6, 2, 2])Polynomial(3 + x + 6*x^2 + 2*x^3 + 2*x^4)

Step 1: evaluate $p(x)$ at $2n-1$ different points.

julia> pvals = fft(vcat(p.coeffs, zeros(4)))9-element Vector{ComplexF64}:
               17.0 + 0.0im
 -4.492726040024655 - 10.28022621396024im
 1.7378259501428421 + 4.548389131353467im
                0.5 - 6.06217782649107im
 -1.745099910118187 + 1.8382342885471272im
 -1.745099910118187 - 1.8382342885471272im
                0.5 + 6.06217782649107im
 1.7378259501428421 - 4.548389131353467im
 -4.492726040024655 + 10.28022621396024im

which is equivalent to computing:

julia> n = 55
julia> ω = exp(-2π*im/(2n-1))0.766044443118978 - 0.6427876096865393im
julia> map(k->p(ω^k), 0:(2n-1))10-element Vector{ComplexF64}: 17.0 + 0.0im -4.492726040024651 - 10.28022621396024im 1.7378259501428397 + 4.5483891313534635im 0.5000000000000016 - 6.062177826491066im -1.7450999101181899 + 1.838234288547124im -1.7450999101181837 - 1.8382342885471246im 0.4999999999999951 + 6.0621778264910615im 1.737825950142845 - 4.548389131353452im -4.4927260400246505 + 10.28022621396022im 16.999999999999975 + 1.532107773982714e-14im

The same for $q(x)$.

julia> qvals = fft(vcat(q.coeffs, zeros(4)))9-element Vector{ComplexF64}:
                14.0 + 0.0im
  1.9285482675487435 - 8.967725221980002im
 -1.9324186608105642 - 0.01930258602426438im
                 0.5 + 2.598076211353316im
  6.0038703932618205 + 3.752270213249106im
  6.0038703932618205 - 3.752270213249106im
                 0.5 - 2.598076211353316im
 -1.9324186608105642 + 0.01930258602426438im
  1.9285482675487435 + 8.967725221980002im

Step 2: Compute $p(x) q(x)$ at $2n-1$ points.

julia> pqvals = pvals .* qvals9-element Vector{ComplexF64}:
               238.0 + 0.0im
 -100.85468292765191 + 20.463620169633234im
  -3.270411622817098 - 8.822936568953224im
                16.0 - 1.7320508075688772im
 -17.374905449530996 + 4.488434009006639im
 -17.374905449530996 - 4.488434009006639im
                16.0 + 1.7320508075688772im
  -3.270411622817098 + 8.822936568953224im
 -100.85468292765191 - 20.463620169633234im
julia> ifft(pqvals)9-element Vector{ComplexF64}: 3.0 + 0.0im 10.000000000000002 + 0.0im 15.000000000000002 - 5.950848353424417e-16im 37.0 + 0.0im 43.0 - 6.837200662697115e-16im 46.0 + 6.837200662697115e-16im 50.0 + 0.0im 21.999999999999996 + 7.441926703969614e-16im 11.999999999999998 + 0.0im

Summarize:

function fast_polymul(p::AbstractVector, q::AbstractVector)
    pvals = fft(vcat(p, zeros(length(q)-1)))
    qvals = fft(vcat(q, zeros(length(p)-1)))
    pqvals = pvals .* qvals
    return real.(ifft(pqvals))
end

function fast_polymul(p::Polynomial, q::Polynomial)
    Polynomial(fast_polymul(p.coeffs, q.coeffs))
end
fast_polymul (generic function with 2 methods)

A similar algorithm has already been implemented in package Polynomials. One can easily verify the correctness.

julia> p * qPolynomial(3 + 10*x + 15*x^2 + 37*x^3 + 43*x^4 + 46*x^5 + 50*x^6 + 22*x^7 + 12*x^8)
julia> fast_polymul(p, q)Polynomial(3.0 + 10.000000000000002*x + 15.000000000000002*x^2 + 37.0*x^3 + 43.0*x^4 + 46.0*x^5 + 50.0*x^6 + 21.999999999999996*x^7 + 11.999999999999998*x^8)

Application 2: Image compression

If you google the logo of the Hong Kong University of Science and Technology, you will probably find the following png of size $2000 \times 3000$.

using Images
img = Images.load("../assets/images/hkust-gz.png")
Example block output

It is too large! We can compress it with the Fourier transformation algorithm. To simplify the discussion, let us using the gray scale image.

gray_image = Gray.(img)
Example block output

The gray scale image uses 8-bit fixed point numbers as the pixel storage type.

julia> typeof(gray_image)Matrix{Gray{N0f8}} (alias for Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8, 8}}, 2})
julia> img_data = Float32.(gray_image)2000×3000 Matrix{Float32}: 0.376471 0.376471 0.376471 0.376471 … 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 … 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 ⋮ ⋱ 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 … 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471 0.376471
julia> img_data_k = fftshift(fft(img_data))2000×3000 Matrix{ComplexF32}: 14.1177-9.69074f-7im -16.5889+1.66456im … -16.5889-1.66456im -12.1824-10.0836im 19.6015+5.63716im 10.0119+14.3428im 7.55086+13.5317im -17.3148-3.91325im -2.42017-22.694im -3.00379-9.20718im 11.248-1.95294im -4.79038+19.1743im 1.09106+1.94913im -5.93928+3.75958im 11.0074-5.10588im -2.26381+2.12877im 5.33122+2.14914im … -14.9784-10.0922im 4.27849-0.864077im -8.88375-11.6353im 14.901+16.4903im -3.59109-2.92389im 11.5189+16.8775im -10.5282-11.4876im -1.89706+4.89608im -8.13688-13.6709im 4.82196+1.34339im 10.6636-3.25829im -1.22232+4.9903im -2.45761+4.52942im ⋮ ⋱ 10.6636+3.25828im -2.45761-4.52942im -1.22232-4.9903im -1.89706-4.89609im 4.82196-1.34339im -8.13687+13.6709im -3.5911+2.9239im -10.5282+11.4876im 11.5189-16.8775im 4.27848+0.864073im 14.901-16.4903im -8.88375+11.6353im -2.26382-2.12877im -14.9784+10.0922im … 5.33123-2.14914im 1.09106-1.94913im 11.0074+5.10589im -5.93928-3.75958im -3.0038+9.20717im -4.79038-19.1743im 11.248+1.95294im 7.55085-13.5317im -2.42017+22.694im -17.3148+3.91326im -12.1824+10.0836im 10.0119-14.3429im 19.6015-5.63716im

it is sparse!

Gray.(abs2.(img_data_k) ./ length(img_data_k))
Example block output

We can store it in the sparse matrix format.

# let us discard all variables smaller than 1e-5
img_data_k[abs.(img_data_k) .< 1e-5] .= 0
sparse_img = sparse(img_data_k)
compression_ratio = nnz(sparse_img) / (2000 * 3000)
recovered_img = ifft(fftshift(Matrix(sparse_img)))
Gray.(abs.(recovered_img))
Example block output