Fast Fourier transform

Definition

Given a function $f(x)$ defined on $x \in \mathbb{C}$, the Fourier transformation is defined as

\[g(u) = \int_{-\infty}^{\infty} e^{-2\pi iux} f(x) dx.\]

The space of $u$ is called the momentum space, and the space of $x$ is called the position space. Its inverse process, or the inverse Fourier transformation is defined as

\[f(x) = \int_{-\infty}^{\infty} e^{2\pi iux} g(u) dk\]

The two-dimensional Fourier transformation and its inverse transformation are defined as

\[\begin{align*} &g(u, v) = \int_{-\infty}^{\infty}dy\int_{-\infty}^\infty e^{-2\pi i(ux+vy)} f(x, y) dx,\\ &f(x, y) = \int_{-\infty}^{\infty}du\int_{-\infty}^\infty e^{2\pi i(ux+vy)} g(u, v) dv. \end{align*}\]

Fourier transformation is widely used in many fields, including

Image and audio processing: YouTube: Image Compression and the FFT, Steve Brunton
Solid state physics: Kittel, Charles, and Paul McEuen. Introduction to solid state physics. John Wiley & Sons, 2018.
Quantum computing: Nielsen, Michael A., and Isaac L. Chuang. Quantum computation and quantum information. Cambridge university press, 2010.
Fourier optics: Goodman, Joseph W. Introduction to Fourier optics. Roberts and Company publishers, 2005.

Discrete Fourier Transformation (DFT)

Let $x$ be a vector of length $n$, the DFT of $x$ is defined as

\[y_{i}=\sum_{n=0}^{n-1}x_{j}\cdot e^{-{\frac {i2\pi }{n}}ij}\]

Since this transformation is linear, we can represent it as a matrix multiplication. Let $F_n$ be the matrix of size $n \times n$ defined as

\[F_n = \left( \begin{matrix} 1 & 1 & 1 & \ldots & 1\\ 1 & \omega & \omega^2 & \ldots & \omega^{n-1}\\ 1 & \omega^2 & \omega^4 & \ldots & \omega^{2n-2}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 1 & \omega^{n-1} & \omega^{2n-2} & \ldots & \omega^{(n-1)^2}\\ \end{matrix} \right)\]

where $\omega = e^{-2\pi i/n}$. This matrix is called the DFT matrix, and the DFT of $x$ is represented as $F_n x$. The inverse transformation is defined as $F_n^\dagger x/n$, i.e. $F_n F_n^\dagger = I$.

using Test, LinearAlgebra

function dft_matrix(n::Int)
    ω = exp(-2π*im/n)
    return [ω^((i-1)*(j-1)) for i=1:n, j=1:n]
end

dft_matrix (generic function with 1 method)

julia> n = 33
julia> Fn = dft_matrix(n)3×3 Matrix{ComplexF64}:
 1.0+0.0im   1.0+0.0im        1.0+0.0im
 1.0+0.0im  -0.5-0.866025im  -0.5+0.866025im
 1.0+0.0im  -0.5+0.866025im  -0.5-0.866025im
julia> dft_matrix(n) * dft_matrix(n)' ./ n3×3 Matrix{ComplexF64}:
          1.0+0.0im          …  1.4803e-16+1.85037e-16im
 -3.70074e-17-1.11022e-16im            0.0+1.11022e-16im
   1.4803e-16-1.85037e-16im            1.0+0.0im

The Cooley–Tukey's Fast Fourier transformation (FFT)

We have a recursive algorithm to compute the DFT.

\[F_n x = \left(\begin{matrix}I_{n/2} & D_{n/2}\\I_{n/2} & -D_{n/2} \end{matrix}\right)\left(\begin{matrix} F_{n/2} & 0 \\ 0 & F_{n/2}\end{matrix}\right)\left(\begin{matrix}x_{\rm odd}\\x_{\rm even}\end{matrix}\right)\]

where $D_n = {\rm diag}(1, \omega, \omega^2, \ldots, \omega^{n-1})$.

Quiz

What is the computational complexity of evaluating $F_n x$? Hint: $T(n) = 2 T(n/2) + O(n)$.

julia> using SparseArrays
julia> @testset "fft decomposition" begin
           n = 4
           Fn = dft_matrix(n)
           F2n = dft_matrix(2n)
       
           # the permutation matrix to permute elements at 1:2:n (odd) to 1:n÷2 (top half)
           pm = sparse([iseven(j) ? (j÷2+n) : (j+1)÷2 for j=1:2n], 1:2n, ones(2n), 2n, 2n)
       
           # construct the D matrix
           ω = exp(-π*im/n)
           d1 = Diagonal([ω^(i-1) for i=1:n])
       
           # construct F_{2n} from F_n
           F2n_ = [Fn d1 * Fn; Fn -d1 * Fn]
           @test F2n * pm' ≈ F2n_
       endTest Summary:     | Pass  Total  Time
fft decomposition |    1      1  1.4s
Test.DefaultTestSet("fft decomposition", Any[], 1, false, false, true, 1.714382294071963e9, 1.714382295510516e9, false, "REPL[2]")

We implement the $O(n\log(n))$ time Cooley-Tukey FFT algorithm.

function fft!(x::AbstractVector{T}) where T
    N = length(x)
    @inbounds if N <= 1
        return x
    end

    # divide
    odd  = x[1:2:N]
    even = x[2:2:N]

    # conquer
    fft!(odd)
    fft!(even)

    # combine
    @inbounds for i=1:N÷2
       t = exp(T(-2im*π*(i-1)/N)) * even[i]
       oi = odd[i]
       x[i]     = oi + t
       x[i+N÷2] = oi - t
    end
    return x
end

fft! (generic function with 1 method)

julia> @testset "fft" begin
           x = randn(ComplexF64, 8)
           @test fft!(copy(x)) ≈ dft_matrix(8) * x
       endTest Summary: | Pass  Total  Time
fft           |    1      1  0.3s
Test.DefaultTestSet("fft", Any[], 1, false, false, true, 1.714382295646515e9, 1.714382295933599e9, false, "REPL[1]")

The Julia package FFTW.jl contains a superfast FFT implementation.

julia> using FFTWWARNING: using FFTW.fft! in module Main conflicts with an existing identifier.
julia> @testset "fft" begin
           x = randn(ComplexF64, 8)
           @test FFTW.fft(copy(x)) ≈ dft_matrix(8) * x
       endTest Summary: | Pass  Total  Time
fft           |    1      1  0.0s
Test.DefaultTestSet("fft", Any[], 1, false, false, true, 1.714382295938509e9, 1.714382295941608e9, false, "REPL[2]")

Application 1: Fast polynomial multiplication

Given two polynomials $p(x)$ and $q(x)$

\[\begin{align*} &p(x) = \sum_{k=0}^{n-1} a_k x^k\\ &q(x) = \sum_{k=0}^{n-1} b_k x^k \end{align*}\]

The multiplication of them is defined as

\[p(x)q(x) = \sum_{k=0}^{2n-2} c_k x^{k}\]

Fourier transformation can be used to compute the product of two polynomials in $O(n \log n)$ time, which is much faster than the naive algorithm that takes $O(n^2)$ time.

Algorithm: Fast polynomial multiplication

Evaluate $p(x)$ and $q(x)$ at $2n$ points $ω^0, \ldots , ω^{2n−1}$ using DFT. This step takes time $O(n \log n)$.
Obtain the values of $p(x)q(x)$ at these 2n points through pointwise multiplication

\[\begin{align*} (p \circ q)(ω^0) &= p(ω^0) q(ω^0), \\ (p \circ q)(ω^1) &= p(ω^1) q(ω^1),\\ &\vdots\\ (p \circ q)(ω^{2n−1}) &= p(ω^{2n−1}) q(ω^{2n−1}). \end{align*}\]

This step takes time $O(n)$.

Interpolate the polynomial $p \circ q$ at the product values using inverse DFT to obtain coefficients $c_0, c_1, \ldots, c_{2n−2}$. This last step requires time $O(n \log n)$.

We can also use FFT to compute the convolution of two vectors $a = (a_0,\ldots , a_{n−1})$ and $b = (b_0, \ldots , b_{n−1})$, which is defined as a vector $c = (c_0, \ldots , c_{n−1})$ where

\[c_j = \sum^j_{k=0} a_kb_{j−k}, ~~~~~~ j = 0,\ldots, n − 1.\]

The running time is again $O(n \log n)$.

In the following example, we use the Polynomials package to define the polynomial and use the FFT algorithm to compute the product of two polynomials.

julia> using Polynomials
julia> p = Polynomial([1, 3, 2, 5, 6])Polynomial(1 + 3*x + 2*x^2 + 5*x^3 + 6*x^4)
julia> q = Polynomial([3, 1, 6, 2, 2])Polynomial(3 + x + 6*x^2 + 2*x^3 + 2*x^4)

Step 1: evaluate $p(x)$ at $2n-1$ different points.

julia> pvals = fft(vcat(p.coeffs, zeros(4)))9-element Vector{ComplexF64}:
               17.0 + 0.0im
 -4.492726040024655 - 10.28022621396024im
 1.7378259501428421 + 4.548389131353467im
                0.5 - 6.06217782649107im
 -1.745099910118187 + 1.8382342885471272im
 -1.745099910118187 - 1.8382342885471272im
                0.5 + 6.06217782649107im
 1.7378259501428421 - 4.548389131353467im
 -4.492726040024655 + 10.28022621396024im

which is equivalent to computing:

julia> n = 55
julia> ω = exp(-2π*im/(2n-1))0.766044443118978 - 0.6427876096865393im
julia> map(k->p(ω^k), 0:(2n-1))10-element Vector{ComplexF64}:
                17.0 + 0.0im
  -4.492726040024651 - 10.28022621396024im
  1.7378259501428397 + 4.5483891313534635im
  0.5000000000000016 - 6.062177826491066im
 -1.7450999101181899 + 1.838234288547124im
 -1.7450999101181837 - 1.8382342885471246im
  0.4999999999999951 + 6.0621778264910615im
   1.737825950142845 - 4.548389131353452im
 -4.4927260400246505 + 10.28022621396022im
  16.999999999999975 + 1.532107773982714e-14im

The same for $q(x)$.

julia> qvals = fft(vcat(q.coeffs, zeros(4)))9-element Vector{ComplexF64}:
                14.0 + 0.0im
  1.9285482675487435 - 8.967725221980002im
 -1.9324186608105642 - 0.01930258602426438im
                 0.5 + 2.598076211353316im
  6.0038703932618205 + 3.752270213249106im
  6.0038703932618205 - 3.752270213249106im
                 0.5 - 2.598076211353316im
 -1.9324186608105642 + 0.01930258602426438im
  1.9285482675487435 + 8.967725221980002im

Step 2: Compute $p(x) q(x)$ at $2n-1$ points.

julia> pqvals = pvals .* qvals9-element Vector{ComplexF64}:
               238.0 + 0.0im
 -100.85468292765191 + 20.463620169633234im
  -3.270411622817098 - 8.822936568953224im
                16.0 - 1.7320508075688772im
 -17.374905449530996 + 4.488434009006639im
 -17.374905449530996 - 4.488434009006639im
                16.0 + 1.7320508075688772im
  -3.270411622817098 + 8.822936568953224im
 -100.85468292765191 - 20.463620169633234im
julia> ifft(pqvals)9-element Vector{ComplexF64}:
                3.0 + 0.0im
 10.000000000000002 + 0.0im
 15.000000000000002 - 5.950848353424417e-16im
               37.0 + 0.0im
               43.0 - 6.837200662697115e-16im
               46.0 + 6.837200662697115e-16im
               50.0 + 0.0im
 21.999999999999996 + 7.441926703969614e-16im
 11.999999999999998 + 0.0im

Summarize:

function fast_polymul(p::AbstractVector, q::AbstractVector)
    pvals = fft(vcat(p, zeros(length(q)-1)))
    qvals = fft(vcat(q, zeros(length(p)-1)))
    pqvals = pvals .* qvals
    return real.(ifft(pqvals))
end

function fast_polymul(p::Polynomial, q::Polynomial)
    Polynomial(fast_polymul(p.coeffs, q.coeffs))
end

fast_polymul (generic function with 2 methods)

A similar algorithm has already been implemented in package Polynomials. One can easily verify the correctness.

julia> p * qPolynomial(3 + 10*x + 15*x^2 + 37*x^3 + 43*x^4 + 46*x^5 + 50*x^6 + 22*x^7 + 12*x^8)
julia> fast_polymul(p, q)Polynomial(3.0 + 10.000000000000002*x + 15.000000000000002*x^2 + 37.0*x^3 + 43.0*x^4 + 46.0*x^5 + 50.0*x^6 + 21.999999999999996*x^7 + 11.999999999999998*x^8)

Application 2: Image compression

If you google the logo of the Hong Kong University of Science and Technology, you will probably find the following png of size $2000 \times 3000$.

using Images
img = Images.load("../assets/images/hkust-gz.png")

It is too large! We can compress it with the Fourier transformation algorithm. To simplify the discussion, let us using the gray scale image.

gray_image = Gray.(img)

The gray scale image uses 8-bit fixed point numbers as the pixel storage type.

julia> typeof(gray_image)Matrix{Gray{N0f8}} (alias for Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8, 8}}, 2})
julia> img_data = Float32.(gray_image)2000×3000 Matrix{Float32}:
 0.376471  0.376471  0.376471  0.376471  …  0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471  …  0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 ⋮                                       ⋱
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471  …  0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
 0.376471  0.376471  0.376471  0.376471     0.376471  0.376471  0.376471
julia> img_data_k = fftshift(fft(img_data))2000×3000 Matrix{ComplexF32}:
  14.1177-9.69074f-7im  -16.5889+1.66456im  …  -16.5889-1.66456im
 -12.1824-10.0836im      19.6015+5.63716im      10.0119+14.3428im
  7.55086+13.5317im     -17.3148-3.91325im     -2.42017-22.694im
 -3.00379-9.20718im       11.248-1.95294im     -4.79038+19.1743im
  1.09106+1.94913im     -5.93928+3.75958im      11.0074-5.10588im
 -2.26381+2.12877im      5.33122+2.14914im  …  -14.9784-10.0922im
  4.27849-0.864077im    -8.88375-11.6353im       14.901+16.4903im
 -3.59109-2.92389im      11.5189+16.8775im     -10.5282-11.4876im
 -1.89706+4.89608im     -8.13688-13.6709im      4.82196+1.34339im
  10.6636-3.25829im     -1.22232+4.9903im      -2.45761+4.52942im
         ⋮                                  ⋱
  10.6636+3.25828im     -2.45761-4.52942im     -1.22232-4.9903im
 -1.89706-4.89609im      4.82196-1.34339im     -8.13687+13.6709im
  -3.5911+2.9239im      -10.5282+11.4876im      11.5189-16.8775im
  4.27848+0.864073im      14.901-16.4903im     -8.88375+11.6353im
 -2.26382-2.12877im     -14.9784+10.0922im  …   5.33123-2.14914im
  1.09106-1.94913im      11.0074+5.10589im     -5.93928-3.75958im
  -3.0038+9.20717im     -4.79038-19.1743im       11.248+1.95294im
  7.55085-13.5317im     -2.42017+22.694im      -17.3148+3.91326im
 -12.1824+10.0836im      10.0119-14.3429im      19.6015-5.63716im

it is sparse!

Gray.(abs2.(img_data_k) ./ length(img_data_k))

We can store it in the sparse matrix format.

# let us discard all variables smaller than 1e-5
img_data_k[abs.(img_data_k) .< 1e-5] .= 0
sparse_img = sparse(img_data_k)
compression_ratio = nnz(sparse_img) / (2000 * 3000)
recovered_img = ifft(fftshift(Matrix(sparse_img)))
Gray.(abs.(recovered_img))