Demystifying Fourier analysis

Intro

I've started learning about spectrum analysis a long time ago at university, but failed to gain a true insight in how it works under the hood. For me, it was just some magical math formulas I had to memorize and know about. I knew the theory but it didn't really click.

A few years ago I wanted to do some audio programming and play around with analyzing sound. That's how I got interested into building an intuition on how the Fourier transform actually works.

I made these interactive examples mostly to improve my own understanding and learn by doing. I am following a mechanistic approach and building from the ground up with simpler building blocks, rather than introducing the formula and saying this is what it means. Since I mainly do web programming for a living, this seems to be easier to wrap my mind around than some abstract math.

My language and math are very simplified, and probably a bit imprecise or not completely correct. I am also not including explanations for everything (like trigonometry functions etc, because that's out of scope) and completely ignoring some details that I didn't think are relevant (like negative frequencies).

Feel free to submit any corrections or feedback in the github repo.

Sound

Sound comprises of oscillations or repeating patterns in pressure that propagate in a medium. Sound waves typically have complex patterns repeating at different frequencies. We can typically reconstruct any complex wave form from other periodic waves.

Sinusoidal waves are a good choice for this because they have nice mathematical properties. That is, sound waves are not made out of sines as real-world phenomena.

UPDATE

As I've come to learn from the excellent book A digital signal processing primer, sinusoids do in fact have a great deal with vibrations and oscillating motions in the real world.

To quote one reddit comment:

The only things "perfect" about a sine wave are its mathematical conveniences, it has no special connection to real-world phenomena. You could model those phenomena with equal precision with other waveforms. ~RickRussellTX

Sine waves also represent up-and-down motion of pistons on a crankshaft:

I suspect that the obsession with sine waves comes from the fact that in our daily machine-driven lives we have an awful lot of machines that rotate around an axis, or are attached to crankshafts and what-have you. A piston on a crankshaft produces "pure" sinusoidal up-and-down motion when operating at constant angular velocity. ~RickRussellTX

Sinusoids in fact cannot correctly reproduce discontinuous signals, but they are good enough for band-limited signals.

Probing with a single sine wave

Let the probing... Begin!

We start from a relatively straightforward approach of probing or analyzing the target signal by doing a sine transform as originally done by Fourier himself. I'm using the term probe here, I've read it somewhere else already and it feels appropriate, but these are usually called analyzing functions.

The idea is to multiply the signal with a pure sine wave. The resulting transform is the area, which can tell us how closely the signal aligns with our test sine wave.

Here is a simplified formula for this idea: $$\mathit{transform} = \mathit{area}(\ \mathit{target\ signal} \times \mathit{probe}\ )$$

For completeness sake, let's also include the math formula for our sine transform that analyzes the presence of frequency s in the target signal:

$$F_{s} = \int_{-\infty}^\infty f(t) \times \sin(2πst) dt$$

\(sin(2πst)\) is our sine analysis function (aka probe) and \(f(t)\) is our target signal.

Our target signal is a 4Hz sine wave. We can multiply it by a single sine probe with a fixed phase. This means it won't slide left or right, just change its frequency.

When we aggregate all the values in the transform together, we get a high positive result if the probe correlates with the signal. This number represents the magnitude of the transform. For non-matching frequencies the result is zero because all the peaks and troughs cancel out.

This is all there is to it, the magic trick behind the transform. The total sum of a single sine wave averages out to zero, because there are positive and negative sides. But when we multiply it with a correlating signal, multiplying the negatives will turn them into positive numbers, effectively resulting in a non-zero sum of the transform.

Notice how the resulting magnitude also depends on the amplitude of the signal we want to analyze.

Probing and phase

Let's try to analyze a signal that is still 4Hz but offset by a quarter of a turn (phase is π/2).

Why can't we match the signal?

The sine probe is out of phase with the target signal and won't correlate. Sinusoidal waves are periodic and repeat from the beginning every one turn around the circle, 360° or 2π radians. When the phase of our probe doesn't match the phase of the target wave, we won't get a match.

Let's turn our probe into a cosine wave. This will bring it back by a quarter of a turn to match the signal perfectly.

Notice how we're also getting a so-called DC offset of magnitude 1 for all the frequencies that we are analyzing. The transform for our 0Hz probe is also a replica of the original wave! This is because the cosine function is 1 when the phase is 0. The sine starts at 0, so we don't see the same effect.

Probing with sine and cosine

Let's try combining the sine & cosine transforms in our analysis of a 4Hz wave ofset by π/4 (half way between a sine and a cosine). This is effectively the famous Fourier transform.
We are now dealing with two numbers, which we can show on an 2d xy-plot.

When we pick the 4Hz analysis waves, we can notice how both match partially. From this we can re-construct a complete match. Any movement in phase of the target wave will reflect in the sine and cosine components of our transform.

From the ratio of our sine and cosine matches we can figure out the actual phase and magnitude (as if we were using a sinusoid of that phase).

The magnitude becomes the length of the diagonal line, which is equal to \(\sqrt{{sin}^2 + {cos}^2}\). The phase can be found from the ratio or angle between the sine & cosine components, ie \(atan2({sin}, {cos})\).

Sine/cosine probes and their negatives

We don't need any more probes to cover the rest of the quadrants, since the next two sinusoids (apart by π/2) would just be the negatives of our sine/cosine probes.

$$\sin(π) = {-\sin(0)}, \ \ \cos(π) = {-\cos(0)}$$

Phase

To further drive the point home, let's look at another interactive example.

Here we make the frequencies of both the signal under test and our sine/cosine probes set at 4Hz and just look at the phase of the signal.

Drag the target sine wave left and right. Notice that our sine & cosine probes give a match for any phase in-between.

Analyzing a complex wave

Let's analyze a more complex signal which consists of multiple components, a strong one at 2Hz and two weaker ones at 4Hz & 7Hz. The 4Hz component has a phase of π/3 and the 7Hz a phase of π/4. This example includes a frequency spectrum plot that shows a bar for each analysis probe. Clicking on a frequency bin will show the corresponding analysis function and transform (just as selecting the probe frequency on a slider in previous examples).

Let's examine these anomalies in our next examples.

Spectral spillover

What happens if we analyze frequencies that are not whole numbers, but somewhere in-between? Let's say this time we are analyzing a frequency of 3.5Hz with phase π/3.

See how for non-integer frequencies we don't get a single bar, instead the spectrum is spilling over into adjacent buckets. The actual peak is somewhere between 3Hz & 4Hz. We can figure out the actual peak either by interpolating between the bins, or we can increase the number of bins by creating more granular probes for analyzing.

What's even stranger is how the spill-over shape doesn't just taper out like a bell curve, instead it oscillates up and down, but dampens as it spreads out, creating shapes known as lobes.

The sinc function & spectral resolution

This spilling over happens because our sine isn't infinite in time and we are analyzing frequencies that are not all periodic with the fixed time interval.

An infinite sine would give perfect peaks in the spectrum plot, but our sine signal is showing lobes in neighboring frequency bins. With a finite window we will have many of our probes that are close in frequency give a perceptible match. That's because they or the signal don't fit perfectly in the window (aren't periodic with it) and have some dangling fragments near each side of the window.

In different words, our transforms don't always cancel out when they should, they are sliced off at the ends in the wrong place. If we had an infinite window of time or we tailored the window to be periodic with each measured frequency and the target signal, all of this noise would cancel out and leave a pure spike for the probe that fully matches our sinusoid.

To illustrate the effect, here is an example of 2.5Hz wave and a 1-second time window. Increasing the time interval changes the spectral resolution and we get a cleaner spike for the target frequency. This shape of the frequency spectrum with side lobes spreading out is known as the sinc function (in this case, it's absolute, ie |sinc|).

Periodic window & frayed ends

We can demonstrate this requirement to have a periodic window of time in the following interactive example.

There is a 2Hz target sine wave and a 4Hz sine probe. The time interval can be increased in increments of 0.125s or half of the probe's period. Notice how when the window size is periodic with the sine transform, we get a perfect cancellation. This is the behavior we want, our 4Hz probe isn't supposed to give a match for the 2Hz target signal. We get partial matches when the probe sine does not perfectly fit in the window (isn't periodic with it).

This is where those lobes and the sinc shape come from. Note that for higher frequencies these fragments are smaller and have less effect on the transform, causing smaller spectrum lobes.

Complex numbers

How do complex numbers come into play?

There is a common way to write down the Fourier transform for frequency s that uses complex numbers: $$F_s = \int_{-\infty}^\infty f(t)\cdot(\cos(2πst) - i \cdot\sin(2πst))dt$$

From my understanding, it's just a convenient way to write sine & cosine components in a single math expression. Multiplying the sine transform with the imaginary number separates it from the cosine component, since real & imaginary numbers don't mix.

Another common notation uses Euler's identity: $$e^{ix} = \cos{x} + i\sin{x}$$ which gives us this beautiful condensed version: $$F_s = \int_{-\infty}^\infty f(t)\cdot e^{-i2πst}dt$$

Conclusion

We can analyze frequencies in any waveform by multiplying it with other periodic waves for each of the frequencies we want to find. These analysis waveforms, aka probes must have some parts above zero and some below zero so their sum averages to zero. Sines and cosines are good candidates, the combination of these two transforms is sufficient to detect a frequency regardless of where it is positioned in the wave, ie its phase (or even its duration). But they exhibit issues when analyzing limited time intervals, since they don't taper off and instead extend to infinity.