Here's where I should explain to you how the FFT works. But I'm going to cheat and just point you to some existing pages:

- First, the FFT itself,
in 1D and 2D. His explanations are awesome, but his code
isn't very fast, mostly because of the divide-by-n at the
end. If you replace this with a multiply by 1.0/n, it's
actually within a factor of two or three of the really fast CPU
FFT implementations like the "Fastest Fourier Transform in the
West", fftw, although it's still an order of magnitude
slower than good GPU versions like CUDA cuFFT.
Be aware he puts the (0,0) frequency (DC coefficient) in
the *middle* of his images, but it's more common to compute this
at the origin (0,0), which lies in the corner of the image.

- Second, the FFT for image filtering. Bottom line is that "low pass" filters let through only the long low spatial frequencies, giving blurring; "high pass" filters let through only short high-frequency components, giving edge detection. Note the "filters" on the left in the examples are in frequency domain--these are the values multiplied by the image's FFTs. There are often more efficient ways to do blurring, however!
- FFT_inverse(FFT(A)*FFT(B)) isn't A*B at a single offset, it's the integral of A*B at all possible offsets, a correlation image. This means the FFT is often used for image matching. Paul Bourke has some examples of this, but they don't seem very clear to me, so I'm explaining it below.
- Many good simulations that use a grid, use a Fourier transform
as an efficient way to convolve a function across every grid
point. For example, NAMD used an FFT correlation to
efficiently solve the electrostatic 1/r^2 across a 3D charge
grid.

- Some simulations like fluid
dynamics can use the FFT to perform a tricky whole-image
problem like divergence correction (Helmholtz).
Mathematically, this is because sin and cos have very simple
derivatives, so we can do differentiation and integration more
easily in FFT space.

float correlation_image(int dx,int dy, image &a,image &b) {The idea is that once we finally figure out the offset (dx,dy) that lines up the two images, the bright parts of a will match up with the bright parts of b, and the value in the correlation image will be large.

float sum=0.0;

for (int y=0;y<a.ht();y++)

for (int x=0;x<a.wid();x++)

sum+=a.at(x,y)*b.at(x+dx,y+dy);

return sum;

}

We can use the FFT to do "fast convolution"--we can compute all the pixels in a correlation image by complex-conjugate-multiplying the FFT'd versions of images a and b:

/// ... fill imgA with the search image, and imgB with the chip to search for ...There's only one problem with this--it works like crap!

fftwf_execute(imgA_fwfft);

fftwf_execute(imgB_fwfft);

for (y=0;y<fftH;y++)

for (x=0;x<fftW;x++) {

fcpx a=imgA.at(x,y);

fcpx b=imgB.at(x,y);

fcpx c=(a)*std::conj(b); /* complex multiply a by the complex conjugate of b */

imgC.at(x,y)=c;

}

fftwf_execute(imgC_bwfft);

/// ... imgC is the "correlation image": pixel (dx,dy) is the product of imgA with imgB shifted by (dx,dy)

When searching for this imgB:

Here's imgA (the image to search in) and imgC (the correlation image):

imgA (search image) |
imgC (correlation image) |

However, if we do a high-pass (edge detection) filter on the incoming images, for example by zeroing out all the low FFT frequences, we get far sharper correlations. Here's the code:

/// ... fill imgA with the search image, and imgB with the chip to search for ...And here are the resulting images. Note the strong correlation peak where the book image lines up:

fftwf_execute(imgA_fwfft);

fftwf_execute(imgB_fwfft);

for (y=0;y<fftH;y++)

for (x=0;x<fftW;x++) {

fcpx a=imgA.at(x,y);

fcpx b=imgB.at(x,y);

fcpx c=(a)*std::conj(b); /* complex multiply a by the complex conjugate of b

// Apply high-pass edge detecting filter

int fx=x; if (fx>(fftW/2)) fx-=fftW;

int fy=y; if (fy>(fftH/2)) fy-=fftH;

float r2=(fx*fx+fy*fy); // square of radius: discrete frequency bins

if (r2<128) c=0; // zero out low frequencies

imgC.at(x,y)=c;

}

fftwf_execute(imgC_bwfft);

/// ... imgC is the "correlation image": pixel (dx,dy) is the product of imgA with imgB shifted by (dx,dy)

imgA (search image) |
imgC (correlation image) |

Now we can handle a small
leftward translation. |
Note the shifted correlation
peak. |

Translating to the right. |
Note the shifted correlation
peak. |

Obscuring part of the search
image. |
We still get a nice
correlation peak. |

Rotating the search image
slightly (5 degrees or so). |
The peak is wider, but still
there. |

Rotating the search image
more (15 degrees or so). |
The peak is now
gone--correlation can't deal with much rotation. |

Searching an unrelated image. |
Now there are many
peaks--"false correlations". |

The bottom line is that image-to-image correlation is a powerful technique that can be used for:

- Correcting translation errors in scanned documents, satellite images, etc.
- Creating seamless
panoramas from a sequence of images.

- Automatically matching stereo images (more on this next lecture!).