Wednesday, October 14, 2009

Convolution confusion

So suppose you have a function that you’d like to modify.  Maybe you don’t like the way the function treats certain inputs.  Maybe you wish the function emphasized some aspect of its input over other aspects.  Maybe you just don’t think the function is aesthetically pleasing.

So you modify the function.  One way to modify the function is to add a constant value at every point.  This is a linear modification.  Since the fourier transform of a function is based on integrating sinusoidal basis functions, and linearity is preserved in integration, you can make this modification in either the time domain OR the frequency domain.

Suppose adding a constant to the function at every point doesn’t modify the function to your liking.  It’s a pretty broad brush change.  Instead of enhancing a particular aspect of the function it, at least in the time domain, merely shifts the function.

Another way to modify the function would be to multiply it by a constant value.  This scaling will also survive the translation to the frequency domain because the Integral(a * f(x) dx) is equal to a * Integral(f(x) dx).  This modification is similar to the first modifcation (adding a constant) in that it’s not selective at all.  It’s very broad brush.

What you really want to be able to do is to modify the input function selectively.  That is, you’d like to use another function to modify the original function.

Enter convolution.  Through some rather heroic mathematics this wonderful operation allows you to use another function, call it h(x), to modify your original function, call it f(x), to produce your output function, call it g(x).

Convolution is a binary operation that is implemented by multiplying the Fourier transforms of the 2 operands.

If f(x) and h(x) are the original function and the modifying function in the time domain then F(u) and H(u) are their corresponding Fourier transforms (we call this the frequency domain).

f(x) convolve h(x) can be performed by multiplying F(u) * H(u) then taking the inverse Fourier transform to produce g(x) – the modified version of the original input.  H(u) has many names, one of which is the transfer function.

Once we’re in the frequency domain, we can make H(u) pretty much whatever we want.  The simplest non-trivial function would be one that multiples F(u) by 1 if it’s below a certain value and 0 if it’s above.  This is an ideal low pass filter.  By ideal it is meant that there is a sharp cutoff – the signal is passed entirely when inside the cutoff and not passed at all when outside.

A key point to keep in mind is that H(u) operates in the frequency domain.  That is, we’re modifying frequencies, which will ultimately produce a modification in the time domain.  I believe the art part of using the Fourier Transform, is figuring out how to modify the function in the frequency domain in a way that produces the desired result in the time domain.

So, for instance, we know that sharp edges in the time/spatial domain require high frequency components in the frequency domain.  To enhance sharp edges via convolution requires that we filter out low frequency components.  To smooth/blur an image (that is, reduce sharp edges) requires the exact opposite (filtering out high frequency components instead).

No comments :

Post a Comment