For a few years I have been frustrated at having no time to write up my image analysis methodology into academic journal papers.  I am becoming worried that the world is moving on (e.g., from image analysis to video analysis) and may not considerable it feasible to backtrack to use my methods, far superior though they may be to the ones the world is using.  My one published paper on algorithms was only the first in a series documenting a large body of work.

Due to a concatenation of circumstances, which from my point of view is characterised as intransigence by The University of Queensland, I am not employed in image analysis and have to write up this work in my "spare time".  I returned to fishery population dynamics which I find very interesting although a much smaller mathematical community than image analysis.  I also make a large investment of my time in keeping up and improving my skills as a classical pianist, which I feel it would be a crime against humanity not to do.  I sometimes wish that my skill as a musician had been given to somebody else so that I could have been a listener instead of a performer, but I can't do anything about that.

Anyway, to return to the point, I am a big fan of the continuous wavelet transform (CWT) as opposed to the discrete wavelet transform (DWT).  Most people prefer the DWT and its relatives but in my view that is only because the appropriate theory for the CWT has never been published and hence practitioners don't know how best to apply the CWT.  In technical terms, the CWT uses what is called an "overcomplete basis" which means that practitioners can tune it to fit their image and choose which wavelet coefficients they want to use to represent their image.  With the DWT there is very little choice: we take what the DWT provides whether it accurately fits our image or not.

For the CWT I am also a big fan of pyramidal algorithms.  These algorithms operate in the spatial domain as opposed to the frequency domain (Fourier transform).  They are based on two-scale relations.  Roughly speaking, their computational efficiency comes from calculating the wavelet transform (WT) at a particular scale from the alread-computed WT at a finer scale.  Pyramidal algorithms are faster than frequency-domain algorithms based on the Fast Fourier Transform (FFT) but not by as big a margin as we might expect.  The FFT is actually an astonishingly efficient way of doing things, and spatial-domain algorithms to do any better have to be very craftily designed.  The comparable speeds of the best spatial-domain and frequency-domain algorithms are described in my published paper which as far as I know is the only place where such a comparison can be found.

As a word of caution about pyramidal algorithms, I have to warn that they are not suitable for modulated wavelets, i.e., wavelets that involved a factor of exp(ix).  The archetypal modulated wavelet is the Gabor wavelet or more correctly the Morlet wavelet which is very similar to the Gabor wavelet but meets the technical definition of a wavelet.  Many other modulated wavelets have been developed over the years; see the book "Two-dimensional Wavelets and Their Relatives" by my friend Jean-Pierre Antoine et al. who has been very kind to me despite having to put up with my suffering chronic fatigue and consequently being out of contact for lengthy periods of time.

In my opinion, however, modulated wavelets are not desirable for wavelet analysis because their zero crossings (values of x at which the real or imaginary part of the wavelet is zero) tend to be very evenly spaced.  This creates two problems: (1) modulated wavelets are very sensitive to noise at particular frequencies, and (2) they can miss important frequencies in a signal if those frequencies don't correspond very closely to frequencies that are used in the wavelet analysis.  Practical wavelet analysis is a discrete technique.  It is not possible to examine all possible frequencies, much as we might like to.  Typically, the scale increases by geometrically, by a constant factor at each step as we advance from very fine to very coarse scales.  This factor is characterised by the number of "voices per octave" or steps per doubling of scale.  A coarse analysis would use one voice per octave (which most implementations of the DWT do).  A fine analysis might use four even twelve voices per octave.  Whatever style of analysis is used, there is still plenty of room in between successive scales for a signal to slip in between the cracks in the analysis.  Extreme regularity of zero crossings tends, in my experience, to grossly aggravate this problem.

Back to frequency-domain algorithms, the major thing that lets them down is the gritty detail of interpolating the wavelet to the frequencies used in the discrete Fourier transform (DFT).  This is in my opinion a major problem and there is no satisfactory means of solving it.  For anisotropic wavelets (e.g., directional wavelets), the problem is exacerbated by the need to rotate the wavelet and then still interpolate the wavelet's Fourier transform to the original non-rotated points of the DFT.  Therefore I would still prefer pyramidal algorithms even if they offered no computational speed advantages over frequency-domain algorithms.

My published paper is the first to show how an isotropic two-scale relation and its associated pyramidal algorithm can be used to implement the CWT for an anisotropic wavelet.  The pyramidal algorithm can use the original Cartesian pixels with no need for rotation, and thereby overcomes a huge number of problems that can be caused by interpolation of a rotated wavelet to the original pixels.

The next steps in describing the body of work I need to publish on the CWT will be taken up in the next instalment of this blog.

Login Form