How to implement the FFT algorithm

João Martins

4.02/5 (72 votes)

Jan 23, 2005

7 min read

1103170

29159

An article on how to implement the FFT algorithm in C, C++ or C#.

Download source - 26.9 Kb

Sample Image

Introduction

Basically, this article describes one way to implement the 1D version of the FFT algorithm for an array of complex samples. The intention of this article is to show an efficient and fast FFT algorithm that can easily be modified according to the needs of the user. I've studied the FFT algorithm when I was developing a software to make frequency analysis on a sample of captured sound.

Background

First of all, 95% of this code wasn't written by me. This is practically the code that is described in the book Numerical Recipes In C of 1982!!! Yes, more than 20 years ago!!! But still, in my opinion, very, very good. But this code is slightly different from the original one. When I was studying the algorithm, I had noticed a pattern that could be exploited, and based on that, I've managed to improve the algorithm with a small change in the code, and the O() (The Big O) (unit measure to the complexity of the algorithm) is reduced in N computations. (After implementing the improved method successfully, I made some research in the web and realized I discovered nothing new. I've just noticed something that someone had already seen :-P)

I will not get "deep in theory", so I strongly advise the reading of chapter 12 if you want to understand "The Why". Other forms of the FFT like the 2D or the 3D FFT can be found on the book too.

The FFT

The Fast Fourier Transform is an optimized computational algorithm to implement the Discreet Fourier Transform to an array of 2^N samples. It allows to determine the frequency of a discreet signal, represent the signal in the frequency domain, convolution, etc... This algorithm has a complexity of O(N*log2(N)). Actually, the complexity of the algorithm is a little higher because the data needs to be prepared by an operation called bit-reversal. This bit-reversal section is presented in the Numerical Recipes In C as a O(2N) complexity. With a small change I've made to the code presented here, it makes it in O(N). This represents something like an 8% improvement of performance.

Example of a signal in the frequency domain.

The FFT is calculated in two parts. The first one transforms the original data array into a bit-reverse order array by applying the bit-reversal method. This makes the mathematical calculations of the second part "much more easy". The second part processes the FFT in N*log2(N) operations (application of the Danielson-Lanzcos algorithm).

Let's start with an array of complex data. This array could be, for example, in this case, an array of floats in witch the data[even_index] is the real part and the data[odd_index] is the complex part. (This can be adapted to an array of real data, just by filling the complex values with 0s, or use the real array FFT implemented on the book.) The size of the array must be in an N^2 order (2, 4, 8, 16, 32, 64, etc...). In case the sample doesn't match that size, just put it in an array with the next 2^N size and fill the remaining spaces with 0s.

Just a small and not very significant consideration: the original code uses data arrays considering the beginning of the information is in index 1 -> data[1], and data[0] is ignored. My code modifies that to start from 0 -> data[0].

First, we define the FFT function:

//data -> float array that represent the array of complex samples
//number_of_complex_samples -> number of samples (N^2 order number) 
//isign -> 1 to calculate FFT and -1 to calculate Reverse FFT
float FFT (float data[], unsigned long number_of_complex_samples, int isign)
{
    //variables for trigonometric recurrences
    unsigned long n,mmax,m,j,istep,i;
    double wtemp,wr,wpr,wpi,wi,theta,tempr,tempi;

The Bit-Reversal Method

First, the original array must be transformed in order to perform the Danielson-Lanzcos algorithm. For example, the complex[index] will swap places with the complex[bit-reverse-order-index]. If the index (in binary) is 0b00001, the bit-reverse-order-index will be 0b10000. In figure 1, you can see what happens to the data array after the transformation.

The implementation of this method according to Numerical Recipes In C goes like this:

    //the complex array is real+complex so the array 
    //as a size n = 2* number of complex samples
    // real part is the data[index] and the complex part is the data[index+1]
    n=number_of_complex_samples * 2; 

    //binary inversion (note that 
    //the indexes start from 1 witch means that the
    //real part of the complex is on the odd-indexes
    //and the complex part is on the even-indexes
    j=1;
    for (i=1;i<n;i+=2) { 
        if (j > i) {
            //swap the real part
            SWAP(data[j],data[i]); 
            //swap the complex part
            SWAP(data[j+1],data[i+1]);
        }
        m=n/2;
        while (m >= 2 && j > m) {
            j -= m;
            m = m/2;
        }
        j += m;
    }

The SWAP goes outside the function and it's something like this:

#defineSWAP(a,b)tempr=(a);(a)=(b);(b)=tempr
//tempr is a variable from our FFT function

Figure 1 (8-length data array)

If you pay attention at figure 1, you can see a pattern. Let's see: divide the array in half with a mirror. If you look at the reflecting side of the mirror, you will see exactly the same thing of what's in the other side. This mirrored effect allows you to do the bit-reversal method in the first half of the array and use it almost directly in the second half. But you must be careful with one thing. You can only apply this effect if the change happens in the first half only. This means that if the change is between an index of the first half and an index from the second, this is not valid, otherwise you would be making the swap and then undoing it again (do this on a 16 length data array and you will understand what I'm saying). So the code becomes something like this:

    //the complex array is real+complex so the array 
    //as a size n = 2* number of complex samples
    // real part is the data[index] and 
    //the complex part is the data[index+1]
    n=number_of_complex_samples * 2; 

    //binary inversion (note that the indexes 
    //start from 0 witch means that the
    //real part of the complex is on the even-indexes 
    //and the complex part is on the odd-indexes
    j=0;
    for (i=0;i<n/2;i+=2) {
        if (j > i) {
            //swap the real part
            SWAP(data[j],data[i]);
            //swap the complex part
            SWAP(data[j+1],data[i+1]);
            // checks if the changes occurs in the first half
            // and use the mirrored effect on the second half
            if((j/2)<(n/4)){
                //swap the real part
                SWAP(data[(n-(i+2))],data[(n-(j+2))]);
                //swap the complex part
                SWAP(data[(n-(i+2))+1],data[(n-(j+2))+1]);
            }
        }
        m=n/2;
        while (m >= 2 && j >= m) {
            j -= m;
            m = m/2;
        }
        j += m;
    }

The Danielson-Lanzcos

The second half of the code goes exactly like it is described in the book. This applies the N*log2(N) trigonometric recurrences to the data. The code I will show here is my version (data starts at index 0):

    //Danielson-Lanzcos routine 
    mmax=2;
    //external loop
    while (n > mmax)
    {
        istep = mmax<<  1;
        theta=sinal*(2*pi/mmax);
        wtemp=sin(0.5*theta);
        wpr = -2.0*wtemp*wtemp;
        wpi=sin(theta);
        wr=1.0;
        wi=0.0;
        //internal loops
        for (m=1;m<mmax;m+=2) {
            for (i= m;i<=n;i+=istep) {
                j=i+mmax;
                tempr=wr*data[j-1]-wi*data[j];
                tempi=wr*data[j]+wi*data[j-1];
                data[j-1]=data[i-1]-tempr;
                data[j]=data[i]-tempi;
                data[i-1] += tempr;
                data[i] += tempi;
            }
            wr=(wtemp=wr)*wpr-wi*wpi+wr;
            wi=wi*wpr+wtemp*wpi+wi;
        }
        mmax=istep;
    }

How to use the FFT

Let's see now how to use the FFT. Imagine that you want to collect a sample of a signal or a function, and you want to know the fundamental frequency of it. This sample may come from any source: a function that you've inserted in the code, a piece of captured sound, etc...

Let's say that the signal is a real array signal (just like the sound capture buffer), how do I use the FFT??? First of all, you need to choose the FFT variant that you will use. There is a specific variant for real arrays, but in this case, I will use this. It's not the most efficient, but it's easy to use.

Next concern is the amount of data you're going to send to the FFT and the sample rate. The sample rate must be a 2^N number, but you don't need to send an array of 2^N samples to be processed (read the NR for different implementations). You can just send 50 samples, for example, and fill the remaining array with 0s. But remember, the more data you send for calculation, the more precise is the FFT.

After the real array has been passed to a complex array with the complex part equal to 0, you compute the FFT.

And now for the results

After the FFT is calculated, you can use the complex array that resulted from the FFT to extract the conclusions.

If you are interested in knowing the fundamental frequency of the signal, find the absolute maximum of the array, and the frequency will be given by the index of the array. The absolute value of a complex number is the square root of the square of the real plus the square of the complex.

If the absolute maximum occurs in the complex number given by indexes [102][103] (real, complex), then your fundamental frequency will be (102/2)=61Hz. You have to divide it in half because the array is twice longer (remember: real, complex), so the result is not the index position (102), but half (61).

If your intention is to draw the Fourier signal, it goes the same way. Frequency 30 is given by the absolute value of complex [30][31], etc. etc. ...remember. The second half of the computed FFT array must be ignored due to the Nyquist redundancy (the minimum sample rate must twice the highest frequency of the signal). It's only a mirror of the first half. If you want to measure frequencies up to 6000, you will need the next 2^N number next to 2*6000.

See the example were I apply the FFT to a Sine signal. It's on the OnPaint function of the CChildView class. The FFT is implemented on the CFourier class. The example is a stupid example and has a stupid structure, but I think it's easy to understand. Change the parameters, play with it, try different things, and see the results. This way, you will be able to take your own conclusions.

Precautions

If you intend to use this FFT implementation, read the NR license.

There are lots of documentation on this matter. It's not an easy thing to understand, but I think it's a very interesting subject. ;)