






Introduction
The Fast Fourier Transform (FFT) allows users to view the spectrum content of an audio signal. The FFT code presented here was written by Don Cross, his homepage appears to have subsequently been taken down. Rather than explain the mathematical theory of the FFT, I will attempt to explain its usefulness as it relates to audio signals.
The FFT allows users to obtain the spectral makeup of an audio signal, obtain the decibels of its various frequencies, or obtain the intensity of its various frequencies. Spectral viewers (shown in the image above), Equalizers, or VU-Meters may all use the FFT in order to display their results. The difference between them then depends upon one of a couple of equations that take the real and imaginary components of the FFT, and return either the intensity or decibel levels to be used in the graphed result. The following code takes both the real and imaginary components of the FFT result, and returns the intensity and decibels.
inline double GetFrequencyIntensity(double re, double im)
{
return sqrt((re*re)+(im*im));
}
#define mag_sqrd(re,im) (re*re+im*im)
#define Decibels(re,im) ((re == 0 && im == 0) ? (0) :
10.0 * log10(double(mag_sqrd(re,im))))
#define Amplitude(re,im,len) (GetFrequencyIntensity(re,im)/(len))
#define AmplitudeScaled(re,im,len,scale) ((int)Amplitude(re,im,len)%scale)
The FFT uses the audio signal as its real component, and uses a NULL pointer for its imaginary component indicating that the imaginary data does not exist. Upon its return, the FFT will return both the real and imaginary data components based upon the data given as the real component. The is mirrored with the return samples so that 0-FFT_LEN/2 contains the data, and FFT_LEN/2 to FFT_LEN contains a reverse of the data. This mistake was corrected in my code. The code that performs the FFT follows:
DWORD nCount = 0;
for (DWORD dw = 0; dw < FFT_LEN; dw++)
{
{
finleft[nCount] = (double)((short*)pwh->lpData)[dw++];
finright[nCount++] = (double)((short*)pwh->lpData)[dw];
}
}
fft_double(FFT_LEN/2,0,finleft,NULL,fout,foutimg);
float re,im,fmax=-99999.9f,fmin=99999.9f;
for(int i=1;i < FFT_LEN/4-1;i++)
{
re = fout[i];
im = foutimg[i];
fdraw[i]=AmplitudeScaled(re,im,FFT_LEN/2,256);
if (fdraw[i] > fmax)
{
fmax = fdraw[i];
}
if (fdraw[i] < fmin)
{
fmin = fdraw[i];
}
}
int nAvg, nBars=16, nCur = 0;
for(int i=1;i < FFT_LEN/4;i++)
{
nAvg = 0;
for (int n=0; n < nBars; n++)
{
nAvg += (int)fdraw[i];
}
nAvg /= nBars;
TRACE("Average for Bar#%d is %d\n",nCur++,nAvg);
i+=nBars-1;
}
DataHolder* pDataHolder = (DataHolder*)lpData;
CFrequencyGraph* pPeak = (CFrequencyGraph*)pDataHolder->pData;
if (::IsWindow(pPeak->GetSafeHwnd()))
{
pPeak->SetYRange(0,256);
pPeak->Update(FFT_LEN/4,fdraw);
}
fmax=-99999.9f,fmin=99999.9f;
fft_double(FFT_LEN/2,0,finright,NULL,fout,foutimg);
fdraw[0] = fdraw[FFT_LEN/4] = 0;
for(i=1;i < FFT_LEN/4-1;i++)
{
re = fout[i];
im = foutimg[i];
fdraw[i] = Decibels(re,im);
if (fdraw[i] > fmax)
{
fmax = fdraw[i];
}
if (fdraw[i] < fmin)
{
fmin = fdraw[i];
}
}
CFrequencyGraph* pPeak2 = (CFrequencyGraph*)pDataHolder->pData2;
if (::IsWindow(pPeak2->GetSafeHwnd()))
{
pPeak2->SetNumberOfSteps(50);
pPeak2->SetYRange((int)fmin,(int)fmax);
pPeak2->Update(FFT_LEN/4,fdraw);
}
This code is contained in a callback function that is called every time the waveIn functions return with updated audio signal data. The code that actually performs the FFT looks like:
void fft_double (unsigned int p_nSamples, bool p_bInverseTransform,
double *p_lpRealIn, double *p_lpImagIn,
double *p_lpRealOut, double *p_lpImagOut)
{
if(!p_lpRealIn || !p_lpRealOut || !p_lpImagOut) return;
unsigned int NumBits;
unsigned int i, j, k, n;
unsigned int BlockSize, BlockEnd;
double angle_numerator = 2.0 * PI;
double tr, ti;
if( !IsPowerOfTwo(p_nSamples) )
{
return;
}
if( p_bInverseTransform ) angle_numerator = -angle_numerator;
NumBits = NumberOfBitsNeeded ( p_nSamples );
for( i=0; i < p_nSamples; i++ )
{
j = ReverseBits ( i, NumBits );
p_lpRealOut[j] = p_lpRealIn[i];
p_lpImagOut[j] = (p_lpImagIn == NULL) ? 0.0 : p_lpImagIn[i];
}
BlockEnd = 1;
for( BlockSize = 2; BlockSize <= p_nSamples; BlockSize <<= 1 )
{
double delta_angle = angle_numerator / (double)BlockSize;
double sm2 = sin ( -2 * delta_angle );
double sm1 = sin ( -delta_angle );
double cm2 = cos ( -2 * delta_angle );
double cm1 = cos ( -delta_angle );
double w = 2 * cm1;
double ar[3], ai[3];
for( i=0; i < p_nSamples; i += BlockSize )
{
ar[2] = cm2;
ar[1] = cm1;
ai[2] = sm2;
ai[1] = sm1;
for ( j=i, n=0; n < BlockEnd; j++, n++ )
{
ar[0] = w*ar[1] - ar[2];
ar[2] = ar[1];
ar[1] = ar[0];
ai[0] = w*ai[1] - ai[2];
ai[2] = ai[1];
ai[1] = ai[0];
k = j + BlockEnd;
tr = ar[0]*p_lpRealOut[k] - ai[0]*p_lpImagOut[k];
ti = ar[0]*p_lpImagOut[k] + ai[0]*p_lpRealOut[k];
p_lpRealOut[k] = p_lpRealOut[j] - tr;
p_lpImagOut[k] = p_lpImagOut[j] - ti;
p_lpRealOut[j] += tr;
p_lpImagOut[j] += ti;
}
}
BlockEnd = BlockSize;
}
if( p_bInverseTransform )
{
double denom = (double)p_nSamples;
for ( i=0; i < p_nSamples; i++ )
{
p_lpRealOut[i] /= denom;
p_lpImagOut[i] /= denom;
}
}
}
And it requires the following supporting functions:
bool IsPowerOfTwo( unsigned int p_nX )
{
if( p_nX < 2 ) return false;
if( p_nX & (p_nX-1) ) return false;
return true;
}
unsigned int NumberOfBitsNeeded( unsigned int p_nSamples )
{
int i;
if( p_nSamples < 2 )
{
return 0;
}
for ( i=0; ; i++ )
{
if( p_nSamples & (1 << i) ) return i;
}
}
unsigned int ReverseBits(unsigned int p_nIndex, unsigned int p_nBits)
{
unsigned int i, rev;
for(i=rev=0; i < p_nBits; i++)
{
rev = (rev << 1) | (p_nIndex & 1);
p_nIndex >>= 1;
}
return rev;
}
double Index_to_frequency(unsigned int p_nBaseFreq,
unsigned int p_nSamples, unsigned int p_nIndex)
{
if(p_nIndex >= p_nSamples)
{
return 0.0;
}
else if(p_nIndex <= p_nSamples/2)
{
return ( (double)p_nIndex /
(double)p_nSamples * p_nBaseFreq );
}
else
{
return ( -(double)(p_nSamples-p_nIndex) /
(double)p_nSamples * p_nBaseFreq );
}
}
The included sample class, CFrequencyGraph, will draw a sample EQ, peak meter, and spectral graph using the frequency intensity. Hopefully, this serves as a decent introduction to the uses and basics of the Fast Fourier Transform for audio signals. Other functions of the FFT include using it in combination with a Beat Detection Algorithm to detect beats in an audio signal. Another page with useful FFT information is located at FFT Spectrum Analyser.
This article uses waveIn* functions to retrieve the soundcard's data from the playing source. Therefore, you must manually open your soundcard properties and change the recording source to use mono/stereo mix or waveIn as the selected recording line. Also, if you set the playback wave option to false, this will override the recording options and no sound will appear. For information about doing an inverse FFT for transforming frequencies into an audio signal, please do a Google search on "Inverse FFT" or "IFFT".
History
- Version 1.3: Fixed Spectrum drawing, Changed Pixelgram color, changed Process routine to match new Recording parameters.
- Version 1.2: Fixed many drawing bugs, changed to use amplitude scaled and decibels.
| You must Sign In to use this message board. |
|
|
 |
|
 |
Hi:
I am not sure but i can not understand the relationship between the amplitude and the number of samples. What i can to say is that in this example is clear what is magnitude and intensity of signal, but ¿why the number of samples is factor or amplitude?. For the same amplitude of signal (for example peak to peak to 1Khz), the amplitude is different for 510 samples to 256 samples. In other words, must be always ONE relationship between the "digital values" -> "power of the system (wats, horse power...)" -> "sound power(decibels)"
¿ Can anybody answer me ? ¿ Anybody knows the email of Fred Ackers to solve my question ?
Best regards to all the people
Juanjo
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
Can someone explain mathematically why the Decibel macro is correct? I thought decibels were the ratio of two values, but the macro is using only the intensity of the frequency squared. Thanks.
|
| Sign In·View Thread·PermaLink | 1.50/5 |
|
|
|
 |
|
 |
How would i go about finding the frequency of the audio coming in and then display that in something like an edit box or something similar?
|
| Sign In·View Thread·PermaLink | 2.67/5 |
|
|
|
 |
|
|
 |
|
 |
hi, i got a program from codeproject which playes wave file with directsound. and i want to modify it and add spectrum support to it. but i got some questiones about it, can you help me resolve it?
question one: the program open wave file and read wave file information to struct WAVEFORMATEX, the members of it like wFormatTag=1, nChannels=2, nSamplesPerSec=44100, nAvgBytesPerSec=176400, nBlockAlign=4, wBitsPerSample=16, cbSize=0. then program initial DirectSoundBuffer's size as 176400, we must set size to 176400, it can be set any number?
question two: i add code to log DirectSoundBuffer's current play position and current write position to listbox control, the max value displayed in it which each of them may big than 176400, how can i get right start postion what i want to sample the buffer? code like:
DWORD dwCurPlayPos, dwCurWritePos; m_lpDSB->GetCurrentPosition(&dwCurPlayPos, &dwCurWritePos);
question three: if i get right start postion of buffer, what right sample size may be suggest?
code modified by me can download here.
regards, jacky_zz
|
| Sign In·View Thread·PermaLink | 3.00/5 |
|
|
|
 |
|
 |
Hello,
i don't understand, why the mod-operator % and not the div-operator / is used for scaling the amplitude:
#define AmplitudeScaled(re,im,len,scale) ((int)Amplitude(re,im,len)%scale)
Maybe someone can answer this question?
Best regards, Tabor25
|
| Sign In·View Thread·PermaLink | 1.33/5 |
|
|
|
 |
|
|
 |
|
|
 |
|
 |
I was wondering if anyone has attempting to integrate the above code into an xcode project? If so, what ammendments have you made? Cheers Eoghan
|
| Sign In·View Thread·PermaLink | 2.00/5 |
|
|
|
 |
|
 |
Hi....
your article is great.... it's very useful for me. but i have a question for you.. can the source code add by sound from using sound card?? so the result is waveform from the sound by using sound card.
I really appreciate it if you can help me. because it can help finish my task. please.....
thank you....
|
| Sign In·View Thread·PermaLink | 1.00/5 |
|
|
|
 |
|
|
 |
 | re  Leon.Minos | 19:39 2 Aug '07 |
|
|
 |
|
 |
Thank you everybody, can someone help me please, i downloaded the code and tried to compile with, visual C++ 6.0 but i get this error, any idea of how fix it:
--------------------Configuration: waveInFFT - Win32 Debug------------------- Compiling... waveInFFTDlg.cpp C:\waveinFFT_demo\waveInFFTDlg.cpp(143) : error C2374: 'i' : redefinition; multiple initialization C:\waveinFFT_demo\waveInFFTDlg.cpp(125) : see declaration of 'i' C:\waveinFFT_demo\waveInFFTDlg.cpp(169) : error C2374: 'i' : redefinition; multiple initialization C:\waveinFFT_demo\waveInFFTDlg.cpp(125) : see declaration of 'i' Error executing cl.exe. waveInFFT.exe - 2 error(s), 0 warning(s)
ED
|
| Sign In·View Thread·PermaLink | 5.00/5 |
|
|
|
 |
|
 |
Rename one of the variables "i" to another letter, perhaps "j", change the use of "i" within the following "for" loop from i to j, and that should fix the problem. I do not have Visual C++ 6.0 so I can not test the code on that compiler.
Nothing is impossible, It's merely a matter of finding an answer to the question of HOW? ... And answering that question is usually the most difficult part of the job!!!
|
| Sign In·View Thread·PermaLink | 4.00/5 |
|
|
|
 |
|
 |
Hi really useful programme. Just trying to have a go at some simple DSP. Quick question, if you print the audio sample values "(double)((short*)pwh->lpData)[dw++];" to a text file and load it up into a sound editor, adobe audition can take plain text input, so can matlab, and set the sampling rate to 44100, as defined in the programme's sampling rate, the audio you are doing the FFT on plays back at the wrong speed (speed up) and is of poor, but audible, quality. Any ideas why there is a difference in the sampling rate and why the audio recording is so noisy.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
The noise in the recording, which you probably noticed by the program display when no audio is playing, has to do with the recording properties. Depending upon what sound card a person is using, there may or may not be a Wave/MP3 line in the recording properties. Otherwise the selected recording line, (and even if there is a wave line) is usually set to "Stereo" or "What U Hear" or some such term. In effect what this does is combine all the recording lines into one source. This includes feedback from the Mic Line. If you can try selecting the WAVE/MP3 line in the recording options of your sound card. That should take care of the issue. If you select that line, you will notice the program no longer displays any noise when no audio is playing. If you do not have such a line in your volume control recording options, then I do not know what you can do other than changing the program from waveIn recording, to waveOut playing but that will require extensive changes.
Nothing is impossible, It's merely a matter of finding an answer to the question of HOW? ... And answering that question is usually the most difficult part of the job!!!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Hi Fred, thanks for the help although this is not the case, I have tested it to be sure. All are muted apart from the microphone I am trying to record off and the value comes out distorted, not just with added low level noise. This manifests itself audible like distortion as if the sound card was being overdriven (the input was too loud) expect it is not reaching the max values of the buffers 2 bytes per samples for the 16 bit sample resolution and is within the normal recording levels for the device. Any more ideas would be greatfuly recieved. I have ran a sweeping sin wav through and got frequency components addeded that arn't in the original. I also ran through a fixed 100 Hz signal and found that what ever is in each buffer does not produce a continous sine wav. ie. when it starts a new set of 1024 samples it does not follow on from the old ones (for example with max and mins. 1 0 -1 0 ... [up to 1023 then] ... 0 (does not follow and does to something it shouldn't) 0 -1 ....). This sounded like a buffer problem to me, so I upped the number of buffers being used from 16 (as per the original code) to 300 which decreases the frequent occurance of these errors. Any ideas?
-- modified at 19:48 Wednesday 30th May, 2007
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
The very reason is what you stated.
ChrisMitchell_ wrote: the microphone I am trying to record off
Microphones have inherent distortion in them on most systems. Try using a line in or wave file as input instead of microphone.
Nothing is impossible, It's merely a matter of finding an answer to the question of HOW? ... And answering that question is usually the most difficult part of the job!!!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Something seems severely wrong in this entire demo. I wouldn't recommend use this as a basis for anyone learning FFT or beat detection.
1) The code below does almost nothing since 'nAvg' is calculating the average of the same number 'fdraw[i]'. 2) 'i' is incremented in the loop statement AND in the body of the loop inexplicably 3) Even if you mute all audio input devices, you'll still get garbage.
for(int i=1;i < FFT_LEN/4;i++) { nAvg = 0; for (int n=0; n < nBars; n++) { nAvg += (int)fdraw[i]; } nAvg /= nBars; TRACE("Average for Bar#%d is %d\n",nCur++,nAvg); i+=nBars-1; }
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I just wanted to say thanks for posting this project. I have only just started with C/C++ , and this project has been a great help .as I said before in my last post I wanted to get this to work with an audio file , its taken me a few weeks but I have now got a small program that runs in a dos box ,it reads and audio file and writes out the FFT data to a text file. this is something I have needed for some time but I had no way of knowing how to do it . but thanks to you and a few late nights I have learnt how to make it work. I still have a few small bits to sort out in my program but its well on its way to being 100% ready . I would not have been able to do it with out your waveinfft project to get me started so Fred thank you very much ...
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Very nice demo!
One question, though. In the code, you specify the number of channels is 1 (in both the PCMFORMAT and WAVEFORMATEX structs). However, you are still reading data from the both the left and right channels. Why did you not specify the number of channels as 2? What would have been the difference (if any) if 2 channels were specified?
Thanks.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
jakeparks wrote: Why did you not specify the number of channels as 2?
This was a minor oversite on my part. I corrected the issue, and made a few other changes, like changing the FFT size and correcting the spectrum display. I have submitted the code and it should be available in a few days.
Nothing is impossible, It's merely a matter of finding an answer to the question of HOW? ... And answering that question is usually the most difficult part of the job!!!
|
| Sign In·View Thread·PermaLink | 2.25/5 |
|
|
|
 |
|
 |
It seems true, however, that even if you specify the number of channels as 1, that the data from the left and right channels are interleaved. What then, if any, is the result of specifying the number of channels as 1? Does it really not matter at all?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
The number of channels specified in the WAVEFORMATEX structure makes a difference. The is sent to the waveInOpen function to tell the input how to format the data that we will retrieve. If one channel is specified, the data returned in the WAVEHDR buffer is for one channel only, if two channels are specified, then the data returned is interleaved for two channels. Attempting to extract two channels from one channel data would cause an issue such as:
One channel data 10101010 Left extracted 1111 Right Extracted 0000
instead of: Two channel left 10101010 Two channel right 10101010
Nothing is impossible, It's merely a matter of finding an answer to the question of HOW? ... And answering that question is usually the most difficult part of the job!!!
|
| Sign In·View Thread·PermaLink | 2.00/5 |
|
|
|
 |
|
|