This article demonstrates how to generate a quasi-infinite number of sounds using math formulas. Also, a sample program is provided to allow you to do experimentation by your own. Have great times!
A sound is basically a sine wave traveling in the air. It is characterized by a power and a frequency. The power determines the sound loudness and the frequency determines its pitch. If you are not familiar with this phenomenon, I propose you to start the sample program right now!
First, try some variant of “sin(x*t)”, beginning with x equal to 20 and increasing it progressively by step of 20. In that expression, “x” represents the frequency in Hertz and “t” is the time variable, varying from 0 to 2pi rad in one second. Maybe with x = 20 you will hear nothing. If this is the case, don’t get mad, it’s just that your speakers can’t handle a so low frequency. (By the way, this is a good method to test your speakers’ quality.) For reference, humans generally can hear sine waves between 30Hz to 20000Hz.
In the digital world, sampling rate is the number of sound snapshots (called samples) that are used to generate a sine wave in the analog world. Hence, higher is the sampling rate, better will be the sound fidelity. There is a fundamental theorem called the Nyquist theorem that states that the sampling rate must be at least twice the highest analog frequency in order to accurately represent the original sound. In other words, if you want to hear a 1000Hz sine wave, the sampling rate must be at least 2000Hz. Knowing that humans can hear sine waves up to 22000Hz, it’s the reason behind the CD quality record sampling rate of 44100Hz (but of course, this not optimal; professional equipments use sampling rate of 48000Hz and higher). Sound samples are represented with bits, typically 8 or 16 bits per sample. That value determines the minimum and the maximum value that a sound sample can take. If you bust these limits, you will hear distorted sounds (here are those alien sounds!). You can experiment this by adding a second sine wave to the preceding test and putting the volume all way up.
Adding two sounds
At first, it may be confusing, but you add sounds simply by using the “+” operator! No fancy mathematics here. If you want to play two sine waves simultaneously, you just add their sin values like this: resultingSound(t) = sin(x*t) + sin(y*t), where x and y are sound frequencies.
Basic sine shape and other sound-generating functions
The general sine wave shape is “amplitude*sin(frequency*(t+delay))”. Math formulas are evaluated using the math parser shown in this article. So, basic functions like sin, cos, tan, min, and max come with the sample application, but it is also very easy to define your own functions (read the above article for more details).
The Sample Program
The sample program is a little lab that allows you to test various sound shapes using math formulas. To be more convenient, you can break your sounds in up to three independent sound shapes instead of putting them all together in a very large formula. This also means that you can “mix” as many sounds as you want by using the “+” operator and scaling them to avoid busting the amplitude limits.
Again for the sake of usability, you can use variables to lighten your sound shape formulas. Variables are named x, y, and z. You can use another special “system” variable named “t”. This variable is the time counter in radians.
In the bottom of the sample application screen is shown the resulting wave. You can see the effect of each sound shape individually by activating and deactivating them. You can also check if the resulting sound is too loud and thus generating distortion.
One last thing: the compute time indicator. Evaluating thousands of expressions per second can be very demanding for your computer. If your computer is too slow, the sound will not be played correctly. In this case, the time indicator will display a “Warning” message. You can diminish the number of evaluations per second by decreasing the sampling rate or reducing your math formula's complexity.
A look at the code
There are three fundamental things here: playing sounds, evaluating math formulas, and mixing sounds.
To play sound, I use DirectSound. The mechanism is simple: create a sound buffer and a few notification events that will tell you when it is time to compute the next thousand sound samples and update the sound buffer. This code is executed in the
Next, when you need to compute the sound samples, the first thing to do is to evaluate variable values that will be used by sound shape formulas. After that, you evaluate sound shape formulas and add their values by scaling them according to the corresponding volume. Finally, you increment the time variable
t and redo the variables and sound shapes evaluation until enough sound samples will have been computed. The following code snippet shows how this is implemented:
for( int s=0; s < m_soundPlayerEventNbSamples; s++ )
sample = 0;
for( int t=0; t < m_nbVars; t++ )
m_pUserVar[t] = m_pVarExpr[t].evaluate();
for( t=0; t < m_nbShapes; t++ )
if( m_pShapeActive[t] )
// volume scaling
sample += m_pShapeExpr[t].evaluate() * m_pShapeVolume[t];
// rescaling to the sample range
m_pSamples[s] = sample* m_sampleScaleVal;
There are some details to be aware of. First, sine function value ranges from –1 to 1, so it must be rescaled to cover the entire sample range that is, in 16 bits, –2^15 to 2^15. With 8 bits sound samples, this scale value would be different (i.e., –2^7 to 2^7). Another thing to note is how the time is computed. Since the
sin(x) function takes radian value as input, the time variable must grow by exactly 2*pi in one second in order to generate a 1Hz sine wave. If the
sin(x) function input value would have been in degrees instead of radians, then the time variable would grow by 360 (degrees) each second. The variable
m_step is the time increment per sample.
Because generating high quality sound in real-time is very CPU intensive, performance is a major issue. I ran the code profiler and found that the most demanding computation was the math expression evaluations. So, I optimized the math parser and succeeded to save some precious milliseconds. Maybe you will notice that the math parser uses a recursive algorithm to evaluate expression, and will wonder if this is a good idea since performance is an issue. I asked me the same question, and I tried a non-recursive algorithm. The result was that while this solution ran faster in debug mode (twice as fast), in release mode, the recursive solution ran faster (a couple of milliseconds faster). So, I concluded that the compiler could do a lot of optimizations with a recursive algorithm and hence, I put it in the final software version.
The morale of this tale is that before doing performance optimization, always use a profiler to find where are the performance bottlenecks, and always use a profiler to validate that your changes actually increased performance.
June 17 - First draft.