![]() |
Multimedia »
Audio and Video »
Audio
Intermediate
License: The Code Project Open License (CPOL)
Sound recording and encoding in MP3 format.By rtybaseAn article describing the technique of recording sound from waveform-audio input devices and encoding it in MP3 format. |
VC6Win2K, WinXP, MFC, Dev
|
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||
Have you ever tried to write something for recording sound from the sound card and encoding it in MP3 format? Not interesting? Well, to make stuff more interesting, have you ever tried to write an MP3 streaming, internet radio sever? I know, you'll say "What for? There are good and pretty much standard implementations like Icecast or SHOUcast". But, anyway, have you ever tried, at least, to dig a bit inside this entire kitchen or write anything similar for your soul? Well, that's what this article is about. Of course, we won't manage to cover all topics in one article; at the end, this may be tiresome. So, I will split the entire topic in a few articles, this one covering the recording and encoding process.
Obviously, the first problem everyone encounters is the MP3 encoding itself. Trying to write something that will work properly isn't quite an easy task. So, I won't go too far and will stop at the LAME (Sourceforge) encoder, considered one of the best (one, not the only!). I am using version 3.97); those interested in having sources, feel free to download them from SourceForge (it's an open source project). The relevant "lame_enc.dll" is also included in the demo project (see the links at the top of this article).
The next problem is recording the sound from the soundcard. Well, with some luck, on Google, MSDN, and CodeProject, you can find many articles related to this topic. I should say that I am using the low level waveform-audio API (see the Windows Media Platform SDK, e.g., waveInOpen(...), mixerOpen(...), etc.).
So, let's go with the details now.
Download the "mp3_stream_src.zip" file containing the sources (see the link to the sources at the top of this article). Inside it, you should find the "mp3_simple.h" file (see the INCLUDE folder after un-zipping). It contains the definition and implementation of the CMP3Simple class. This class is a wrapper of the LAME API, which I tried to design to make life a bit easier. I commented code as much as possible, and I hope those comments are good enough. All we need to know at this point:
CMP3Simple object, we need to define the desired bitrate at what to encode the sound's samples, expected frequency of the sound's samples, and (if necessary to re-sample) the desired frequency of the encoded sound:// Constructor of the class accepts only three parameters.
// Feel free to add more constructors with different parameters,
// if a better customization is necessary.
//
// nBitRate - says at what bitrate to encode the raw (PCM) sound
// (e.g. 16, 32, 40, 48, ... 64, ... 96, ... 128, etc), see
// official LAME documentation for accepted values.
//
// nInputSampleRate - expected input frequency of the raw (PCM) sound
// (e.g. 44100, 32000, 22500, etc), see official LAME documentation
// for accepted values.
//
// nOutSampleRate - requested frequency for the encoded/output
// (MP3) sound. If equal with zero, then sound is not
// re-sampled (nOutSampleRate = nInputSampleRate).
CMP3Simple(unsigned int nBitRate, unsigned int nInputSampleRate = 44100,
unsigned int nOutSampleRate = 0);
CMP3Simple::Encode(...).// This method performs encoding.
//
// pSamples - pointer to the buffer containing raw (PCM) sound to be
// encoded. Mind that buffer must be an array of SHORT (16 bits PCM stereo
// sound, for mono 8 bits PCM sound better to double every byte to obtain
// 16 bits).
//
// nSamples - number of elements in "pSamples" (SHORT). Not to be confused
// with buffer size which represents (usually) volume in bytes. See
// also "MaxInBufferSize" method.
//
// pOutput - pointer to the buffer that will receive encoded (MP3) sound,
// here we have bytes already. LAME says that if pOutput is not
// cleaned before call, data in pOutput will be mixed with incoming
// data from pSamples.
//
// pdwOutput - pointer to a variable that will receive the
// number of bytes written to "pOutput". See also "MaxOutBufferSize"
// method.
BE_ERR Encode(PSHORT pSamples, DWORD nSamples, PBYTE pOutput,
PDWORD pdwOutput);
Similarly, after un-zipping the "mp3_stream_src.zip" file, inside the INCLUDE folder, you should find the "waveIN_simple.h" file. It contains the definitions and implementations for the CWaveINSimple, CMixer and CMixerLine classes. Those classes are wrappers for a sub-set of the waveform-audio API functions. Why just a sub-set? Because (I am lazy sometimes), they encapsulate only functionality associated with Wave In devices (recording). So, Wave Out devices (playback) are not captured (type "sndvol32 /r" from "Start->Run" to see what I mean). Check comments I added to each class to have a better picture of what they are doing. What we need to know at this point:
CWaveINSimple device has one CMixer which has zero or more CMixerLines.
private" (due design).
CWaveINSimple class can not be instantiated directly, for that the CWaveINSimple::GetDevices() and CWaveINSimple::GetDevice(...) static methods are declared.
CMixer class can not be instantiated directly, for that the CWaveINSimple::OpenMixer() method is declared.
CMixerLine class can not be instantiated directly, for that the CMixer::GetLines() and CMixer::GetLine(...) methods are declared. IReceiver abstract class and implement the IReceiver::ReceiveBuffer(...) method. Further, an instance of the IReceiver derivate is passed to CWaveINSimple via CWaveINSimple::Start(IReceiver *pReceiver).
// See CWaveINSimple::Start(IReceiver *pReceiver) below.
// Instances of any class extending "IReceiver" will be able
// to receive raw (PCM) sound from an instance of the CWaveINSimple
// and process sound via own implementation of the "ReceiveBuffer" method.
class IReceiver {
public:
virtual void ReceiveBuffer(LPSTR lpData, DWORD dwBytesRecorded) = 0;
};
...
class CWaveINSimple {
private:
...
// This method starts recording sound from the
// WaveIN device. Passed object (derivate from
// IReceiver) will be responsible for further
// processing of the sound data.
void _Start(IReceiver *pReceiver);
...
public:
...
// Wrapper of the _Start() method, for the multithreading
// version. This is the actual starter.
void Start(IReceiver *pReceiver);
...
};
Let's see some examples.
const vector<CWaveINSimple*>& wInDevices = CWaveINSimple::GetDevices();
UINT i;
for (i = 0; i < wInDevices.size(); i++) {
printf("%s\n", wInDevices[i]->GetName());
}
strDeviceName = e.g., "SoundMAX Digital Audio")?CWaveINSimple& WaveInDevice = CWaveINSimple::GetDevice(strDeviceName);
CHAR szName[MIXER_LONG_NAME_CHARS];
UINT j;
try {
CMixer& mixer = WaveInDevice.OpenMixer();
const vector<CMixerLine*>& mLines = mixer.GetLines();
for (j = 0; j < mLines.size(); j++) {
// Useful when Line has non proper English name
::CharToOem(mLines[j]->GetName(), szName);
printf("%s\n", szName);
}
mixer.Close();
}
catch (const char *err) {
printf("%s\n",err);
}
First of all, we define a class like:
class mp3Writer: public IReceiver {
private:
CMP3Simple m_mp3Enc;
FILE *f;
public:
mp3Writer(unsigned int bitrate = 128,
unsigned int finalSimpleRate = 0):
m_mp3Enc(bitrate, 44100, finalSimpleRate) {
f = fopen("music.mp3", "wb");
if (f == NULL) throw "Can't create MP3 file.";
};
~mp3Writer() {
fclose(f);
};
virtual void ReceiveBuffer(LPSTR lpData, DWORD dwBytesRecorded) {
BYTE mp3Out[44100 * 4];
DWORD dwOut;
m_mp3Enc.Encode((PSHORT) lpData, dwBytesRecorded/2,
mp3Out, &dwOut);
fwrite(mp3Out, dwOut, 1, f);
};
};
and (supposing that strLineName = e.g., "Microphone"):
try {
CWaveINSimple& device = CWaveINSimple::GetDevice(strDeviceName);
CMixer& mixer = device.OpenMixer();
CMixerLine& mixerline = mixer.GetLine(strLineName);
mixerline.UnMute();
mixerline.SetVolume(0);
mixerline.Select();
mixer.Close();
mp3Writer *mp3Wr = new mp3Writer();
device.Start((IReceiver *) mp3Wr);
while( !_kbhit() ) ::Sleep(100);
device.Stop();
delete mp3Wr;
}
catch (const char *err) {
printf("%s\n",err);
}
CWaveINSimple::CleanUp();mixerline.SetVolume(0) is a pretty tricky point. For some sound cards, SetVolume(0) gives original (good) sound's quality, for others, SetVolume(100) does the same. However, you can find sound cards where SetVolume(15) is the best quality. I have no good advices here, just try and check.
Almost every sound card supports "Wave Out Mix" or "Stereo Mix" (the list is extensible) Mixer's Line. Recording from such a line (mixerline.Select()) will actually record everything going to the sound card's Wave Out (read "speakers"). So, leave WinAmp or Windows Media Player to play for a while, and start the application to record the sound at the same time, you'll see the result.
Rather than calling:
mp3Writer *mp3Wr = new mp3Writer();
it is also possible to instantiate an instance of the mp3Writer as following (see the class definition above):
mp3Writer *mp3Wr = new mp3Writer(64, 32000);
This will produce a final MP3 at a 64 Kbps bitrate and 32 Khz sample rate.
The demo application (see the links at the top of this article) is a console application supporting two command line options. Executing the application without specifying any of the command line options will simply print the usage guideline, e.g.:
...>mp3_stream.exe
mp3_stream.exe -devices
Will list WaveIN devices.
mp3_stream.exe -device=<device_name>
Will list recording lines of the WaveIN <device_name> device.
mp3_stream.exe -device=<device_name> -line=<line_name>
[-v=<volume>] [-br=<bitrate>] [-sr=<samplerate>]
Will record from the <line_name>
at the given voice <volume>, output <bitrate> (in Kbps)
and output <samplerate> (in Hz).
<volume>, <bitrate> and <samplerate> are optional parameters.
<volume> - integer value between (0..100), defaults to 0 if not set.
<bitrate> - integer value (16, 24, 32, .., 64, etc.),
defaults to 128 if not set.
<samplerate> - integer value (44100, 32000, 22050, etc.),
defaults to 44100 if not set.
Executing the application with the "-devices" command line option will print the names of the Wave In devices currently installed in the system, e.g.:
...>mp3_stream.exe -devices
Realtek AC97 Audio
Executing the application with the "-device=<device_name>" command line option will list all the lines of the selected Wave In device, e.g.:
...>mp3_stream.exe "-device=Realtek AC97 Audio"
Mono Mix
Stereo Mix
Aux
TV Tuner Audio
CD Player
Line In
Microphone
Phone Line
At the end, the application will start recording (and encoding) sound from the selected Wave In device/line (microphone in this example) when executing with the following command line options:
...>mp3_stream.exe "-device=Realtek AC97 Audio" -line=Microphone
Recording at 128Kbps, 44100Hz
from Microphone (Realtek AC97 Audio).
Volume 0%.
hit <ENTER> to stop ...
Recorded and encoded sound is saved in the "music.mp3" file, in the same folder from where you executed the application.
If you want to record sound that is currently playing (e.g., AVI movie, or Video DVD, or ...) through the soundcard Wave Out, you can run the application with the following options:
...>mp3_stream.exe "-device=Realtek AC97 Audio" "-line=Stereo Mix"
However, this may be specific for my configuration only (also explained in the "Remark 2" above).
You can specify additional command line parameters, e.g.:
...>mp3_stream.exe "-device=Realtek AC97 Audio"
"-line=Stereo Mix" -v=100 -br=32 -sr=32000
This will set the line�s volume at 100%, and will produce the final MP3 at 32 Kbps and 32 Khz.
In this article, I covered couple of months I spent investigating MP3 encoding APIs and recording (capturing actually) sound going to the sound card's speakers. I used all this techniques for implementing an internet based radio station (MP3 streaming server). I found this topic very interesting, and decided to share some of my code. In one of my next articles, I will try to cover some of the aspects related to MP3 streaming and IO Completion Ports, but, until that time, I have to clean existing code, comment it, and prepare the article :).
General
News
Question
Answer
Joke
Rant
Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads.
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 16 Nov 2006 Editor: Smitha Vijayan |
Copyright 2006 by rtybase Everything else Copyright © CodeProject, 1999-2010 Web21 | Advertise on the Code Project |