Voice Command






4.71/5 (16 votes)
Sep 26, 2005
6 min read

239389

17510
An article on the Voice Command of speech recognition.
Introduction
The Voice Command Demo demonstrates a simple speech recognition by showing you the commands it recognizes.
A speech recognition engine should be installed to run the program. You can download the Microsoft Speech Recognition Engine from here.
The Voice Command interface is the high-level interface for speech recognition. It is designed to provide command and control speech recognition for applications. With this interface, a user gives the computer simple commands, such as "Open the file", and can answer simple yes/no questions. Command and Control does not allow speech dictation.
The Voice Command design mimics a Windows menu in behavior, providing a "menu" of commands that users can speak. Basically, to use voice commands, an application designs a Voice menu that corresponds to a window or state within the application. Most programs will have one Voice menu for the main window and one for every dialog box. Contained within every Voice menu is a list of voice commands that users can say. When they say one, the application is notified which command was spoken. "Open a file" and "Send mail to <e-mail name>" are typical voice commands. Each voice command has information in addition to the spoken command, such as a description string and a command ID.
Voice commands allow the user to control an application by speaking commands through an audio input device rather than by using the mouse or keyboard, giving the user hands-free control of the application. Voice commands involve the use of an audio input device, such as a microphone or a telephone, a speech recognition engine, and a Voice menu. When the user speaks a command into the audio input device, the speech recognition engine attempts to transcribe the spoken input into text. If the engine succeeds, it compares the command text to that of the commands in the active Voice menus. (A Voice menu contains a list of commands to which an application can respond.) If the engine finds a matching command in a Voice menu, it notifies the application of the match, and the application carries out the command.
Why Use Command and Control?
In general, use Command and Control recognition when:
- It makes the application easier to use.
- It makes features in the application easier to get to.
- It makes the application more fun/realistic.
If an application uses speech recognition solely to impress people, it will work well for demos but will not be used by real users.
This sample program identifies a command spoken by the user from a set of commands, and displays it.
Requirements
Microphone
The user can choose between two kinds of microphone: either a close-talk or headset microphone that is held close to the mouth, or a medium-distance microphone that rests on the computer 30 to 60 centimeters away from the speaker. A headset microphone is needed for noisy environments.
Speech-recognition engine
Speech-recognition software must be installed on the user's system. Many new audio-enabled computers and sound cards are bundled with speech-recognition engines. As an alternative, many engine vendors offer retail packages for speech recognition or text-to-speech, and some license copies of their engines. If you don’t have one, you can download one from here.
Limitations
Currently, even the most sophisticated speech-recognition engines have limitations that affect what they can recognize and how accurate the recognition will be. This may seem like an impenetrable list, but a savvy application can design around these limitations.
Using the code
SpeechReg.cpp is the implementation file for the program.
- Initialize the application
To use voice commands, you need to create a Voice Command object, register your application with the object, and then create a Voice Menu object to manage your application's voice menus. You create a Voice Command object by calling the
CoCreateInstance
function with theCLSID_VCmd
class identifier and theIID_IVoiceCmd
interface identifier. You must create a separate Voice Command object for each site that your application needs to use.CoCreateInstance
returns a pointer to theIVoiceCmd
interface for the Voice Command object. Before it can perform other Voice Command tasks, an application must register itself by calling theIVoiceCmd::Register
member function. Register specifies the site that the object represents and passes the address of the application's Voice Command notification interface to the Voice Command object.After creating a Voice Command object and registering the application, you can use the
IVoiceCmd::MenuCreate
member function to open a voice menu and create a Voice Menu object to represent the menu.MenuCreate
retrieves the address of theIVCmdMenu
interface for the Voice Menu object. You can use the interface's member functions to manage the menu and its commands.The following example shows how to create a Voice Command object, register an application, and create a Voice Menu object. The example creates a temporary Voice Menu object; that is, the object is not added to the Voice Menu database maintained by the Voice Command object.
The function initializes OLE, creates an instance of the Voice Command object, registers the application with the object, and creates a temporary Voice Menu object. It returns
TRUE
if successful orFALSE
otherwise. The function uses the following global variables and constants:gpIVoiceCommmand
-- address of theIVoiceCmd
interface for the Voice Command object.gpIVCmdDialogs
-- address of theIVCmdDialogs
interface for the Voice Command object.gpIVCmdMenu
-- address of theIVCmdMenu
interface for the Voice Menu object.
The
BeginOLE()
function begins the OLE and creates the Voice Command object, registers with it, and creates a temporary menu.BOOL BeginOLE() { HRESULT hRes; VCMDNAME VcmdName; LANGUAGE Language; PCIVCmdNotifySink gpVCmdNotifySink = NULL; PIVCMDATTRIBUTES pIVCmdAttributes; SetMessageQueue(96); CoInitialize(NULL); // Create the voice commands object hRes=CoCreateInstance(CLSID_VCmd, NULL, CLSCTX_LOCAL_SERVER, IID_IVoiceCmd, (LPVOID *)&gpIVoiceCommand); // Get the dialogs interface pointer... hRes = gpIVoiceCommand->QueryInterface( IID_IVCmdDialogs, (LPVOID FAR *)&gpIVCmdDialogs ); // Get the attributes interface pointer... // hRes = gpIVoiceCommand->QueryInterface( IID_IVCmdAttributes, (LPVOID FAR *)&gpIVCmdAttr ); // Create/Register VCmd notification sink... gpVCmdNotifySink = new CIVCmdNotifySink; hRes = gpIVoiceCommand->Register( "", gpVCmdNotifySink, IID_IVCmdNotifySink, VCMDRF_ALLMESSAGES, NULL ); if(FAILED(hRes)) MessageBox(m_hwnd,"Error in registering","Speech Reg",MB_OK); //The following code checks for a navigator app and //checks the state of voice commands hRes = gpIVoiceCommand->QueryInterface(IID_IVCmdAttributes, (LPVOID FAR *)&pIVCmdAttributes); if (pIVCmdAttributes) { pIVCmdAttributes->EnabledSet( TRUE ); pIVCmdAttributes->AwakeStateSet( TRUE ); pIVCmdAttributes->Release(); }; // Initialize command menu set variables... lstrcpy(VcmdName.szApplication, "Speech Reg"); lstrcpy(VcmdName.szState, "Main"); Language.LanguageID = LANG_ENGLISH; lstrcpy (Language.szDialect, "US English"); // Create an empty command menu set... hRes = gpIVoiceCommand->MenuCreate( &VcmdName, &Language, VCMDMC_CREATE_TEMP, &gpIVCmdMenu ); if( FAILED(hRes) ) MessageBox(m_hwnd,"Failed to create a voice " "command set with MenuCreate()", "Speech Reg",MB_OK); return TRUE; }
- Adding Commands to a Voice Menu
After you create a Voice Menu object, you can add commands to the menu by filling an array of
VCMDCOMMAND
structures, copying the address and size of the array into anSDATA
structure, and passing the address of theSDATA
structure to theIVCmdMenu::Add
member function.The example in this section shows how to add a new set of commands to a Voice Menu object. The example consists of three functions:
UseCommands
,GetCommands
, andNextCommand
:The
UseCommands
function deactivates the Voice menu, replaces any existing commands in the menu with a new set, and reactivates the menu. One of the parameters toUseCommands
is the address of a buffer containing the list of command strings to enter.UseCommands
passes the address of the command-string buffer to theGetCommands
function, along with the address of anSDATA
structure.The
GetCommands
function converts the buffer to an array ofVCMDCOMMAND
structures and copies the address and size of the array into theSDATA
structure.The
NextCommand
function is a helper routine thatGetCommands
uses to retrieve individual command strings from the command buffer passed toUseCommands
. - Responding to Voice Command Notifications
The Voice Command object calls an application's
IVCmdNotifySink
interface to inform the application of Voice Command events so that the application can respond to them. To receive notifications, the application must create a COM object that supports theIVCmdNotifySink
interface and must pass the address of the interface to the Voice Command object when calling theIVoiceCmd::Register
function.The
IVCmdNotifySink
interface consists of a set of member functions that correspond to Voice Command events. When an event occurs on the site that the application is using, the Voice Command object calls the member function that corresponds to the event.The following example shows how to define an object class that implements the
IVCmdNotifySink
interface:class CIVCmdNotifySink : public IVCmdNotifySink { private: DWORD m_dwMsgCnt; HWND m_hWnd; public: CIVCmdNotifySink(void); ~CIVCmdNotifySink(void); // IUnknown members STDMETHODIMP QueryInterface (REFIID, LPVOID FAR *); STDMETHODIMP_(ULONG) AddRef(void); STDMETHODIMP_(ULONG) Release(void); // IVCmdNotifySink members STDMETHODIMP CommandRecognize (DWORD, PVCMDNAME, DWORD, DWORD, PVOID, DWORD,PSTR, PSTR); STDMETHODIMP CommandOther (PVCMDNAME, PSTR); STDMETHODIMP MenuActivate (PVCMDNAME, BOOL); STDMETHODIMP UtteranceBegin (void); STDMETHODIMP UtteranceEnd (void); STDMETHODIMP CommandStart (void); STDMETHODIMP VUMeter (WORD); STDMETHODIMP AttribChanged (DWORD); STDMETHODIMP Interference (DWORD);}; typedef CIVCmdNotifySink * PCIVCmdNotifySink;