![]() |
Multimedia »
Audio and Video »
Speech
Beginner
License: The Code Project Open License (CPOL)
MFC Recognizes C&C Speeches Easily !By aljodavMFC C&C speech recognition in 4 code lines using Microsoft Speech Object Library |
C++ (VC7.1, VC8.0), Windows (Win2K, WinXP, Win2003, Vista), Visual Studio (VS.NET2003, VS2005), MFC, Dev
|
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||
Browsing the Microsoft Speech SDK Help, I didn't find any MFC example doing speech recognition; in fact, no MFC example doing anything. Looking at some CPP examples, ... some VB ones, ... I concluded that it is possible to build a reasonably simple and effective MFC class wrapping a small part of the speech recognition engine. This way, MFC applications could provide speech recognition facility to their users, easily. Also, I devised a new mode (querying mode) in dealing with the SR engine, resulting in more programming freedom. Below is one way I'd like to do speech recognition (querying mode) easily:
CEasyCCRecognition eccr(TRUE); // object instance (querying mode)
// for MFC programmers;
// SR engine is immediately started;
// SR will not send WMs.
eccr.Xml = _T("Words.xml"); // instructs SR engine what to listen to.
eccr.IsListening = TRUE; // now, SR engine is listening...
Now, the SR engine is started and listening to your speech and trying to recognize it, according to the instructions inside the xml disc file. Speak something into the default audio input (I hope it is the microphone) consistent with the xml disc file. Knowing what text was recognized is simple:
CString str = eccr.TextRecognized;
The CString object now holds the recognized text (if not empty). As you see, these 4 code lines, easily, isolate COM programming from the main application path.
Would you like doing speech recognizing this way? It is MFC C&C speech recognition in 4 code lines using Microsoft Speech Object Library! If you answer positively, read on... otherwise read some more lines...
Using CEasyCCRecognition class is simple: no SDK download is needed; plain MFC/C++ is used, without any reference to any element in any SDK! But, be aware, Microsoft Speech Object Library version 5.1 (or newer) must be previously installed in the computer running CEasyCCRecognition class code!
Don't forget to make the SR engine accustomed to your voice (use the control panel to do this).
After reading this article (or even before), you may want to read the clean and neat article MFC 'shockwaveflashes' easily ! and also the cool and smart and ... article MFC speaks easily !
No special background is needed. You must be able to build simple projects only. Of course, you must also be able to download and install Microsoft Speech Object Library version 5.1 (or newer) in case you need to!
Microsoft Speech Object Library does speech recognition in two ways: Dictation and Command and Control (C&C). CEasyCCRecognition class is a simple (but effective) class that wraps a small part of the speech recognition COM objects, and uses C&C to provide any MFC application, easily, with speech recognition, following instructions the SR engine receives from an xml disc file.
You can do speech recognition in just 4 steps:
include directives at the top of the page:
#include "EasyCCRecognition.h" // zipped in
// MFCRcgnzsCCSpeechesEasily_src.zip.
CEasyCCRecognition g_eccr(TRUE); // start SR engine now; do not send WM's
C...Dlg::OnInitDialog() override include:
g_eccr.Xml = _T("Words.xml"); // there's a copy in
// MFCRcgnzsCCSpeechesEasily_demo.zip
g_eccr.IsListening = TRUE; // start listening now.
SetTimer(1, 250, NULL);
WM_TIMER and include this in the handler:
CString str = g_eccr.TextRecognized;
if (!str.IsEmpty())
((CListBox*)GetDlgItem(IDC_LIST1))->InsertString(0, str);
They are just (1,2,3,) 4 steps as you can see. Build and run. Now, speak a recognizable word (e.g., many, speeches, numbers, etc.) into the default audio input (the microphone, I hope), to have the recognized text shown in the listbox.
CEasyCCRecognition class can be used in one of two modes:
The SR engine is instructed to send a WM every time it recognizes some text, e.g. :
CEasyCCRecognition eccr; // one possible object instance
// for MFC programmers;
// SR engine is not started.
eccr.Xml = _T("Words.xml"); // instructs SR engine what listen to.
eccr.MsgID = WM_ICHOSETHISONE; // optional; if not specified,
//WM_APP is used for MsgID.
eccr.Start = GetSafeHwnd(); // destine for the WM's.
eccr.IsListening = TRUE; // now, SR engine is listening...
Now, whenever the SR engine recognizes some text, it will send a WM to the destine window, that can be caught by:
ON_MESSAGE(WM_ICHOSETHISONE, OnTextRecognized) // or WM_APP if property
// MSgID was not used
The SR engine is instructed not to send any WM. The application has to query it periodically, e.g. :
CEasyCCRecognition eccr(TRUE); // one possible object instance
// for MFC programmers;
// SR engine is started immediately,
// and doesn't send WM's.
eccr.Xml = _T("Words.xml"); // instructs SR engine what listen to.
eccr.IsListening = TRUE; // now, SR engine is listening...
Now, knowing what text was recognized is a matter of querying the SR engine, periodically, e.g.:
CString str = eccr.TextRecognized;
I should mention the property IsStarted; if, after property Start is called, IsStarted returns FALSE, it means that Microsoft Speech Object Library must be installed with (or updated to) version 5.1 or newer!
The Microsoft C++ extension property makes CEasyCCRecognition class usage simpler; see the properties:
BOOL IsStarted; // get only
// Gets the SR engine starting state;
// initially it is FALSE. If after property Start
// is used this one keeps FALSE, it means that
// Microsoft Speech Object Library version 5.1
// (or newer) is not present in the computer.
BOOL IsListening; // get/put
// Gets or sets the SR engine listening state;
// initially it is FALSE.
UINT MsgID; // put only
// Defines the WM ID the SR engine shall use when sending
// messages to the application; if not used,
// the default WM_APP is used.
HWND Start; // get/put
// Get: starts the SR engine in the querying mode;
// MsgID is ignored and WM's are not sent.
// Put: starts the SR engine in the WM mode,
// using the MsgID defined previously.
// It defines the HWND where messages shall be sent to.
const LPCTSTR Xml; // put only
// Defines, through a xml disc file,
// what words/phrases the SR engine
// shall recognize.
const LPCTSTR Recognize; // put only
// Defines, through a text buffer, what words
// the SR engine shall recognize. A xml disc
// file with the tag TEXTBUFFER should've had,
// previously, instructed the SR engine to
// prepare itself to do this. After, the property
// Recognize can be used any time
// to change words that shall be recognized.
const CString& TextRecognized; // get only
// Retrieves the text recognized by
// the SR engine.
I have provided two examples for each mode. The third is an extension of the second one, and the fourth makes use of property Recognize:
The first application uses WM to receive messages when some text is recognized. Four xml disc files are available to instruct the SR engine. Each xml disc file specifies some words/phrases to be recognized, and also the xml disc files' names. Each time a word/phrase is recognized, it is shown in a list box; if it is a disc file's name, it is open and made active. When you want to exit the application (you shouldn't; this app is very nice!), just speak exit to the default audio input (the microphone ?). If Microsoft Speech Object Library version 5.1 or newer is not registered, a message box will state that.
The second application queries the SR engine 5 times per second, for any text recognized. A xml disc file specifies what words/phrases to be recognized. This application can be a starting for a game's frame window. When you want to exit the application (you shouldn't; this app is also very nice!), just speak exit to the default audio input (it has to be the microphone...). If Microsoft Speech Object Library version 5.1 or newer is not registered, the application will, silently, exit immediately after it starts. The application drawing is simple and not persistent.
The third application is an extension of the second one. The dreadful and terrible Komm Pew Tehr menaces the world... UN, G7, unitedly, have chosen you... you have to battle him on world's sea, but... be careful... no one can hit any animal... Grean Peace is around!... All you have to do is speak the coordinates where a launched missile will hit! Building this application will require EasySpeech.h and EasyShockwaveflash.h (the demo is already built); the first one fom the article MFC speaks easily ! , and the other from the article MFC 'shockwaveflashes' easily ! Speaking restart into the microphone will restart the application; when you want to exit (don't; this app is the nicest one!), just speak exit. If Microsoft Speech Object Library version 5.1 or newer is not registered, the application will, silently, exit immediately after it starts. The application drawing is simple and not persistent.
The fourth application uses WM to receive messages when some text is recognized; it uses property Recognize. Initially, an xml disc file with the tag <TEXTBUFFER /> instructs the SR engine. An edit control holds the text (punctuation too), entered by the user (i.e., at run time), that shall be considered to be recognized once the 'Enter' key is pressed. Each time this key is pressed, the text buffer is put into the property Recognize. This new group of words is considered, and the old one is discarded, automatically, by the SR engine. As the words considered to be recognized are known, only, to the user and to the SR engine, user's privacy is favored (there's no disc writing; a speechable password can be implemented).
MFC, D3DDevice and Tiny X (changing clothes), just walkin'... easily !
You can use it in any free and non commercial product with the following citation (maybe in your 'About dialog box'):
CEasyCCRecognition class code © aljodav, from the article :
www.codeproject.com/KB/audio-video/MFCRcgnzsCCSpeechesEasily.aspx
MFC can be used to wrap COM objects, easily, even when they are not so MFC-friendly as in this case. CEasyCCRecognition class hides MFC programmers from the COM rules (and mantras) and wraps it in a simple way, but, despite that, it opens a whole world of possibilities for MFC programmers.
Here's the link to the Full SDK and language packs.
... I was Googling for "Komm Pew Tehr" (sic) ... and I got ...
Recognize, and example (v.4)
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 24 Jun 2007 Editor: Deeksha Shenoy |
Copyright 2007 by aljodav Everything else Copyright © CodeProject, 1999-2009 Web15 | Advertise on the Code Project |