Click here to Skip to main content
15,888,046 members
Articles / Programming Languages / VC++
Tip/Trick

C++ Speech Recognition

Rate me:
Please Sign up or sign in to vote.
2.50/5 (9 votes)
14 Jun 2014CPOL 127.3K   14   10
Simple Speech Recognition using C++

Introduction

Speech recognition is a fascinating domain but it is not a very easy task. Software today is able to deliver some average performance which means that you need to speak out loud and make sure to dictate very precisely what you meant to say in order for the software to recognize it. This project is a complete example on how to develop speech recognition using SAPI. Make sure you have the SAPI SDK installed on your computer and also speech recognition enabled.

Background

Unlike many implementations of Speech Recognition using SAPI, this one doesn't need a static grammar resource to be loaded into the project. This code was made really simple and straightforward to help anyone who has the desire to develop speech recognition in C++.

C++
#include <windows.h>
#include <sphelper.h>
#include <string>
#include "resource.h"
#define WM_RECOEVENT    WM_USER+1

BOOL CALLBACK DlgProc(HWND hWnd, UINT Message, WPARAM wParam, LPARAM lParam);
void LaunchRecognition(HWND hWnd);
void HandleEvent(HWND hWnd);
WCHAR *ExtractInput(CSpEvent event);
void CleanupSAPI();

CComPtr<ISpRecognizer> g_cpEngine;
CComPtr<ISpRecoContext> g_cpRecoCtx;
CComPtr<ISpRecoGrammar> g_cpRecoGrammar;
WCHAR *lpszBuffer;

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nShowCmd)
{
    // allocating memory for buffer this buffer is used to store
    // the text during the speech recognition process
    lpszBuffer = new WCHAR[MAX_PATH];
    lpszBuffer[0] = 0;

    DialogBox(hInstance, MAKEINTRESOURCE(IDD_DIALOG1), NULL, DlgProc);
    // freeing the memory that was allocated for the buffer
    delete [] lpszBuffer;
    return 0;
}

BOOL CALLBACK DlgProc(HWND hWnd, UINT Message, WPARAM wParam, LPARAM lParam)
{
    switch(Message)
    {
    case WM_RECOEVENT:
        HandleEvent(hWnd);
        break;
    case WM_COMMAND:
        switch(LOWORD(wParam))
        {
        case ID_START_RECOG:
            LaunchRecognition(hWnd);
            break;
        }
        break;
    case WM_CLOSE:
        CleanupSAPI();
        EndDialog(hWnd, 0);
        break;
    default:
        return FALSE;
    }
    return TRUE;
}

void LaunchRecognition(HWND hWnd)
{
    if(FAILED(::CoInitialize(NULL)))
    {
        throw std::string("Unable to initialise COM objects");
    }

    ULONGLONG ullGramId = 1;
    HRESULT hr = g_cpEngine.CoCreateInstance(CLSID_SpSharedRecognizer);
    if(FAILED(hr))
    {
        throw std::string("Unable to create recognition engine");
    }
    
    hr = g_cpEngine->CreateRecoContext(&g_cpRecoCtx);
    if(FAILED(hr))
    {
        throw std::string("Failed command recognition");
    }

    hr = g_cpRecoCtx->SetNotifyWindowMessage( hWnd, WM_RECOEVENT, 0, 0 );
    if(FAILED(hr))
    {
        throw std::string("Unable to select notification window");
    }

    const ULONGLONG ullInterest = SPFEI(SPEI_SOUND_START) | SPFEI(SPEI_SOUND_END) |
                                      SPFEI(SPEI_PHRASE_START) | SPFEI(SPEI_RECOGNITION) |
                                      SPFEI(SPEI_FALSE_RECOGNITION) | SPFEI(SPEI_HYPOTHESIS) |
                                      SPFEI(SPEI_INTERFERENCE) | SPFEI(SPEI_RECO_OTHER_CONTEXT) |
                                      SPFEI(SPEI_REQUEST_UI) | SPFEI(SPEI_RECO_STATE_CHANGE) |
                                      SPFEI(SPEI_PROPERTY_NUM_CHANGE) | SPFEI(SPEI_PROPERTY_STRING_CHANGE);
    hr = g_cpRecoCtx->SetInterest(ullInterest, ullInterest);
    if(FAILED(hr))
    {
        throw std::string("Failed to create interest");
    }

    hr = g_cpRecoCtx->CreateGrammar(ullGramId, &g_cpRecoGrammar);
    if(FAILED(hr))
    {
        throw std::string("Unable to create grammar");
    }

    hr = g_cpRecoGrammar->LoadDictation(0, SPLO_STATIC);
    if(FAILED(hr)) 
    {
        throw std::string("Failed to load dictation");
    }
    
    hr = g_cpRecoGrammar->SetDictationState(SPRS_ACTIVE);
    if(FAILED(hr)) 
    {
        throw std::string("Failed setting dictation state");
    }
}

void HandleEvent(HWND hWnd)
{
    CSpEvent event; 
    WCHAR  *pwszText;
    
    // Loop processing events while there are any in the queue
    while (event.GetFrom(g_cpRecoCtx)== S_OK)
    {
        switch (event.eEventId)
        {
        case SPEI_HYPOTHESIS:
            {
                pwszText = ExtractInput(event);
                //MessageBoxW(NULL, pwszText, L"text", MB_ICONERROR);
                wcscat(lpszBuffer, pwszText);
                wcsncat(lpszBuffer, L"\r\n", 2);
                SetDlgItemTextW(hWnd, IDC_EDIT1, lpszBuffer);
            }
            break;
        }
    }
}

WCHAR *ExtractInput(CSpEvent event)
{
    HRESULT                   hr = S_OK;
    CComPtr<ISpRecoResult>    cpRecoResult;
    SPPHRASE                  *pPhrase;
    WCHAR                     *pwszText;

    cpRecoResult = event.RecoResult();

    hr = cpRecoResult->GetPhrase(&pPhrase);

    if (SUCCEEDED (hr))
    {
        if (event.eEventId == SPEI_FALSE_RECOGNITION)
        {
            pwszText = L"False recognition";
            //MessageBoxW(NULL, pwszText, L"text", MB_ICONERROR);
        }
        else
        {
            // Get the phrase's entire text string, including replacements.
            hr = cpRecoResult->GetText(SP_GETWHOLEPHRASE, SP_GETWHOLEPHRASE, TRUE, &pwszText, NULL);
        }
    }
    CoTaskMemFree(pPhrase);
    return pwszText;
}

void CleanupSAPI()
{
    if(g_cpRecoGrammar)
    {
        g_cpRecoGrammar.Release();
    }
    if(g_cpRecoCtx)
    {
        g_cpRecoCtx->SetNotifySink(NULL);
        g_cpRecoCtx.Release();
    }
    if(g_cpEngine)
    {
        g_cpEngine.Release();
    }
    CoUninitialize();
}

Points of Interest

Speech recognition with SAPI is not very easy, there are not many codes and examples available in C++.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Help desk / Support Gexel Telecom
Canada Canada
I have been programming in C and C++ for more than four years, the first time that i had learn programming was in 1999 in college. However it was only by the year 2000 when i have buy my first computer that i had truly started to do some more interesting things in programming. As a programmer,my main interest is A.I programming. So i'm really captivated by all that is related to N.L.U (Natural Language Understanding), N.L.P (Natural Language Processing), Artificial Neural Networks etc. Currently i'm learning to program in Prolog and Lisp. Also,i'm really fascinated with the original chatterbot program named: Eliza,that program was wrote by Joseph Weizenbaum. Everytime i run this program,it makes me really think that A.I could be solve one day. A lot of interesting stuff has been accomplish in the domain of Artificial Intelligence in the past years. A very good example of those accomplishments is: Logic Programming,which makes it possible to manipulate logic statements and also to make some inferences about those statements. A classical example would be: given the fact that "Every man is mortal" and that Socrates is a man,than logically we can deduce that Socrates is mortal. Such simple logical statements can be wrote in Prolog by using just a few lines of code:

prolog code sample:

mortal(X):- man(X). % rule
man(socrates). % declaring a fact

the preceding prolog rule can be read: for every variable X,if X is a man than X is mortal. these last Prolog code sample can be easily extented by adding more facts or rules,example:
mortal(X):- man(X). % rule
mortal(X):- woman(X). % rule
man(socrates). % fact 1
man(adam). % fact 2
woman(eve). % fact 3

for more, check: https://cenelia7.wixsite.com/programming
ai-programming.blogspot.com

Comments and Discussions

 
BugMy vote of 1 Pin
Michael Haephrati11-Jun-20 3:50
professionalMichael Haephrati11-Jun-20 3:50 
QuestionThis code isn't working at all and there's no author feedback on how to fix it. this is useless Pin
Winux Worx7-Jun-20 20:25
Winux Worx7-Jun-20 20:25 
Questionerror C4996: 'GetVersionExW': was decleared deprecated Pin
LorenzoGorza11-Dec-19 5:59
LorenzoGorza11-Dec-19 5:59 
QuestionHow to install the library or header files Pin
Member 135620786-Dec-17 9:38
Member 135620786-Dec-17 9:38 
Questionspeech to text using dsp kit Pin
jjk036-Feb-17 4:46
jjk036-Feb-17 4:46 
QuestionMissing Resource header files Pin
Member 1200675224-Sep-15 5:22
Member 1200675224-Sep-15 5:22 
QuestionResource header file issue Pin
Member 1148389126-Feb-15 17:57
Member 1148389126-Feb-15 17:57 
AnswerRe: Resource header file issue Pin
Gonzales Cenelia1-Mar-15 15:51
Gonzales Cenelia1-Mar-15 15:51 
GeneralRe: Resource header file issue Pin
Sara159-Feb-18 4:17
Sara159-Feb-18 4:17 
QuestionResource header file issue Pin
Member 1148389126-Feb-15 17:56
Member 1148389126-Feb-15 17:56 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.