Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Converting Text-To-Speech and and using mouth motion animation

0.00/5 (No votes)
27 Nov 2001 1  
This program shows how to convert text to speech and use mouth motion

Introduction

Parts of this code are based on James Matthews' original article Visemes: Representing Mouth Positions.

Figure 1: GUI of application program

This project shows I can convert from Text into Speech that followed by mouth motion. Before you build this project, I suggest you to read my article , "Simple Program for Text to Speech Using SAPI". These steps for building project:

  1. New Application

  2. Setting SAPI on project

  3. Building GUI of project

  4. Coding

  5. Testing

1. New Application

Create your application with MFC AppWizard (exe) and type SpeakMouth as project name. Click OK button. In Step 1 of MFC AppWizard, choice Dialog based and click Finish button.

Figure 2 : New application.

Figure 3: Application based on Dialog. Click Finish button

2. Setting SAPI on Project

Setting SAPI on this project like the article , "Simple Program for Text to Speech Using SAPI".

3. Building GUI of project

Figure 4: GUI Model

From figure 4, you can build and design like hat. This ID and properties of GUI component:

No Component Name ID Add variable using ClassWizard
1 Picture IDC_MOUTH_IMG CStatic :: m_cMouth
2 Group Box IDC_STATIC -
3 Edit Box IDC_TEXT CString :: m_sText
4 Button IDC_SPEAK_BTN -

Model GUI of this project shows on figure 4. After you build GUI, then you must import files bitmap consist of graphics of mouth (figure 5) in Resource View of project. Choice files bitmap in folder res (figure 6). I have put files bitmap of mouth shape in folder res in this project, and you can find these files bitmap in your Speech SDK too. These files bitmap you must import:

  • mic eyes closed.bmp
  • mic.bmp
  • mic_eyes_narrow.bmp
  • mic_mouth_2.bmp
  • mic_mouth_3.bmp
  • mic_mouth_4.bmp
  • mic_mouth_5.bmp
  • mic_mouth_6.bmp
  • mic_mouth_7.bmp
  • mic_mouth_8.bmp
  • mic_mouth_9.bmp
  • mic_mouth_10.bmp
  • mic_mouth_11.bmp
  • mic_mouth_12.bmp
  • mic_mouth_13.bmp

Figure 5: Import file bitmap into project

Figure 6: Choice file Batmap

After you put all files bitmap into Resource View of project, you also must set properties of these files bitmap ie. assign ID of files bitmap. You can do it with right clicking of your mouse into each file bitmap (figure 7) and you'll see window like figure 8. Here's ID of files bitmap:

File Bitmap ID
  mic eyes closed.bmp  IDB_MICEYESCLO
  mic.bmp  IDB_MICFULL
  mic_eyes_narrow.bmp  IDB_MICEYESNAR
  mic_mouth_2.bmp  IDB_MICMOUTH2
  mic_mouth_3.bmp  IDB_MICMOUTH3
  mic_mouth_4.bmp  IDB_MICMOUTH4
  mic_mouth_5.bmp  IDB_MICMOUTH5
  mic_mouth_6.bmp  IDB_MICMOUTH6
  mic_mouth_7.bmp  IDB_MICMOUTH7
  mic_mouth_8.bmp  IDB_MICMOUTH8
  mic_mouth_9.bmp  IDB_MICMOUTH9
  mic_mouth_10.bmp  IDB_MICMOUTH10
  mic_mouth_11.bmp  IDB_MICMOUTH11
  mic_mouth_12.bmp  IDB_MICMOUTH12
  mic_mouth_13.bmp  IDB_MICMOUTH13

Figure 7: Setting properties of file bitmap

Figure 8: Properties of file bitmap

4. Coding

There are three classes in this project (figure 8) ie:

  • CAboutDlg
  • CSpeakMouthApp
  • CSpeakMouthDlg

Figure 9: Classes of project

Call ClassWizard  to add variable into class CSpeakMouthDlg, type m_sError as name of variable and its type is CString, protected.

Below steps of coding processes:

  1. In the file StdAfx.h, put these code lines for accessing SAPI COM
  2.  #include <atlbase.h>
    
     extern CComModule _Module;
     #include <atlcom.h>
    
    
  3. On the class CSpeakMouthDlg, add the code lines at the top of the file SpeakMouthDlg.h
  4. #include "sapi.h"
    
    #include <sphelper.h>
    
    
    // CONTANTS OF MOUTH
    
    #define CHARACTER_WIDTH     128
    #define CHARACTER_HEIGHT    128
    #define WEYESNAR            14 // eye positions
    
    #define WEYESCLO            15
    
        Beside that, add code lines on the top of class CDialog (file SpeakMouthDlg.cpp):
     #define WM_RECOEVENT	WM_USER+101	
     /////////////////////////////////////////////////////////
    
     // Mouth Mapping Array (from Microsoft's TTSApp Example)
    
    
     const int g_iMapVisemeToImage[22] = {
         0,  // SP_VISEME_0 = 0,   // Silence
    
        11, // SP_VISEME_1,        // AE, AX, AH
    
        11, // SP_VISEME_2,        // AA
    
        11, // SP_VISEME_3,        // AO
    
        10, // SP_VISEME_4,        // EY, EH, UH
    
        11, // SP_VISEME_5,        // ER
    
        9,  // SP_VISEME_6,        // y, IY, IH, IX
    
        2,  // SP_VISEME_7,        // w, UW
    
        13, // SP_VISEME_8,        // OW
    
        9,  // SP_VISEME_9,        // AW
    
        12, // SP_VISEME_10,       // OY
    
        11, // SP_VISEME_11,       // AY
    
        9,  // SP_VISEME_12,       // h
    
        3,  // SP_VISEME_13,       // r
    
        6,  // SP_VISEME_14,       // l
    
        7,  // SP_VISEME_15,       // s, z
    
        8,  // SP_VISEME_16,       // SH, CH, JH, ZH
    
        5,  // SP_VISEME_17,       // TH, DH
    
        4,  // SP_VISEME_18,       // f, v
    
        7,  // SP_VISEME_19,       // d, t, n
    
        9,  // SP_VISEME_20,       // k, g, NG
    
        1	// SP_VISEME_21,      // p, b, m
    
     };
    
     ////////////////////////////////////////////////////////////
    
    
  5. On the class CSpeakMouthDlg, add a member method (you just click on right mouse, and choice Add Member Method). Method:  InitMouthImageList()::BOOL, InitializationSAPI()::BOOL and DestroySAPI()::BOOL , all in protected mode. Beside that, Add variables, here's the code lines: 
  6.    CString m_sError;
       BOOL InitMouthImageList();
       BOOL DestroySAPI();
       BOOL InitializationSAPI();
    	
       ///////////////////////////////////////////////////////
    
       // Speech API Variables
    
    	
       CComPtr<ISpVoice> IpVoice;
       CImageList m_cMouthList;
       int	m_iMouthBmp;
       CRect m_cMouthRect;
       ///////////////////////////////////////////////////////
    
    
  7. Now, you can implement your method that you have created
  8.  void CSpeakMouthDlg::OnDestroy() 
    {
    
        DestroySAPI();
        CDialog::OnDestroy();
    
    }
    
    BOOL CSpeakMouthDlg::InitializationSAPI()
    {
        if (FAILED(CoInitialize(NULL))) {
            m_sError=_T("Error intialization COM");
            return FALSE;
        }
    
        HRESULT hRes;   
        hRes = IpVoice.CoCreateInstance(CLSID_SpVoice);
    
        if (FAILED(hRes)) {
            m_sError=_T("Error creating voice");
            return FALSE;
        }
    
        hRes = IpVoice->SetInterest(SPFEI(SPEI_VISEME), 
            SPFEI(SPEI_VISEME));    
        if (FAILED(hRes)) {
            m_sError=_T("Error creating interest...seriously");
            return FALSE;
        }
    
        hRes = IpVoice->SetNotifyWindowMessage(m_hWnd, 
            WM_RECOEVENT, 0, 0);
        if (FAILED(hRes)) {
            m_sError=_T("Error setting notification window");
            return FALSE;
        }
    
        return TRUE;
    }
    
    BOOL CSpeakMouthDlg::DestroySAPI()
    {
        if (IpVoice) {
            IpVoice.Release();
        }
        return TRUE;
    }
    
    BOOL CSpeakMouthDlg::InitMouthImageList()
    {
        m_cMouth.GetClientRect(&m_cMouthRect);
        m_cMouth.ClientToScreen(&m_cMouthRect);
        ScreenToClient(&m_cMouthRect);
    
        CBitmap bmp;
        m_cMouthList.Create(CHARACTER_WIDTH, 
            CHARACTER_HEIGHT, ILC_COLOR32 | ILC_MASK, 1, 0);
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICFULL));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH2));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH3));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH4));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH5));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH6));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH7));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH8));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH9));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH10));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH11));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH12));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICMOUTH13));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICEYESNAR));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        bmp.LoadBitmap(MAKEINTRESOURCE(IDB_MICEYESCLO));
        m_cMouthList.Add(&bmp, RGB(255,0,255));
        bmp.Detach();
    
        m_cMouthList.SetOverlayImage(1, 1);
        m_cMouthList.SetOverlayImage(2, 2);
        m_cMouthList.SetOverlayImage(3, 3);
        m_cMouthList.SetOverlayImage(4, 4);
        m_cMouthList.SetOverlayImage(5, 5);
        m_cMouthList.SetOverlayImage(6, 6);
        m_cMouthList.SetOverlayImage(7, 7);
        m_cMouthList.SetOverlayImage(8, 8);
        m_cMouthList.SetOverlayImage(9, 9);
        m_cMouthList.SetOverlayImage(10, 10);
        m_cMouthList.SetOverlayImage(11, 11);
        m_cMouthList.SetOverlayImage(12, 12);
        m_cMouthList.SetOverlayImage(13, 13);
        m_cMouthList.SetOverlayImage(14, WEYESNAR);
        m_cMouthList.SetOverlayImage(15, WEYESCLO);
    
        return TRUE;
    }
    
  9. In the class CSpeakMouthDlg, the method OnInitDialog() must have the following code for SAPI initialization
  10. // Set the icon for this dialog.  
    
    // The framework does this automatically
    
    //  when the application's main window 
    
    // is not a dialog
    
    SetIcon(m_hIcon, TRUE); // Set big icon
    
    SetIcon((HICON)(LoadImage(AfxGetResourceHandle(), 
            MAKEINTRESOURCE(IDR_MAINFRAME), 
            IMAGE_ICON, 16, 16, 0)), FALSE);
    
    m_sError=_T("");
    if (!InitializationSAPI()) {
        AfxMessageBox(m_sError);        
        DestroySAPI();
    }
    
    InitMouthImageList();
    
    return TRUE;  
    
    
  11. In the construction of class CSpeakMouthDlg::CSpeakMouthDlg(CWnd* pParent=NULL)
  12.  m_iMouthBmp = 0;
    
  13. Now, you must create the method to handle the event of converting speech into mouth motion. You can send the message handler ie: WM_RECOEVENT that you have defined as WM_USER + 101. In file SpeakMouthDlg.h, you type:
  14. // Generated message map functions
    
    //{{AFX_MSG(CSpeakMouthDlg)
    
    virtual BOOL OnInitDialog();
    afx_msg void OnSysCommand(UINT nID, LPARAM lParam);
    afx_msg void OnPaint();
    afx_msg HCURSOR OnQueryDragIcon();
    afx_msg void OnDestroy();
    afx_msg void OnSpeak();
    //}}AFX_MSG
    
    // add message handler
    
    afx_msg LRESULT OnMouthEvent(WPARAM, LPARAM); 
    DECLARE_MESSAGE_MAP()
    and in the file SpeakMouthDlg.cpp, add these code lines
    BEGIN_MESSAGE_MAP(CSpeakMouthDlg, CDialog)
    //{{AFX_MSG_MAP(CSpeakMouthDlg)
    
    ON_WM_SYSCOMMAND()
    ON_WM_PAINT()
    ON_WM_QUERYDRAGICON()
    ON_WM_DESTROY()
    ON_BN_CLICKED(IDC_SPEAK_BTN, OnSpeak)
    //}}AFX_MSG_MAP
    
    // add message handler
    
    ON_MESSAGE(WM_RECOEVENT, OnMouthEvent)  
    END_MESSAGE_MAP()    
    

    This is the implementation of the method OnMouthEvent() :

    LRESULT CSpeakMouthDlg::OnMouthEvent(WPARAM wParam, 
                                         LPARAM lParam) 
    {
        CSpEvent event;  
    
        while (event.GetFrom(IpVoice) == S_OK) {
            switch (event.eEventId) {
                case SPEI_VISEME:
                    m_iMouthBmp = 
                        g_iMapVisemeToImage[event.Viseme()];
                    InvalidateRect(m_cMouthRect, false);
                    break;
            }
        }
    
        return 0;
    }
    
  15. Now, you add a method to handle the button click (IDC_SPEAK_BTN) event. Type OnSpeak as the name of the method, and add these code lines as it's implementation:
  16.  void CSpeakMouthDlg::OnSpeak() 
     {
        UpdateData();
    	
        USES_CONVERSION;
        IpVoice->Speak(m_sText.AllocSysString(), SPF_ASYNC, 
            NULL);		
     }
    
  17. So, you can debug and run your program

5. Testing

To test your application program, you can run it and type word a into the Edit Box. After that, click the Speak button.

Reference

Speech SDK 5.1 from Microsoft Inc.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here