(untagged)

SAPI with Microsoft Agent and Visemes to Explain TTS in C#

Beniton Fernando

0.00/5 (No votes)

17 May 2006

This article explains how to work with SAPI, Microsoft Agent and Visemes in C# .NET

Sample screenshot

Introduction

This article explains how to code text to speech application with SAPI and Microsoft Agent control.

When you finish reading this article, I hope you will know the following:

How to set the HotKey for an application
How to use Microsoft Agent Control
How to use Speech API
Finally, how to use visemes with TTS

What Do We Need To Do This

The sample given above was created in Visual Studio .NET 2003. In order to run the sample, we need a set of APIs installed on our system.

Requirements are:

Sample Application

The sample application given above converts the selected text to speech regardless of where the text is and converts it into speech when a HotKey is pressed. To achieve this task, we need to do the following set of things:

Setting HotKey
Referencing the Speech API and Microsoft Agent Control
Setting visemes

Setting HotKey

Setting HotKey for an application means registering the key with the application in order to get the messages from the operating system.

For this, we use DllImports of user32.dll.

//API Imports
[DllImport("user32.dll", SetLastError = true)]
public static extern bool RegisterHotKey( IntPtr hWnd, // handle to window

int id, // hot key identifier
KeyModifiers fsModifiers,
// key-modifier options
Keys vk
// virtual-key code
);

[DllImport("user32.dll", SetLastError = IntPtr hWnd, true)]
public static extern bool UnregisterHotKey(// handle to window

int id // hot key identifier
);

This sample application registers the F9 key as the HotKey in the constructor of the application and unregisters it when disposing.

//For registering
bool
//HotKey_iD may be any number unique to that application
//For unregistering
bool
 bcheck = UnregisterHotKey(Handle, HOTKEY_ID);

We are done with registering of HotKey. Now we need to handle the message from the operating system. For that, we override the method WndProc and check for the corresponding message received:

const int WM_HOTKEY= 0x312;
protected
{
 override void WndProc(ref Message msg)// Listen for operating system messages.
{
//Here we do whatever we need
}
}
break;base.WndProc(ref msg);

When the HotKey is pressed, we copy the selection by sending keyevents using SendKeys.SendWait("^(c)"); and get the text from the clipboard. Now we have the text which needs to be converted in speech.

Referencing the Speech API and Microsoft Agent Control

In order to use the Speech API, we need to reference it in our application. This is done as given below.

First we make a reference to the Microsoft Speech Library 5.0 as follows:

Now, we can find the reference of SpeechLib:

We have finished referencing the Speech library. Now we have to reference Microsoft Agent Control.

We can add Microsoft Agent Control directly to the ToolBar by selecting Add/Remove Items from Menu and selecting the COM Component tab in the Customise toolbox dialog. From that dialog, we select the Microsoft Agent Control as follows:

Now, we can just drag and drop the control from the toolbox to our form:

Then we need to use the Speech library and Microsoft Agent in our code.

First we go with the Speech library and import SpeechLib:

using SpeechLib;

Then we create a Voice object:

voice = new SpVoice();

Now, we need to make it talk. This is done as follows:

voice.Speak("Whatever it is" ,SpeechVoiceSpeakFlags.SVSFlagsAsync);

We make use of SVSFlagAsync because we are going to use visemes in the sample. This will be explained later.

Setting a Different Voice

If we have installed different voices, then we can make use of it in our sample application.

In order to list out all the available voices in the system, we do this:

foreach
{
Console.Writeline(t.GetAttribute("Name"));
//I add it in a Combo
}
(ISpeechObjectToken t in voice.GetVoices("",""))

We can set the voices according to our preference as follows:

voice.Voice =
    voice.GetVoices("Name="+VoiceCombo.Items[0].ToString(), "Language=409").Item(0);

Making Use of Agent

Now we can also make use of Microsoft Agent and make the Agent speak for us.

This is done as follows:

//You can load whatever agent you wish as per the availability
//To set the language to US English.
axAgent2.Characters.Load("Genie",(Character = axAgent2.Characters["Genie"];
    object)"C:/Speaker/chars/GENIE.acs");
Character.LanguageID = 0x409;
Character.Show(
Character.Speak(txt,
Character.Hide();
null);null);

Setting Visemes

Visemes are nothing but images with expression. We have different kinds of expressions related to phonetics. We can have 13 images with different expressions related to phonetics for achieving this. For setting visemes in a SAPI application, we need to have 13 images expressing Silence (ae, aa, ao, ey, er, y, w, ow, aw, oy, ay, h, r, l, s, sh, th, f, d, k, p).

Then, we need to set a viseme handler for Voice as follows:

voice.Viseme+=
    new _ISpeechVoiceEvents_VisemeEventHandler(VisemeEvent);

The VisemeEvent method sets the different images for the pronounced words:

private
{
//we have 22 visemetype
pictureBox1.Image= selectedList.Images[i] ;
}
 void VisemeEvent(int StreamNo,object StreamPos, int duration,
     SpeechLib.SpeechVisemeType nextVisemetype,
     SpeechLib.SpeechVisemeFeature visemeFeature,
     SpeechLib.SpeechVisemeType currentVisemetype)
     int i= int.Parse(currentVisemetype.ToString().Replace("SVP_",""));

I was not able to find perfect images for visemes. So I tried to create my own visemes.

Conclusion

I hope I have covered the basic things about SAPI and Microsoft Agent Control. Please feel free to email me at beniton@gmail.com if you find any problems or have suggestions for this article. Thank you!

Please don't forget to rate this article.

switch (msg.Msg)case WM_HOTKEY:
// this is the block the app turns in if the HotKey has been pressed
bcheck = RegisterHotKey(Handle, HOTKEY_ID, KeyModifiers.None, Keys.F9);

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here