
Introduction
I was always fascinated whenever I used Acrobat Reader's Read Out options. I found that Adobe Reader uses the Windows Speech engine. Almost all Windows OSs are shipped with the Speech engine. We can also use this engine programatically. There are many features available with the Speech engine, like speech recognition, text to speech, etc. With speech recognition, you can interact with your PC using voice commands rather than GUI commands. In this example, I have shown how to use the TTS feature of the Speech engine.
Background
Windows XP is shipped with the Text-To-Speech engine. You can verify this by going to Control Panel ->Speech ->Text to speech. If this engine is not installed with your OS version, you can download it from Microsoft Speech SDK 5.1. If you want to use the TTS feature on a web browser, you can use an ActiveX provided by Microsoft by applying the new ActiveXObject (Sapi.SpVoice) in your JavaScript.
A little about SAPI
Microsoft Speech API (SAPI) contains many interfaces and classes for managing speech. For TTS, the base class is SpVoice
. The following are some important properties:
- Voice Object of type
SpObjectToken
, which is inherited from ISpeechObjectTokens
- Volume Integer specifies intensity of voice
- AudioOutputStream specifies the stream for audio output; if you want to save it on file, use
SpFileStream
of SAPI
- SynchronousSpeakTimeout specifies the milliseconds after which the voice's synchronous
Speak
and SpeakStream
calls will time out
Methods:
GetVoices()
returns all available voices; I have use this to populate the voice-type comboBox
Speak()
returns the audio on the output stream (speaker/ file)
Pause()
pauses the audio output
Resume()
resumes the audio output
WaitUntilDone()
blocks application execution while a voice is speaking asynchronously
Using the code
To start with SAPI in your .NET application, you have to first add a reference to SAPI.dll from the path C:\Program Files\Common Files\Microsoft Shared\Speech if SAPI is not appearing in COM tab of Add Reference. Following is the code that generates audio based on the text entered. Note that I am assigning the Voice
property a value based on the voice type selected from the combo box. At form_load
, I have filled the combo box with all available voices. See the next code section.
Private Sub btnSpeak_Click(ByVal sender As System.Object,
ByVal e As System.EventArgs)
Handles btnSpeak.Click
Me.Cursor = Cursors.WaitCursor
Dim oVoice As New SpeechLib.SpVoice
Dim cpFileStream As New SpeechLib.SpFileStream
oVoice.Voice = oVoice.GetVoices.Item(cmbVoices.SelectedIndex)
oVoice.Volume = trVolume.Value
oVoice.Speak(txtSpeach.Text,
SpeechLib.SpeechVoiceSpeakFlags.SVSFDefault)
oVoice = Nothing
Me.Cursor = Cursors.Arrow
End Sub
Find all available voices and then bind with Voice ComboBox by using the GetVoices()
method on the SpVoice
class object. Note the list of available voices. We can use the getDescription()
method to find out the voice name, e.g. LH Michael.
Private Sub Form1_Load(ByVal sender As System.Object,
ByVal e As System.EventArgs)
Handles MyBase.Load
Dim x As New SpeechLib.SpVoice
Dim arrVoices As SpeechLib.ISpeechObjectTokens = x.GetVoices
Dim arrLst As New ArrayList
For i As Integer = 0 To arrVoices.Count - 1
arrLst.Add(arrVoices.Item(i).GetDescription)
Next
cmbVoices.DataSource = arrLst
End Sub
To save the audio output to a file, you must use SpFileStream
and set AudioOutPutStream
equal to your stream object of type SpFileStream
.
If SaveFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then
Dim oVoice As New SpeechLib.SpVoice
Dim cpFileStream As New SpeechLib.SpFileStream
cpFileStream.Open(SaveFileDialog1.FileName,
SpeechLib.SpeechStreamFileMode.SSFMCreateForWrite, False)
oVoice.AudioOutputStream = cpFileStream
oVoice.Voice = oVoice.GetVoices.Item(cmbVoices.SelectedIndex)
oVoice.Volume = trVolume.Value
oVoice.Speak(txtSpeach.Text, SpeechLib.SpeechVoiceSpeakFlags.SVSFDefault)
oVoice = Nothing
cpFileStream.Close()
cpFileStream = Nothing
End If
References
Since this example is on TTS, those who are interested in Speech recognition and grammar can refer to this article. For more details on Speech SDK, please refer to this article.
History
- 25 June, 2007 -- Original article posted