Click here to Skip to main content
15,886,258 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi All,

I am working on some speech enabled application using MS Speech SDK v11. I want to recognize the commands I specify in my SRGS document, at the same time, if some speak something else, I want that to be converted to simple text and be stored. I tried using Dictation mode, so that I can just continue in Dictation and get the text. Then using custom built SRGS parser, determine the commands. If the text is not predefined command, just treat it as simple text. Now problem is the AppendDictation is not working due to "Cannot find grammar referenced by this grammar." and I referred this http://stackoverflow.com/questions/9347346/appenddictation-on-microsoft-speech-platform-11-server

I changed from Microsoft.Speech to System.Speech and interestingly found that Microsoft.Speech at least recognize the commands I specify but System.Speech is doing nothing. I am confused what to do.

I repeat, I have simple requirement, recognize the commands I specify and if not just translate those in Text form. If this is not possible then how can I make work Speech API for dictation mode only.

Also just a quick question, as previously, we had to train the computer to understand our speech and create profile, does current speech engine also must be trained? Or it fine tunes itself as we speak frequently?

Thank you in advance!
Posted

1 solution

Right. The quality of the recognition engine which you can download from Microsoft is not good enough for Dictation. If your grammar is reasonably small, it works reasonably well, but even then you might need to use better pronunciation. :-) Actually, some say there are much better commercial engines, including the one from the company which licensed to Microsoft the present-day engine for Windows. I don't know what else to suggest, probably wait until technology improves. :-)

—SA
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900