Click here to Skip to main content
Click here to Skip to main content

Fun with Google Speech Recognition Service

, 22 Jul 2014
Rate this:
Please Sign up or sign in to vote.
Create text issues in Redmine by using Google speech recognition service
Screenshot

Introduction

I was excited to discover open web services like Google has, and it was very amazing when I heard about Google speech recognition.

In this article, I write some tips to use Google speech recognition API in Windows application with direct recording voice from audio input devices. And also, like a delicious spice - wear simple program for speech recognition into the utility for quick issues adding in Redmine project.

Background

The basic idea was: you push the button, some timer starts elapse together with wave-in device opening, main loop starts and pcm data from buffers with your voice records to file, timer stops and audio file is posted to Google for recognition.

First task was in understanding flac encoding in realtime, you can tell 'In *nix, I can write couple commands in terminal and do all: record, encode, post flac file and receive answer from server. So, why do you not encode file with encoder program started after recording wave file?' - because it's boring, just imagine: your program writes already prepared flac audio file!

From the time then I wrote some application for batch converting mp3 files to OGG/Vorbis, I have stayed library that can encode pcm to vorbis in realtime, there also was ring buffer for that.

At that point, the appropriate handler for the flac did not wait. You might know that Google accepts flac in 16 kHz and 16 bit per sample with 1(mono) channel format. By using example in libflac, I add three functions: InitialiseEncoder, ProcessEncoder, CloseEncoder which are, respectively: open file and prepare encoder, upload to encoder 16bit pcm samples, close file and destroy encoder. One thing: don't understand why it can't add metadata to flac file? Maybe charset problems?

The wonderful article: WaveLib, which has wave-in API implementation included, that uses Recorder class: starts the WaveInRecorder and in parallel uses thread for transmitting pcm data to encoder.

File Uploading

The basic upload function usage is below, change lang parameter optionally:

string result = WebUpload.UploadFileEx(flacpath, 
	"http://www.google.com/speech-api/v1/recognize?lang=ru&client=chromium",
	"file", "audio/x-flac; rate=16000", parameters, null);

Response from server is received in JSON format.

Issue Creating

In which case can you use the speech recognition? Maybe for issue creating? Maybe it is not practical, but certainly funny.

The Redmine web application includes REST web service. By it, we can create issues as much as we need to, just specify project and tracker, by the way the list of trackers I could only get younger version 1.3*.

RedmineManager manager = new RedmineManager(Configuration.RedmineHost,
    Configuration.RedmineUser, Configuration.RedminePassword);
    
// New ISSUE
var newIssue = new Issue
{
    Subject = Title,
    Description = Description,
    Project = new IdentifiableName() { Id = ProjectId },
    Tracker = new IdentifiableName() { Id = TrackerId }
};
// GET ID OF CURRENT USER
User thisuser = (from u in manager.GetObjectList
		<user>(new System.Collections.Specialized.NameValueCollection())
                 where u.Login == Configuration.RedmineUser
                 select u).FirstOrDefault();
if (thisuser != null)
    newIssue.AssignedTo = new IdentifiableName() { Id = thisuser.Id };
    
manager.CreateObject(newIssue);

Points of Interest

When it was over, I drew attention to record timeout, it gives you 4 secs for your speech: not for all expressions of it may be appropriate, form maybe needs some stop button?

Ring buffer will save you from data loss in case of such records directly to flac. When the data comes from the wave-in, they go in the ring buffer.

History

  • February 28, 2012: First version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

cerriun
Software Developer
Russian Federation Russian Federation
No Biography provided

Comments and Discussions

 
GeneralMy vote of 3 PinmemberTema SMirnov30-Apr-14 11:09 
GeneralRe: My vote of 3 Pinmembercerriun3-Jun-14 4:12 
Bug"Unable to connect to the server" error Message Pinmemberpsych1872-Apr-12 5:27 
GeneralRe: "Unable to connect to the server" error Message Pinmembercerriun17-May-12 21:27 
GeneralRe: "Unable to connect to the server" error Message PinmemberFatburger311-Aug-12 6:38 
GeneralRe: "Unable to connect to the server" error Message PinmemberFatburger311-Aug-12 6:41 
QuestionError PinmemberHosam Ershedat1-Apr-12 11:41 
Hi , thanks for code , I'm very interested to understand it
but the problem in the redmine part I don't understand it , and how to
get redmineHost or account , please help me , thanks again.
AnswerRe: Error Pinmembercerriun17-May-12 21:31 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140721.1 | Last Updated 23 Jul 2014
Article Copyright 2012 by cerriun
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid