conversion from speech to text using Sphinx in java does not produce correct result

Question

2.00/5 (1 vote)

See more:

I have been trying to convert speech to text using the Sphinx package in java...but i am unable to understand why is it not correctly producing the tokens....
Below is the code

.java file

Java

package speechtotext;

import edu.cmu.sphinx.frontend.util.Microphone;
import edu.cmu.sphinx.recognizer.Recognizer;
import edu.cmu.sphinx.result.Result;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import java.awt.*;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import javax.swing.*;


public class HelloWorld extends JApplet  implements ActionListener{
 
  private JButton b1 = new JButton("SPEAK"), b2 = new JButton("STOP");
  JTextArea textArea = new JTextArea(7,30);
  Result result;
  ConfigurationManager cm;
  Recognizer recognizer;
  Microphone microphone;
  String resultText ;
  
  public void init() {
    Container cp = getContentPane();
    cp.setLayout(new FlowLayout());
    Image image = Toolkit.getDefaultToolkit().createImage("C:\\Users\\arsa\\Desktop\\1.png");
    Image scaled = image.getScaledInstance(300, 550, Image.SCALE_SMOOTH);
    JLabel label = new JLabel(new ImageIcon(scaled));
    cp.add(label,BorderLayout.CENTER);
    textArea.setText("");
    textArea.setLineWrap(true);
    textArea.setEditable(false);
    add(textArea,"Center");
    cp.add(b2,FlowLayout.LEFT);
    cp.add(b1,FlowLayout.LEFT);
    cp.add(textArea);
    b1.addActionListener(this);
    b2.addActionListener(this);
    cm = new ConfigurationManager(HelloWorld.class.getResource("helloworld.config.xml"));
    recognizer = (Recognizer) cm.lookup("recognizer");
    System.out.println("Successful1 allocation");
    recognizer.allocate();
    System.out.println("Successful1 allocation1");
    microphone = (Microphone) cm.lookup("microphone");
     if (!microphone.startRecording()) {
            System.out.println("Cannot start microphone.");
            recognizer.deallocate();
            System.exit(1);
        }
  }
@Override
    public void actionPerformed(ActionEvent e) {
    String str=e.getActionCommand();
     if (e.getSource() == b1)
     {
         result = recognizer.recognize();                
     }   
     else  if (e.getSource() == b2)
     {
         if (result != null) {
             resultText = result.getBestPronunciationResult();
             if(resultText!=null)
                textArea.setText("You said: " + resultText + '\n');
            else if(resultText==null)
                textArea.setText("I couldn't hear what you said.\n");
         }
         else if(result==null)
             textArea.setText("Cheater!! Cheater!! you didn't say anything....\n");
     }
   }

.gram file

XML

#JSGF V1.0;

/**
 * JSGF Grammar for Hello World example
 */

grammar hello;

public <greet> = (Good morning | Hello) ( Bhiksha | Evandro | Paul | Philip | Rita | Will );

Posted 29-Apr-14 10:25am

Member 10195287

Updated 29-Apr-14 21:03pm

Richard MacCutchan

v4

Add a Solution

Comments

NeverJustHere 29-Apr-14 19:54pm

Try speaking in an American accent :)

I'm only half joking. The only experience I have with Speech to Text was a system installed in Australia in the late 90's using Dialogic Speech to Text recognition boards. The accuracy improved significantly when we got them to provide an Australian accented pattern.

We were only attempting to distinguish Yes/No answers over a telephone line.

1 solution

Add a Solution

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Ravimal Bandara · Answer 1 · 2014-05-10T08:07:00

The problem might be your accent. But you can solve this by modifying the default acoustic model (list of phonemes of each word).
In Sphinx the acoustic model can be found as a text file. It includes some thing like following lines,

HELLO	HH AH L OW
HELLO(2)	HH EH L OW
THANKS	TH AE NG K S
YOUR	Y AO R
YOUR(2)	Y UH R

TH AE NG K S is the set of phonemes for the word "THANKS". You can modify these phonemes to suit to your pronunciation.

1. First find WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar file and extract it.
2. Go to edu\cmu\sphinx\model\acoustic\WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz\dict folder and open “cmudict.0.6d” file in that folder.
3. Modify the content as it will suit to your pronunciation and save.
4. Zip the extracted hierarchy back as it was and Zip file named should be same as JAR file.
5. Remove “WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar” file from Project’s CLASSPATH and add “WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.zip” instead of it.

You can add more words using the following tool.
http://www.speech.cs.cmu.edu/tools/lmtool-new.html[^]

conversion from speech to text using Sphinx in java does not produce correct result

1 solution

Solution 1

Add your solution here

Preview 0

conversion from speech to text using Sphinx in java does not produce correct result

1 solution

Solution 1

Add your solution here

Preview 0

Existing Members

...or Join us