Click here to Skip to main content
6,629,377 members and growing! (21,535 online)
Email Password   helpLost your password?
Platforms, Frameworks & Libraries » Mobile Development » Applications     Intermediate

ShakyVoice - voice stress analysis tool

By Vladimir Ralev

This is an implementation of a simple voice stress analysis tool for PocketPC, it can be used on the road as a lie detector.
C#, Windows, .NET CF, .NET, PocketPC 2002VS.NET2003, Dev
Posted:26 May 2004
Views:83,223
Bookmarked:52 times
Announcements
Loading...
 
Search    
Advanced Search
Add to IE Search
printPrint   add Share
      Discuss Discuss   Broken Article?Report  
25 votes for this article.
Popularity: 5.13 Rating: 3.67 out of 5
3 votes, 12.0%
1
3 votes, 12.0%
2
2 votes, 8.0%
3
10 votes, 40.0%
4
7 votes, 28.0%
5

Sample Image - ShakyVoice.jpg

Introduction

ShakyVoice (initially codenamed BatteryDrainer Pro) is a simple voice stress analysis tool for your Pocket PC. It measures only one of the many voice parameters that expose the stress, however, this is exactly what most of these cheap phone conversation lie detectors do. I only added a timeline tracking feature that allows to see the values in time, and the program doesn't say what is truth and what is false - this kind of processing you have to do on your own by having some samples of lie and truth files. I think this makes the program more accurate and useful than the silly toys that have no idea how a truth would sound like.

How to use it?

*This program requires the MS .NET compact Framework installed.

The application measures the voice stress directly. You can measure a lie, by comparing the difference between the stress in the truth file and the lie file your have recorded. Formally, it's like this:

LieMeasure = |TruthFileStressReading - LieFileStressReading|

The program doesn't display the LieMeasure, because you would always have to feed it with two files (truth and lie). You can view a single file if you already have an idea about the speaker's normal parameters, so you can see if there is a difference.

You start by recording a truth file and a lie file. The program supports only 8khz PCM mono files, so you should go to Start -> Program -> Settings -> Input and set the Voice Recording Format to 8, 000 Hz,16 Bit, Mono (16 KB/s), and make sure you are not selecting some GSM or Microsoft format, the app works on uncompressed PCM WAVE files only. Otherwise - it will display an error message or wrong data (depending on what you have done wrong). Now, in order to record a truth file, go to Records -> Record truth and use the control in the normal way. Then you can record the lie file by Records -> Record lie. Any file recorded is stored in the memory and can be recalled in the next sessions. After that - you can analyze the files individually by Records -> Analyze truth/lie. After the analysis, the file parameters are displayed on the timeline. You should take short records, no more than 30 seconds, because the analysis is very slow (and drains the batteries quickly). This is one of the reasons why the utility and the precision are so basic. The red line is the stress - higher values mean greater stress. The white line is the total power of the signal and the yellow is the tremor power (actually the square roots of the powers, because of speed considerations). The program computes the tremor power relative to the total speech signal power - this is a good measure of the stress. You should usually look at the maximum values (on the top of the graphs), just get an idea what is the maximum (red) - it's a good start point for your analysis. The time is in milliseconds (on the time bar).

Generally, consider these situations as stress:

  • a white line not following the yellow closely.
  • a high red line.

User interface description

In order to distinguish between speech and silence, you should keep track on the white line not to go too low. Low white line means that the current segment may be empty (silence). The stress estimate is based on chucks of about 1/2 of the second overlapped by some hard coded factor.

Why do we lie?

Coming soon... yeah right ;)

Lie theory

People get stressed when telling a lie. The bigger the lie, the greater the stress. They get even more stressed when they know there is a chance to get caught or when they know they lie about something important. The stress is easy to detect. This is any deviation from the normal parameters - like the heart rhythm, retina reactions, eye blinking frequency (this one really works), blood pressure, body temperature, EEG graphs, and voice. The others are more subjective and I don't mention them at all. In this article, I will focus on the voice parameters that indicate stress. It is a very wide topic - there are many methods to detect stressed voice, some of them are very sophisticated. I have implemented the most simple one - a "poor" micro tremor detector. It works like that - it is estimating the modulation of the speech with a sine wave with frequency 8-12 Hz. In other words, it checks if the speech energy is jumping in 1/14 - 1/8 sec intervals. How do we do that? - many ways - however, it looks like the most common one is with spectral analysis of the speech data frames. Spectral analysis is usually done by taking the Fourier transform image of the speech data using a Fast Fourier Transform (FFT) algorithm. It is not easy to implement this algorithm - I have taken mine from Stephan Bernsee. I just had to port some functions from C to C# (keeping it in an unsafe section). If some DSP expert is reading this, he/she is already laughing, because this is not the right way to measure the modulation we are seeking for. Yes, but in practice, if we get the energy of the 6-15 Hz band, we pretty much are coming up with this right modulation energy. This is not very precise of course, I guess it is still useful since some people are actually selling out such devices.

Code Review

The most important part is the analysis code. It first gets a chuck of data of size FrameSize and then performs FFT, computes the total and the tremor energy, and sets up the stress estimate. There are many many details on how I read/record the wave file and how I display the data, and I can't really comment everything. It is all a matter of work, nothing too bright.

   while(notdone)
   {
    this.LieFile.inStream.Position=store_pos;
    int read=this.LieFile.inStream.Read(tmp,0,FrameSize*2);
    if(read!=FrameSize*2)
    {
     notdone=false;
     for(int q=read;q<FrameSize*2;q++) tmp[q]=0;
    }
    unsafe
    {
     fixed(byte *pdata=tmp)
     {
      byte *assignable=pdata;
      for(int q=0;q<2*FrameSize;q+=2)
      {
       short tword=(short)((((int)assignable[1])<<8)|assignable[0]);
       fdata[q]=((float)tword)/((float)(0xffff>>1));
       fdata[q+1]=0;
       assignable+=2;
      }
     }

     fixed(float * fftme=fdata)
     {
      smbFft(fftme,4096,-1);
     }
     float total=0;float tremor=0;
     for(int q=0;q<FrameSize;q++)
     {
      fdata[q]=(float)Math.Sqrt(Math.Pow(fdata[2*q],2)+
                Math.Pow(fdata[2*q+1],2));
      total+=fdata[q];
     }
     total/=FrameSize;
     for(int q=7;q<=15;q++)
     {
      tremor+=fdata[q];
     }
     tremor/=6;
     Data[1,0].Add(tremor);
     Data[1,1].Add(total);
     Data[1,2].Add(tremor/total);
    }
    store_pos+=OverlapFactor;
    progressBar1.Value= 
      (int)(100F*(float)store_pos/(float)this.LieFile.inStream.Length);
   }

What could be done further?

I will mention about some more advanced techniques. A more accurate stress indicator is the shaking pitch (or fundamental frequency) of the speaker. Have you noticed (while lying) that your voice is getting thin and high. Well, that's because your pitch is going higher - the algorithm I am talking about can be up to 100x more sensitive to the pitch changes than your ears & brain. This would be a great boost in accuracy.

Another alternative is to measure the breathing intervals and speech speed - this one is also very accurate.

All of these three methods can be implemented on a Pocket PC device with pretty much no hardware/performance requirements.

Used resources

  • Maximum thanks to Stephan Bernsee for the FFT routine.
  • Brenner, M., Branscomb, H., & Schwartz, G.E. (1979). Psychological stress evaluator: Two tests of a vocal measure. Psychophysiology, 16(4), 351-357.
  • comp.dsp.
  • Many Internet sites helped me to verify the method.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Vladimir Ralev


Member

Location: Bulgaria Bulgaria

Other popular Mobile Development articles:

  • Writing Your Own GPS Applications: Part 2
    In part two of the series, the author of "GPS.NET" teaches developers how to write GPS applications suitable for the real world by mastering GPS precision concepts. Source code includes a working NMEA interpreter and sample high-precision application in C# and VB.NET.
  • Writing Your Own GPS Applications: Part I
    What is it that GPS applications need to be good enough to use for in-car navigation? Also, how does the process of interpreting GPS data actually work? In this three-part series, I will cover both topics and give you the skills you need to write a commercial-grade GPS application.
  • Learn How to Find GPS Location on Any SmartPhone, and Then Make it Relevant
    A step by step tutorial for getting GPS from any SmartPhone, even without GPS built in, and then making location useful.
  • iPhone UI in Windows Mobile
    It's an interface that works with transparency effects. As a sample I used an interface just like the iPhone one. In this tutorial I am explaining how simple is working with transparency on Windows Mobile.
  • Pocket 1945 - A C# .NET CF Shooter
    An article on Pocket PC game development
Article Top
You must Sign In to use this message board.
FAQ FAQ 
 
Noise Tolerance  Layout  Per page   
 Msgs 1 to 19 of 19 (Total in Forum: 19) (Refresh)FirstPrevNext
GeneralTrouble with Binary Software PinmemberJoe Stonecipher15:36 15 Aug '09  
GeneralAnger detection Pinmember100,000 Cold Calls11:58 12 Aug '08  
QuestionQuestion about code PinmemberAcid260019:03 4 Apr '07  
GeneralHelp please! Pinmembergiomla11:00 8 Feb '07  
GeneralRe: Help please! PinmemberSuperjaxon22:07 30 Mar '07  
QuestionCan't record Pinmembermablendafx16:39 6 Nov '05  
GeneralSmartphone Pinsussdamylen0:30 2 Oct '05  
GeneralRe: Smartphone PinmemberVladimir Ralev0:41 2 Oct '05  
GeneralVery cool PinmemberShaka91317:57 19 Nov '04  
Generaldoes not work PinmemberMario M.23:36 27 May '04  
GeneralRe: does not work PinmemberVladimir Ralev23:49 27 May '04  
GeneralRe: does not work PinmemberVladimir Ralev23:59 27 May '04  
GeneralRe: does not work PinmemberMario M.0:11 28 May '04  
GeneralRe: does not work PinmemberVladimir Ralev0:19 28 May '04  
GeneralRe: does not work PinmemberMario M.1:11 28 May '04  
GeneralRe: does not work PinmemberVladimir Ralev1:23 28 May '04  
GeneralRe: does not work PinmemberMario M.0:09 28 May '04  
GeneralNice Pinmemberappwiz9:05 27 May '04  
GeneralRe: Nice PinmemberVladimir Ralev9:14 27 May '04  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 26 May 2004
Editor: Smitha Vijayan
Copyright 2004 by Vladimir Ralev
Everything else Copyright © CodeProject, 1999-2009
Web10 | Advertise on the Code Project