Click here to Skip to main content
Click here to Skip to main content

An Introduction to VoiceXML

, 19 Apr 2007
Rate this:
Please Sign up or sign in to vote.
This simple article introduces a person to VoiceXML, its existence and applications

Introduction

Voice Extensible Markup Language (VoiceXML) is a markup language for creating voice user interfaces that use automatic speech recognition (ASR) and text-to-speech synthesis (TTS). The VoiceXML forum was formed in March 1999, by AT&T, IBM, Lucent and Motorola to promote and to accelerate the adoption of VoiceXML-based applications worldwide.

Today, more than 10,000 commercial VoiceXML-based speech applications have been deployed across a diverse set of industries, including financial services, government, insurance, retail, telecommunications, transportation, travel and hospitality. Millions of calls are answered by VoiceXML applications every day!

Some of the things voice xml was designed for and is being used today are creating audio dialogs, digitized audio, spoken and DTMF key input recognition, spoken input recognition, telephony and mixed initiative conversion.

Using the code

Consider an example below. The first one is a simple "Hello, This is Keshav!".

<xml> 
<vxml version="1.0">
   <form> 
      <block>Hello, this is Keshav</block>
   </form> 
</vxml>    
 

The <vxml> element is a container for dialogs which is of two types, forms and menus. Forms present menus and gather relative information while the menus offer choice as to what must be done next. In the above example, it simply presents "Hello, this is Keshav!" to the user. The conversation ends here since there is no successor dialog presented by the form.

<xml> <vxml version="1.0">
   <form>
      <field name= "coffee">
          <prompt> 
            Would you like to Espresso, Cappuccino, Mocha or nothing?
          </prompt>
          <grammar src="drink.gram" type = "application/x-jsgf"/>
      </field>  

      <block>
         <submit next="http://www.coffee.example/coffee2.asp"/> 

      </block>
   </form>

</vxml>    
  

This example asks the user for a choice of coffee and accordingly submits it to a server script. A typical interaction between a computer (C) and a human (H) would be,

C: Would you like to have espresso, Cappuccino, Mocha or nothing?

H: Darjeeling tea.

C: I did not understand what you said.

C: Would you like to have espresso, Cappuccino, Mocha or nothing?

H: Mocha.

C: (continues in document coffee2.asp)

Architectural model

The architectural model of voice xml considers the following components.

1) A document Server

2) VoiceXML interpreter

3) VoiceXML interpreter Context

4) Implementation Platform.

A document server processes requests from a client application, the voiceXML interpreter through the voiceXML interpreter context. The server produces voiceXML documents in reply which are produced by the voiceXML interpreter.

The voiceXML interpreter context may monitor user inputs in parallel with the voiceXML interpreter. The implementation platform is controlled by the voiceXML interpreter context and by the voiceXML interpreter. It generates events in response to user actions and system events

Goals

VoiceXML's main goal is to bring the full power of web development and content delivery to voice response applications and to free the authors of such applications from low level programming and resource management. It enables integration of voice services with data services using the familiar client-server paradigm.

It

  • Minimizes the client-server interactions by specifying multiple interactions per document.
  • Shields application authors from low level and platform specific details.
  • Separates user interaction code (in VoiceXML) from service logic(CGI script)
  • Promotes service portability across implementation platforms. VoiceXML is a common language for content providers, tool providers and platform providers.
  • Is easy to use for simple interactions and yet provides language feature to support complex dialogs

A possible shortcming

While VoiceXML strives to accommodate the requirements of a majority of voice response services, services with stringent requirement may be best served by dedicated applications that employ a finer level of control.

Conclusion

VoiceXML is a promising option in current and the future ages. To know more about VoiceXML forum and the membership, visit http://www.voicexml.org

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Keshav V. Kamat
Web Developer
India India
No Biography provided

Comments and Discussions

 
GeneralMy vote of 1 Pinmembertinku5nov6-Sep-13 19:28 
Questioncorrupted zip file Pinmemberaghapour.ahad3-May-12 0:01 
Generalcorrupted zip file Pinmembersoysal6-Feb-11 7:38 
Zip file is only 127KB. And suposed to be ~260KB.
Is it possible to re-upload?
Generalheader corrupted Pinmembercapowell10-Mar-08 3:19 
GeneralRe: header corrupted Pinmemberjcocha17-May-12 7:08 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140709.1 | Last Updated 20 Apr 2007
Article Copyright 2007 by Keshav V. Kamat
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid