Click here to Skip to main content
15,902,492 members
Articles / Programming Languages / XML
Article

WBXML Parser

Rate me:
Please Sign up or sign in to vote.
2.38/5 (5 votes)
11 Apr 20075 min read 141.6K   1.5K   11   59
A library for compressing XML content suitable for mobile devices.

Introduction

When browsing the web using your mobile phone, it will generally be receiving web content in a format called WML (Wireless Markup Language). This is a derivative of XML, specific to wireless devices. It displays content in a manner suitable for the small screen on a mobile phone. When these WML pages are sent from the network to the phone, the pages are compressed to reduce the number of bytes required for transmission (and hence the cost of the download to the customer) and also to generally speed things up a bit.

The compression is achieved in two main ways. Well known tags and attributes/values are replaced with a binary code, and recurring strings are stored in a lookup table.

This technique is widely used. Another common application is the transmission of network settings to a mobile phone.

In general, the same techniques are used to encode and decode. However, depending on the XML document type, different lookup tables are used for the well known tags and attributes/values. There are about forty publicly defined XML document types. You could also define your own tables!

The WBXML encoding was originally defined by the WAP Forum several years ago. The specifications from this body are now maintained by the Open Mobile Alliance: OMA Specifications.

As I said earlier, you can define your own tables, and this is what Nokia and Sony Ericsson did for bookmarks and settings: Nokia-Ericsson OTA. Another example is Phone.com but the specs defined by this company are not easily found.

Using the code

I've designed the library so it only takes a few lines of code to encode and decode a page. Below is an example:

VB
Dim FileName As String= "C:\Documents " & _ 
    "and Settings\John Joe\My Documents\ML Pages\bbc2.wml"
Dim FileContents As String = _
    My.Computer.FileSystem.ReadAllText(FileName)

Dim Content_Type As String = ""
Dim WBXML_Parser As New XML.WBXML_Coder_Class

' Let's encode the wml page
Dim Encoded_Page As String = _
    WBXML_Parser.Encode_Page(FileContents, Content_Type) 
'the content type should now be "wmlc"
Debug.Print(Content_Type)

' Decode the wmlc string
Dim Decoded_Page As String = _
    WBXML_Parser.Decode_Page(Encoded_Page, Content_Type)

' The content type should now be "wml"
Debug.Print(Content_Type)
' We should have the same page that we started with.
Debug.Print(Page)

Content Type

All XML documents generally have a well known Content Type. For example, WML has the content types "application/vnd.wap.wmlc" (binary encoded) and "text/vnd.wap.wml" (plain text). Click this link for the list on the OMA web site.

When you are encoding/decoding a page, the class will automatically try to detect the type of XML document involved. If it finds a match, it will set the content_type parameter of the calling function.

Character-set

The WBXML specification allows the use of different character-sets such as UTF-8 and UTF-16. The code is not currently set up to handle multi-byte characters.

HMAC

When you send a provisioning message to a phone, you need to include a hash key. This is computed using a PIN and the message itself. You use the IMSI or network PIN as the key as well. Please note that the HMAC_Calculate function is overloaded.

The code example below shows how to get the HMAC when you supply a PIN code of 4729. Note that the HMAC is calculated using a hex encoded ASCII string, i.e., the null character is "00" and so on.

VB
Dim FileName As String= "C:\Documents and" & _ 
            " Settings\John Joe\My Documents\ML Pages\bbc2.wml"
Dim FileContents As String = _
    My.Computer.FileSystem.ReadAllText(FileName)

Dim Content_Type As String = ""
Dim WBXML_Parser As New XML.WBXML_Coder_Class
Dim  Toolbox  As New  ToolBox_Class

' Let's encode the wml page
Dim Encoded_Page As String = _
    WBXML_Parser.Encode_Page(FileContents, Content_Type) 

Dim HMAC As String = WBXML_Parser.HMAC_Calculate("4729", _
                     ToolBox.Ascii_to_Hex(Encoded_Page))

Points of interest

Embedded SyncML docs

The SyncML specification consists of three sub specifications: SyncML, MetaInfo, and DevInfo. In some of the examples for MetaInfo, there are embedded DevInfo documents. The technique for encoding these is encode the DevInf section and then embed it in the SyncML document as opaque data.

The different uses of XML

This is one feature of XML that I didn't appreciate until I started working with multiple XML document types. In the XML usage in a WML document, anything which is not a tag/attribute is generally presentation data, i.e., what the user sees on the mobile phone screen. So really, it's pretty much like XHTML. In a SyncML document, anything which is not a tag/attribute is data that the phone itself uses.

Visual Basic collections

The code for this library is several thousand lines, but most of it is defining constants for well known tags and attributes. I've written a few subclasses which deal with the storage and retrieval of these constants. One of the things I would have liked to have done is to use a predefined .NET collection for the storage and retrieval of these constants. The problem (my lack of knowledge) is these constants are referenced using both their string values and the corresponding hex values. So you need to do lookups on both values.

With all the predefined collections, you can do a quick lookup using the index/key with the "item" function, but to do a lookup on the value (and get the corresponding index), you need to use a For Each loop. So, I've cobbled together a collection based on structures and standard arrays.

Opaque data

I've found only two specifications which use opaque data: Service Indication and SyncML. If there are other specifications which use opaque data, the code will need to be updated in order to correctly handle it.

Performance

In order to keep the time spent loading and unloading constants to a minimum, only those relevant to the current XML document are loaded.

Multitude of specification documents

In the beginning, a long, long time ago, there was one. WML Binary Specification defines the basic methods for WBXML encoding, and defines the constants for WML itself. All other specifications are built on top of this core specification. You then get specifications for other document types along with different versions of these specifications. A later version of a specification does not necessarily supersede an earlier version. Constants are removed as well as added as versions increment, so this means you got to treat every specification/version individually. Hence the huge lists of constants.

Bugs

This is not production quality code by any stretch of the imagination. I've tested the code with all the samples I've found. If there is a specific spec/document not working, please post it and I'll have a look. Of course, you can try fixing it yourself!.

History

  • 1 June 2006: Initial posting.
  • 13 June 2006: I've updated the code to support embedded DevInf in SyncML documents.
  • 4 April 2007: Added support for calculating the HMAC using IMSI and network PIN. Changed the handling of attributes in SyncML documents.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
United Kingdom United Kingdom
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralModifed links not working Pin
hisabir18-Jun-09 2:22
hisabir18-Jun-09 2:22 
GeneralHELPP NEEDED Pin
vel15aug8-Nov-08 23:28
vel15aug8-Nov-08 23:28 
GeneralWBXML Parser Problem Pin
santhisuresh29-Oct-08 1:01
santhisuresh29-Oct-08 1:01 
Generalproblem with WV WBXML Pin
Davita12-Jan-08 4:04
Davita12-Jan-08 4:04 
GeneralRe: problem with WV WBXML Pin
Dazzar12-Jan-08 13:10
Dazzar12-Jan-08 13:10 
GeneralRe: problem with WV WBXML Pin
Davita13-Jan-08 1:05
Davita13-Jan-08 1:05 
GeneralRe: problem with WV WBXML Pin
Dazzar13-Jan-08 4:39
Dazzar13-Jan-08 4:39 
GeneralRe: problem with WV WBXML Pin
Dazzar23-Jan-08 2:55
Dazzar23-Jan-08 2:55 
QuestionSyncml Ecoding problem Pin
Muhammad Raheel Javed8-May-07 20:16
Muhammad Raheel Javed8-May-07 20:16 
QuestionSyncml Pin
Muhammad Raheel Javed10-Apr-07 21:00
Muhammad Raheel Javed10-Apr-07 21:00 
AnswerRe: Syncml Pin
Dazzar10-Apr-07 22:17
Dazzar10-Apr-07 22:17 
AnswerRe: Syncml Pin
Dazzar10-Apr-07 22:42
Dazzar10-Apr-07 22:42 
GeneralRe: Syncml Pin
Muhammad Raheel Javed11-Apr-07 4:30
Muhammad Raheel Javed11-Apr-07 4:30 
GeneralRe: Syncml Pin
Dazzar11-Apr-07 4:39
Dazzar11-Apr-07 4:39 
AnswerRe: Syncml Pin
Muhammad Raheel Javed11-Apr-07 5:05
Muhammad Raheel Javed11-Apr-07 5:05 
GeneralRe: Syncml Pin
Dazzar11-Apr-07 5:01
Dazzar11-Apr-07 5:01 
AnswerRe: Syncml Pin
Muhammad Raheel Javed11-Apr-07 5:10
Muhammad Raheel Javed11-Apr-07 5:10 
GeneralRe: Syncml Pin
Dazzar11-Apr-07 5:25
Dazzar11-Apr-07 5:25 
AnswerRe: Syncml Pin
Muhammad Raheel Javed11-Apr-07 5:34
Muhammad Raheel Javed11-Apr-07 5:34 
GeneralRe: Syncml Pin
Dazzar11-Apr-07 6:11
Dazzar11-Apr-07 6:11 
AnswerRe: Syncml Pin
Muhammad Raheel Javed12-Apr-07 0:21
Muhammad Raheel Javed12-Apr-07 0:21 
GeneralRe: Syncml Pin
Dazzar12-Apr-07 0:58
Dazzar12-Apr-07 0:58 
AnswerRe: Syncml Pin
Muhammad Raheel Javed18-Apr-07 4:50
Muhammad Raheel Javed18-Apr-07 4:50 
GeneralRe: Syncml Pin
Muhammad Raheel Javed11-Apr-07 4:56
Muhammad Raheel Javed11-Apr-07 4:56 
GeneralRe: Syncml Pin
Dazzar18-Apr-07 5:45
Dazzar18-Apr-07 5:45 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.