Click here to Skip to main content
Click here to Skip to main content

DBX Parser

By , 2 Jul 2009
 

Introduction

The DBXParser can read an Outlook Express DBX file, and extract the raw content (MIME) to EML file, then you can use MIME parsers to get the actual subject body and attachments. It's the first and only (as far as Google tells) pure C# source code that provides simple methods to read and extract, free, pure, without over-headed third-party DLLs. ;)

Background

I've been searching for source code to read an Outlook Express DBX file for a very long time (believe me, it's really long). I have found quite a few DBX readers/codes in other languages, such as:

But none of them are pure C#, so I made my own. It's a good way to port similar codes from another language, here I choose the DBX Parser(PHP), because it's much simpler than the C++ version:)

Where Are My DBX Files?

Your DBX files will normally be found here:

C:\Documents and Settings\{UserName}\Local Settings\
	Application Data\Identities\{Guid}\Microsoft\Outlook Express 

Take a look at {Guid}, it's different on different computers, so all you need to do is to use Directory.GetFiles() to go through all the sub folders for searching *.dbx files.

File Format

I am not going to go deep into the file format here, because there are already quite a few documents there:

How It Works

It reads the raw DBX file, without any third-party or related DLLs provided by Outlook Express. It only reads through the file, decoding byte by byte, then internally stores the position and sizes of chunks of each message in order to be used by the Extract method, because it only stores the positions, so it uses very little memory.

Using the Code

First create a new instance of DBX, then use the Parse function to read the file, it will return how many messages the DBX file contains. If it returns -1, then it means there is something wrong with the file. Then you could use the Extract method to save the content to a file or to read the content to memory.

Here goes a sample code:

using (DBX DBX = new DBX())
{
    int count = DBX.Parse(@"test.dbx");	//specify your DBX file here
    if (count > 0)
    {
        for (int i = 0; i < count; i++)
        {
            DBX.Extract(i, (i + 1) + ".eml"); //specify file to extract to
            				//or just read the content to memory
            				//string content = DBX.Extract(i);
        }
    }
}

How to Decode MIME (EML File)?

Choose any one of the following codes that you like:

Points of Interest

Because I suffered a lot while finding such code, I contribute it here as others won't have to get crazy looking for it. After porting the code from PHP, I feel that PHP is really mature. There are different kinds of PHP codes out there, and I'm really surprised to see that even PHP has such a sample code, but Java/VB/C# do not. I try to keep this code as simple as possible, read and extract, that's all, which I think fits most of the situations. If you have any comments or suggestions, please feel free to tell me, or just modify the code yourself.

History

  • Version 1.0 - 2009-5-1
  • Version 1.1 - 2009-5-5: some code clean up, added a VC++ code
  • Version 1.2 - 2009-7-2: fixed a problem not correctly return the exact amount of mails in some special situations, thanks to Cato.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Huisheng Chen
Architect www.xnlab.com
Australia Australia
Member
I was born in the south of China, started to write GWBASIC code since 1993 when I was 13 years old, with professional .net(c#) and vb, founder of www.xnlab.com
 
Now I am living in Sydney, Australia.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
BugIt gives wrong count (skips some emails from DBX) if the body value is nullmemberzhitesh1 Mar '12 - 5:36 
Hi,
 
The parser skips any email with null body value, any idea?
 
Thank you
GeneralRe: It gives wrong count (skips some emails from DBX) if the body value is nullmemberHuisheng Chen1 Mar '12 - 10:34 
search for
System.Diagnostics.Debug.WriteLine(string.Format("wrong {0} {1}", start, end));
 
replace it with
 
{
    mails.Add(mail);
    System.Diagnostics.Debug.WriteLine(string.Format("wrong {0} {1}", start, end));
}
 
I should work.
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

GeneralMy vote of 5memberklinkenbecker18 Jun '11 - 7:20 
Thanks for making the effort, sharing and saving us the trouble...
GeneralOrder of filesmemberMacRaider410 Feb '10 - 5:23 
I'm currently finding that the order of the files produced aren't the same order as the files in the folder. Is anyone else having a problem, if I can get that figured out then this will be what I need for the project I'm working on.
GeneralEmail count always -1membermhota30 Nov '09 - 20:18 
Hi,
I have few DBX files and when I am trying to read it is saying mail count = -1 for all files.
GeneralRe: Email count always -1memberUnruled Boy30 Nov '09 - 20:55 
could you please send me the smallest one with more than 1 emails to unruledboy at gmail.com, or put it somewhere if it is too big and give me the link
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

Generaldbx errormemberPeso19 Aug '09 - 22:25 
Can I recovery my big dbx file? How?
Thanks
 
Xep
GeneralRe: dbx errormemberUnruled Boy19 Aug '09 - 23:43 
sorry, I don't know, maybe you should ask the outlook express team from microsoft.com
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

GeneralMy vote of 1memberabcd1234f10 Aug '09 - 8:37 
-
GeneralThank youmemberPepsibot22 Jul '09 - 13:08 
This is EXACTLY what I needed. Thank you for doing the leg work!
Big Grin | :-D
 
Knowing what not to do is important, too. Eventually, you'll be left solely with solutions.

Generalnicememberpita20002 Jul '09 - 21:37 
it looks very easy to use
GeneralRe: nicememberUnruled Boy2 Jul '09 - 23:20 
I am glad that you like it Big Grin | :-D
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

QuestionIs it Extract only 10 Mails ?memberLaxmilal Menaria30 Jun '09 - 21:15 
Hello,
 
I have tried this code with My DBX file but it extract only 10 Mails. is there any limitation ?
 
I have more than 1000 emails in this single dbx, so what should I do ?
 
Thanks in advance,
Laxmilal
AnswerRe: Is it Extract only 10 Mails ?memberUnruled Boy30 Jun '09 - 21:23 
it has been fixed with version 1.2, but codeproject has not yet updated my article. right now, you could find:
 
indirect = true;
 
and add:
 
m_offset = _dbx_int24(info, 1);
 
after it.
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

AnswerRe: Is it Extract only 10 Mails ?memberUnruled Boy2 Jul '09 - 15:52 
article updated.
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

QuestionHow is it?membersjun127 May '09 - 16:15 
It became reference very much.
Thank you. Smile | :)
 
It found an error partially of the code.
How is it?Confused | :confused:
 

private void _dbx_mail_header(int Offset)
{
int d_offset = 0;
int s_offset = 0;
int m_offset = 0;
bool indirect = false;
byte[] mail;
_dbx_read(Offset, out mail, 12);
if (_dbx_int32(mail, 0) != Offset)
throw new IndexOutOfRangeException(string.Format("Self {1} != Offset {2}", _dbx_int32(mail, 0), Offset));
Offset += 12;
int n = (_dbx_int32(mail, 8) >> 16) & 0xff;
if (n != 0)
{
for (int i = 0; i < n; i++)
{
byte[] info;
_dbx_read(Offset, out info, 4);
switch (info[0])
{
case 0x0e:
s_offset = _dbx_int24(info, 1);
break;
case 0x12:
d_offset = _dbx_int24(info, 1);
break;
case 0x04:
indirect = true;
break;
case 0x84:
m_offset = _dbx_int24(info, 1);
break;
}
Offset += 4;
}
}
if (m_offset != 0)
{
if (indirect)
{
byte[] offset;
_dbx_read(Offset + m_offset, out offset, 4);
m_offset = _dbx_int32(offset, 0);
}
_dbx_mail_message(m_offset);
}
}

////////////////////////////////////////////////
JUN
http://jsmusicbox.homeip.net/~jsmusicbox/
////////////////////////////////////////////////
GeneralRe: How is it?memberUnruled Boy30 May '09 - 0:01 
do you mean I should remove the
 
"break;"
 
after
 
"indirect = true;"
 
?
 
As what I have said in the article, this code is ported from php, I do not know why exactly the original user code that like that D'Oh! | :doh:
 
maybe you should give me the detail error, or send me the dbx file, if you wish. Smile | :)
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

GeneralGreat!memberSouthmountain1 May '09 - 6:11 
Bravo!
 
usibility & truth

GeneralRe: Great!memberUnruled Boy1 May '09 - 16:46 
thanks. Smile | :)
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

Generalplease give comment or suggestionsmemberUnruled Boy1 May '09 - 4:20 
no matter what vote you do, please leave comment.
 
thanks a lot.
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

GeneralUnveil ...memberkanu@adatapost30 Apr '09 - 23:26 
I am really lucky. Today I thought about this concept and .... It is.
 
Thanks buddy
 
A DATAPOST COMPUTER CENTRE
(K.V Prajapati)

GeneralRe: Unveil ...memberUnruled Boy1 May '09 - 4:19 
glad you are lucky Laugh | :laugh:
 
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130516.1 | Last Updated 2 Jul 2009
Article Copyright 2009 by Huisheng Chen
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid