Click here to Skip to main content
11,926,961 members (54,585 online)
Click here to Skip to main content
Add your own
alternative version


1 bookmarked

Decode quoted-printable data by Regex

, 14 Feb 2012 CPOL
Rate this:
Please Sign up or sign in to vote.
A "single-liner" to decode quoted-printable data.


Some data like MHTML[^] contain parts that are encoded as quoted-printable[^] data stream. That format is quite simple:

  • All printable ASCII characters may be represented by themselves, except the equal sign
  • Space and tab may remain as plain text unless they appear at the end of a line
  • All other bytes are represented by an equal sign followed by two hex digits representing the byte value
  • No line must be longer than 76 characters: if they were longer, they are broken by a trailing equal sign


The following quoted-printable encoded text...

This is a long text with some line break and some encoding of the equal sig=
n (=3D). Any line longer than 76 characters are broken up into lines of 76 =
characters with a trailing equal sign.

...results in the following after decoding...

This is a long text with some line break and some encoding of the equal sign (=). 
  Any line longer than 76 characters are broken up into lines of 76 characters with a trailing equal sign.

The Trick

I came up with the following Regex since I could not find a suitable class in the .NET framework to decode quoted-printable data.

string raw = ...;
string txt = Regex.Replace(raw, @"=([0-9a-fA-F]{2})|=\r\n",
              m => m.Groups[1].Success
                   ? Convert.ToChar(Convert.ToInt32(m.Groups[1].Value, 16)).ToString()
                   : "");

Where to go from here

Once you have the decoded text, you can for example strip off all HTML tags, e.g.:

string textonly = HttpUtility.HtmlDecode(Regex.Replace(txt, @"<[\S\s]*?>", ""));
Console.WriteLine("{0}", textonly);


<a href=""#print_link"">Expression&lt;Action&lt;T&gt;&gt; expr = s =&gt; Console.WriteLine(&quot;{0}&quot;, s);


Expression<Action<T>> expr = s => Console.WriteLine("{0}", s);

Finally, the plain text can be searched for some pattern, e.g.:

var q = from m in Regex.Matches(textonly,
        select m.Groups[1].Value;
q.Aggregate(0, (n, v) => { Console.WriteLine("{0}: Expression<Action<T>> {1}", ++n, v); return n; });

Possible output:

1: Expression<Action<T>> calculate
2: Expression<Action<T>> print
3: Expression<Action<T>> store


Performance may not be optimal, but it keeps me going with my other tasks... Wink | ;-)


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Andreas Gieriet
Founder eXternSoft GmbH
Switzerland Switzerland
I feel comfortable on a variety of systems (UNIX, Windows, cross-compiled embedded systems, etc.) in a variety of languages, environments, and tools.
I have a particular affinity to computer language analysis, testing, as well as quality management.

More information about what I do for a living can be found at my LinkedIn Profile and on my company's web page (German only).

You may also be interested in...

Comments and Discussions

GeneralReason for my vote of 5 Good Tip Pin
ProEnggSoft24-Feb-12 21:06
memberProEnggSoft24-Feb-12 21:06 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.151126.1 | Last Updated 14 Feb 2012
Article Copyright 2012 by Andreas Gieriet
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid