Click here to Skip to main content
Click here to Skip to main content

Convert HTML to MHTML using ASP.NET

By , 14 Jun 2004
 

Introduction

Ever wanted to make a report out of an html document and have it sent to the client for offline use in Word or Excel? An RFC - compliant Multipart MIME Message (mhtml web archive) is one single file containing all related material such as linked documents and images serialized to their Base64 inline encoding representations. There is no native support for creating mhtml archives in .NET but thanks to the Windows CDO library this is easy accomplished.

The code

The projects contains 3 classes; mht, mhtImage and mhtImageCollection. The mht class contains the conversion functions like convertWebControlToMHTString which takes a webControl and a collection of images and returns a string representation of the created mht archive. Use this function when the converting webControl is dependent on user specific Session and Application variables for rendering or when you use dynamically created images.

Public Function convertWebControlToMHTString(ByVal control As WebControl, _
  ByVal MHTimages As mhtImageCollection) As String
  'Render WebControl to html
  Dim html As String = getHtml(control)

  'If WebControl has images, make the html Word compatible
  If Not MHTimages Is Nothing Then
    fixImageLocation(html, MHTimages)
  End If

  Dim msg As New CDO.MessageClass
  Dim stm As ADODB.Stream = Nothing
  Dim MS As System.IO.MemoryStream = Nothing

  Dim iBp As CDO.IBodyPart

  'Make a multipart mhtml document
  Dim mainBody As CDO.IBodyPart
  mainBody = msg
  mainBody.ContentMediaType = "multipart/related"

  'Make the html part of the document
  iBp = mainBody.AddBodyPart()
  iBp.ContentMediaType = "text/html"
  iBp.ContentTransferEncoding = "quoted-printable"
  stm = iBp.GetDecodedContentStream
  stm.WriteText(html)
  stm.Flush()

  'Make the image parts of the document
  If Not MHTimages Is Nothing Then
    Dim oMhtImage As mhtImage
    For Each oMhtImage In MHTimages
      iBp = mainBody.AddBodyPart()
      With iBp
        .ContentMediaType = "image/" + _
 oMhtImage.ImageFormat.ToString().ToLower()
        .ContentTransferEncoding = "base64"

        'ContentLocation must be the same as in the 
        'html part to make them linked
        .Fields.Append("urn:schemas:mailheader:content-location", _
    DataTypeEnum.adBSTR, , , oMhtImage.ContentLocation)
        .Fields.Update()
        .Fields.Refresh()
      End With

      Try
        MS = New System.IO.MemoryStream
        oMhtImage.Image.Save(MS, oMhtImage.ImageFormat)
        Dim bytearray As Byte() = MS.ToArray()
        stm = iBp.GetDecodedContentStream
        stm.Write(bytearray)
        stm.Flush()
      Finally
        MS.Close()
        stm.Close()
      End Try
    Next
  End If

  stm = mainBody.GetStream()
  Return stm.ReadText(stm.Size)
End Function
      

The convertWebPageToMHTString function converts an html document from a specific URL to a mht archive, all images included. Use this function for public html documents not dependent on user specific Session and Application variables.

Public Function convertWebPageToMHTString(ByVal url As String) As String
    Dim msg As New CDO.MessageClass
    Dim stm As ADODB.Stream = Nothing

    Try
        msg.MimeFormatted = True
        msg.CreateMHTMLBody(url, CDO.CdoMHTMLFlags.cdoSuppressNone, "", "")
        stm = msg.GetStream()
        Return stm.ReadText(stm.Size)
    Finally
        stm.Close()
    End Try
End Function
    

The fixImageLocation appends the string "http://" at the beginning of each ContentLocation if not already there, for Word compliance

Private Sub fixImageLocation( _
      ByRef html As String, ByRef MHTimages As mhtImageCollection)
    Dim curContentLocation As String
    Dim curIndex As Integer
    Dim oMhtImage As mhtImage
    For Each oMhtImage In MHTimages
        curContentLocation = oMhtImage.ContentLocation
        If curContentLocation.IndexOf(":") = -1 Then
            curIndex = html.IndexOf(curContentLocation)
            While curIndex <> -1
                html = html.Insert(curIndex, "http://")
                curIndex = html.IndexOf(curContentLocation, curIndex + _
   curContentLocation.Length)
            End While
            oMhtImage.ContentLocation = "http://" + curContentLocation
        End If
    Next
End Sub
    

The mhtImage class contains image information. Property Image contains the actual image. Property ContentLocation contains the path to the image, must be exactly the same as the source for the image in the html part. Property ImageFormat contains the image format (jpg, gif, bmp...)

The mhtImageCollection class contains a collection of mhtImages.

Using the code

Example on how to make a mht archive from a Panel webControl containing one image.

Dim oMhtCol As New mhtImageCollection
oMhtCol.add(New mhtImage(System.Drawing.Image.FromFile( _
  Server.MapPath("/mhtml/images/myComputer.jpg")), _
  "images/myComputer.jpg", System.Drawing.Imaging.ImageFormat.Jpeg))
sendMHTFile(ConvertWebControlToMHTString(Panel1, oMhtCol), "myFirstMht.mht")
    

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Partenon
Web Developer
Sweden Sweden
Member
Software developer

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Generalhelp!!memberbishwajeet3 Oct '08 - 20:38 
Actually I converted this code in C# and is working fine but its not going to save all web site just like ibnlive.com, Cricinfo.com..what can be reasons .
If anyone know just tell me..I will be very thankful to him.
 
mail_id;-bishu_473@yahoo.co.in
 
bishwajeet
GeneralCDO.MessageClass CreateMHTMLBody Random Format Errormemberxontherocks29 Nov '07 - 4:42 
Hi guys i created a function (C# .Net) to create a Mht file, and always work perfect, but randomly the file generated lost the format and look totally diferent of the source HTML, is to hard to replicate the error, please let me know if you have been heard about this problem, and if you know a posible solution.
 
Thanks
X
 
this is the code i'm using:
 
public static void Create(string url, string path, string fileName)
{
// Create in memory MHT
MessageClass message = new MessageClass();
message.CreateMHTMLBody(url, CdoMHTMLFlags.cdoSuppressNone, "", "");
string mht = message.GetStream().ReadText(message.GetStream().Size);
 
// Verify path
System.IO.DirectoryInfo directory = new DirectoryInfo(path);
if (!directory.Exists)
directory.Create();
 
// Write MHT file
FileInfo file = new FileInfo(path + "\\" + fileName);
StreamWriter writer = file.CreateText();
writer.Write(mht);
writer.Close();
}
QuestionCreateMHTMLBody causes errormemberNancy Forbes10 Oct '07 - 4:01 
Are there any known issues with Windows 2003 SP2's version of CDOSYS? I have an application that uses CreateMHTMLBody that no longer works on my development machine since we upgraded to SP2. It is still working on our production machine which we haven't installed SP2 on yet.
 
I eventually get an HTTP 500 error on the web page. (Appears to just timeout) In the log file I have 202|8004004|operation_aborted_80.
 
The cdosys.dll has a date/timestamp of 2/17/2007 9:02am on my development machine. The production machine (one that works) has a date of 9/9/2005 9:32pm.
 
The application on the dev server will send text emails. It appears that just CreateMHTMLBody is broke.

 
Nancy
GeneralNeed help converting html to mhtmlmemberIneedMIMEmailsNOW13 Sep '07 - 23:06 
Hya. very new at mhtml. I need to format a mail in html format into mhtml. The html code incl. iframes and this is key. how can i do this?
 
I downloaded the html2mhtml but i cant see how to use it.
 
So 2 things.
 
How do i use the converter? and
Can i use iframes in the html code i want to convert?
GeneralHTML To DOC using MHTmembergansci31 May '07 - 19:46 
Im converting HTML file to DOC file using MHT, For embedding images and stylesheets in my word document. Hyperlink(with query string) in anchor tag is truncated automatically in word document, though length of URL is just 400 character.
Is anyone have idea on this problem?
or Some other feasible way to convert HTML to DOC embedding images and stylesheets.
GeneralLibrarymemberyann_lh11 Apr '07 - 10:23 
It would be really nice and helpful to build this converted into a standalone .NET library (DLL) that could be used from either web or win forms applications. Any intent to do so?
GeneralIs there any limit for the Mht file created.membersunil197810 Jul '06 - 20:08 
I am trying to generate MHT file from HTML. The code works fine if the Html to be converted is less.Once it gets bigger, then I am getting blank MHT file.Is there any size limit that MHT file is created using this code. How can i convert big HTML files into MHT file. Any help will be appreciated.
 
Sunil
GeneralGetting Exceptionmembersunil197810 Jul '06 - 0:52 
When i am trying to run the project that I downloaded, on clicking Convert button, I am getting error "External component has thrown an exception. " On debugging i found that it is throwing exception on "Dim msg As New CDO.MessageClass" statement. Exception message shows "[System.Runtime.InteropServices.SEHException] ".
Any help regarding solving the problem will be appreciated.
 


 
Sunil
QuestionHow can I Convert MHTML to HTML ?memberxfary3 Jul '06 - 16:04 
from this article I can convert html to mhtml , but if I must convert
a mthml file to a html file ,what should I do ?
please help me! thank you !
AnswerRe: How can I Convert MHTML to HTML ?memberRichard Beacroft16 Apr '07 - 0:56 
Did you get an answer?
 
I need to do this. We store mhtml in a database that can be read and displayed. Unfortunately, only IE supports viewing of these pages correctly. Therefore better to convert back to html for viewing.
 
Rik
 
Rik

QuestionAbout the convertWebPageToMHTString(...) methodmemberjasper1688814 Jun '06 - 18:24 
Hi,
I want to convert an Url to an MHT file, so I try to use the convertWebPageToMHTString() method, and try the following test cases ...
(1)If the Url is "http://", then the convert will be OK,
(2)If the Url is "https://", then the convert will be Faild, and show me the "Security Error"
Is there anyone can help me ? Cry | :((
Thanks A Lot,
Best Regards,
GeneralRe: About the convertWebPageToMHTString(...) methodmemberjasper1688814 Jun '06 - 19:19 
Hi,
The following is the code I modified ..
If If the Url is "https://", then the convert will be Faild, and show new Exception to the caller program ...
 
=========================================
Try
msg.MimeFormatted = True
msg.CreateMHTMLBody(url, CDO.CdoMHTMLFlags.cdoSuppressNone, "", "")
stm = msg.GetStream()
Return stm.ReadText(stm.Size)
Catch ex As Exception
Throw New Exception(ex.Message)

Finally
If Not IsNothing(stm) Then
stm.Close()
stm = Nothing
End If

End Try
Questionconvert an asp.net generated web page in memor?memberJTW226 May '06 - 12:11 
Hi, I need to fax a report that is generated by my site. The report is an HTML page with some small graphic images. The fax vendor (InterFax.Net) recommends using .MHT. Currently, I get the report into a string using:
 
dim strgWriter as New System.IO.StringWriter
server.execute("MyAspNetPageName.ASPX?someParam=value",strgWriter)
dim sReport as string = strgWriter.ToString
 
The result is the HTML in the sReturn string which I simply send to InterFax using their web service. However, this leaves the <IMG> links as real links; it obviously does not embed the graphics.
 
The code presented here requires a "real" URL -- any idea how I can feed it the string? I'd very much prefer not to have to write a temporary file if possible.
 
Many thanks -- john

AnswerRe: convert an asp.net generated web page in memor?memberPartenon27 May '06 - 7:17 
dim sReport as string = convertWebPageToMHTString("MyAspNetPageName.ASPX?someParam=value")
 
Magnus Persson
GeneralRe: convert an asp.net generated web page in memor?memberJTW227 May '06 - 7:45 
thanks but that doesn't help -- the page is in memory, in a text string.
GeneralRe: convert an asp.net generated web page in memor?memberPartenon27 May '06 - 7:49 
modify "convertWebControlToMHTString".
 
Replace the parameter "byval control As webControl" with "byval html As string"
and remove the line "Dim html As String = getHtml(control)"
 

 
Magnus Persson
GeneralRe: convert an asp.net generated web page in memor?memberJTW23 Jun '06 - 1:00 
thanks!! We are now able to package up an HTML page w/ images and send it to Interfax.Net for faxing -- it works great; thank you!
 
Next question: we save the MHT to our database and would like to view it from inside the app. HTML data, we simply write it to the Response object using Response.Write(str). This doesn't work with this MHT data; the headers are displayed, etc. -- the browser (IE) does not seem to understand the content type. Do you have any ideas for what we may be doing wrong? Maybe something as simple as setting a header or Response.ContentType? Tried lots of combinations but haven't found the magic yet... TIA--john
GeneralNeed help to convert HTML to MHTLmemberhuanngo21 Dec '05 - 20:04 
I used this function below to convert an auto-generated HTML file into MHTMl so that I can attach it to email and send it to other people. However the code did not work if I delete the images: the MHTML file did not render any imgae.
Here is the code I used from previous post:
C:\WINDOWS\SYSTEM32\cdosys.dll
C:\Program Files\Common Files\System\ado\msado15.dll
 
=====================================================================
 
public void convertWebPageToMHTString(string url, string fileName)
{
CDO.MessageClass message = new CDO.MessageClass();
 
//The following method allows for a username and password
//The last two parameters are for the username and password, respectively
message.CreateMHTMLBody(url, CDO.CdoMHTMLFlags.cdoSuppressNone,String.Empty,String.Empty);
ADODB.Stream stream = message.IBodyPart_GetStream();
stream.SaveToFile(fileName,ADODB.SaveOptionsEnum.adSaveCreateOverWrite);
 
}
 
Thanks alot
 
CodeBattles
GeneralRe: Need help to convert HTML to MHTLmemberlucian_davitoiu18 Apr '06 - 2:38 
If the URL passed to CreateMHTMLBody() contains spaces (possible in case of a file:// protocol) and cdosys.dll version is 6.2.4.0 then this code will not work; anyway it works fine with earlier versions of cdosys.dll.
 
lucian_davitoiu
 
-- modified at 12:18 Tuesday 18th April, 2006
GeneralRe: Need help to convert HTML to MHTLmemberhuanngo18 Apr '06 - 7:05 
Thanks for replying. It could be because of the version issues since the codes from the author didn't work either. As soon as I deleted the images, they won't render on the page anymore. Another way to do this is MIME format. I got it done a while ago.
 
Thanks,

 
CodeBattles
QuestionCode is directly download of MHTM filememberNirdesh Puri5 Dec '05 - 23:04 
:->when we download mht file then some time html code directly download.

 
Nirdesh Puri
QuestionAny Limit of MHTML file size?memberNirdesh Puri23 Nov '05 - 20:16 
Frown | :(
Hi,
 
I got problem when I make a large MHTML file then mhtm is downloaded but nothing will be in file. Any body can help me how i increase that downloaded mhtml file size in Dotnet.

 
Nirdesh Puri
QuestionHow to Refresh CDOSYS object?sussRadhika Datla21 Sep '05 - 15:50 

Hi your article is very much useful to me. But now I am having a problem with attaching images in mail body.The story as follows!
 
For my CDO mail I am accessing multiple recipients addresses from one CSV file. At the same time I am adding a html file as body.If my HTMLfile contains any images(with IMG tag) The mail is perfectly going to first recipient.But for the second recipient, my mail is going with images in the body and additionally as attachments.For the second recipient,it's going like- mailbody with images + one same image as attachment.For third one, mailbody + 2 image attachments.....
Though I haven't attached any.
 
I thought this is because my CDO object hasn't set to "nothing" after sending mail to first recipient. But if do this my mail don't go to second person. I am using
 
ObjCDOmessage.CreateMHTMLbody("file://\\" & Bodyfilename, CDO.CdoMHTMLFlags.cdoSuppressNone)
 
to set my mail body.
How can I avoid images that are in my previous recipient's mail as attachments to my next person's mail?Please solve my Problem
 
Radhika Datla

AnswerRe: How to Refresh CDOSYS object?sussAnonymous21 Sep '05 - 20:43 
Please include some additional code
GeneralRe: How to Refresh CDOSYS object?sussRadhika Datla21 Sep '05 - 21:12 
Hi I am reading mail addresses from a CSV file .Fields are email address,recipient's name. I am sending code after setting CDO configuration and sender's address.
First I read the body file contents to "strbody". Now
 
'********************start reading the contents of the address file from the specified location********
 
finaddr = 1
ncurrentLine = 0
nStartLine = CInt(txtstartline.Text) - 1
FileOpen(finaddr, OFD1.FileName, OpenMode.Input)
If (nStartLine <> ncurrentLine) Then
For itemp = ncurrentLine + 1 To nStartLine
If (LOF(finaddr) = 0) Then
MsgBox("Startline is greater than the address line")
Exit Sub
End If
strFullLine = LineInput(Val(finaddr))
If (Len(strFullLine) = 0) Then
MsgBox("address file empty")
End If
Next
ncurrentLine = nStartLine
 
End If
Do Until EOF(finaddr)
strFullLine = LineInput(Val(finaddr))
If (Len(strFullLine) = 0) Then
Exit Do
End If
vSpacePos = Microsoft.VisualBasic.InStr(strFullLine, ",")
strEmailAddr = Microsoft.VisualBasic.Left(strFullLine, vSpacePos - 1)
strName = Microsoft.VisualBasic.Mid(strFullLine, vSpacePos + 1)
vSpacePos = Microsoft.VisualBasic.InStr(strName, ",")
If (vSpacePos <> 0) Then
strName = Microsoft.VisualBasic.Left(strName, vSpacePos - 1)
End If
strEmailTrimed = Trim(strEmailAddr)
 
Do While True
vSpacePos = InStr(strEmailTrimed, " ")
If (vSpacePos = 0) Then
Exit Do
End If
strTemp = Microsoft.VisualBasic.Left(strEmailTrimed, vSpacePos - 1)
strTemp1 = Microsoft.VisualBasic.Mid(strEmailTrimed, vSpacePos + 1)
strTemp1 = Trim(strTemp1)
strEmailTrimed = strTemp & strTemp1
Loop
strEmailAddr = strEmailTrimed
If (Len(strEmailAddr) = 0) Then
Exit Do
End If
 
'*********************Set the email to specified members in the address field**********************
 
oMsg.To = strEmailAddr
oMsg.Subject = txtsubject.Text
Dim myheader As String
myheader = "

Dear "
 
'**********************Open the bodyfile to add custom Header and recipient's name*****************
 
FileOpen(finbody, OFD2.FileName, OpenMode.Output)
Print(finbody, myheader & strName & "" & " ," & "

" & vbCrLf & strBody)
FileClose(finbody)
'MsgBox(strBody)
Dim Bodyfilename As String = OFD2.FileName
'MsgBox(Bodyfilename)
 
'*****************setting HTML mail body **********************************************************
 
oMsg.CreateMHTMLBody("file://\\" & Bodyfilename, CDO.CdoMHTMLFlags.cdoSuppressNone)
'***************************************Sending the email*******************************************
 
oMsg.Send()
 

'************After sending mail, delete the custom header from our file body for not to be repeated
'in next recipient's mail body********************************************************************
 
temfinbody = 3
FileOpen(temfinbody, OFD2.FileName, OpenMode.Output)
Print(temfinbody, strBody)
SaveFileDialog1.RestoreDirectory = True
FileClose(temfinbody)
'MsgBox(strBody)
 
'****Now deleting the attachments because they should not be repeated with the mail of the next recipient's address field****
 
If (Len(addattach.Text) <> 0) Then
For counter = 0 To lstattachment.Items.Count - 1
oMsg.Attachments.DeleteAll()
Next
End If

'******************************increment the loop to read the next addressee*******************************
 
GoTo skipped
skipped:
ncurrentLine = ncurrentLine + 1
icount = icount + 1
Loop
StatusBar1.Text = " MAIL HAS BEEN SENT SUCCESSFULLY! NUMBER OF MESSAGES SENT: " & CStr(icount)
strLastAddrFile = txtsendto.Text
 
'***********close the address file********************************************************************************
FileClose(finaddr)
icount = 0
strBody = ""

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 15 Jun 2004
Article Copyright 2004 by Partenon
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid