5,695,118 members and growing! (13,105 online)
Email Password   helpLost your password?
Web Development » ASP.NET » General     Intermediate

Convert HTML to MHTML using ASP.NET

By Partenon

An article on how to convert a html document with images to a mhtml document
VB, Windows, .NET, Visual Studio, ASP.NET, Dev

Posted: 14 Jun 2004
Updated: 14 Jun 2004
Views: 146,326
Bookmarked: 51 times
Announcements
Loading...



Search    
Advanced Search
Sitemap
13 votes for this Article.
Popularity: 4.33 Rating: 3.89 out of 5
1 vote, 7.7%
1
1 vote, 7.7%
2
0 votes, 0.0%
3
3 votes, 23.1%
4
8 votes, 61.5%
5

Introduction

Ever wanted to make a report out of an html document and have it sent to the client for offline use in Word or Excel? An RFC - compliant Multipart MIME Message (mhtml web archive) is one single file containing all related material such as linked documents and images serialized to their Base64 inline encoding representations. There is no native support for creating mhtml archives in .NET but thanks to the Windows CDO library this is easy accomplished.

The code

The projects contains 3 classes; mht, mhtImage and mhtImageCollection. The mht class contains the conversion functions like convertWebControlToMHTString which takes a webControl and a collection of images and returns a string representation of the created mht archive. Use this function when the converting webControl is dependent on user specific Session and Application variables for rendering or when you use dynamically created images.

Public Function convertWebControlToMHTString(ByVal control As WebControl, _
  ByVal MHTimages As mhtImageCollection) As String
  'Render WebControl to html

  Dim html As String = getHtml(control)

  'If WebControl has images, make the html Word compatible

  If Not MHTimages Is Nothing Then
    fixImageLocation(html, MHTimages)
  End If

  Dim msg As New CDO.MessageClass
  Dim stm As ADODB.Stream = Nothing
  Dim MS As System.IO.MemoryStream = Nothing

  Dim iBp As CDO.IBodyPart

  'Make a multipart mhtml document

  Dim mainBody As CDO.IBodyPart
  mainBody = msg
  mainBody.ContentMediaType = "multipart/related"

  'Make the html part of the document

  iBp = mainBody.AddBodyPart()
  iBp.ContentMediaType = "text/html"
  iBp.ContentTransferEncoding = "quoted-printable"
  stm = iBp.GetDecodedContentStream
  stm.WriteText(html)
  stm.Flush()

  'Make the image parts of the document

  If Not MHTimages Is Nothing Then
    Dim oMhtImage As mhtImage
    For Each oMhtImage In MHTimages
      iBp = mainBody.AddBodyPart()
      With iBp
        .ContentMediaType = "image/" + _
 oMhtImage.ImageFormat.ToString().ToLower()
        .ContentTransferEncoding = "base64"

        'ContentLocation must be the same as in the 

        'html part to make them linked

        .Fields.Append("urn:schemas:mailheader:content-location", _
    DataTypeEnum.adBSTR, , , oMhtImage.ContentLocation)
        .Fields.Update()
        .Fields.Refresh()
      End With

      Try
        MS = New System.IO.MemoryStream
        oMhtImage.Image.Save(MS, oMhtImage.ImageFormat)
        Dim bytearray As Byte() = MS.ToArray()
        stm = iBp.GetDecodedContentStream
        stm.Write(bytearray)
        stm.Flush()
      Finally
        MS.Close()
        stm.Close()
      End Try
    Next
  End If

  stm = mainBody.GetStream()
  Return stm.ReadText(stm.Size)
End Function
      

The convertWebPageToMHTString function converts an html document from a specific URL to a mht archive, all images included. Use this function for public html documents not dependent on user specific Session and Application variables.

Public Function convertWebPageToMHTString(ByVal url As String) As String
    Dim msg As New CDO.MessageClass
    Dim stm As ADODB.Stream = Nothing

    Try
        msg.MimeFormatted = True
        msg.CreateMHTMLBody(url, CDO.CdoMHTMLFlags.cdoSuppressNone, "", "")
        stm = msg.GetStream()
        Return stm.ReadText(stm.Size)
    Finally
        stm.Close()
    End Try
End Function
    

The fixImageLocation appends the string "http://" at the beginning of each ContentLocation if not already there, for Word compliance

Private Sub fixImageLocation( _
      ByRef html As String, ByRef MHTimages As mhtImageCollection)
    Dim curContentLocation As String
    Dim curIndex As Integer
    Dim oMhtImage As mhtImage
    For Each oMhtImage In MHTimages
        curContentLocation = oMhtImage.ContentLocation
        If curContentLocation.IndexOf(":") = -1 Then
            curIndex = html.IndexOf(curContentLocation)
            While curIndex <> -1
                html = html.Insert(curIndex, "http://")
                curIndex = html.IndexOf(curContentLocation, curIndex + _
   curContentLocation.Length)
            End While
            oMhtImage.ContentLocation = "http://" + curContentLocation
        End If
    Next
End Sub
    

The mhtImage class contains image information. Property Image contains the actual image. Property ContentLocation contains the path to the image, must be exactly the same as the source for the image in the html part. Property ImageFormat contains the image format (jpg, gif, bmp...)

The mhtImageCollection class contains a collection of mhtImages.

Using the code

Example on how to make a mht archive from a Panel webControl containing one image.

Dim oMhtCol As New mhtImageCollection
oMhtCol.add(New mhtImage(System.Drawing.Image.FromFile( _
  Server.MapPath("/mhtml/images/myComputer.jpg")), _
  "images/myComputer.jpg", System.Drawing.Imaging.ImageFormat.Jpeg))
sendMHTFile(ConvertWebControlToMHTString(Panel1, oMhtCol), "myFirstMht.mht")
    

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Partenon


Software developer
Occupation: Web Developer
Location: Sweden Sweden

Other popular ASP.NET articles:

Article Top
Sign Up to vote for this article
You must Sign In to use this message board.
FAQ FAQ Noise ToleranceSearch Search Messages 
 Layout  Per page   
 Msgs 1 to 25 of 74 (Total in Forum: 74) (Refresh)FirstPrevNext
Generalhelp!!memberbishwajeet21:38 3 Oct '08  
GeneralCDO.MessageClass CreateMHTMLBody Random Format Errormemberxontherocks5:42 29 Nov '07  
QuestionCreateMHTMLBody causes errormemberNancy Forbes5:01 10 Oct '07  
GeneralNeed help converting html to mhtmlmemberIneedMIMEmailsNOW0:06 14 Sep '07  
GeneralHTML To DOC using MHTmembergansci20:46 31 May '07  
GeneralLibrarymemberyann_lh11:23 11 Apr '07  
GeneralIs there any limit for the Mht file created.membersunil197821:08 10 Jul '06  
GeneralGetting Exceptionmembersunil19781:52 10 Jul '06  
QuestionHow can I Convert MHTML to HTML ?memberxfary17:04 3 Jul '06  
AnswerRe: How can I Convert MHTML to HTML ?memberRichard Beacroft1:56 16 Apr '07  
QuestionAbout the convertWebPageToMHTString(...) methodmemberjasper1688819:24 14 Jun '06  
GeneralRe: About the convertWebPageToMHTString(...) methodmemberjasper1688820:19 14 Jun '06  
Generalconvert an asp.net generated web page in memor?memberJTW213:11 26 May '06  
GeneralRe: convert an asp.net generated web page in memor?memberPartenon8:17 27 May '06  
GeneralRe: convert an asp.net generated web page in memor?memberJTW28:45 27 May '06  
GeneralRe: convert an asp.net generated web page in memor?memberPartenon8:49 27 May '06  
GeneralRe: convert an asp.net generated web page in memor?memberJTW22:00 3 Jun '06  
GeneralNeed help to convert HTML to MHTLmemberhuanngo21:04 21 Dec '05  
GeneralRe: Need help to convert HTML to MHTLmemberlucian_davitoiu3:38 18 Apr '06  
GeneralRe: Need help to convert HTML to MHTLmemberhuanngo8:05 18 Apr '06  
QuestionCode is directly download of MHTM filememberNirdesh Puri0:04 6 Dec '05  
QuestionAny Limit of MHTML file size?memberNirdesh Puri21:16 23 Nov '05  
GeneralHow to Refresh CDOSYS object?sussRadhika Datla16:50 21 Sep '05  
GeneralRe: How to Refresh CDOSYS object?sussAnonymous21:43 21 Sep '05  
GeneralRe: How to Refresh CDOSYS object?sussRadhika Datla22:12 21 Sep '05  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 14 Jun 2004
Editor: Nishant Sivakumar
Copyright 2004 by Partenon
Everything else Copyright © CodeProject, 1999-2008
Web11 | Advertise on the Code Project