![]() |
Web Development »
ASP.NET »
General
Intermediate
Reduce The Size Of Your ASP.NET OutputBy Michael RussellDescribes the use of HttpResponse.Filter to reduce the size of your outgoing .ASPX files. |
VB.NET 1.0, Win2K, WinXP, ASP.NET, Visual Studio, Dev
|
|
Advanced Search |
|
|
|
||||||||||||||||

This article demonstrates how to use HttpResponse.Filter to easily reduce the output size of your website.
Recently, we redesigned the web site for Layton City. Because the redesign made it much easier for citizens to find what they were looking for, our hits per day nearly tripled overnight. Unfortunately, so did our bandwidth. We're currently serving almost 60Mb a day of just HTML. That doesn't include images or Adobe� Reader� documents. So priority #1 became reducing our bandwidth without reducing usability or having to rewrite the majority of our pages.
One downside of using some of the ASP.NET controls is that they insert lots of whitespace characters so that developers can easily see where problems are. While that is desirable during debugging, there is no means of turning that functionality off when you have released your site.
After finding an article on HttpResponse.Filter in the Longhorn SDK (here), we decided to use HttpResponse.Filter to intercept our outgoing HTML and squish it.
Add the WhitespaceFilter class to your project, and add the following line of code into the Application_BeginRequest function in your Global.asax file:
Sub Application_BeginRequest(ByVal sender As Object, ByVal e As EventArgs)
Response.Filter = New WhitespaceFilter(Response.Filter)
End Sub
The above code causes the compressor to be added to every single page in your application. Alternatively, if you only want to compress individual pages, you can add the line to the Page_Load event.
Comments are inline. Some of the weird lines are in to help compress specific portions of the website. (Updated 1/23/2004)
Imports System.IO
Imports System.Text.RegularExpressions
' This filter gets rid of all unnecessary whitespace in the output.
Public Class WhitespaceFilter
Inherits Stream
Private _sink As Stream
Private _position As Long
Public Sub New(ByVal sink As Stream)
_sink = sink
End Sub 'New
#Region " Code that will most likely never change from filter to filter. "
' The following members of Stream must be overridden.
Public Overrides ReadOnly Property CanRead() As Boolean
Get
Return True
End Get
End Property
Public Overrides ReadOnly Property CanSeek() As Boolean
Get
Return True
End Get
End Property
Public Overrides ReadOnly Property CanWrite() As Boolean
Get
Return True
End Get
End Property
Public Overrides ReadOnly Property Length() As Long
Get
Return 0
End Get
End Property
Public Overrides Property Position() As Long
Get
Return _position
End Get
Set(ByVal Value As Long)
_position = Value
End Set
End Property
Public Overrides Function Seek(ByVal offset As Long, _
ByVal direction As System.IO.SeekOrigin) As Long
Return _sink.Seek(offset, direction)
End Function 'Seek
Public Overrides Sub SetLength(ByVal length As Long)
_sink.SetLength(length)
End Sub 'SetLength
Public Overrides Sub Close()
_sink.Close()
End Sub 'Close
Public Overrides Sub Flush()
_sink.Flush()
End Sub 'Flush
Public Overrides Function Read(ByVal MyBuffer() As Byte, _
ByVal offset As Integer, ByVal count As Integer) As Integer
_sink.Read(MyBuffer, offset, count)
End Function
#End Region
' Write is the method that actually does the filtering.
Public Overrides Sub Write(ByVal MyBuffer() As Byte, _
ByVal offset As Integer, ByVal count As Integer)
Dim data(count) As Byte
Buffer.BlockCopy(MyBuffer, offset, data, 0, count)
' Don't use ASCII encoding here. The .NET IDE replaces
' some characters, such as �
' with a UTF-8 entity. If you use ASCII encoding,
' you'll get B. instead of the registered
' trademark symbol.
Dim s As String = System.Text.Encoding.UTF8.GetString(data)
' Replace control characters with either spaces or nothing
' The funky semi-colon handling is there because
' of a JavaScript comment in a component.
' This way, we keep the carriage returns that actually matter.
s = s.Replace(ControlChars.Cr, _
Chr(255)).Replace(ControlChars.Lf, _
"").Replace(ControlChars.Tab, "")
s = s.Replace(";" & Chr(255), ";" & ControlChars.Cr)
s = s.Replace(Chr(255), " ")
' Eliminate excess whitespace.
Do
s = s.Replace(" ", " ")
Loop Until s.IndexOf(" ") = -1
' Eliminate known comments.
' We use three comments in our template. These comments
' go on every single page on the site.
' Obviously, we can kill them when they are going out.
' This way, the comments stay in for
' maintenance, but are trimmed before release.
s = s.Replace("<!-- Page Content Goes Above Here -->", "")
s = s.Replace("<!-- Page Content Goes Below Here -->", "")
s = s.Replace("<!-- Do not get rid of this on data pages -->", "")
' Eliminate some additional whitespace we can kill
' For some reason, a single space gets emitted
' before each of our DOCTYPE directives.
s = s.Replace(" <!DOCTYPE", "<!DOCTYPE")
' These are the most common excess whitespace items we can remove.
s = s.Replace("<li> ", _
"<li>").Replace("</td> ", _
"</td>").Replace("</tr> ", _
"</tr>").Replace("</ul> ", _
"</ul>").Replace("</table> ", _
"</table>").Replace("</li> ", "</li>")
s = s.Replace("<LI> ", _
"<LI>").Replace("</TD> ", _
"</TD>").Replace("</TR> ", _
"</TR>").Replace("</UL> ", _
"</UL>").Replace("</TABLE> ", _
"</TABLE>").Replace("</LI> ", "</LI>")
s = s.Replace("<td> ", _
"<td>").Replace("<tr> ", _
"<tr>")
s = s.Replace("<TD> ", _
"<TD>").Replace("<TR> ",_
"<TR>")
s = s.Replace("<P> ", "<P>").Replace("<p> ", "<p>")
s = s.Replace("</P> ", "</P>").Replace("</p> ", "</p>")
s = s.Replace("style=""display:inline""> ", _
"style=""display:inline"">")
s = s.Replace(" <H", "<H").Replace(" <h", _
"<h").Replace(" </H", _
"</H").Replace(" </h", "</h")
s = s.Replace("<UL> ", "<UL>").Replace("<ul> ", "<ul>")
s = s.Replace(" <TABLE", _
" ID="Table1"<TABLE").Replace(" ID="Table2"<table", _
" ID="Table3"<table")
s = s.Replace(" ID="Table4"<li>", _
"<li>").Replace(" <LI>", "<LI>")
s = s.Replace(" <br>", _
"<br>").Replace(" <BR>",_
"<BR>").Replace("<br> ", _
"<br>").Replace("<BR> ", "<BR>")
s = s.Replace(" <ul>", "<ul>").Replace(" <UL>", "<UL>")
' Replace long tags with short ones
s = s.Replace("<STRONG>", "<B>").Replace("<strong>", "<b>")
s = s.Replace("</STRONG>", "</B>").Replace("</strong>", "</b>")
' Replace some HTML entities with true character codes
s = s.Replace("&brkbar;", "|")
s = s.Replace("�", "|")
s = s.Replace("­", "-")
s = s.Replace(" ", Chr(160))
s = s.Replace("‚", "'")
s = s.Replace("„", """")
s = s.Replace("�", "'")
s = s.Replace("’", "'")
s = s.Replace("�", "'")
s = s.Replace("�", """")
s = s.Replace("”", """")
s = s.Replace("�", """")
s = s.Replace("�", "-")
s = s.Replace("&endash;", "-")
' If we don't do this, JavaScript horks on the site
s = s.Replace("<!--", "<!--" & ControlChars.Cr)
s = s.Replace("}", "}" & ControlChars.Cr)
' Last chance to eliminate excess whitespace
Do
s = s.Replace(" ", " ")
Loop Until s.IndexOf(" ") = -1
' Finally, we spit out what we have done.
Dim outdata() As Byte = System.Text.Encoding.UTF8.GetBytes(s)
_sink.Write(outdata, 0, outdata.GetLength(0))
End Sub 'Write
End Class
Occasionally, you will find that you have one or more pages that you do not want to compress. For example, the pages may use pre-formatted text or the pages may emit binary data instead of HTML.
In that case, you would want to filter the filter, so to speak. On our site, we have one page that we don't compress, so our Application_BeginRequest looks a little bit like this...
Sub Application_BeginRequest(ByVal sender As Object, ByVal e As EventArgs)
' ...non-related code trimmed...
' Whitespace Reduction
If Request.Url.PathAndQuery.ToLower.IndexOf("makethumbnail") = -1 Then
Response.Filter = New WhitespaceFilter(Response.Filter)
End If
End Sub
Using this class will increase the amount of processing time used for each page. In our case, the reduction in bandwidth (7% on our main page, as much as 30% on some of our more complex pages) was worth the increased workload on the server. All of the string operations are very inefficient, admittedly. A rewrite to use StringBuilder is in the works. The only downside to StringBuilder is that you can't run regular expressions against it. However, because of the use of Strings in the current version, I do not recommend using it if the HTML on your page is greater than 80,000 bytes on average, due to the behavior of the .NET Framework's garbage collector. Essentially, any object greater than 80,000 bytes will be immediately pushed into the Large Object Heap, which is only GC'ed as a measure of last resort by the framework.
If you are using a server operating system, you can also enable HTTP compression on the server to reduce your bandwidth usage even further. If an HTTP/1.1 client connects to your server, Windows will compress the binary stream (similar to ZIP) before sending it out to the client.
To enable HTTP compression on Windows 2000, open the Internet Service Manager, right-click on your server, and pick "Properties". Select the "Service" tab, then check "Compress Application Files" and "Compress Static Files".
As far as I can tell, HTTP compression is automatically enabled on IIS 5.1 in Windows XP.
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 26 Jan 2004 Editor: Smitha Vijayan |
Copyright 2004 by Michael Russell Everything else Copyright © CodeProject, 1999-2009 Web09 | Advertise on the Code Project |