|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
IntroductionThis article presents a JavaScript compression tool that takes your JavaScript source code and compresses it by removing all comments, extraneous whitespace, and optionally, as many line feeds as possible, and by optionally shortening function parameters and variable names. This will reduce the script size, and may help your pages load faster and reduce bandwidth consumption. A minor side benefit when line feed removal and variable name compression is enabled is that it provides lightweight obfuscation of the code, making it harder for the casual user to read and/or play around with it. It won't stop a determined user from reformatting and reverse engineering it, but that is not the intent of this tool. I developed this tool for use in my own ASP.NET projects. The code is written in C#, but as long as you have the .NET Framework installed, it can be used to compress JavaScript for any web project, .NET or otherwise. The supplied project file is for Visual Studio 2003, but it can be opened, converted, and successfully compiled under Visual Studio 2005 as well. There are three levels of compression:
Code blocks can also be surrounded by special // #pragma NoCompStart
//====================================================
// File : TestScript1.js
// Author : Eric Woodruff
// Updated : 07/23/2003
// #pragma NoCompEnd
// Anything from this point forward will be compressed
// .
// .
// .
// #pragma NoCompStart
// Skip compression on this section
function Test()
{
return true;
}
// #pragma NoCompEnd
// Resume compression
// .
// .
// .
The The ProgramsTwo versions of the program are provided. The first is an interactive version that you can use to test the different modes of compression. It is a Windows Forms application written in C#. After running it, simply paste your JavaScript code into the Original Script text box, turn the Line Feed Removal and Variable Name Compression options on or off, and click the Compress button. The compressed script is then shown in the Compressed Script textbox, with some compression statistics displayed below it. The text can be copied to the clipboard from the Compressed Script text box. Note that when using the Test only variable name compression option, the script code is not compressed. Only parameter and variable names are compressed. This may help locate a problem with the variable name compression code. Although the script code is not compressed, comments are removed so that the naming results match (i.e., it won't use different names due to matching a word that appears in a comment such as "a", "be", or "to"). The second and most useful tool is a console mode version of the compressor that can be used as the command for a pre-build step in ASP.NET projects to compress scripts in the project. It can also be used to compress scripts that are stored in custom web controls as embedded resources. The command line syntax is shown below. Options and file specs are case-insensitive, and are processed from left to right as encountered. JSCompressCL [/options] filespec [[/options]
filespec ...]
The available command line options are as follows:
The debug and release build options are spelled out to make it easy to specify them in a project's pre-build step using one of the IDE macros. This is described below. At the minimum, you should specify an output folder other than the one in which the scripts to compress reside. For example, you may want to store the uncompressed scripts in a folder called ScriptsDev and tell the compressor to store the compressed scripts in a folder called Scripts that the application will use at runtime. The compressor will not overwrite the source scripts. On debug builds, it also checks for an existing copy of the script and, if the timestamp is greater than or equal to the source script, it skips it. This saves recreating a script file that has not changed, each time the project is built during debugging. An "up to date" message is displayed in such cases. The scripts are always processed in release builds, to ensure that they are up to date and are compressed. If a script is compressed, the tool displays the source and destination filenames along with the compression statistics. The Implied release build with line feed removal,
no stats displayed.
JSCompressCL /q /o:\MyProj\Scripts
\MyProj\ScriptsDev\*.js
Explicit release build with line feed removal,
stats are displayed.
JSCompressCL /release /o:\MyProj\Scripts
\MyProj\ScriptsDev\*.js
Line feed removal disabled for first file set, line feed
removal and variable name compression enabled for second file set.
JSCompressCL /o:\MyProj\Scripts
/k \MyProj\ScriptsDev1\*.js
/d /v \MyProj\ScriptsDev2\*.js
Debug build, no compression. Scripts are passed
through unmodified for debugging purposes.
JSCompressCL /Debug /o:\MyProj\Scripts
\MyProj\ScriptsDev\*.js
Debug build with forced compression. Scripts are
compressed even though it's a debug build.
JSCompressCL /Debug /f /o:\MyProj\Scripts
\MyProj\ScriptsDev\*.js
Using the Console Version as a Project's Pre-Build StepCopy the console version of the application to a folder somewhere on your PC. To use the console version as the pre-build step of a web project, create a folder to contain the uncompressed scripts (ScriptDev, for example), and another to contain the compressed scripts to be used at runtime by the application (Scripts, for example). To create a new folder in the project, right click on the project name, select Add..., select New Folder, and enter the folder name. Add a new script to the folder, by right clicking on it and selecting Add... and then Add New Item... to create a new item, or Add Existing Item... if you copied an existing file to the new folder. Once added to the project folder, right click on the script, and select Properties. Change the Build Action property from Content to None for the scripts in the development (uncompressed) folder. You can add copies of the scripts in the compressed folder and leave their build action set to Content if you want to do so. The next step is to right click on the project name, select Properties, expand the Common Properties folder, and select the Build Events sub-item. Click in the Pre-build Event Command Line option to enter the command line to run. You can click the "..." button to open a dialog with a larger editor and a list of available macros. Below is an example of a common command line that can be used (lines wrapped for display purposes). Replace the path to the tool with the path where you stored it on your PC. D:\Utils\JSCompressCL /$(ConfigurationName)
/o:$(ProjectDir)Scripts $(ProjectDir)ScriptsDev\*.js
The The The same applies for the Compressing Scripts that are Embedded ResourcesIf you are developing a web control, for example, that uses scripts that are contained in the assembly as embedded resources, you can still compress them using the above steps. The only difference is that, when setting up the folders as described above, make an initial copy of the scripts, and place them in the compressed script folder. In the project manager, right click on the scripts in the compressed script folder, select Properties, and change the Build Action property to Embedded Resource. When you build the project, the pre-build command will compress the scripts, the project will then be built in the normal fashion, and the compressed scripts will be embedded as resources in the assembly. How the Code WorksThe code for the Windows Forms and the console applications is fairly straightforward, and there is nothing much to describe. The forms version takes data from the controls, and uses it with the Basic InformationThe The Compression ProcessThe /// <summary>
/// Compress the specified JavaScript code.
/// </summary>
/// <param name="strScript">The script to compress</param>
/// <returns>The compressed script</returns>
public string Compress(string strScript)
{
string strCompressed;
char [] achScriptChars;
// Don't bother if there is nothing to compress
if(strScript == null || strScript.Length == 0)
return strScript;
// Set up for compression
scLiterals.Clear();
scNoComps.Clear();
// Create the regular expressions and match evaluators on
// first use.
if(reInsLit == null)
{
reExtNoComp = new Regex(@"//\s*#pragma\s*NoCompStart.*?" +
@"//\s*#pragma\s*NoCompEnd.*?\n",
RegexOptions.Multiline | RegexOptions.Singleline |
RegexOptions.IgnoreCase);
reDelNoComp = new Regex(@"//\s*#pragma\s*NoComp(Start|End).*\n",
RegexOptions.Multiline | RegexOptions.IgnoreCase);
reInsLit = new Regex("\xFE|\xFF");
meInsLit = new MatchEvaluator(OnMarkerFound);
meExtNoComp = new MatchEvaluator(OnNoCompFound);
reFuncParams = new Regex(@"function.*?\((.*?)\)(.*?|\n)?\{",
RegexOptions.IgnoreCase | RegexOptions.Singleline);
reFindVars = new Regex(@"(var\s+.*?)(;|$)",
RegexOptions.IgnoreCase | RegexOptions.Multiline);
reStripVarPrefix = new Regex(@"^var\s+",
RegexOptions.IgnoreCase);
reStripParens = new Regex(@"\(.*?,.*?\)|\[.*?,.*?\]",
RegexOptions.IgnoreCase);
reStripAssign = new Regex(@"(=.*?)(,|;|$)",
RegexOptions.IgnoreCase);
}
The first part initializes two string collections that will end up containing any "no compression" sections specified by the // Extract sections that the user doesn't want compressed
// and replace them with a marker.
strCompressed = reExtNoComp.Replace(strScript, meExtNoComp);
// This is the match evaluator referenced by meExtNoComp:
// Extract the sections that the user doesn't want compressed
// and save them for reinsertion at the end without the #pragmas.
// They are replaced with a marker character.
private string OnNoCompFound(Match match)
{
scNoComps.Add(reDelNoComp.Replace(match.Value, String.Empty));
return "\xFE";
}
The next part extracts the sections, if any, that the user does not want compressed, as specified via the // Split the string into an array for parsing
achScriptChars = strCompressed.ToCharArray();
// Remove comments and extract literals
CompressArray(achScriptChars);
After the "no compression" sections have been removed, the script is split into a character array to make parsing simpler. The array is passed to the Literal strings and regular expressions are extracted and stored in a string collection, and are replaced by a marker character ( // Gather up what's left and remove the nulls
strCompressed = new String(achScriptChars);
strCompressed = strCompressed.Replace("\0", String.Empty);
// Skip code compression?
if(!varCompTest)
{
// Remove all leading and trailing whitespace and condense runs
// of two or more whitespace characters to just one.
strCompressed = Regex.Replace(strCompressed, @"^[\s]+|[ \f\r\t\v]+$",
String.Empty, RegexOptions.Multiline);
strCompressed = Regex.Replace(strCompressed, @"([\s]){2,}", "$1");
Once the array has been parsed, it is converted back into a string, and all null characters (representing removed sections) are deleted. After that, regular expressions are used to remove leading and trailing whitespace from all lines, and to condense all runs of two or more whitespace characters to just one. This part and the subsequent steps are skipped if only testing variable name compression. // Line feed removal requested?
if(removeLineFeeds)
{
// Remove line feeds when they appear near numbers with signs
// or operators. A space is used between + and - occurrences
// in case they are increment/decrement operators followed by
// an add/subtract operation. In other cases, line feeds are
// only removed following a + or - if it is not part of an
// increment or decrement operation.
strCompressed = Regex.Replace(strCompressed, @"([+-])\n\1",
"$1 $1");
strCompressed = Regex.Replace(strCompressed, @"([^+-][+-])\n",
"$1");
strCompressed = Regex.Replace(strCompressed,
@"([\xFE{}([,<>/*%&|^!~?:=.;])\n", "$1");
strCompressed = Regex.Replace(strCompressed,
@"\n([{}()[\],<>/*%&|^!~?:=.;+-])" ,"$1");
}
The next step is to see if line feed removal has been requested. If so, all line feeds occurring near numbers with signs and near operators are removed. As noted in the comments, care is taken around the // Strip all unnecessary whitespace around operators
strCompressed = Regex.Replace(strCompressed,
@"[ \f\r\t\v]?([\n\xFE\xFF/{}()[\];,<>*%&|^!~?:=])[ \f\r\t\v]?",
"$1");
strCompressed = Regex.Replace(strCompressed, @"([^+]) ?(\+)", "$1$2");
strCompressed = Regex.Replace(strCompressed, @"(\+) ?([^+])", "$1$2");
strCompressed = Regex.Replace(strCompressed, @"([^-]) ?(\-)", "$1$2");
strCompressed = Regex.Replace(strCompressed, @"(\-) ?([^-])", "$1$2");
A final set of regular expressions is used to strip whitespace from around operators and the marker characters. Again, special care is taken with the // Try for some additional line feed removal savings by
// stripping them out from around one-line if, while,
// and for statements and cases where any of those
// statements immediately follow another.
if(removeLineFeeds)
{
strCompressed = Regex.Replace(strCompressed,
@"(\W(if|while|for)\([^{]*?\))\n", "$1");
strCompressed = Regex.Replace(strCompressed,
@"(\W(if|while|for)\([^{]*?\))((if|while|for)\([^{]*?\))\n",
"$1$3");
strCompressed = Regex.Replace(strCompressed,
@"([;}]else)\n", "$1 ");
}
After removing all extraneous whitespace, if line feed removal has been requested, a few additional steps are taken to remove unnecessary line feeds from around if(a == 1)
for(b = 0; b < 10; b++)
while(!c)
c = DoSomething();
If the code contains semi-colons on all statements that need them to mark their endpoints, the above process can usually remove all line feeds from the script, reducing it to one long stream of characters, thus providing maximum code compression. // Compress variable names too if requested
if(compressVarNames || varCompTest)
strCompressed = CompressVariables(strCompressed);
// Put back the literals and uncompressed sections removed
// during the parsing step.
noCompCount = literalCount = 0;
strCompressed = reInsLit.Replace(strCompressed, meInsLit);
return strCompressed;
}
// This is the match evaluator referenced by meInsLit:
// Replace a literal or uncompressed section marker with the
// next entry from the appropriate collection.
private string OnMarkerFound(Match match)
{
if(match.Value == "\xFE")
return scNoComps[noCompCount++];
return scLiterals[literalCount++];
}
Variable name compression occurs next, if requested. This process will be described in the next section. The last step is to reinsert the uncompressed sections and literal strings. In a manner similar to extraction, a regular expression and a match evaluator are used. Two private counters are used to keep track of the progress through the string collections. As each marker character is found, the match evaluator is called and, depending on the marker found, it returns the next element from the appropriate collection, which then takes the place of the marker. The matching counter is also incremented ready for the next match. After the insertions have been made, the compressed script is returned to the caller. Parameter and Variable Name CompressionThe
The actual renaming process occurs as follows: private string CompressVariables(string script)
{
StringCollection scVariables = new StringCollection();
string[] varNames;
string name = null, matchName;
bool incVarName;
// Find function parameters
MatchCollection matches = reFuncParams.Matches(script);
foreach(Match m in matches)
{
varNames = m.Groups[1].Value.Split(',');
// Add each unique name to the list
foreach(string s in varNames)
{
name = s.Trim();
if(name.Length != 0 && !scVariables.Contains(name))
scVariables.Add(name);
}
}
The first part searches for function parameters using a regular expression created earlier. The parameter list is split apart, and each unique parameter name is added to the variable name string collection. // Find variable declarations
matches = reFindVars.Matches(script);
foreach(Match m in matches)
{
// Remove the "var " declaration prefix
name = reStripVarPrefix.Replace(m.Groups[1].Value, String.Empty);
// Strip brackets and parentheses containing commas such
// as array declarations and method calls with parameters.
name = reStripParens.Replace(name, String.Empty);
// Remove assignment operations
name = reStripAssign.Replace(name, "$2");
varNames = name.Split(',');
// Add each unique name to the list
foreach(string s in varNames)
{
name = s.Trim();
if(name.Length != 0 && !scVariables.Contains(name))
scVariables.Add(name);
}
}
The next part searches for var num1, string1 = "Test", num2 = array1[3, 0];
var resultString = functionCall("A", "B");
The // Replace each variable in the list with a shorter name.
// Start with "a" through "z" then use "_a" through "_z",
// "_aa" to "_az", "_ba" to "_bz", etc.
newVarName = new char[10];
newVarName[0] = '\x60';
varNamePos = 0;
incVarName = true;
foreach(string replaceName in scVariables)
{
// Increment the variable name and make sure it isn't
// in use already.
if(incVarName)
{
do
{
IncrementVariableName();
name = new String(newVarName, 0, varNamePos + 1);
matchName = @"\W" + name + @"\W";
} while(Regex.IsMatch(script, matchName));
incVarName = false;
}
// Don't bother if the existing name is shorter. This check
// could be removed to obfuscate the variable name even if it
// would be longer.
if(name.Length < replaceName.Length)
{
incVarName = true;
script = Regex.Replace(script,
@"(\W)" + replaceName + @"(?=\W)", "$1" + name);
}
}
return script;
The final step loops through each unique variable name found, and substitutes a shorter name. Once done, the compressed script is returned. As noted in the comments, the naming scheme starts with As each new name is created, a check is made to ensure that it does not already exist in the script. For example, common loop variable names such as ConclusionOn average, my own scripts have been reduced in size by 50% to 60%. Adding in variable name compression increases the savings by an additional 10% to 15% in the average script. Naturally, the more you comment your JavaScript code, use indentation to make the code more readable, and use descriptive variable names, the better the compression rates, as there is more stuff to remove. Using semi-colons to mark statement endpoints can also increase the compression rates as it enables the code to remove most if not all of the line feed characters too. History
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||