Click here to Skip to main content
Email Password   helpLost your password?

Introduction

In this article, I will explain how to reduce the size of a .NET application by zipping the application and unzipping it in memory at run time using a pure .NET solution. Compressors for binary files like [UPX] use the same technique. However, they do not work with pseudo EXE files of .NET. As we will see, it is relatively easy to pack a .NET application, because we can make use of the .NET built-in support for metadata and reflection.

A free open source tool written by me called [.NETZ] fully automates all the steps explained in this article so you do not need to write any code by yourself. Read this article only if you are interested in how .NETZ tool works.

First, I will explain how the technique works for packing the main EXE file. For this article I will suppose we have a .NET EXE application named 'app.exe' whose size we need to reduce. It can be a self-contained EXE file or a file that depends on other DLLs. Then I will show how DLLs can be handled. Finally, I will briefly mention an alternate solution. The example code here is in C#.

Selecting a Compression Library

The first thing we need is to select a suitable compression library. I have used #ziplib [ZLIB], which supports (among the others) the zip standard format. We can use any zip compatible program to zip our application. For example, the following command-line zip utility command can be placed in the build batch file:

pkzip25 -add app.zip app.exe
The .NETZ tool however does the zipping step programmatically.

Depending on the program, we may have up to 60% or more size reduction in 'app.zip'. Once we have zipped the application, we need to write a small starter application in C# to unzip and start the application at run-time. This means that we need #ziplib to be available when we unzip the application. The good news is that #ziplib source code is available online. If we choose to use only the usual zip format, then we can remove all the unnecessary classes that support zipping and other formats from the library, and leave only the unzip code. If we do this, the size of #ziplib library will be reduced to only 60 KB (the compiled �zip.dll� file). This 60 KB is the only size overhead of this method, because we need to distribute the unzip library with the application starter.

If the �app.exe� is over 200 KB, which is normally the case for a small GUI exe, then the size of �app.zip� plus the unzip lib will still be under 200 KB. We could also use another faster and smaller zip library. It has not to be written in any of the .NET languages; a native library can be accessed using PInvoke. However it should support uncompressing Byte[] arrays or System.IO.Stream objects so we can process the data in memory.

Packing the Data

The next step is to write a starter application and pack �app.zip� as part of it. The easiest way to do this in .NET, is to pack �app.zip� as a resource file. I will show only example code in this article to keep it simple. It you are interested in the real code I recommend that you have a look at the .NETZ tool source code. Code similar to the following will produce a valid .NET resource file �app.resources�:

FileStream fs = new FileStream("app.zip", FileMode.Open, FileAccess.Read);
byte[] data = new byte[fs.Length];
fs.Read(data, 0, data.Length);
fs.Close();

ResourceWriter rm = new ResourceWriter("app.resources");
rm.AddResource("appdata1", data);
rm.Close();

Note that we have given a name 'appdata1' to the resource to access it later.

The Starter Application

The starter application �starter.exe� loads the resource file, gets the data, unzips it on memory, and uses reflection to start the �app.exe� application. The code invoked by Main(string[] args) will look like the following:

ResourceManager rm = new ResourceManager("app", this.GetType().Assembly);
byte[] data = (byte[])rm.GetObject("appdata1");

We have to unzip the application data in memory. The code for #ziplib is:

string zipPath = "app.exe";
MemoryStream zipFile = new MemoryStream(data);

ZipFile zf = new ZipFile(zipFile);
ZipEntry ze = zf.GetEntry(zipPath);
Stream zs = zf.GetInputStream(ze);
byte[] uzdata = new byte[ze.Size];
sz.Read(uzdata, 0, uzdata.Length);

Then we can create an assembly from the byte array:

Assembly assembly = Assembly.Load(uzdata);

Once we have an assembly, the easiest thing to do is to invoke its entry point, which corresponds the Main(string[] args) method in the original �app.exe�, passing it the original command line arguments passed to the Main(string[] args) method of the starter:

assembly.EntryPoint.Invoke(null, new object[]{args});

Alternatively, we can rely on reflection code to find the types in the assembly and invoke methods on them. This can be useful when the �app.exe� has no entry point, or when we want to invoke any other methods. The startup time could be smaller than starting �app.exe� directly, because of lower disk overhead.

To compile �starter.exe� from �starter.cs�, we will use the following command (supposing it is a Win exe):

csc /t:winexe /out:starter.exe starter.cs AssemblyInfo.cs
    /r:zip.dll /res:app.resources /win32icon:App.ico

The 'App.ico' file can be extracted from the original 'app.exe'; 'AssemblyInfo.cs' can be generated using reflection information from 'app.exe'.

We can rename �starter.exe� back to �app.exe� later, if we like. This way, we distribute �starter.exe� and �zip.dll� which are both smaller in size than �app.exe� alone.

The .NETZ tool handles all these steps automatically. It uses System.CodeDom.Compiler.ICodeCompiler interface with CSharpCodeProvider to compile the starter code programmatically.

Handling DLL-s

If the 'app.exe' depends on other DLLs, we normally do not need to do anything. However, sometimes we may like to zip also the DLL files. The technique we will describe here works only for applications that make use of .NET XCOPY paradigm, that is when DLLs are used by a single application. This technique will not work if the DLLs are placed in GAC, or shared by more than one application which is not aware of the technique.

Let us suppose �lib.dll� is a DLL file required by �app.exe� we like to zip. First, we would link �app.exe� with the normal unzipped version of �lib.dll� as we normally do.

.NET has a built in mechanism for resolving types and assemblies. However, when it fails, we can provide .NET with an assembly. This functionality is exposed by a hook in the System.AppDomain class. We need to handle the following event:

AppDomain currentDomain = AppDomain.CurrentDomain;
currentDomain.AssemblyResolve += new ResolveEventHandler(MyResolveEventHandler);

This code need to be placed into the Main(...) method of the starter application. The trick for this event to be activated is to place the �app.exe� assembly activation code described above in another separate method that will be called by the starter�s Main method.

After we zip the �lib.dll� into �lib.zip�, we may also pack it as a resource file with the starter application, like we did with the �app.zip�. This can be preferable if we want to have a single exe file. Otherwise, we may leave it as a separate file. However, we need to rename the file to something different from �lib.dll�, given that .NET will look for this name, and it will look like a corrupted file to .NET. We can leave the name �lib.zip�, or be creative and rename �lib.zip� to �lib.dllz�. Alternatively, we can save the �lib.zip� data in a SQL database table and retrieve it from there, etc.

The code to activate the DLL in MyResolveEventHandler will look like the following. Here, we suppose that the zipped DLL is a file in the same directory as the starter application:

public static Assembly MyResolveEventHandler(object sender, ResolveEventArgs args)
{
    int i = args.Name.IndexOf(',');
    string dllName = args.Name.Substring(0, i);

    // the dllName will equal "lib" in our example

    // we map it to the zipped file name

    dllName += ".dllz";

    // read the file and unzip the data as above

    // code omitted ...

    byte[] uzdata = ...

    return Assembly.Load(uzdata);
}

This way, the types found in the DLL will be resolved to the AppDomain.

The .NETZ tool�s starter code has logic to search the compressed DLL files in the same way as .NET searches normal DLL files. It also supports private DLL paths.

Another Alternative

Another possible technique to pack .NETZ executables would be to rely on native platform code.

.NET itself uses a technique similar to the [UPX] to create native images. The only difference is that the CLR data segment is not zipped. Every time we run a .NET pseudo EXE file or access a DLL, the information in the PE header [COFF] is read and the CLR data segment is extracted. The CLR is then initialized and the data segment is given to it as an assembly byte array. For EXE files, this work is done by calling the _CorExeMain function in 'mscoree.dll'. For DLL files, a similar function _CorDllMain is called inside the usual DllMain [SSCLI]. If the CLR data segment is zipped then these two functions of the 'mscoree.dll' need to be changed to unzip the data before passing it to the CLR.

This can be done by a third party. In this case, we need to unzip the CLR data segment and then modify the EXE header, so the unzipped data be in the same locations as expected by _CorExeMain and _CorDllMain. However, it would be better if Microsoft supported this as an option in the future. This would make .NET pseudo EXE files have size comparable to the Java JAR files. This technique would also work with DLLs placed in GAC.

Closing Up

In this article I demonstrated a technique used in the .NETZ tool to compress .NET executable files. If you need more details download the latest version of .NETZ and have a look at its source code.

History

References

You must Sign In to use this message board.
 
 
Per page   
 FirstPrevNext
Generalpretty cool and for WebApps...
meaningoflights
20:47 29 Apr '08  
using Virtual Path Provider you can do same thing for WebApps.


Virtualizing Access to Content: Serving Your Web Site from a ZIP File

"In this article, Victor Garcia Aprea creates a virtual path provider that serves a Web site stored in a .zip file."

msdn2.microsoft.com/en-us/library/aa479502.aspx
Question.NET runtime redirection
C-lviu
21:48 5 Jan '06  
Sorry to bother you with this question here but I really need a quick answer on this if you can.
I noticed that the compiled netz application that I downloaded from www.madebits.com is redirecting itself to the latest version of .NET framework available on the machine.
The thing is that it is not using a netz.exe.config file.
I searched allover the internet and could not find an article on how to do this.
Could you plese tell me how to redirect an application to the latest version of .NET runtime available on a machine without using a configuration file.

Thanks a lot.
GeneralDoesn't work if your exe file uses serialization
C-lviu
5:13 19 Dec '05  
Hi,

This is a great thing, I tried it, even wrote a program that uses this tehnique.
But there is a downside to this.
Microsoft .NET framework 1.1 does not allow you to use serialization from a dynamic assembly. If your exe file has code that uses serialization it will throw an Exception.

Ie:
XmlSerializer xs = new XmlSerializer(typeof(MyClass))

This would throw an Exception like this:
System.InvalidOperationException: Unable to generate a serializer for type MyClass from assembly <Unknown> because the assembly may be dynamic. Save the Assembly and load it from disk to use it with XmlSerialization.

However in .NET Framework 2.0 it works, the exception is not thrown.
Does anyone have a sollution for this on .NET Framework 1.1?

Thanks

-- modified at 10:18 Monday 19th December, 2005
Generalknown issue documented in .NETZ help
Vasian Cepa
6:16 19 Dec '05  
This is a known issue documented in .NETZ help. For more details please read the .NETZ help on:

http://madebits.com/netz/help2.php#rem

Please do not write questions in this page that are already answered in the documentation. This article is not updated.

GeneralWhy use #ZLib?
FZelle
2:40 3 Aug '04  
Hello despide the guy before me I think it is an interessting
Idea especially if you zip the Assemblies with a password.

That way there is definately no chance of reverse engeneering.

But one question, why do you use #zlib?

Thatone is under the GPL, so no software build with it can be
sold.

If you use something else it could become more accepted
esp. in the comercial sector.

Have you tried http://www.organicbit.com/zip/

GeneralRe: Why use #ZLib?
Vasian Cepa
3:16 3 Aug '04  
Smile I just saw it, thanks for the suggestion.

It is nice and smaller, but the only ZipReader constructor is:

public ZipReader(string fileName);

I need a constructor that accepts a System.IO.Stream not a file name.

However I do not agree with you about the license.

Message update:

http://www.st.informatik.tu-darmstadt.de/static/staff/Cepa/tools/netz/license.html

GeneralRe: Why use #ZLib?
FZelle
5:05 3 Aug '04  
Thank you for the fast answer.

But the problem I see is that I have to deliver something to the
customer, that is working with #zlib ( the decompression ).
Especially if i want to use the DLL (de)-compression.

Maybee
http://www.codeproject.com/managedcpp/mcppzlibwrapper.asp
or
http://www.codeproject.com/managedcpp/bzipstream.asp
are better for this.

best regards

GeneralLicense
Vasian Cepa
5:29 3 Aug '04  
Bitte, check

http://www.st.informatik.tu-darmstadt.de/static/staff/Cepa/tools/netz/license.html

I verified it. #ziplib can be distributed with closed source apps. Its license is a modified GPL not pure GPL.
GeneralRe: License
FZelle
6:14 3 Aug '04  
Thanks for the clarification.

Bis dann

GeneralRe: Why use #ZLib?
AJ.NET
19:57 1 Sep '04  
Hi,

did you realize that there is zip support in the j# library? It's part of the .net fw, no licensing issues and maintained by MS. The only caveat is that it follows java conventions and uses stream classes from the j# library instead of System.IO. But a wrapper could hide those issues. AFAIR there's an msdn article covering that topic.

Bye,
AJ
GeneralRe: Why use #ZLib?
Vasian Cepa
22:35 1 Sep '04  
I think you have not read the message above carefully. There are no restrictions in linking #ZipLib in commercial applications. The author of #ZipLib was kind to confirm this to me and the #ZipLib web site says it clearly that there are no restrictions as long as you do not change the code. It was a missundestanding that started this thread and it is getting longer than necessary.

By the way there is zip support by default in Windows since zlib.dll is distributed with Windows and a wrapper for C# exists in http://www.organicbit.com/zip/. The problem is that you need a library that can unzip the data in memory, that is it should work on Byte[] arrays OR on System.IO.Stream objects. The zlib.dll accepts only file names.
GeneralRe: Why use #ZLib?
poomex
3:43 3 Jan '06  
But the thing is that you've got to store password somewhere to decompress the assembly, ain't it right...

I wonder what would be the easiest way to get the assembly out of resources - I mean reverse it?
GeneralNo good
leppie
2:08 3 Aug '04  
Nice idea, but its just no good! All that you do is zip an assembly and hence making it unaccessable to other components! You say it uses less disk space, but u need to extract the assembly into memory, so infact you are using MORE space, and time! Something like a pure exe packer or strip tool (look at ilstrip from Portable.NET) would be much better compared to a few saved bytes on disk.

My two cents Smile

PS: Why not do a true .NET exe packer? Dll's like their win and *nix counter parts could never be compressed due to the nature of how they being accessed.

top secret xacc-ide 0.0.1
GeneralRe: No good
Vasian Cepa
3:22 3 Aug '04  
a pure exe packer will be better of course, but I am NOT going to write it Smile , and this idea works similarly but it is pure Smile C# code.

About the memory/time issue you are wrong, run a few benchmarks to get convinced.
GeneralRe: No good
leppie
3:38 3 Aug '04  
Vasian Cepa wrote: About the memory/time issue you are wrong, run a few benchmarks to get convinced.
Gte me convinced, I suspect you are being fooled by memory cache! What benchmarks did you do? Here's one: Make a 10mb dll (just a fat word doc or compressable obj as a resource), then u test it again.

top secret xacc-ide 0.0.1
GeneralRe: No good
Vasian Cepa
3:50 3 Aug '04  
I have another article with benchmarks ready for somewhere else. However until I know its fate for sure, I would prefer not to give you more info here. Hopefully I will show the data also here one day but until then if you need more info please contact me privately by email. This thread is closed Smile .
General--> Download the latest version of .NETZ compression tool ------ http://www.madebits.com/netz/index.php
Vasian Cepa
22:27 1 Aug '04  
Please download the latest version of the .NETZ compression tool from:

http://www.madebits.com/netz/index.php
-- modified at 8:12 Monday 5th December, 2005
GeneralI've tried but failed
amonw
22:30 30 Jul '04  
Hi,
I've done almost everything you said except:
1.I used the entire #ziplib dll:icsharpcode.sharpziplib.dll
2.I changed
ResourceManager rm = new ResourceManager("app", this.GetType().Assembly);
to:
ResourceManager rm = new ResourceManager("app", Assembly.GetExecutingAssembly());
because I can't use "this" in "static void Main(..)"

starter was successfully compiled but when I run the starter.exe,
it threw a "System.IO.FileLoadException" at "Assembly assembly = Assembly.Load(uzdata);"Confused
Can anyone help me out?
Thanks in advance.
Generalfound a mistake but still failed
amonw
23:32 30 Jul '04  
I think the line "byte[] uzdata = new byte[zs.Length]" in the article should change to "byte[] uzdata = new byte[ze.Size]" .
But it still threw a "System.IO.FileLoadException" at "Assembly assembly = Assembly.Load(uzdata);"
Generalfinally done!
amonw
16:24 1 Aug '04  
I worked it out finally.
The previous error is because the file is too large to use just one
"zs.Read(uzdata, 0, uzdata.Length);",after I changed this to a read loop,everything works fine.
Thank you for such a great idea.
GeneralRe: finally done!
Vasian Cepa
22:29 1 Aug '04  
Smile sorry I made you try, I had had no time to update the arcticle. The message above about the .NETZ tool will tell you more. The complete working source code is there. Please take the latest version.
Generalsome more comments
Vasian Cepa
23:10 2 Aug '04  
You have to read the file all at once to minimize disk access, a loop is slower and depends on the buffer size. Check the .NETZ tool source code to see how this can be properly done to work with files of any size.
GeneralWill this work with EXE + DLLs
Praveen Ramesh
10:52 16 Jul '04  
Hey, yes this is very interesting!

Will this approach work when you have say an exe A1 and dependant dlls D1, D2 and D3? Can we zip them all up, unzip them in memory at runtime and run A1? I wonder how the framework will find the dependant D1, D2 and D3 dlls (that are in memory) when A1 is run?

We noticed that loading the dlls were very slow in laptops (we are assuming because of hard-disk speed) and zipping the whole thing might make it load faster after all.

Thanks
Praveen
GeneralRe: Will this work with EXE + DLLs
Vasian Cepa
2:39 19 Jul '04  
Update: Please check the .NETZ tool.
GeneralReflector...
Jonathan de Halleux
9:48 16 Jul '04  
Reflector uses this kind of technique...

Jonathan de Halleux - My Blog


Last Updated 4 Oct 2004 | Advertise | Privacy | Terms of Use | Copyright © CodeProject, 1999-2010