A complementary article, showing how to exploit this is here.
There is a Crisis in Hashland. Latest results on Cryptology are opening the debate about the security of cryptographic one way hash functions.
I suggested in another article that:
Don't store the user password on your database. No matter how many security measures you take, there is not a perfect security system. Use a hash method for the passwords, like SHA1, or MD5.
SHA1 and MD5 aren't secure anymore, because of projects like passcracking, we can't trust these hash functions for one way encryption.
In fact, experts suggest that:
"Given the number of practical attacks on MD5, it may be time to move
to a Federal Information Processing Standards (FIPS) approved hash
algorithm, such as SHA-256, or SHA-512. Note that vulnerabilities have
recently been found in SHA-1, however, and NIST is already planning to
phase it out by 2010." (Quoted from cn.bbs.comp.security.)
Update - here are some reactions to these issues:
Microsoft is banning certain cryptographic functions from new computer code, citing increasingly sophisticated attacks that make them less secure, according to a company executive.
The Redmond, Wash., software company instituted a new policy for all developers that bans functions using the DES, MD4, MD5 and, in some cases, the SHA1 encryption algorithm, which is becoming "creaky at the edges," said Michael Howard, senior security program manager at the company, Howard said. (Source)
To understand the consequences, this article first explains what one way hash functions are, shows one of their common uses on password storage, and shows the nature of the current attacks and their consequences, and also suggests other alternative hash functions stronger at the present time.
One Way Hash Functions
The following definitions are taken from Bruce Schneier's Book: Applied Cryptography Second Edition:
A one-way hash function, H(M), operates on an arbitrary-length pre-image message, M. It returns a fixed-length hash value, h.
- h = H(M), where h is of length m
Many functions can take an arbitrary-length input and return an output of fixed length, but one-way hash functions have additional characteristics that make them one-way :
- Given M, it is easy to compute h.
- Given h, it is hard to compute M such that H(M)= h.
- Given M, it is hard to find another message, M’, such that H(M) = H(M’).
In some applications, one-wayness is insufficient; we need an additional requirement called collision-resistance.
It is hard to find two random messages, M and M’, such that H(M) = H(M’).
Hashing Function Used for Password Storing
The definition mentioned before, lead to the development of secure password storage.
The use of one way hashing function is the following algorithm:
- For password storage, request input
plain_password from user , then apply a oneway hash function H on
plain_password and store it. In code,
stored_password = H(plain_password)
- For password checking, request
probe_password from user, apply H on it, and compare with
stored_ password, i.e. check
::= H(probe_password) == stored_password.
This schema is good as long as H is a good and strong hashing function.
Today, MD5 is under heavy attack, the reason is that it is used on the GNU implementation of the POSIX function Crypt. In fact, on this Web site, you can find a lot of collisions for this function.
Other papers about the MD5 attacks can be found here.
Other Security Side Effects
In recent works, three investigators: Lenstra, Wang and Weger, showed that it is feasible to build colliding electronic X.509 certificates, using the MD5 collision techniques developed by Wang. You can read their paper here.
This work violates the basic trust principles underlying PKI (Public Key Infrastructure).
Moreover, Chinese investigators have shown an attack to other Hash function: SHA-1. (Read about it here).
Some others side effects are mentioned in the paper "MD5 To Be Considered Harmful Someday".
For example, digital signatures like RSA, DSA/ElGamel, and Elliptical Curve never hash the data directly, but rather a hash of the data, often the choice is MD5. Also consider DRM (Digital Right Management) implementations using MD5. All these protection signatures and checksums are at risk because of these findings.
If you read the paper, you can learn that it is possible to add a payload to the data, or alter the data without being noticed.
Other example is shown in the paper, Practical Attacks on Digital Signatures Using MD5 Message Digest.
A Proof of Concept
In this article, I wrote about how to implement the attack in Microsoft.NET.
Finding Collisions for MD5
The typical way of collision search is to use a brute-force algorithm: given a hash value
h, for a plain message
m, written in an alphabet
h = MD5(m), so in brute force collision search, we try every possible combination in alphabet
A we find a
m' message such as
MD5(m') = h. m' can be equal to or not equal to
The Rainbow Crack uses precalculated tables for intermediate steps on the process, this can accelerate the cracking process. For example, a password of up to 14 characters, of this charset: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()-_+=" can be cracked in a few minutes.
Better One Way Hash Functions
Some people suggest to use more complicated applications of MD5, for example, use
stored_password = crypt(plain_password + salt), where the key is a fixed one or the user id, or some other fixed value. This schema is not stronger, and can be shown to have several flaws to this approach.
Other alternatives are:
- Use a key based hash function
- Combine algorithms
- Use other functions
The first one relays on a key, that can be stolen. Combining algorithms is better, but probably only results in more CPU usage than real protection.
To use other hash functions can be a solution, but what criteria is used to choose a good one?
The answer is simple, use hash functions with a bigger domain of results. For example, MD5 generates a 128 bits value, so the space of possible resulting values is 2128 in size. By simple logic, if your hash function has an output domain of a size bigger than that, then it's a good alternative.
All the followings functions have stronger implementations than MD5.
- WHIRPOOL, generates a 512 bits output
- RIPEMD, uses 160, 128 or 320 bits output
- SHA-2, generates 256, 512 bits output
SHA-2 is available on crypto API, and Microsoft.NET, so I suggest you to use it. The SHA-2 is a group of functions, in Microsoft.NET you have the followings classes:
- September 7th, 2005: Added some quotes from Chinese security groups, and more links on papers about the MD5 collisions. Also a section about other effects on DRM and checksums is included. Some grammar corrections.
- September 8th, 2005: Some title changes, to be more precise about cryptology terminologies. Some typos corrected.
- September 9th, 2005: Added list of SHA-2 algorithms available on Microsoft.NET 1.1.
- September 14th, 2005: Added link to proof of concept article