How fast can a MD5 collision be found

Question

1.00/5 (2 votes)

See more:

If I have a salt of 16bytes and 16bytes of data, how fast can one find another 16bytes of data so that MD5(salt + data) == MD5(salt + other data)?

I don't expect an answer accurate to the nanosecond, just an estimation like "a few seconds", "a few hours", "a few civilizations".

Posted 8-May-14 5:15am

Sergejack

Add a Solution

Comments

Sergey Alexandrovich Kryukov 8-May-14 11:25am

It apparently depends on your system and code. Beyond that, the question simply makes no sense. You did not even specify your language and platform.
Why not timing it by yourself?
—SA

Sergejack 9-May-14 3:09am

System and language are irrelevant. Hackers will always use whatever they see fit. I just want to know how fast they could forge a fake data being validated by the same MD5 hash and salt.

2 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

David A. Gray · Answer 1 · 2014-05-08T18:35:00

As Sergay said, it depends on your system and code. For instance, I have Ronald Rivest's code, written in straight C, that hashes 1 block in X nanoseconds. Since I haven't done any benchmarks, I can't tell you the value of X. I used Microsoft Visual C++ 6.0 to compile it into the DLL that I use daily, and it runs against the multithreaded DLL implementation of the Standard C runtime library. I would expect to get marginally different (probably better) times if I compiled against the static multithreaded CRT, and the times would certainly be (slightly) better if I compiled against the statically linked single threaded CRT (LIBC.LIB). I might get measurably different results if I used the GCC compiler to compile and link the same code.

The above only covers the potential sources of variation given the same source code, written in the same programming language. It gets even more muddy if you include other implementations of the MD5 algorithm, either in the C programming language, or change the programming language entirely. For example, I would expect a Visual Basic implementation of the same algorithm to run more slowly, even if it was faithful to Dr. Rivest's algorithm. The same would be true of a port to C#, JavaScript, or any number of other languages in which the MD5 algorithm has been implemented.

A sample size of 16 bytes is pretty unrealistic. Real world plaintext is more like thousands, or even millions, of bytes, such as the body of a mail message or the executable code and static data of a program file.

Another consideration is that sample size has little bearing on the content of the resultant hash, since the MD5 hash is always 128 bits (32 bytes) long.

Moreover, if you confine your test cases to plaintexts of 16 bytes, it's fairly easy to work out that there are only 128 possible combinations (8 bits per byte times 16 bytes).

Sergejack · Answer 2 · 2014-05-11T21:39:00

Solution 3

Answer : very slowly as it's not just about finding collision but working with preimage.

Posted 11-May-14 21:39pm