MD5, heaven on earth?

tomclaessens

This small report was written to inform in most cases PHP developers of the fact that MD5 offers a false secure feeling. Not because of the algorithm, but because of its use.

MD5 is a one-way hash algorithm that provides a 128 bits length hash whatever the original text length might have been. The resulting hash is often used to sign documents thus giving a way to verify that the original content of the document wasn't altered neither by software/hardware nor by a third party.

Today, many applications (most of them network-oriented) use MD5 as their authentication algorithm to avoid that plain text passwords are sent over the network. (or in many cases, the internet)

Here's an example of how MD5 is used (faulty) today: A client sends an MD5 password hash over the network to the server. The server makes his own MD5 password hash and then compares the two hashes. If they match, the ser- ver assumes that the client (or whoever sent the hash) knows the password and therefor the authentication will be successful. But the server might be totally wrong.

Theoretically, MD5 cannot be reversed. That means that nobody can guess or compute from the hash back what the original text might have been. (even with small strings, like most passwords). But do attackers really need to know the original string? No, the number of a resulting hash is fixed (2¹²⁸⁾ and a lot of strings will give the same MD5 hash.

So MD5 is a faulty algorithm? No, but it's used for a wrong purpose. As stated above, it offers some form of verification to see if a file is altered (or string) but the designers never intended that it should be used as an authentication algorithm.

The problem is that many authentication methods make use of MD5 in a wrong way. Let's take MSN Messenger as an example. The common way to use MD5 as authentication is MD5(challenge + MD5(password + challenge)) but Messenger uses a simplified version, MD5(challenge + password). Using MD5 as authentication was a weak choice in the first place, but making a simplified authentication algorithm was even worse.

How is MD5 cracked then? There are two ways it can be easily broken. First, most would try a dictionary attack. A file with commonly used passwords is used. Each string (or in this case password) is MD5'd and the resulting hash is compared with the existing MD5 hash. Once the two hashes match, the authentication is broken (which in no way means that the string from the dictionary file is identical to the original string of the cracked hash).

A second method is a brute force attack. BF just tries every possible combination of characters. So in the end, it will always come up with a hash matching the one that had to be cracked. Though, both methods might be time consuming. But as most people use weak or short passwords (easy to remember ones) the cracking shouldn't take that long. As most people use existing words, the dictionary attack comes obviously in the first place of the attacker.

How to prevent this from happening? Well, use MD5 as it was intended (as signature for file checking) and use real authentication algorithms to provide a strong security for the information that requires secrecy.

In the end, weak authentication systems are not weak because of the used algorithm, but because the designers / developers have chosen a wrong algorithm which wasn't created for such purpose in the first way.

To improve security, you can use md5 in a more "sophisticated way".

Example:

function str_encode($str) {
   $strMD5feed = sha1("y8Vyica9aspPHs9H9X6Z"); // random key added to md5
   $str = base64_encode($str);
   $str = str_rot13($str);
   $str = bin2hex($str);
   $str = md5($str + md5($strMD5feed + $str));
   return $str;
}

To get a good $strMD5feed value, I suggest for those who are not that creative in picking random letters, use my phpPPG tool (check the link in my signature)

Mordecai

Hmm... I've tried 2719 words on the string "mypass," and none have matched. I'm not quite sure how MD5 works (I don't use it, anyway), but I don't think multiple strings can give the same hash, but then again, I don't see why they couldn't.

I altered my crack function to use MD5().

TruckStuff

Hmm... I've tried 2719 words on the string "mypass," and none have matched.

That's because 'mypass' isn't a word that appears in most dictionaries.

Until recently, MD5 was thought to return a unique value for any unique string (that is why it is used as a method of integrity verification). However, it was recently proven that it IS possible for MD5 to return identical results for non-unique. Google for more details.

The fact of the matter is that no authentication mechanism is 100% fool proof. The best you can do is get a strong-enough auth system to keep it the script kiddies and those who are just plugging away for ovious holes. If someone REALLY knows what they are doing and REALLY wants to get into your system, you'll never be able to stop them.

Mordecai

Originally posted by TruckStuff
That's because 'mypass' isn't a word that appears in most dictionaries.

Actually, it was in the dictionary, I took it out so there wasn't an exact match. I used one of my own personal dictionaries (just some random words, not in any sort of order), so I got to choose what words were in it.

I'll try it with more words and find what's close. I'll implement levenshtein distancing to see if anything is remotely close (<5 characters difference, or so).

ednark

the incorrect use of md5 is a good thing to point out to people.. i used to think it was for encrypting simply because i saw it used as such.. and you don't know unless you are told..

there was a post recently about the RC4 algorithm arcfour rcfour something like that earlier that you can search for if you are more interested in encryption... howere that specific algorithm is also implemented as part of the mcrypt_ php module

ahundiak

and a lot of strings will give the same MD5 hash.

tomclaessens,

Can you provide an example? Maybe give a list of 5 common words that all produce the same MD5 value?

jc94062

The fact of the matter is that no authentication mechanism is 100% fool proof. The best you can do is get a strong-enough auth system to keep it the script kiddies and those who are just plugging away for ovious holes. If someone REALLY knows what they are doing and REALLY wants to get into your system, you'll never be able to stop them.

I use a useful little method where if someone enters a password wrongly 5 times in a day their IP is banned, for a day.
Sure, people can try thousands of passwords, its just going to take a while 😛

Incidently, I expect md5(md5($pass)) would be fairly unique.

Arakrys

Incidently, I expect md5(md5($pass)) would be fairly unique.

I can't see why it would be more unique than md5($pass) ?

trooper

Good article

Lets have some more of these types of articles makes people realise that just cuz its "always been done like that" it ain't always correct

😃

superwormy

For the record, this part of the article:

A client sends an MD5 password hash over the network to the server. The server makes his own MD5 password hash and then compares the two hashes.

Is not usually what is done / reccomended on this board at all when you're working with PHP, because PHP is server-side, they are usually sending a plain-text password over the network, and its being encrypted and matched on the server-side. So that example at least isn't really applicable to solely PHP applications.

If you're worried about sending plain-text passwords over a network, you should be using SSL, which is designed just for those cases.

Another good idea in addition to jc94062's idea of banning IPs, is to put a sleep(10) counter on yoru login pages. So they click the login script, and if hte login fails, make the script sleep 10 seconds or so before it returns them to the login page again. A bot script that is just brute-forcing passwords will take signifigantly longer that way to brute-force the login if it has to wait 10 seconds between each request.

HalfaBee

A brue force script using a-zA-Z0-9 on even a short password could take 200+ years under ideal conditions.
If someone want's to prove me wrong give me a string that gives
8e7118c57e205c0049f6bd5d358f53bb
as my password, I will stop using MD5 as a security measure.

HalfaBee

epimeth

3e47b75000b0924b6c9ba5759a7cf15d

thats my md5'd password

whats it in cleartext?

I'll give you a hint... its a word from webster's english dictionary.

go ahead.

find it.

and tell me how long it took....

Weedpacket

A thread in the Echo Lounge contains a number of observations about the size of MD5 space, searching it, appropriate use of MD5, its irrelevance (in a sense) to password storage, and so forth.

But it is good (and important) to point out that storing passwords as MD5 hashes on the server is no more secure when it comes to people trying to guess passwords than storing them in clear text; the security of storing MD5s of passwords comes from the difficulty it causes people who have managed to get access (legitmately or otherwise) to the stored passwords. Another recent thread which addresses this is here.

guyryan100

epimeth,
the md5'd password is "nothing" and it took, uh, 5 minutes to write the script, < 1 second to execute the script, and about 5 more minutes to remember my messageboard password so that I could post my reply.

The md5'd password: nothing

The code:
<?php
$dest_md5="3e47b75000b0924b6c9ba5759a7cf15d"; // find this
$handle = fopen ("c:/temp/webster", "r"); // list of words
while (!feof ($handle)) // for each word in list
{
$word = trim(fgets($handle, 20)); // get rid of any whitespace
$test_md5 = md5($word); // grab md5 of the word
if($test_md5 == $dest_md5) echo "Password Found: $word <br>"; // compare, if a match, display
}
fclose ($handle);
?>

HalfaBee's md5'd password is not a dictionary word, so the above code did not identify the match on his pw. one would have to try random strings to identify it - which, as he pointed out, would take a while.

superwormy

Wow...

Point well proven there... can't argue with that.

pjleonhardt

Originally posted by guyryan100
epimeth,
the md5'd password is "nothing" and it took, uh, 5 minutes to write the script, < 1 second to execute the script, and about 5 more minutes to remember my messageboard password so that I could post my reply.

The md5'd password: nothing

The code:
<?php
$dest_md5="3e47b75000b0924b6c9ba5759a7cf15d"; // find this
$handle = fopen ("c:/temp/webster", "r"); // list of words
while (!feof ($handle)) // for each word in list
{
$word = trim(fgets($handle, 20)); // get rid of any whitespace
$test_md5 = md5($word); // grab md5 of the word
if($test_md5 == $dest_md5) echo "Password Found: $word <br>"; // compare, if a match, display
}
fclose ($handle);
?>

HalfaBee's md5'd password is not a dictionary word, so the above code did not identify the match on his pw. one would have to try random strings to identify it - which, as he pointed out, would take a while.

owned

HalfaBee

And?

Weedpacket

Originally posted by pjleonhardt
owned

Pardon?

HalfaBee

Huh?