md5 passwords

dalecosp · Nov 14, 2012

sneakyimp;11017433 wrote:
I typically install fail2ban on my servers to prevent such spamming via ssh.

Much better to disallow password authentication entirely and utilize key-based authentication, yes?

sneakyimp · Nov 14, 2012

dalecosp;11017441 wrote:
Much better to disallow password authentication entirely and utilize key-based authentication, yes?

Yes, that has been my approach where practicable -- with fail2ban and iptables on top for extra OCD-ness.

Bonesnap · Nov 14, 2012

sneakyimp;11017423 wrote:
If you have an algorithm you are using to generate the dictionary randomly, I'd be interested in knowing how often you encounter a collision in expanding your DB. It's my understanding that MD5 producese a 128-bit hash which means there are 3.4 10³⁸ different possibilities -- vastly larger than the 3.06 10⁸ entries that you have. I also wonder if there's any theory out there about what algorithms are able to explore MD5 space faster than others.

I don't have an algorithm but I do have a method. I have spent considerable time compiling massive dictionaries ranging from English language to common passwords to celebrities and apply varying incremental "mangling" to each entry. Also, I have never encountered a collision. 306 million is quite literally a drop in the bucket compared to the total combinations. Crackstation has a dictionary of over 15 billion entries and from what I understand has never run into a collision.

sneakyimp;11017423 wrote:
If your password is "password" then you will be cracked eventually by a dictionary attack. I'd be willing to bet that any exploit kit would start first with a set of common default passwords.

Dictionary attacks are almost useless against a salted password, the video notwithstanding. The only reason the video looks successful is because the attacker knew exactly how the salt was applied. I don't remember, were all the salts unique from that video? Also, all those passwords were very simple. If I recall, most of them were all digits or very common words. If you applied those passwords against a larger dictionary the time would increase substantially.

You are right, though: using a very simple password, even if salted, is much more likely to be cracked than a strong one (also salted).

sneakyimp;11017423 wrote:
As for depending on your source code to protect you, I think it's important to consider Kerckhoff's Principle which suggests that the keys you use matter tremendously more than the mechanism of encryption. Considering for a moment the number of different ways that one might prepend or append or intersperse a salt with a password, it seems apparent that we're talking about a much smaller number of possibilities than the vast number space of a 128-bit key.

I'm not advocating using an insecure password; however, applying a salt increases its effectiveness quite a bit, and applying a non-standard method could make it nearly impossible. Remember, password attackers, like common thieves, go for the easy targets and avoid the difficult ones. Most won't even attempt a salted password because it's not worth it, and instead go for the unsalted ones.

Derokorian;11017425 wrote:
Edit: As for dictionary attacks, that will only work if they are using your login mechanism to spam logins, however, good security would say If request for invalid credentials happen more than X times within Y minutes from a single source, block that source from continued attempts for Z minutes (where z increases every time a single source is added to the list). Of course that's my style, it may not be everyone's.

Brute force dictionary attacks are usually applied to a leaked database, where the hashes can be sorted and unique entries removed. Such being the case, there are no arbitrary limits imposed on the attacker. In reality, no serious attacker tries to brute force through a website login form, even if there weren't any restrictions; it would simply take too long.

Derokorian · Nov 14, 2012

Interesting points of information bone. As per usual I don't know the method for attacks, and made assumptions. I guess I don't understand how you can use a dictionary attack on the database unless you know the salt(s), the way they are applied and all intermediate steps between inputting a password in a form, and storing it in the db. For example, is it as simple as:

$pass = md5(APPSALT . $_POST['password'] . $row['USER_SALT']);

Or is it possibly more complex, like one I used to use:

$pass = hashPass($username,$usersalt,$_POST['pass']);
function hashPass($user,$salt,$pass) {
   $str = substr(APPSALT,0,ceil(strlen(APPSALT)/2)) . $salt . $user . $pass . $salt . substr(APPSALT,ceil(strlen(APPSALT)/2));
   return hash("sha512",hash("sha512",$str));
}

Seems to me, without any of those inner workings there shouldn't be anyway to use a dictionary attack to crack a hashed password... But maybe I'm missing something (as always) about how it works.

Bonesnap · Nov 14, 2012

The method I would use would be closer to that of your second example.

Derokorian;11017461 wrote:
Seems to me, without any of those inner workings there shouldn't be anyway to use a dictionary attack to crack a hashed password

This only applies if the password is salted. If it's not then a dictionary attack (a good one) can be very effective.

One other thing about salting: it severely mitigates the effectiveness of rainbow tables as well, almost nullifying them completely.

sneakyimp · Nov 14, 2012

Bonesnap;11017457 wrote:
I have spent considerable time compiling massive dictionaries ranging from English language to common passwords to celebrities and apply varying incremental "mangling" to each entry.

I wonder if it might be possible to use a simpler algorithm to generate an MD5 dictionary more rapidly simple by taking the 93 printable ASCII characters and generating strings of some length between 8 and 20 chars by just counting from 0 to N in base 93. After all, the point isn't really to determine some original password, but to determine an equivalent string that accomplishes the same hash, right? Or is it? I think the salting algorithm might matter here. I wonder if salting might obviate the utility of an md5 dictionary entirely.

Bonesnap;11017457 wrote:
Dictionary attacks are almost useless against a salted password, the video notwithstanding.

Not if you apply the dictionary attack at the point of login, in which case the app is applying the salt for you.

Bonesnap;11017457 wrote:
applying a non-standard method could make it nearly impossible.

I think Kerckhoff would disagree I'm not sure how to formalize mathematically how much trickier non-standard salting methods are, but I would posit that each possible different salting technique contributes at most one bit to the strength of one's password [i.e., it provides exactly one alternative to the simple prepend or append approach] -- and then only if the source code is not also compromised.

Also, if you are applying the attack at the point of login, the salting method is irrelevant. The point of both hashing and salting is merely to make it more difficult to extract the actual password from stored data that has been exposed.

Bonesnap;11017457 wrote:
Brute force dictionary attacks are usually applied to a leaked database, where the hashes can be sorted and unique entries removed. Such being the case, there are no arbitrary limits imposed on the attacker. In reality, no serious attacker tries to brute force through a website login form, even if there weren't any restrictions; it would simply take too long.

I don't have any hacker cred to know under what conditions people most often use dictionary attacks, but I've seen long-running and persistent brute force attacks directed against ssh on a dedicated server I had once -- the basic approach appeared to be attempting login using common usernames like root, ubuntu, mysql, www-data, etc. With a botnet and/or multithreaded applications, it is possible to bring quite a bit of cracking power to bear on an unprepared server and the extra time lost in HTTP latency might well be offset by the extra time required to deal with some kind of black box salting algorithm.

I think the critical advantage of a dictionary attack -- regardless of where it is applied -- is that most passwords are not nonsense but are directly derived from one or two words in the dictionary. For an 8-letter password, this means a dramatic reduction in the number of possible passwords a victim might reasonably use. There are 5.6 * 10 ^ 15 different random 8-char sequences of the 93 printable ASCII characters. I don't know the number of letters in English with 8 letters or less but it's maybe 10 ^5. Allowing 100 variations per word for common substitutions (e.g., CamelCase or L3tt3r Substitution or punctuation) still yields only about 10 ^ 7 possibilities -- a dramatic difference.

Weedpacket · Nov 15, 2012

Bonesnap wrote:
Also, I have never encountered a collision.

A mechanism for constructing MD5 collisions was presented in 2005; however it does involve the attacker creating both plaintexts.

Bonesnap wrote:
306 million is quite literally a drop in the bucket compared to the total combinations. Crackstation has a dictionary of over 15 billion entries and from what I understand has never run into a collision.

(hehe, he said "literally a drop the bucket"). The attack shown was against a known salting method (ASP.NET's Membership provider), and in the video tested hashes at a rate of 259 million/second (it could have been a couple of billion tests per second but for bottlenecks in generating cases to pass to the GPU) for 45 minutes and recovered 24710 of 39384 passwords (with unique salts). Mind you, the dictionary only had 23,685,601 entries.

sneakyimp wrote:
I wonder if it might be possible to use a simpler algorithm to generate an MD5 dictionary more rapidly simple by taking the 93 printable ASCII characters and generating strings of some length between 8 and 20 chars by just counting from 0 to N in base 93. After all, the point isn't really to determine some original password, but to determine an equivalent string that accomplishes the same hash, right?

There's no reason to assume that a collision has the same or even a remotely similar length to the target. (In fact, I'm not aware of a proof that any given hash does have (at least) two colliding plaintexts, though I'd be surprised if it's not so - it would imply that the MD5 algorithm is pretty poor at uniformly distributing plaintexts across the hash space.) Mind you, your "just counting" is equivalent to a dictionary containing 2.36785×10³⁹ words - or about 4×10²⁸ terabytes.

sneakyimp wrote:
I think the critical advantage of a dictionary attack -- regardless of where it is applied -- is that most passwords are not nonsense but are directly derived from one or two words in the dictionary. For an 8-letter password, this means a dramatic reduction in the number of possible passwords a victim might reasonably use. There are 5.6 * 10 ^ 15 different random 8-char sequences of the 93 printable ASCII characters. I don't know the number of letters in English with 8 letters or less but it's maybe 10 ^5. Allowing 100 variations per word for common substitutions (e.g., CamelCase or L3tt3r Substitution or punctuation) still yields only about 10 ^ 7 possibilities -- a dramatic difference.

Obligatory xkcd reference here

Bonesnap · Nov 15, 2012

sneakyimp;11017465 wrote:
I wonder if it might be possible to use a simpler algorithm to generate an MD5 dictionary more rapidly simple by taking the 93 printable ASCII characters and generating strings of some length between 8 and 20 chars by just counting from 0 to N in base 93. After all, the point isn't really to determine some original password, but to determine an equivalent string that accomplishes the same hash, right? Or is it? I think the salting algorithm might matter here. I wonder if salting might obviate the utility of an md5 dictionary entirely.

What you're referring to is a rainbow table, and they already exist for all possible combinations of ASCII characters up to eight characters for MD5. If you look hard enough on the Internet you can find them. I have.

The issue with rainbow tables is simply that passwords that people use don't "fit the bill", as sneakyimp basically said. So while you will have a massive dictionary of precomputed hashes, your success rate will be far less than a carefully constructed dictionary attack.

sneakyimp;11017465 wrote:
Not if you apply the dictionary attack at the point of login, in which case the app is applying the salt for you.

Agreed, but simply put: it's too slow. When you compare the amount of time it takes even a script to enter in credentials compared to attacking a hash list, it's night and day. Even if it took half a second, there's millions of combinations "wasted" - and that's assuming that the login form doesn't have security measures in place. Most do nowadays. They're really not a target for password attackers.

sneakyimp;11017465 wrote:
I think Kerckhoff would disagree I'm not sure how to formalize mathematically how much trickier non-standard salting methods are, but I would posit that each possible different salting technique contributes at most one bit to the strength of one's password [i.e., it provides exactly one alternative to the simple prepend or append approach] -- and then only if the source code is not also compromised.

He can disagree all he wants but the fact remains if you don't know how the salt is applied then you're out of luck. Your only two options are to either 1) keep trying different salting methods until you find one that works but that presents a problem: what if you actually use the right salting method but the string you're testing isn't in your dictionary? You would think you had the wrong one and move on. Sure, you could test each and every single hash, but that exponentially adds to your time. Why not just attack someone who puts up much less of a fight?

And 2) Do a straight up brute force attack trying every single possible combination, but we both know that could take centuries or more.

Nothing in the world is absolutely 100% secure; everything has a flaw. So let's remember the gritty truth about security: you don't have to be 100% secure, but you better hope to be more secure than the guy standing next to you.

EDIT: Yes, if your source code is compromised then there's nothing you can do. If you drop your house key and someone picks it up, then your fancy lock means nothing. One thing to keep in mind though: as the "designer" of the system, I don't have control over people's passwords (short of putting restrictions on them but that's generally considered a bad idea), so I will make my system as strong as possible. I still think it is better to have a non-standard salting method. Wouldn't you agree?

sneakyimp;11017465 wrote:
I don't have any hacker cred to know under what conditions people most often use dictionary attacks, but I've seen long-running and persistent brute force attacks directed against ssh on a dedicated server I had once -- the basic approach appeared to be attempting login using common usernames like root, ubuntu, mysql, www-data, etc. With a botnet and/or multithreaded applications, it is possible to bring quite a bit of cracking power to bear on an unprepared server and the extra time lost in HTTP latency might well be offset by the extra time required to deal with some kind of black box salting algorithm.

In my (limited) experience and from what I've read, the overwhelming majority of dictionary attacks and brute force attacks are on a hash list or some other kind of list that the attacker can manipulate.

EDIT: By limited experience I do not mean I have hacked, cracked, or otherwise tried to gain access to a system or someone's account without permission. Just wanted to clarify that!

sneakyimp;11017465 wrote:
I think the critical advantage of a dictionary attack -- regardless of where it is applied -- is that most passwords are not nonsense but are directly derived from one or two words in the dictionary. For an 8-letter password, this means a dramatic reduction in the number of possible passwords a victim might reasonably use. There are 5.6 * 10 ^ 15 different random 8-char sequences of the 93 printable ASCII characters. I don't know the number of letters in English with 8 letters or less but it's maybe 10 ^5. Allowing 100 variations per word for common substitutions (e.g., CamelCase or L3tt3r Substitution or punctuation) still yields only about 10 ^ 7 possibilities -- a dramatic difference.

Agreed, which is why dictionary attacks are far more likely to succeed versus a rainbow table.

Weedpacket;11017471 wrote:
A mechanism for constructing MD5 collisions was presented in 2005; however it does involve the attacker creating both plaintexts.

I'm not denying collisions exist or that it's possible, I was just saying that I have personally never encountered one during the creation of my dictionary (or otherwise).

Weedpacket;11017471 wrote:
(hehe, he said "literally a drop the bucket"). The attack shown was against a known salting method (ASP.NET's Membership provider), and in the video tested hashes at a rate of 259 million/second (it could have been a couple of billion tests per second but for bottlenecks in generating cases to pass to the GPU) for 45 minutes and recovered 24710 of 39384 passwords (with unique salts). Mind you, the dictionary only had 23,685,601 entries.

I don't quite understand what you're getting at here. 306 million is a fraction of a fraction of a percent of the total possible combinations of MD5, and even less for SHA-1.

Also, you said it right there. A known salting method. That's the weakness.

And to be honest, 259 million/second isn't impressive. My dual 8800 ULTRAs do ~950 million/second when I use IGHashGPU. Even a few billion is nothing nowadays. You should check out a Whitepixel. They were able to obtain 33.1 billion/second with 4 GPUs running together. Talk about horsepower, eh?

Weedpacket · Nov 15, 2012

Bonesnap wrote:
I don't quite understand what you're getting at here. 306 million is a fraction of a fraction of a percent of the total possible combinations of MD5, and even less for SHA-1.

http://theoatmeal.com/comics/literally

Also, you said it right there. A known salting method. That's the weakness.

Which, if you use a third- or second-party authentication system, or the choice of hash is otherwise outside your control, is likely to be the case (see the list of algorithms that the cited software targets).

http://board.phpbuilder.com/showthread.php?10386947-md5-passwords&p=11016865&viewfull=1#post11016865

Bonesnap · Nov 15, 2012

Weedpacket;11017493 wrote:
http://theoatmeal.com/comics/literally

Heh, touche.

Weedpacket;11017493 wrote:
Which, if you use a third- or second-party authentication system, or the choice of hash is otherwise outside your control, is likely to be the case (see the list of algorithms that the cited software targets).

http://board.phpbuilder.com/showthread.php?10386947-md5-passwords&p=11016865&viewfull=1#post11016865

Fair enough. I still advocate choosing a strong password as well.

Bonesnap · Nov 15, 2012

Double post. Thanks, Internet!

dalecosp · Nov 15, 2012

sneakyimp;11017447 wrote:
Yes, that has been my approach where practicable -- with fail2ban and iptables on top for extra OCD-ness.

Ah, I understand that. I tend to tcpwrap mine, warnings in the config_file be d-----d. So far, been a long time since I got bit by that.

md5 passwords

dalecosp

Ssneakyimp

BBonesnap

DDerokorian

BBonesnap

Ssneakyimp

Weedpacket

BBonesnap

Weedpacket

BBonesnap

BBonesnap

dalecosp