sneakyimp;11017465 wrote:I wonder if it might be possible to use a simpler algorithm to generate an MD5 dictionary more rapidly simple by taking the 93 printable ASCII characters and generating strings of some length between 8 and 20 chars by just counting from 0 to N in base 93. After all, the point isn't really to determine some original password, but to determine an equivalent string that accomplishes the same hash, right? Or is it? I think the salting algorithm might matter here. I wonder if salting might obviate the utility of an md5 dictionary entirely.
What you're referring to is a rainbow table, and they already exist for all possible combinations of ASCII characters up to eight characters for MD5. If you look hard enough on the Internet you can find them. I have.
The issue with rainbow tables is simply that passwords that people use don't "fit the bill", as sneakyimp basically said. So while you will have a massive dictionary of precomputed hashes, your success rate will be far less than a carefully constructed dictionary attack.
sneakyimp;11017465 wrote:Not if you apply the dictionary attack at the point of login, in which case the app is applying the salt for you.
Agreed, but simply put: it's too slow. When you compare the amount of time it takes even a script to enter in credentials compared to attacking a hash list, it's night and day. Even if it took half a second, there's millions of combinations "wasted" - and that's assuming that the login form doesn't have security measures in place. Most do nowadays. They're really not a target for password attackers.
sneakyimp;11017465 wrote:I think Kerckhoff would disagree 😉 I'm not sure how to formalize mathematically how much trickier non-standard salting methods are, but I would posit that each possible different salting technique contributes at most one bit to the strength of one's password [i.e., it provides exactly one alternative to the simple prepend or append approach] -- and then only if the source code is not also compromised.
He can disagree all he wants but the fact remains if you don't know how the salt is applied then you're out of luck. Your only two options are to either 1) keep trying different salting methods until you find one that works but that presents a problem: what if you actually use the right salting method but the string you're testing isn't in your dictionary? You would think you had the wrong one and move on. Sure, you could test each and every single hash, but that exponentially adds to your time. Why not just attack someone who puts up much less of a fight?
And 2) Do a straight up brute force attack trying every single possible combination, but we both know that could take centuries or more.
Nothing in the world is absolutely 100% secure; everything has a flaw. So let's remember the gritty truth about security: you don't have to be 100% secure, but you better hope to be more secure than the guy standing next to you.
EDIT: Yes, if your source code is compromised then there's nothing you can do. If you drop your house key and someone picks it up, then your fancy lock means nothing. One thing to keep in mind though: as the "designer" of the system, I don't have control over people's passwords (short of putting restrictions on them but that's generally considered a bad idea), so I will make my system as strong as possible. I still think it is better to have a non-standard salting method. Wouldn't you agree?
sneakyimp;11017465 wrote:I don't have any hacker cred to know under what conditions people most often use dictionary attacks, but I've seen long-running and persistent brute force attacks directed against ssh on a dedicated server I had once -- the basic approach appeared to be attempting login using common usernames like root, ubuntu, mysql, www-data, etc. With a botnet and/or multithreaded applications, it is possible to bring quite a bit of cracking power to bear on an unprepared server and the extra time lost in HTTP latency might well be offset by the extra time required to deal with some kind of black box salting algorithm.
In my (limited) experience and from what I've read, the overwhelming majority of dictionary attacks and brute force attacks are on a hash list or some other kind of list that the attacker can manipulate.
EDIT: By limited experience I do not mean I have hacked, cracked, or otherwise tried to gain access to a system or someone's account without permission. Just wanted to clarify that!
sneakyimp;11017465 wrote:I think the critical advantage of a dictionary attack -- regardless of where it is applied -- is that most passwords are not nonsense but are directly derived from one or two words in the dictionary. For an 8-letter password, this means a dramatic reduction in the number of possible passwords a victim might reasonably use. There are 5.6 * 10 ^ 15 different random 8-char sequences of the 93 printable ASCII characters. I don't know the number of letters in English with 8 letters or less but it's maybe 10 5. Allowing 100 variations per word for common substitutions (e.g., CamelCase or L3tt3r Substitution or punctuation) still yields only about 10 ^ 7 possibilities -- a dramatic difference.
Agreed, which is why dictionary attacks are far more likely to succeed versus a rainbow table.
Weedpacket;11017471 wrote:A mechanism for constructing MD5 collisions was presented in 2005; however it does involve the attacker creating both plaintexts.
I'm not denying collisions exist or that it's possible, I was just saying that I have personally never encountered one during the creation of my dictionary (or otherwise).
Weedpacket;11017471 wrote:(hehe, he said "literally a drop the bucket"). The attack shown was against a known salting method (ASP.NET's Membership provider), and in the video tested hashes at a rate of 259 million/second (it could have been a couple of billion tests per second but for bottlenecks in generating cases to pass to the GPU) for 45 minutes and recovered 24710 of 39384 passwords (with unique salts). Mind you, the dictionary only had 23,685,601 entries.
I don't quite understand what you're getting at here. 306 million is a fraction of a fraction of a percent of the total possible combinations of MD5, and even less for SHA-1.
Also, you said it right there. A known salting method. That's the weakness.
And to be honest, 259 million/second isn't impressive. My dual 8800 ULTRAs do ~950 million/second when I use IGHashGPU. Even a few billion is nothing nowadays. You should check out a Whitepixel. They were able to obtain 33.1 billion/second with 4 GPUs running together. Talk about horsepower, eh?