forgotten password functionality -- security risk in random bytes?

NogDog · Mar 24, 2021

I am not an expert, but am interested if you learn anything w.r.t. minimum recommended token length. I had worked on something in the past where we did something like:

$token = rtrim(base64_encode(sha1(rand(1, PHP_INT_MAX))), '=');

Trying that locally, I get a token of 54 characters. But looking at an API we interface with now, it returns a JWT for security that's 900+ characters. Dunno where the happy middle ground is for a password reset email, though.

sneakyimp · Mar 24, 2021

NogDog I don't think you need to base64_encode the output of sha1 -- the latter returns a hex value of 40 chars and base64 just adds an extra 33% or so of length on top of that.

I'd also point out that rand is not considered cryptographically 'unsafe' according to OWASP guidelines.
https://cheatsheetseries.owasp.org/cheatsheets/Cryptographic_Storage_Cheat_Sheet.html#secure-random-number-generation

I did stumble across this brief discussion of token length, but was unsure what reasonable assumptions are about attack rates. You should ask yourself how quickly an attacker might make guess attempts to decide the lifetime of a token. I think if if you have flood control (i.e., block repeated requests) then you are considerably safer against brute force attack.

EDIT: You might consider base64-encoding the raw output of sha1 to try and conserve length:

$token = base64_encode(sha1(rand(1, PHP_INT_MAX), true));

NogDog · Mar 24, 2021

sneakyimp I don't think you need to base64_encode the output of sha1

To be honest, I think we did it just because it looks more secure: a wider variety of characters.

sneakyimp · Mar 24, 2021

NogDog Security theater! Bravo!

I'm hearing that 256 bits (32 bytes) should do the trick for password reset token. OWASP also says that random_bytes is cryptographically secure. Furthermore, it is my understanding that it safe to expose randomly generated bits/bytes directly to remote users without any need for a hash. That all being the case, it looks like you can do something like this in PHP 7:

$token = base64_encode(random_bytes(32));

and the string representation of the token is going to be 44 characters long -- I think.

Weedpacket · Mar 25, 2021

sneakyimp Is it safe to just append a bin2hex or base64_encode of the raw random bytes to a url and send that along? Or does this expose the sensitive inner working of my CSPRNG to bad guys?

As far as this goes, if the random bytes have enough structure to allow identifying the workings of the generating producing them, then they're not random enough to be considered cryptographically secure.

$token = base64_encode(random_bytes(32));

Base64 encodes incoming bytes in triples, so this $token will always end with a = and the last character will end with two bits of zero-padding. There's no information leak, but random_bytes(33) "looks more random" after base64 encoding, without the base64 string being any longer (the = is replaced by the additional randomness).

sneakyimp · Mar 25, 2021

Weedpacket As far as this goes, if the random bytes have enough structure to allow identifying the workings of the generating producing them, then they're not random enough to be considered cryptographically secure.

I've seen this stated a few ways, but you have stated it most clearly. I looked around a bit and stumbled across an example where an OS distributed a problematic random number generator. This is precisely the scenario it seems one should be worried about in showing the world one's raw random number bytes. Would it offer any protection to run the raw bytes from the (supposedly CS) PRNG through a hash function instead of just letting the world see them? Would the hash function offer any concealment if your PRNG is not really CS?

Weedpacket Base64 encodes incoming bytes in triples, so this $token will always end with a = and the last character will end with two bits of zero-padding. There's no information leak, but random_bytes(33) "looks more random" after base64 encoding, without the base64 string being any longer (the = is replaced by the additional randomness).

Thanks for pointing this out -- a multiple of 3 for the byte length would probably optimize the entropy per character best. That said, I wonder if any mail gateways will end up breaking these urls, which are longer than 75 chars with this lengthy token.

Weedpacket · Mar 25, 2021

The tokens are only good for a few hours anyway, and if an attacker actually does break one — what would actually be gained? The ability to guess the nonces used in subsequent reset requests. If you're able to generate a password reset request and get the RNG into a predictable state, then try and log in as the user you're targetting and make a password reset request there. You won't get the reset email because you don't have the target's email account, but you can predict the nonce they got. So you can now change the password and log in as the target.

Hashing won't make a difference because if the original random number can be predicted, then its hash can be predicted as well.

But the attack hinges on being able to predict the next nonce that will be used (it's the one after the one the attacker had just got that yielded the RNG's current state). If you generated (decided using a different RNG) a random number of such nonces (up to, say, ten thousand) and used the last one, then the attacker would have to brute force through on average ten thousand nonces using each one to try and reset the password (and up to twenty, because the same one-nonce-in-ten-thousand mechanism would have been used to generate the nonce the attacker got). But that's activity that should ring alarm bells anyway, and could be dealt with through a simple backoff mechanism of "five failed attempts, please wait five minutes before trying again"; they'd run out of time before getting even a few dozen attempts and would have to start all over. Generating and discarding nonces might take a little extra time, but it would be going out by email anyway, so normal users won't notice.

In the meantime the target may be oblivious to all this (having not seen the password reset email yet), and may even log in normally, at which point the password reset request could be discarded completely, and the attacker is left shooting off at nothing.

The Debian vulnerability was different: "...there are only 65.536 possible ssh keys generated, cause the only entropy is the pid of the process generating the key." source

No amount of munging those keys would have worked, because without additional entropy there would still have only been 65536 possible outcomes that could be brute forced. (Those keys are now mildly blacklisted; if any come up then security monitoring software is likely to flag them as possibly compromised.)

sneakyimp · Apr 2, 2021

Weedpacket The tokens are only good for a few hours anyway, and if an attacker actually does break one — what would actually be gained? The ability to guess the nonces used in subsequent reset requests. If you're able to generate a password reset request and get the RNG into a predictable state, then try and log in as the user you're targetting and make a password reset request there. You won't get the reset email because you don't have the target's email account, but you can predict the nonce they got. So you can now change the password and log in as the target.

Thanks for this response. It's pretty knotty conceptually so I hope you'll tolerate me responding to help me understand the broader security concepts.

The site I'm working on is pretty low-risk -- nothing very vital is permanently stored -- but financial transactions are processed by this machine. The main risk I fear isn't so much that a password reset token would be 'broken' -- it contains on cryptographic data. I'm perhaps not even concerned that a token might be guessed by an attacker. Taking control of someone's account probably won't gain you any sensitive information beyond email information or perhaps a physical address or some low-value personal data.

I think the risk I vaguely fear is that someone might create an account (or thousands of them) of their own, then make repeated password reset requests and inspect the tokens attached to each password reset link to get some intel on the RNG state of my server and somehow use this intel to crack the more vital encrypted traffic this machine has with a payment gateway. Or perhaps gain insight to let an attacker decrypt HTTPS communications of other visitors to my site which might contain payment or password credentials. Using random_bytes to generate the tokens and then just appending them to a url and mailing them out feels like lifting my kimono and showing the goods.

Weedpacket Hashing won't make a difference because if the original random number can be predicted, then its hash can be predicted as well.

Guessing the original random number will certainly allow us to predict a hash, but the converse is not true. Given the hash, you (theoretically) cannot predict the original bytes.

Can we truly dismiss the threat of exposing raw output of the machine's RNG? Or dismiss the possibility that sending a hash as token might actually provide protection? I'd point out that current best practices involve using password_hash to avoid storing passwords -- the word 'hash' suggesting that the original value cannot be constructed from the dog's breakfast left after the original value is algorithmically destroyed. Could a hash conceal vital detail of the random_bytes fed to the hash function, thereby acting as a prophylactic for my RNG against the attacker? Does a hash reduce the the entropy 'space' of the token? Just appending my RNG output directly to a url and emailing it over and over to any attacker feels a bit like raw-doggin' it with a cavalcade of dubious sex workers.

I do, in fact, limit the number of password reset attempts. If there are N existing password reset tokens for a particular account, the attacker must wait until they all expire before creating any more. This does not stop an attacker from creating dozens/scores/hundreds/thousands of accounts. I do use CAPTCHA in the password reset page as well, to try and exclude automated attacks.

Weedpacket The Debian vulnerability was different: "...there are only 65.536 possible ssh keys generated, cause the only entropy is the pid of the process generating the key."

WTF? I've made some amateur coding mistakes before, but that sounds bad.

Weedpacket · Apr 2, 2021

sneakyimp Could a hash conceal vital detail of the random_bytes fed to the hash function, thereby acting as a prophylactic for my RNG against the attacker?

If you choose a good hash function it wouldn't hurt (assuming you generate at least as many random bytes as the hash length). But it's still getting a bit redundant because CSPRNGs are starting to be built that use already hash algorithms to massage their random bytes.

You might want to look at the NIST recommendation on the subject (specifically the Hash_DRBG, for generating random bits from hash functions.) (More broadly: Computer Security Resource Center Topics: random number generation). Oh, hey, you just could get the randomness from NIST themselves (the big red warning doesn't apply here: you're not generating secret keys).

sneakyimp WTF? I've made some amateur coding mistakes before, but that sounds bad.

Yeah: at some point you're going to have to trust that whoever built your infrastructure knew what they were doing, or at least knew enough to do a better job of it than you, and be prepared to make repairs yourself should something like the above happen. Unless you're prepared to recapitulate the last couple hundred years of IT software and hardware (because how are you going to make the processor chips before you've built the manufacturing facilities?).

OliviaParcker · Apr 15, 2021

I am just starting to write code, no more than 75 characters with this long token. Can you suggest something about this

Steve_R_Jones · Apr 15, 2021

OliviaParcker - start your own thread. Hijacking other people's thread is rude.

OliviaParcker · Apr 19, 2021

yes, but maybe it will help me here. mk here are people who know this area ((

forgotten password functionality -- security risk in random bytes?

NogDog

Ssneakyimp

NogDog

Ssneakyimp

Weedpacket

Ssneakyimp

Weedpacket

Ssneakyimp

Weedpacket

OOliviaParcker

Steve_R_Jones

OOliviaParcker