I wrote some code wherein email addresses are converted to base64_encode strings before getting sent in a query string, like this:

$email = 'john@apple.com';
$email = base64_encode($email);
$link = '<a href="mypage.php?e='.$email.'">link</a>';

That seemed fine to me, until I read, in the manual's base64_encode entry, a comment which said, "You'll want to call urlencode on the base_64 encoded data before putting it into a GET.* IIUC, base 64 output includes the plus and the slash, both of which will be mungered by browsers."

Now, the query strings that get generated by my code above never seem to create a "bad" query string: they get through to the processing page (and get processed) fine every time. I've tried it with scores of random email addresses, and they always get through, and I never see a "+" or a "/" in the encoded string.

Is this just chance, in which case I should still modify the code somehow? Or does it show that these characters are not in fact generated, and that I have nothing to worry about?

Furthermore, urlencoding the string does cause problems.

Hope this makes sense. Thanks!

    Is this just chance, in which case I should still modify the code somehow? Or does it show that these characters are not in fact generated, and that I have nothing to worry about?

    According to RFC 2045, '+' and '/' are valid characters in a base64 encoded string, so it is just chance, unless a modified version of base 64 encoding is actually used. Consequently, you should urlencode it when placing it into a query string.

    Furthermore, urlencoding the string does cause problems.

    What kind of problems?

      OK I'd better modify my code. The problems I was having with urlencoding must be part of my implementation. Thanks!

        The conventional way (RFC3548) of making a base64-encoded string safe for URLs is to use a slightly modified base64 alphabet that uses "-" instead of "+" and "_" instead of "/".

        function urlsafe_base64_encode($string)
        {
            return strtr(base64_encode($string), "+/", "-_");
        }
        
        function urlsafe_base64_decode($string)
        {
            return base64_decode(strtr($string, "-_", "+/"));
        }
        

          Don't base64 encodings often end in one or two "=" signs? If so, they'd probably need to be dealt with, too.

            True (though only if the last subdelimiter seen was a '&'): base64_decode won't mind if you leave them off completely.

            The additional characters (after English letters and digits) allowed in the query part of a URL are "-._~!$&'()*+,;=:@", %escape sequences notwithstanding. I suppose some querystring parsers might be confused by "a=b&c=d=", however PHP's isn't one of them.

              Write a Reply...