Hi, I just finished reading the tutorial on regex (http://www.phpbuilder.com/columns/dario19990616.php3?page=4). I don't understand Dario's last example:

For instance, say we want do get the filename from a path/URL string -- this code would be all we need:

ereg("([\/]*)$", $pathOrUrl, $regs);
echo $regs[1];

I am not sure why he has listed it with two backslashes in it ("[\/]"). I thought the backslash didn't need escaping when it's within brackets. Or is he escaping the forward slash? I thought the forward slash doesn't need escaping.

Anyone expalain this one?

Thanks,
Misha

    Use a print statement to see how PHP's double quotes are interpreting your regex. That interpretaion happens before the regex evaluation. It's always confusing. Here's the debug version of your code:

    $regex= "([^\\/]*)$";
    print "<pre>regex=[$regex]</pre>\n";
    ereg($regex, $pathOrUrl, $regs);
    echo $regs[1];
    

      Hmm,

      I have no idea what you're saying really 🙂.

      I am not quite sure how it relates to the questions I asked. Will I understand my question and the answer better if I put that code in a page and test it?

      Can you elaborate on what you wrote?

        If you want to know what is in the regex, run the code I wrote. The double quotes change \ to . So the regex just has .

          Thanks for that explanation; that's what I thought you meant.

          However when I run the code with only one backslash I get the same regex expression. So, the escape is unnecessary?

          Misha

            Yes, you're right. The author was being extra careful. You can use a single \ depending on what the next character is.
            "\/" == "\/"
            but
            "\n" != "\n"
            and
            "\$x" != "\$x"
            Sometimes to get a literal string, people just escape all \, $, ". That takes care of it. Around actual $variables you want expanded, you have to look out for {, }, [, ], $, and probably a few other symbols.

            The rules are here:
            http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.double

              Now I'm very confused. I think your method of debugging the regex is incorrect somehow.

              The author (Dario) says, "Just don't forget that bracket expressions are an exception to that rule--inside them, all special characters, including the backslash ('\'), lose their special powers (i.e., "[*+?{}.]" matches exactly any of the characters inside the brackets)."

              Is he wrong?

              This would mean that the regex:

              [\$x]

              should return characters that are "\", "$", and "x". Right?

              In your debug, however it returns something different; the "\" is working as an escape.

              Is that because it's the first character in the regex?

              Misha

                Well, things happen in a different order than you're thinking. Before ereg() ever sees the pattern, it gets evaluated at a double-quoted string.

                So you see
                "([\/])$"
                in your code, but ereg() sees
                ([\/]
                )$

                (This is different from Perl, by the way. Perl does it the way you're thinking.)

                That's why it's useful to save the pattern in a variable, so you can debug using a print statement.

                If you are going to be learning more about regexes, you should switch from "ereg" stuff to "preg" functions. See: http://us3.php.net/manual/en/ref.pcre.php for the advantages. Another advantage is more people use them, so there is more help available. Also... it says "perl-compatible" but the quoting still works the same. Because this is PHP. It just means the regexes are more standard.

                  Thanks for responding kitchin.

                  I have to say, I'm not understanding your statements. I don't see direct answers to my questions.

                  My question, at the moment, is the author of that article says clearly, "Characters in brackets lose their special powers." This means to me that:

                  [\$s] == the characters \, $, and s.

                  Is the author wrong about this?

                  In what cases are there exceptions? In what case, essentially, does the backslash function as an escape, within a bracket?

                  Misha

                    You think the author's regex is

                    ([\/]*)$

                    It is not.

                    Because of the quotes. Like I said, if you want to see the regex, print it out.

                      Write a Reply...