johanafm;10935217 wrote:Thanks. I had no idea that was acceptable. I would rather have expected "(pattern(" to be correct, and missinterpreted it as the whole pattern being a capturing subpattern.
Yeah, it's admittedly an oddball. But the pcre introduction aspect of the manual does confirm this:
excerpt:
The expression must be enclosed in the delimiters, a forward slash (/), for example. Delimiters can be any non-alphanumeric, non-whitespace ASCII character except the backslash () and the null byte. If the delimiter character has to be used in the expression itself, it needs to be escaped by backslash. Since PHP 4.0.4, you can also use Perl-style (), {}, [], and <> matching delimiters.
However, there is a crux in the manual's explaination.. it states that if a delimiter character is used within the pattern, it must be escaped. Generally, this is true, but not necessarily so in absolute terms (it's contextual - more on this below).
I found out that if you choose a delimiter set like say < and > for example, and you want to match a literal < and > within the pattern, so long as there is an equal amount of opening closing characters within the pattern, they don't require escaping (oddly enough):
Example 1:
$str = 'Some text <35723a7c4b> more text!';
preg_match('<Some text <[a-c0-9]+>>', $str, $match); // no need to escape inner < and > characters
echo $match[0]; // Ouput: Some text <35723a7c4b>
Example 2:
$str = 'Some text <35723a7c4b> more text <4234654656>!';
preg_match('<Some text <[a-c0-9]+> more text <[0-9]+>>', $str, $match); // still don't need to escape multiple inner < and > matching characters
echo $match[0]; // Ouput: Some text <35723a7c4b> more text <4234654656>
Finally, for using characters like (..) (groups) or [..] (character classes) within the pattern (when the delimiters are also those characters), we don't escape those if we want the regex engine to treat those as actual groups or classes:
Example:
$str = 'Some text 35723a7c4b more text!';
preg_match('[Some text ([a-c0-9]+)]', $str, $match); // chose [ and ] as delimiters, yet character class still parses correctly
echo $match[1]; // Ouput: 35723a7c4b
But yeah, on the whole, I certainly wouldn't recommend using these oddball delimiters.. will mostly lead to confusion to the uninitiated. I would stick to delimiters like !..!, ~...~, #...#, etc.. In those cases (when delimiters are not matching opening / closing characters, they do need to be escaped within the pattern).