PCRE pattern syntax
PCRE pattern modifiers (after the closing delimiter)
By using a different delimiter ("#" instead of "/") and getting rid of the first set of square brackets since they are redundant, we can clean that pattern up a bit to:
preg_match_all("#(<(\w+)[^>]*>)(.*?)(</\\2>)#", $html, $matches, PREG_SET_ORDER);
# : opening regexp delimiter
( : start sub-pattern
< : literal "<"
( : start sub-pattern #2
\w : any "word" character (letter, number, or underscore)
+ : preceding character appears 1 or more times
) : end sub-pattern #2
[^>] : any character that is not a ">"
* : preceding character class appears 0 or more times
: literal ">"
) : end of sub-pattern
( : start of sub-pattern
. : any character
* : preceding character appears 0 or more times
? : preceding repetition modifier's greediness is toggled (to "ungreedy" in this case)
) : end sub-pattern
( : start sub-pattern
</ : literal "</"
\\2 : whatever was matched in sub-pattern #2
: literal ">"
) : end of sub-pattern
# : closing regexp delimiter
Clear as mud? 😉