I'm trying to make a filter that, among other things, turns unclosed XHTML tags into their htmlspecialchars() equivalents.
I made a recursive function to try to handle this. The board will probably mangle this regexp, but I'll try to post it anyway.
function DualTags($string) {
// call htmlspeacilchars($string) before calling this
// turn good tags back to normal
return preg_replace(
'/<(\\\\w+?)((?:\\\\s+\\\\w+=".*?")*?)>(.*?)<\\\\/\\\\1>/ise',
"'<\\$1' . str_replace('\\\\\\"', '\\"', '\\$2') . '>'
. DualTags(str_replace('\\\\\\"', '\\"', '\\$3')) . '</\\$1>'",
$string);
}
It works fine, but nested tags of the same type don't work. I can have
<b><i></i></b>
or
<table> </table> <table> </table>
but
<table> <table> </table> </table>
turns into
<table> <table> </table> </table>
Any ideas? I have a similar problem when trying to support custom markup like [font][font][/font][/font].