Hi everyone,
I've got a little problem I can't resolve myself and would like to ask some help. I use regular expression to remove from email in HTML format some dangerous tags and also I remove everything before <BODY..> tag and after </body>. Regular expression to remove all stuff from <HTML> to <BODY...> looks like:
eregi_replace("^(.|\s)*<body[^>]*>", "", $htmlstring);
and it work well with regular HTML formatted emails. My problems begin when user receive email with few HTML document parts in body of letter (ex. the best company in the World microsoft send newsletter in HTML format with few instance of HTML in the body). Body part in raw looks like:
<html>....<body>...</body></html><html>...<body..>...</body></html>
Of cause my regular expression return me just last part of this e-mail.
So question what regular expression strip me of all instance <html>... any tags, but not other HTML or BODY...<body...> from given HTML.
Thanks.