Thanks for the help guys, much appreciated as always. Last question:
preg_replace('/<tag>[£]+<\/tag>/', 'removed', $text);
This basically replaces anything between these two tags (except for £) with "removed".
The problem is that I actually want to strip the matches out of the text, or remove everything else in $text except for the matches (the tags and whatever is between them). Is there some way to assign the matched series of strings to a variable, or use a syntax that says everything except for <tag>[£]+<\/tag> ?