There are times when it's faster to bypass a DOM tree and work right on memdump-ed XML as a string (with line feeds and tabs floating around in there).
Regular Expressions seem to be the prefered way to manipulate text, now that PHP speaks regex. (My first impulse is to trot-out good old substr(), but I'm willing to suppress the urge, if someone can show me a better way.)
I really like strstr().
I might do this:
$result_str = strstr($result_str, '<?');
This might discard all charcters of an HTTP header, prior to the beginning of the XML it contains. It's old school and I don't have to think very hard.
Now I want to discard any characters, white space and line feeds past the last XML element.
What I want is an inverse strstr(), (rtsrts() anyone?), but I reach, instead, for preg_replace().
OK:
$result_str = preg_replace('/<\/LAST_TAG>\s\d\s/', '</LAST TAG>', $result_str );
fine.
But what if I want to cut and paste XML to create a new XML doc.
Yes, I know, use xmldoc(), naviagate the DOM tree, use the XPath part of XLST to get what I need out of it, and make a new tree.
This is supposed to be a short-cut that doesn't soak-up processing cycles, however.
I need to say something like:
$question_str = preg_replace('/<SOME_ELEMENT>\S\s/', '', $question_str);
But this only removes the element following <SOME_ELEMENT>.
Now I either build a loop or use something that will later appear in the next book of anti-examples:
$question_str = preg_replace('/<SOME_ELEMENT>\S\s\S\s\S\s\S\s/', '', $question_str);
Very funny. It's an honest question.
We don't know how many characters we're discarding, of course. obviously we can count with strpos() and then use substr(). But we're trying to use regular expressions, so that our code is more or less generic.
There are no useful single-character delimiters in XML (tags are meaningful, but chunky); if we split() on...what? line feeds?, we get a huge array that doesn't really get us any closer to our goal.
Are there any perl converts to PHP out there who can show me the True regex Way to discard everything past a known string in the middle of an XML file/buffer/string?