Using Regular Expressions to delete text

GilesGuthrie

I'm sorry to drag this up, but I did a search and got just as confused as when I read the documentation...

I am hauling text out of a database and formatting it for output to screen. I read Ying Zhang's excellent article which told me how to use Regular Expressions and ereg_replace(), and I successfully implemented that, but have run into trouble again.

My script allows for one user to quote what another is writing. When a post that contains a quote is itself quoted, I want that to be removed.

Therefore I retrieve the post to be quoted, and want to search it for [quote]any characters[/quote], and remove any instances.

So, the code fragment that I wrote goes like this:

$quote = ereg_replace('\\[[b][/b]quote\\]([[:[b][/b]print:]]+)\\[/[b][/b]quote\\]', '', $quotetext);

Which gives the RegExp: [quote]([[:print:]]+)[/quote]

My understanding is that the square bracket character is a special character, and thus should be escaped with a backslash. Thus I put in the code for the tag that denotes the start of the string to match. Then I want to match one-or-more printable characters (the ([[:print:]]+ part), and finally find the corresponding end tag of the string.

What happens is that the piece I'm expecting it to remove is not removed. I'm concerned that the [[:print:]]+ segment will consider the [/quote] segment as a printable character and thus not find the end of the string. Is this sensible? Would I therefore want to write it the other way, as in "match everything that is not the [/quote] tag"? And if so, how would I write it?

All help would be much appreciated! 🙂