sneakyimp;10879815 wrote:You'd get double spaces "I think erego I am" would return "I think I am" (with two spaces)
Ah yes, I see what you mean. My bad.
Cheers,
NRG
sneakyimp;10879815 wrote:You'd get double spaces "I think erego I am" would return "I think I am" (with two spaces)
Ah yes, I see what you mean. My bad.
Cheers,
NRG
This pattern's ugly, but does it in one shot - just an extra option that will suck the spaces before and after if it's at the end of the line. Note YourWord is what you'd put in with the preg_quote bit, aye?
/(\bYourWord\b\s?|\s+\bYourWord\b\s?$)/i
I'd like a nicer pattern without the or, but can't muster it at the mo.
nrg_alpha;10879817 wrote:Hi Brad..
Mabey I'm misunderstanding something.. going by the OP's goal:
'I've been instructed to write a routine that strips certain words from a text string',would the str_ireplace routine not suffice?
Cheers,
NRG
The key word in the quote there is words.
The "clbuttic" mistake is that you decide to remove the vulgar word "ass" from a user's input, but instead of just stripping it out you decide to be humorous and replace it with the less offensive word "butt". So you simply do a str_ireplace() to replace "ass" with "butt".
Hence the word classic becomes "clbuttic." Search Google for this term, or any other word that has "ass" in it replaced with "butt" - it's a clbuttic, I mean classic, mistake.
(Thanks to Weedpacket in a previous thread for this example - it vaguely sounded familiar when he mentioned the word, but after a quick Google it all came back to me... very funny example of programming-gone-awry.)
bradgrafelman;10879822 wrote:The key word in the quote there is words.
The "clbuttic" mistake is that you decide to remove the vulgar word "ass" from a user's input, but instead of just stripping it out you decide to be humorous and replace it with the less offensive word "butt". So you simply do a str_ireplace() to replace "ass" with "butt".
Hence the word classic becomes "clbuttic." Search Google for this term, or any other word that has "ass" in it replaced with "butt" - it's a clbuttic, I mean classic, mistake.
(Thanks to Weedpacket in a previous thread for this example - it vaguely sounded familiar when he mentioned the word, but after a quick Google it all came back to me... very funny example of programming-gone-awry.)
Ohh.. I thought you were referring to Sneakyimp's problem in THIS thread.. sorry..
Cheers,
NRG
Sneakyimp, my solution does not create two spaces when I tested it.. the '' that does the replacing does not create an extra space.. it is 'nothing'.. when I try the following code, it works:
function remove_words($str) {
$remove = array('SOMNAMBULIST', 'EREGO', 'HERETOFORE');
$str = str_ireplace($remove, '', $str);
return $str;
}
$test = 'I am SOMNAMBULIST therefore I am!';
echo $test . '<br />';
echo remove_words($test);
Cheers,
NRG
nrg_alpha;10879823 wrote:Ohh.. I thought you were referring to Sneakyimp's problem in THIS thread.. sorry..
I was/am - it's the same concept.
Brad, I don't think sneakyimp is trying to replace one word for another...if I understand correctly, he just wants to strip out specific words...
so this is not the same as in the link I initially posted (which yes, I agree with what you and Weedpacket are saying).
But in THIS case (this thread), I fail to see the usage of 'clbuttic' when the goal here is simply to remove words.. not replace them with other words.. Unless I am very seriously misunderstanding everything here..
Cheers,
NRG
nrg_alpha,
It's pretty simple.. you want to remove full words. Your solution will deform one word into another. For example:
remove_words('I think neweregoword I am');
Would yeild "I think newword I am" deforming my word "neweregoword". Hence clbuttic = classic
m@tt;10879829 wrote:nrg_alpha,
It's pretty simple.. you want to remove full words. Your solution will deform one word into another.
In the test from the last code I posted, it deforms (replaces) 'one successfully found criteria' for another if thats what you mean.. in this case, due to an array of 'words' from the $remove array, this is replaced with ''.
Is this not in essence 'removing' a word? When I examine the string after it has been passed through the function, the word that is not supposed to be there isn't there.
m@tt;10879829 wrote:For example:
remove_words('I think neweregoword I am');
Would yeild "I think newword I am" deforming my word "neweregoword". Hence clbuttic = classic
Perhaps I'm new to this clbuttic thing.. but doesn't the replacement word have to actually be a word instead of ''? Or do you mean that by doing any form of replcaement, the 'context' of the string is altered which can result in a clbuttic situation?
Cheers,
NRG
The whole point we're trying to make in suggesting regular expressions over a simple str_ireplace() is that regular expressions can make the distinction between words and a string of characters within a word.
For example, using str_ireplace(), try to remove the offensive word "ass" from this string: Calling someone an ass can be classified as quite offensive. -- and then ask yourself what "clified" means.
Point taken. I now understant completely. Just by the example words the OP used, str_ireplace() worked perfectly. But givine your latest example, it does not.
So I concede.. preg it is. (and now I realise fully the definition of clbuttic).
Sometimes the simplest examples hammer home the point the hardest.
Cheers,
NRG
$patterns[] = '/(^)?\s?\b'.preg_quote($rm).'\b(?(1)\s?)/i';
Just for the fun of it. My previous pattern left spaces at the end of lines if the word occured at the end of a line, which you could get rid of by modifying the pattern to b[/b] but the one shown above doesn't have the repeat. The only site effect of this new pattern is that it essentially does a left trim.
ok so let me see...
function remove_words($str) {
$remove = array('SOMNAMBULIST', 'EREGO', 'HERETOFORE');
$patterns = array();
foreach($remove as $rm) {
$quoted = preg_quote($rm, '/');
$patterns[] = '/(\b' . $quoted . '\b\s?|\s?\b' . $quoted . '\b\s?(?=$|\n))/i';
}
return preg_replace($patterns, '', $str);
} // remove_words()
echo "'" . remove_words('I think erego I am') . "'\n"; // needs a space!
echo "'" . remove_words('Heretofore unknown') . "'\n"; // works great
echo "'" . remove_words('I think heretofore erego i am') . "'\n"; // works great
echo "'" . remove_words('I was satisfied heretofore') . "'\n"; // works great
I believe I've covered all the boundary conditions in the examples and it seems to work:
'I think I am'
'unknown'
'I think i am'
'I was satisfied'
I think you nailed it drakla. The expression itself is a bit scary to me (this is to be expected from the undead i suppose) so I think I'll stick with my previous function which I can grasp a little better.
The scariest parts are all those question marks. I'm not really sure what they do.
Thanks for the valiant effort guys!
I was thinking of putting an explanation of the pattern in for this one
b?\s?\bYourWord\b(?(1)\s?)[/b]
The basics of it are that you test if you're at the start of a line, and to do that you use the ^ and put it into brackets, which for those who've done a bit of regex with know also creates a capturing pattern with 1 as its id, but it must be optional, and that's what the question mark does.
B?[/B] check if we're at the start of the line [the ], do a capture so we can check it later [the brackets], but make it optional [the question mark]
\s?\bYourWord\b is grab a space before the word if it's there [\s?], and the word itself
The last bit tests whether the optional start of line pattern actually did capture anything, and if so also says take whitespace from after the word
B[/B] means did 1 capture anything? If so optionally scoop up another space.
So the logic of the whole expression is always grab a space from in front of the word, but if you're at the start of a line then also grab a space after.
That's going to be unintelligible drivel, isn't it. This will probably help more:
http://www.regular-expressions.info/conditional.html