I have a word in a variable $word. With preg_replace () I want to find the occurrance of $word and replace it with <u>$word</u>. I'm using \b$word\b to find the word, but the problem is this finds occurrences of the word inside <a> tags. For example, if $word is "john", then I get this:

$string = "<a href='http://www.john.com'>some link</a>";
$word = "john";
$pattern = "/\b$word\b/i";
$replacement = "<u>\\0</u>";
$newString = preg_replace ( $pattern, $replacement, $string );
echo $newString;

// output: <a href='http://www.<u>john</u>.com'>some link</a>

I want to replace $word in all occurences except where it occurs inside an <a> tag. Anybody know how to do this with regex?

    what about :

    ...
    $pattern = "/>.+\b$word\b.+</i";
    ...
    

      I'd guess this won't work, consider what it would do with

      <p>Text text text <a href="www.john.com">Link to John</a> more text more John</p>

      Retrotron, maybe you find these examples helpful to start off, I think this is pretty close to what you want to achive.

        Excellent. Those examples were helpful. This pattern effectively prevents matches inside of tags:

        $pattern = "/\b($word)\b(?=([^>]*<))/i";
        

        Almost there. This pattern will match words that occur between an opening and closing <a> tag, e.g. this --

        $string = "Hey there <a href="www.john.com">John Simpson</a>.";
        $word = "john";
        $pattern = "/\b($word)\b(?=([^>]*<))/i";
        $replacement = "<u>\\1</u>";
        $newString = preg_replace ( $pattern, $replacement, $string );
        echo $newString;
        

        -- gives me this:

        Hey there <a href="www.john.com"><u>John</u> Simpson</a>.

        I can't have replacements between the opening and closing <a> tags either. Basically, I can't have any replacements between <a.......</a>, but I'm not sure how to use regex to say match $word unless it occurs between <a.....</a>.

        Any ideas?

          Wow, that's great....doing the logic in the replacement like that. I couldn't seem to see how to do this without logic, but of course I can't use logic in regex and was thus stuck. This is a clever solution. Excellent. Thanks a bunch.

          I found another way a few minutes back:

          1. use preg_match_all to get all occurrances of <a....</a> and store them in $anchorsArray.
          2. use preg_replace to replace all occurrances of <a...</a> with a temporary tag marker (e.g. $!$).
          3. Do my other preg_replacements.
          4. Go back through the string and replace each $!$ with $anchorsArray[$i].

          This might be slower because it involves more steps, although the regular expressions are much simpler and less resource intensive.

            Hi again,

            I did use similar placeholder methods on some occasions where things went way over my head. Nevertheless I've always been a little concerned about the possibility that whatever placeholder I used may - as improbable as it may be in a given case - occur within the normal text and scramble things. To be absolutely on the safe side, it would probably be best to validate against that first.

            Concerning resources, well, who knows... but I'm not in the mood to do benchmarking tests on this now 😉

              xblue wrote:

              I did use similar placeholder methods on some occasions where things went way over my head. Nevertheless I've always been a little concerned about the possibility that whatever placeholder I used may - as improbable as it may be in a given case - occur within the normal text and scramble things. To be absolutely on the safe side, it would probably be best to validate against that first.

              That's pretty much what I did in a situation where I found myself doing something similar.
              http://www.phpbuilder.com/board/showpost.php?p=10411554&postcount=12

                Ah, good point. It's safest to take this approach. Cool.

                  Write a Reply...