Hi

I am using a regular expression to search for a word inside a text file (rss feed) and great I can find it.

 if (preg_match("/\b.".$searchWord."\b/i", $contents)) {
                        echo "searching ".$file." for word ".$searchWord. " Found: ".$searchWord." <br/>";
}
else...

However I would like to search for the beginning and the end of the feed item that my search word is found in <item> to </item>.

So therefore when I find the word Ebay for example, I would like to be able to find the first <item> before the word Ebay and the first </item> after the word.

Hope that makes sense

Thanks in advance

IcedEarth

Example

<item> <title>Ebay bid for Lord&apos;s title reaches £3m</title>
            <link>http://www.itn.co.uk/news/ecfcb3185dc2493afd5dc0846418530b.html</link>
            <description>A lord who has put his title and property up for sale on Ebay has received bids over £3 million.</description>
            <guid>http://www.itn.co.uk/news/ecfcb3185dc2493afd5dc0846418530b.html</guid>
            <pubDate>Fri, 08 Aug 2008 11:26:04 +0100</pubDate>
        </item>

    i dont have alot of knowlage of php, so this is probally the hard way, but it might help you in the right direction.

    $ebaypos = strpos($contents, 'Ebay');
    $lastItemPos = strpos($contents, '</item>', $ebaypos);

    and you could find the first occurance with a bit more string manipulation,
    do the same but reverse, i googled a script for you that does that.
    (http://www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=5517)

    good luck

      Thanks for the input, I am currently investigating using a regular expression to do the job, but am not expert in this area, so will have to read and read!

      Thanks again

        Your regexp might be something like:

        $regexp = '/<item>.*?\b' . preg_quote($searchWord) . '\b.*?<\/item>/i';
        if (preg_match($regexp, $contents)) { 
        

          Thanks for giving me a pointer to get started!!! I was starting to go goggle eyed reading all this regular expression stuff and trying to build one!

            I have found this site useful for regex:
            http://www.regular-expressions.info/reference.html

            Admittedly, regex is my biggest weakness (but I'm getting better). It's one of those mini-languages that doesn't contain a large syntax, but the rules of engagement will kick your @$$ if you don't know what you are doing (which I don't know what I'm doing half of the time).

            But start small and gradually build up. It's one of this things that takes time to truly get on the ball with, yet once you do, the tool is indespensible!

            Cheers,

            NRG

              Thanks, ill check it out as I journey on until I discover my holy grail of regular expressions!

                Although since you're parsing XML you might want to look at using XPath instead of regular expressions. XPath is designed for searching XML. In this case something along the lines of: //item/title[contains(text(),'$searchstring').

                  Thanks I cracked it like this:

                          //TEST TO REMOVE THE KNOWN "BAD TAGS" 
                          //(Maybe turn this into a function sooner or later.)
                  
                          $match = preg_replace("/<media:thumbnail([^`]*?)\/>/", "", $match);
                  
                          $match = preg_replace("/<dc:date([^`]*?)<\/dc:date>/", " ", $match);
                          $match = preg_replace("/<dc:creator([^`]*?)<\/dc:creator>/", " ", $match);
                          $match = preg_replace("/<dc:type([^`]*?)<\/dc:type>/", " ", $match);
                  
                  
                          //Right now look for the searched word and write to the file if found.
                  
                          preg_match ("/<item>([^`]*?)<\/item>/", $match, $temp);
                          $item = $temp['1'];
                          $item = trim($item);
                  
                          $regexp = "/\b.".$searchWord."\b/i";  
                  
                             if (preg_match($regexp, $item)) {
                  
                                if (!($item == ''))
                                {
                                // append item tags to content we want to add 
                                $item = "<item>\n".$item."\n\t\t</item>\n";
                                $found++;
                                }//end if

                  Thanks for all your input, appreciated!!!

                    Write a Reply...