given:

foreach ($doc->getElementsByTagName('div') as $nodes)
{
...
}

how would i access an img tag within that div node to grab its src attribute?

    You'd look at each $nodes (bad name, it's just one node) and see if it's an img tag; if it is you'd look at its src attribute.

      Weedpacket;10915045 wrote:

      You'd look at each $nodes (bad name, it's just one node) and see if it's an img tag; if it is you'd look at its src attribute.

      that what I thought and I tried:

      foreach ($doc->getElementsByTagName('div') as $nodes)
      {
      
       foreach($nodes->getElementsByTagName('img') as $node) //crappy var names I know
       {
      
       $imgSRC = $node->getAttribute('src');     
      
       }
      
      } 
      

      but it came back empty and I know its there

        this works, albeit while throwing a fatal error:

        $doc = new DOMDocument();
        $doc->loadHTML($html);
        
        foreach($doc->getElementsByTagName('div') as $nodes)
        {
        	$imgs = $nodes->getElementsByTagName('img');
        	$imgsLength = ($imgs->length - 1);
        	echo $imgs->item($imgsLength)->getAttribute('src');
        }
        

        I get:

        Fatal error: Call to a member function getAttribute() on a non-object
        

        on the echo line... HOWEVER, it does echo the data I want... whats with the error?

          this is the html structure i'm working with (severly cleaned up)... this is one iteration of several

          <div class="collection">
               <div><a><img/></a></div> //VARIABLE
               <div><a><img/></a></div> //VARIABLE
               <div><a><img/></a></div> //VARIABLE
               <div><a><img/></a></div> //VARIABLE
               <div class="important">
               <div NEED BACKGROUND HEX FROM THIS INLINE STYLE><a><img class="image" NEED THIS SOURCE/></a></div>
               </div>
               <div class="info">
               <p class="name"><a NEED THIS HREF>NEED THIS TEXT</a></p>
               <p class="details">NEED THIS TEXT</p>
               </div>
               <div class="last">
               <p class="misc">NEED THIS TEXT</p>
               </div>
          </div>
          

          The ones with variable dont always exist... there might be one, there might be 4...

          I'm trying to nab all this in one sweep instead of doing multiple foreach's, processing and then appending arrays

            I should also note that the last div group (classed above as "last") doesnt always exist either

              the more and more I play with this, the more I'm thinking that xPath queries might be simpler... that being said, it's been about 3 years since I've touched xPath so I'm rustier than a 10 year old trampoline

              would be this be the correct way to grab the first bit of data I'm looking for:

              $doc = new DOMDocument();
              $doc->loadHTMLFile($file);
              
              $xpath = new DOMXpath($doc);
              
              $elements = $xpath->query("*/div[@class='important']/div");
              $style = $elements->item(0)->getAttribute('style');
              

              and then use my regex to get the hex code from $style?

                ok... xpath actually turns out to be the most efficient and easy way to grab all these in one pass

                example:

                //COLOR
                $elements = $xpath->query("//div[@class='important']");
                $style = $elements->item(0)->getAttribute('style');
                preg_match('/(#[0-9A-z]{6})/',$style,$color);
                $color = $color[0];
                

                heres my only issue... how do I iterate that xpath (and the others) over every <div class="collection"> container?

                thanks!

                  Write a Reply...