SimpleXML will not parse certain nodes in my XML file
Results 1 to 3 of 3

Thread: SimpleXML will not parse certain nodes in my XML file

  1. #1
    Senior Member
    Join Date
    May 2003
    Posts
    121

    SimpleXML will not parse certain nodes in my XML file

    I'm parsing a huge XML file that I don't have control over how they formatted it. It's valid XML though and I need to use SimpleXML to parse it to keep it light on memory usage on the server.

    Here is my sample XML file, as you can see the D node doesn't have any text and neither does the C element...
    Code:
    <?xml version="1.0"?>
                    <a>
                     <b>
                      <c>dog food</c>
                      <c>cat food</c>
                     </b>
                     <d>
                      <c />bird food
                     </d>
                    </a>
    Now when I parse it with PHP, it doesn't show anything for the D node, despite that the text "bird food" should show up

    Here is the sample PHP code to prove it:
    PHP Code:
    $string = '<?xml version="1.0"?>
                    <a>
                     <b>
                      <c>text</c>
                      <c>stuff</c>
                     </b>
                     <d>
                      <c />code
                     </d>
                    </a>';

    $xml = new SimpleXMLElement($string);


    echo '<pre>'; print_r($xml); echo '</pre>';
    $data = $xml->d->children();
    echo '<pre>'; print_r($data); echo '</pre>';
    The XML file I have is much bigger than this, but this will illustrate an issue I run into throughout the entire file with SimpleXML
    php 5.x | mysql 5.x | apache 2.x

    online / live - Linux
    offline / dev - Windows

  2. #2
    Pedantic Curmudgeon Weedpacket's Avatar
    Join Date
    Aug 2002
    Location
    General Systems Vehicle "Thrilled To Be Here"
    Posts
    21,771
    If you want to keep memory usage light on a large XML input, then the streaming XML parser may be a better choice.

    From simplexmlelement.children:
    Note: SimpleXML has made a rule of adding iterative properties to most methods. They cannot be viewed using var_dump() or anything else which can examine objects.

    PHP Code:
    <?php
    $string 
    '<?xml version="1.0"?>
                    <a>
                     <b>
                      <c>text</c>
                      <c>stuff</c>
                     </b>
                     <d>
                      <c />code
                     </d>
                    </a>'
    ;

    $xml = new SimpleXMLElement($string);

    $data $xml->d;

    echo 
    $data;
    THERE IS AS YET INSUFFICIENT DATA FOR A MEANINGFUL ANSWER
    FAQs! FAQs! FAQs! Most forums have them!
    Search - Debugging 101 - Collected Solutions - General Guidelines - Getting help at all

  3. #3
    Senior Member
    Join Date
    Jul 2007
    Posts
    3,619
    Quote Originally Posted by Weedpacket View Post
    If you want to keep memory usage light on a large XML input, then the streaming XML parser may be a better choice.
    I definitely agree, although I took the lazy approach and illustrate my point using what I know: DOMDocument.

    Note: SimpleXML has made a rule of adding iterative properties to most methods. They cannot be viewed using var_dump() or anything else which can examine objects.
    Although the actual problem here seems to be that it only retrieves elements of d, not nodes. Bug? Since I'm not familiar with SimpleXML there may be some other method of accessing child nodes, rather than child elements, but I failed finding one with a quick glance in the manual.
    PHP Code:
    echo '<pre>'print_r($xml); echo '</pre>';
    $data $xml->d->children();

    foreach (
    $data as $n)
    {
        echo 
    '<pre>'; echo ++$i ' ' print_r($n1).PHP_EOL; echo '</pre>';

    I'd have expected 3 iterations with output for
    $d-> first child node (TextNode)
    $d-> second child (c element)
    $d-> last child node (TextNode)

    Similarly, assigning not $xml->d->children to $data, but rather $xml->d ought to assign the d element, on which an echo should produce its node value (which is all the text nodes contained in d directly as well as all the text nodes of d's child elements).

    Quote Originally Posted by s0me0ne View Post
    Here is my sample XML file, as you can see the D node doesn't have any text and neither does the C element...
    If you're talking about the results using simpleXML in your example, then I agree. But if you're talking about thow things really are, then this should be corrected to:
    The D node has 3 child nodes: A newline (LF) text node, followed by an empty c element, followed by a "bird food\n" text node.

    And, demonstrating this using Dom
    PHP Code:
    $xml = new DOMDocument('1.0');
    $xml->loadXml($string);

    echo 
    '<pre>'print_r($xml); echo '</pre>';

    $a $xml->childNodes->item(0);
    $d $a->getElementsByTagName('d')->item(0);
    echo 
    '<pre>';
    $i 0;
    foreach (
    $d->childNodes as $child)
    {
        echo ++
    $i ': ' $child->nodeValue PHP_EOL;
    }
    echo 
    '</pre>'
    Output
    Code:
    1: 
    		
    2: 
    3: bird food
    Or more simply
    PHP Code:
    $d $a->getElementsByTagName('d')->item(0);
    echo 
    "<pre>'"$d->nodeValue "'</pre>" 
    Last edited by johanafm; 11-26-2012 at 12:15 PM.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •