Hello, this is my first post here, and also my first time experimenting with XML, so please bear with me. 😃

I am attempting to create a script that will format an xml file based on a given template. Well, I've got it all working except for one problem. However, I have also encountered a little quirk that I can't quite explain (and i'm curious about) while trying to reproduce the problem, so I'll ask about that first.

Here is the code in question:

<?

class XMLParser
{
  var $data;

  function NewsXMLParser()
  {  
$data = "";
} function startElement($parser, $tagName, $attributes) {
print "<b>start element--></b>"; } function endElement($parser, $tagName) {
//print $this->data; print "<b><--end element</b>";
} function characterData($parser, $cdata) {
$this->data = $cdata; print $this->data; print "<br /><hr /><br />";
} } $xml_parser = xml_parser_create(); $news_parser = new XMLParser(); xml_set_object($xml_parser,&$news_parser); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "characterData"); $fp = fopen("test2.xml","r") or die("Error reading XML data."); while ($data = fread($fp, 4026)) xml_parse($xml_parser, $data, feof($fp)) or die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser))); fclose($fp); xml_parser_free($xml_parser); ?>

The .xml file I am using is located here: http://devel.openbracket.net/test2.xml

Anyway, I was wondering why, when executing the script above, the callback funtion for CDATA ( characterData() ) seems to be caled multiple times, even though there is clearly only one chunk of CDATA in the .xml file. I can tell it's being called more than once because of the multiple horizontal rules that appear: http://devel.openbracket.net/test.php

For my next question, assume that the commented line in the code above is uncommented and the second statement in the "characterData" function is. (heres the modified code for reference:

...
  function endElement($parser, $tagName)
  {  
print $this->data; print "<b><--end element</b>";
} function characterData($parser, $cdata) {
$this->data = $cdata; //print $this->data; print "<br /><hr /><br />";
} ...

It seems that $this->data is partially wiped out at the end of characterData(). I assign $cdata to $this->data (and the previous, unmodified code shows that it prints correctly right after) but when endElement() is called, the exact same print line outputs only the last part of what should be shown: http://devel.openbracket.net/test2.php I can't understand why this happens and having the whole chunk display is the only thing keeping me from finishing my script. Any help would be appreciated 😃

edit: wow, as soon as I hit the post button I realize that these two problems are connected. The last "chunk" that shows in the first, unmodified code is the same that shows in the second part. So it seems the only thing being stored in $this->data is from the last time characterData() is called. So I guess my main problem now is trying to figure out how to keep characterData() from being called multiple times....

    okay, I've now got a temporary solution, so the problem isn't as urgent as before. Instead of reassigning $this->data everytime the CDATA function is called, I used string concatenation to add on to it.

    ...
      function endElement($parser, $tagName)
      {  
    print $this->data; print "<b><--end element</b>";
    $this->data = ""; } function characterData($parser, $cdata) {
    $this->data .= $cdata; //print $this->data; print "<br /><hr /><br />";
    } ...

    heh, I guess I found my own solution, but I'm still curious as to why the function is called multiple times and if there is anything I can do to prevent that.

      Perhaps to do with reading 4026 bytes at a time...does reducing the cdata reduce the number of linebreaks?

      What has this got to do with OOP? 🙂

        Well, I THOUGHT it had to do with OOP, because doing the exact same thing without xml_set_object works perfectly for some reason.

        Also, I've increased 4026 to larger numbers and it still does the exact same thing, so I figure that isn't the problem. Besides, the whole xml file is only 2kb :/

          Write a Reply...