Hello, this is my first post here, and also my first time experimenting with XML, so please bear with me.
I am attempting to create a script that will format an xml file based on a given template. Well, I've got it all working except for one problem. However, I have also encountered a little quirk that I can't quite explain (and i'm curious about) while trying to reproduce the problem, so I'll ask about that first.
Here is the code in question:
<?
class XMLParser
{
var $data;
function NewsXMLParser()
{
$data = "";
}
function startElement($parser, $tagName, $attributes)
{
print "<b>start element--></b>";
}
function endElement($parser, $tagName)
{
//print $this->data;
print "<b><--end element</b>";
}
function characterData($parser, $cdata)
{
$this->data = $cdata;
print $this->data;
print "<br /><hr /><br />";
}
}
$xml_parser = xml_parser_create();
$news_parser = new XMLParser();
xml_set_object($xml_parser,&$news_parser);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
$fp = fopen("test2.xml","r")
or die("Error reading XML data.");
while ($data = fread($fp, 4026))
xml_parse($xml_parser, $data, feof($fp))
or die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
fclose($fp);
xml_parser_free($xml_parser);
?>
The .xml file I am using is located here: http://devel.openbracket.net/test2.xml
Anyway, I was wondering why, when executing the script above, the callback funtion for CDATA ( characterData() ) seems to be caled multiple times, even though there is clearly only one chunk of CDATA in the .xml file. I can tell it's being called more than once because of the multiple horizontal rules that appear: http://devel.openbracket.net/test.php
For my next question, assume that the commented line in the code above is uncommented and the second statement in the "characterData" function is. (heres the modified code for reference:
...
function endElement($parser, $tagName)
{
print $this->data;
print "<b><--end element</b>";
}
function characterData($parser, $cdata)
{
$this->data = $cdata;
//print $this->data;
print "<br /><hr /><br />";
}
...
It seems that $this->data is partially wiped out at the end of characterData(). I assign $cdata to $this->data (and the previous, unmodified code shows that it prints correctly right after) but when endElement() is called, the exact same print line outputs only the last part of what should be shown: http://devel.openbracket.net/test2.php I can't understand why this happens and having the whole chunk display is the only thing keeping me from finishing my script. Any help would be appreciated
edit: wow, as soon as I hit the post button I realize that these two problems are connected. The last "chunk" that shows in the first, unmodified code is the same that shows in the second part. So it seems the only thing being stored in $this->data is from the last time characterData() is called. So I guess my main problem now is trying to figure out how to keep characterData() from being called multiple times....