Hi guys,
My name's Aidan and this is my first post here. I work for a small company in NZ, and I'm currently trying to work through the unexpected behaviour of the php xml_parse function.
I am having some trouble with the expat-based xml parser in php (xml_parser_create, xml_set_character_data_handler, et al) dropping whitespace. I'm hoping someone else has run into this problem and can help, or at least I can find someone to just share in my frustration... :bemused:
I've wrapped the functions into a nice OO package, and on the whole it's working perfectly. However on PHP4 on at least one platform (I think it's a mac), my character data callback function (set with xml_set_character_data_handler() ) never receives any whitespace, including newlines and carriage returns! For my purpose, this is unacceptable (the newlines/c-returns are highly significant in my CDATA). This problem does not occur on php5 on windows.
Here is an example:
XML
<?xml version='1.0' ?>
<some_text>
This is a short sentence followed by a new line.
Here is a second sentence.
</some_text>
The actual character data from the <some_text> node after parsing would be:
This is a short sentence followed by a new line.Here is a second sentence.
I've had a wee hunt around in the manual, and I found an undocumented (in my manual) option 'XML_OPTION_SKIP_WHITE', which seems like it should be the culprit in my case. But this option seems to be set to false by default even on my php4/mac target.
This seems to be a behaviour that no one else seems to mind, which is fine, except I can't find much other info on this problem anywhere else on the net.
Has anyone figured out a way around this (stupid) parsing behaviour?
Thanks!