Hey everyone. I'm having a problem parsing XML files in PHP 4.3.9.
I have a few files I'm trying to parse. I'm sure that they're all well-formed because I can view them without problems in both IE 6.0 and FF 1.0.6. When I go to parse them, though, xml_parse returns one of these errors:
- invalid token
- unclosed token
- not well-formed (invalid token)
I did a little investigating and found that when the files are fresh (they are auto-generated by another product), and I open them in TextPad, the encoding is set to Unicode. But if I use TextPad to save them as UTF-8 documents, the errors disappear and the script works perfectly!
So, my questions are:
- Why does that happen?
- Is there any way I can use PHP to encode the docs as UTF-8 before I try to parse them? I tried reading the original XML file into a string, using utf8_encode to set it to UTF-8, and then writing it to a temp file, but that doesn't seem to work.
By the way: There is no way for me to adjust how the XML is generated. I have to work with these files as is.
Anybody have any ideas??? 😕
Thanks!