I've lost the day trying to do something that I thought was going to be completely simple to do, and it's been one headbanging experience after another...
I'm getting a ZIP file from a Java program. The file is POSTed to my system, and will be sitting in a directory for me to process. Inside the ZIP file is a single text file, which is actually an XML "fragment" - there's no <?xml ?> header, and not base node.
It's basically a set of data readings
<data time="hh:mm:ss" date="yyyymmdd" value=""><someinfo></someinfo><otherinfo></otherinfo></data>
Just rows and rows of that.
So I thought I'd use the zip_* functions and read the file in, slap an <? xml ?> and <root> </root> around the returned info, and load that into a DomDocument() and have an XML file to play with.
Nope.
I seem to be able to read the data from the zip file. But it turns out the file coming back is encoded in either UTF-8 or UTF-16 ... there's a Byte Order Mark that I have to ignore on the return from the zip_* functions.
When I plunk the extra info on and try to do a loadXML() call on the string I build, I'm getting parse errors.
I finally gave up on this approach, and thought I'd try a more brute force and ignorance approach by turning the returned string into an array of records by splitting on "<data".
That's gotten me nowhere, either. I can't figure out the incantation to break the string up when it's encoded in UTF-8 (or is it really UTF-16??). I tried split() and preg_split(), but I only ever end up with one huge record.
I've been Googling and searching php.net. All I've found are a large number of articles and blog entries bemoaning the poor support for Unicode in PHP in general, and tutorials and examples that have either full-fledged XML files to load, or are only using ASCII characters.
So ... Can someone point me to something that can either tell me how to get the XML file I want, or tell me how to split the string into an array of records, each on "<data"?
Or point me to something that will explain how to work with Unicode/UTF-8/UTF-16 better so that I can beat on this some more?
Thanks.