They're pasting into a form and that form/page is being submitted to itself where I can run some processing functions on it and then I write it to the xml. A few simple things are I take out the <>'s and replace them with []'s etc.
I found a function that somebody wrote that replaces a ton of characters [chr(123)] with their (unicode?) counterparts [{] and after that I seem to be able to run my mb_convert_encoding() function on it to get it into UTF-8.
Does anybody have a minute to explain the different characters to me?
chr(123) - what's the name for this character? What encoding is it used in?
"/& - how about these? Are they embedded in html?
{ - this is unicode, correct? Or UTF-8? Or both?