I wrote a unicode-safe AJAX application.
My recommendation for all things, is to use UTF-8. The various Chinese and Japanese character sets are more of a pain the arse than anything else, especially SJIS etc.
Javascript strings are unicode internally, and it appears to me that Javascript is interpreted in a particular encoding - this can be controlled with the charset= on the content-type. I put this in my .htaccess:
AddDefaultCharset UTF-8
AddCharset UTF-8 .js
AddType text/plain .ajp
Because I wanted to serve .js and .ajp files with charset UTF-8.
PHP also outputs everything in UTF-8 and my HTML is in UTF-8.
Sending data to the site in UTF-8 is a bit more difficult. I used the Javascript encodeURIComponent method (on String objects) to utf-8 encode and %-encode the strings I send to the server. I use form-encoded posts.
this.escapeForPost = function(str) {
str = encodeURIComponent(str);
str = str.replace(/\+/g, '%2b');
return str;
}
It's also necessary to escape + with %2b as well - otherwise it is interpreted as a space by PHP.
I don't use non-ascii names for any of my fields, XML elements or attributes - this seems like a good thing.
Mark