Hi all,
I have always been struggeling with character sets and such but I've managed so far... eventhough I don't fully understand it. I thought it would be a good idea to play with this a bit more and thus learn as I go... but omg charsets are HELL! :mad:
I gave myself an assignment to create a database with country information, iso codes, country names in different translations and such. I thought it would be smart to create an UTF-8 database instead of a ISO-8859-1 database. Reason being; not all alphabets are supported by ISO-8859-1 and maybe I want to add greek translations...
When I started scripting I found out that PHP defaults to ISO-8859-1. So simple string conversions like strtolower or an ereg_replace simply return corrupted strings. So a solution could be to utf8_decode -> strtolower -> utf8_encode... That would be ok if you hardly want to 'do' stuff to a string. But when you're constantly work with them it becomes a pain, and besides it takes up processing time!
I've looked into set_locale, don't understand much of it. But what I do understand is that, if you create distributable code, you shouldn't really mess with that.
I'm stuck really... I want to create a multi-lingual/alphabetical application including database but would love to work with PHP build in functions like strtolower etc. How should I go about this problem?
I found this; http://sourceforge.net/projects/phputf8
But I can't believe PHP is so narrowminded that we need external libraries to work with multi-alphabetical content?!?! :queasy:
How do you guys work with this problem? Does anyone know of a GOOD and very explanatory tutorial on this?
Cheers,
Hendricus