Your XML declares a character set of UTF-8. This is a multi-byte charset which supports pretty much every language in the world.
I'm on thin ice here, but I'll try and explain. Hopefully weedpacket or bradgrafelman or nogdog or someone might chime in. Unfortunately, PHP does not by default use multibyte character sets when dealing with character strings and -- perhaps even more confusingly -- PHP may report a different value in this script depending on how you save the PHP file.
<?
$str = 'ÁÁ';
echo strlen($str);
?>
If you save the file as regular ASCII/ANSI/Latin-1 text, it will probably report "2". If you save it as UTF-8, it will report "4".
Instead of htmlentities, try using [man]utf8_encode[/man].
If that doesn't work, post back here and we'll try to figure it out.