If you read a HTML text (from a database or a file) containing special character entities such as "nbsp" (no-break space) or "Aring" (latin capital letter A with ring above) and display it inside a Textarea form field, these characters will be converted to their "represenation" forms, not maintain the &whatever; forms.
Now, if you then use the form to save the data in the Textarea field back into the database/file, these characters still be in their representation forms, which is a Bad Thing(tm). In a lot of cases, you can very easily convert the characters back, but in the case of whitespace, it's not as easily done, nor is it so in the case of "lt" (less-than sign). You can't just go and replace every instance of " " and "<". And having to replace them manually every time you edit a text is really out of the question too.
So, how to do this? I'm not particularly experienced with regular expressions, but I have a hunch this case is not that easily done using them (although it might be possible). Any help and/or thoughts about it would be appreciated.