I've written a calendar system in PHP that works great, and includes a rss feed script as well.
Unfortunately, my users will occasionally copy/paste from a Word doc into the textarea form field to create entries. Most of the time, this is fine, BUT... Word replaces some characters with non-standard stuff which is breaking my rss feeds. Things that are troublesome are the elipses, and both single and double quotes. For example, in Word if you type 3 periods together, Word replaces them with a single character elipses.
I need to parse the input for these characters and replace them before inserting the text into my DB (and I need to check output from the db to replace any that are already in there).
Anyone know if there is a function similar to htmlspecialchars(), htmlentities(), etc. that already exists? Failing that, I assume that I'll need to write my own function that will check for these and replace them with the correct text. Should I check the input character by character or is there a regex that will find things like this?
Thanks!