Hi,
I am currently working on a RSS feed script for an online news site. The script pulls the data from a mysql database and format these to RSS on a daily basis. Everything is working fine except that the RSS data fails to validate because of bad characters included in the data. I later found out that these unacceptable characters were non-UTF and non-ISO characters which I suspect were pasted directly from Word to the news program data entry form. Below are some of the characters:
& # 1 4 6; (cp1251 single quote)
& # 1 4 7; (cp1251 opening double quote)
& # 1 4 8; (cp1251 closing double quote)
(I put a space in between because vBulletin shows it as ' and " so the characters should be joined without the single spaces in between)
htmlentities() wouldn't convert these chars so I used ereg_replace():
$desc = ereg_replace ("& # 1 4 7;", "& q u o t;", $desc);
(again igonore the in-between space)
Unfornatenately ereg_replace() seems to ignore the “ above and would not replace it. This was on my Linux remote server. However on my local windows machine the above ereg_replace() does work but I needed it to work on my remote Linux server.
BTW mbstring extension has been enabled with PHP on both my local and remote server.
Any help is appreciated.