[RESOLVED] Convert a character to decimal NCR?

BillKat · Feb 24, 2009

We have a fully-UTF-8 system, and a client has asked for output with certain characters as their NCR equivalent.

e.g. © as & #169;

Is there a way to do this in PHP, either on a string or character basis?

Thanks all.

bradgrafelman · Feb 24, 2009

What does "certain characters" mean? Which ones?

Example, this code:

$string = 'Hello world, &#169; 2008!';
$string = preg_replace('/([^a-z0-9 ])/ie', '"&#" . ord("$1") . ";"', $string);

echo $string;

outputs:

Hello world&#44; &#169; 2008&#33;

BillKat · Feb 25, 2009

Thanks for that. They've given us a list of 201 or so characters, £ sign, alpha, beta and so on.

I got as far as this for testing:

for ($i=0; $i<strlen($string); $i++) {
	$output .= "&#" . ord($string[$i]) . ";";
}

But both that and yours give mangled output in the browser, like an input string of "£ xxx © ¤ Σ α ß" is displayed as "Â£ xxx Â© Â¤ Î£ Î± ÃŸ".

E.g. the pound sign has become & #194;& #163;
instead of the expected & #163;
(gaps deliberate)

Is there something about these particular characters that ord() doesn't like maybe.

BillKat · Feb 25, 2009

Just found a statement that ord breaks on unicode, so off to do more research. Thanks again for the help.

edit: Found this package, which seems to do the job well: http://hsivonen.iki.fi/php-utf8

So we're on the way to a full solution.

Weedpacket · Feb 25, 2009

Well, that's because the file is UTF-8, which means that £ is stored as two bytes: "0xc2 0xa3" (ord only works on a byte-by-byte basis) which just happens to look like Â£ when interpreted as Latin-1.
If the table is fixed you could just write it up by hand

$mapping = array(
"£" => "&#38;#163;",
"©" => "&#38;#169;",
"&#931;" => "&Sigma;", // I'm guessing here: what does "NCR" mean?
...
);

$output = str_replace(array_keys($mapping), array_values($mapping), $input);

Provided you use UTF-8 for that file, the keys will be the right byte sequences.

BillKat · Feb 27, 2009

Thanks, looks good. NCR = Numerical Character Reference.

[RESOLVED] Convert a character to decimal NCR?

BBillKat

Bbradgrafelman

BBillKat

BBillKat

Weedpacket

BBillKat