Hi,

I am developing a PHP program to send SMS which contains non ASCII (Tamil, French, etc...) language characters.

I use cURL to launch the SMS HTTP API. I am struggling for this for almost a month and it always driving me to nuts.

I have a form as below

<html>
<title>Unicode testing</title>
<form action="send.php" method="POST" name="adminForm" id="adminForm" enctype="multipart/form-data">
<textarea name="message" rows="10" cols="30"></textarea>
<input type="submit" value="Send">
</form>
</html>

It is a very simple form with 1 text area where I will type the message in tamil, french, english etc...) Once I hit the send button it will invoke the below php program

<?php

$message = $_POST['message'];
$url = "http://api.clickatell.com/http/sendmsg?user=user&password=pppp&from=JEEMA&api_id=11111&unicode=1&to=9444871902";
$url .= "&text=".urldecode($message);

$ch = curl_init();
curl_setopt($ch, CURLOPT_VERBOSE, 1);

curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER,array ("Content-Type: text/xml; charset=utf-8"));
$contents = curl_exec ($ch);

curl_close ($ch);
$contents = ltrim(rtrim(trim(strip_tags(trim(preg_replace ( "/\s\s+/" , " " , html_entity_decode($contents)))),"\n\t\r\h\v\0 ")), "%20");

echo $contents;
?>

I get error ERR: 116, Invalid Unicode data

Could someone please help me???

    Two questions, the first being: Is the page you output the HTML form on set to UTF-8 encoding (either in a <meta> tag, HTTP header, or - preferably - both)?

    Second, why are you trying to use [man]urldecode/man on the POST'ed message? You should be encoding it, not decoding it...

      Hi,

      First, thanks for your valuable reply. I included the header and <meta> tag as below

      <?php
      header('Content-Type: text/html; charset=utf-8'); 
      ?>
      
      <html>
      <title>Unicode testing</title>
      <head>
      <meta http-equiv="content-type" content="text/html; charset=utf-8" />
      </head>
      <form action="send.php" method="POST" name="adminForm" id="adminForm" enctype="multipart/form-data">
      <textarea id="txt" name="message" rows="10" cols="30"></textarea>
      <input type="submit" value="Send">
      </form>
      </html>

      Also in my send.php I tried urlencode. Still the same problem appears.

      I think I need to convert the user typed characters to a UNICODE characters before submitting to the gateway. Because clickatell has its own unicode convertor. For example when I type é à it generated a string 00e9002000e0, and this string is accepted in the API URL.

      Any ideas how do I convert a user typed chars to UNICODE...

        malaiselvan;10971133 wrote:

        Hi,

        Any ideas how do I convert a user typed chars to UNICODE...

                function utf8_to_unicode_code($utf8_string)
                  {
                      $expanded = iconv("UTF-8", "UTF-32", $utf8_string);
                      return unpack("L*", $expanded);
                  }
        

          Hi,

          Thanks for your code.

          I tried your code for the text "é à". It is returning an Array of integer values.

          Your Code returns

          Array ( [1] => -131072 [2] => -385875968 [3] => 536870912 [4] => -536870912 )

          But ClickaTell unicode converter returns

          00e9002000e0

          Any ideas what I am missing?

            this is the best i can come up with at the moment:

            function hex_chars($data){
            	$mb_hex = '';
            	for($i = 0 ; $i<mb_strlen($data,'UTF-8') ; $i++){
            		$c = mb_substr($data,$i,1,'UTF-8');
            		$o = unpack('N',mb_convert_encoding($c,'UCS-4BE','UTF-8'));
            		$mb_hex .= '00'.hex_format($o[1]);
            	}
            	return $mb_hex;
            
            }
            function hex_format($o){
            	$h = strtoupper(dechex($o));
            	$len = strlen($h);
            	if($len%2==1)
            		$h = "0$h";
            	return $h;
            }
            
            echo hex_chars('é à'); //00E9002000E0
            
            

            its modified from a user note on the ord() page

              Hi,

              I think we are almost there. Thanks for your help.

              I dont know whats the significance of adding '00' as prefix

              I tried with a simple symbol "€"

              ClickaTell converter returns
              20ac

              But your code returns
              0020AC

              I am not sure why clickatell adds 00 as prefix for é and not adding for €

              Any clues???

                <?php
                function hex_chars($data){
                	$mb_hex = '';
                	for($i = 0 ; $i<mb_strlen($data,'UTF-8') ; $i++){
                		$c = mb_substr($data,$i,1,'UTF-8');
                		$o = unpack('N',mb_convert_encoding($c,'UCS-4BE','UTF-8'));
                		$mb_hex .= sprintf('%04X',$o[1]);
                	}
                	return $mb_hex;
                
                }
                
                
                echo hex_chars('€'); //20AC
                echo hex_chars('é à'); //00E9002000E0
                
                
                ?>

                Unicode should be capitals A-F, but if you must lower-case change X to x in the sprintf()

                  You guys are awesome!!!. I never expected that I will get this much response in this forum.

                    3 years later

                    Hello everybody !
                    I've exactly the same problem.
                    I try the solution given by Dagon but i always got the error 116, Invalid Unicode data...
                    I've tried all : placed the Dagon'scode

                    <?php
                    function hex_chars($text){
                    $mb_hex = '';
                    for($i = 0 ; $i<mb_strlen($text,'UTF-8') ; $i++){
                    $c = mb_substr($text,$i,1,'UTF-8');
                    $o = unpack('N',mb_convert_encoding($c,'UCS-4BE','UTF-8'));
                    $mb_hex .= sprintf('%04X',$o[1]);
                    }
                    return $mb_hex;

                    }

                    echo hex_chars('&#8364;'); //20AC
                    echo hex_chars('é à'); //00E9002000E0

                    ?>

                    at the end of my form's page, at the begenning of the send.php page ...still got error.
                    Do I have to modify the original code of Malaisevlan send.php page ?

                    Any help is realy welcome !

                    Best regards and thank you in advance,
                    Thommen

                      Write a Reply...