I have a problem returning DB results when input string that contains accents. Examples of unique names in my DB are akuéson, Church´s and Borås.

  • For the purpose of my problem, I'm not including "sql injection" prevention.

  • I'm currently not able to extend my DAL class to PDO or mysqli

  • DB encoding: UTF-8

  • Table encoding: UTF-8

  • HTML page encoding: UTF-8

  • I can add records (from form input) where names uses accents (e.g. "Fióre`s Häßelberg")

  • I can read records with accents when using -> "name LIKE '$labelName%'"

  • I can NOT read records with accents when using -> "name = '$labelName'" (0 rows returned)

Here's my code (simplified):

class DAL { 
  protected $username;  
protected $pwd;
protected $host; protected $dbname; private $conn;
private $db; private $query_result; // the stored result of the last query you ran public function __construct($cfg_file = 'nf.config') { $config = parse_ini_file($cfg_file); $this->username = $config['db_user']; //... } public function connect() { //... } private function query($sql) { $this->query_result = mysql_query($sql, $this->conn)
} public function getLabelData($labelName) { $sql = "SELECT * FROM sl_label WHERE name = '$labelName'"; $this->query($sql); return $this->query_result;
} public function getSearchResult($labelName) { $sql = "SELECT name FROM sl_label WHERE name LIKE '$labelName%'";
$this->query($sql); return $this->query_result; } }

Outputting the $sql just before execution of $this->query($sql) gives this result:

SELECT * FROM sl_label WHERE name = 'Church´s'

Analyzing the mysql query log afer execution, the query looks like this:

1 Connect root@localhost on
1 Init DB norwegianfashion
1 Query SELECT id FROM sl_label WHERE name = 'Church´s'
1 Quit

If I analyze the encoding of the input string using mb_detect_encoding (and the input string is "Church´s"), I get the following results:

mb_detect_encoding($labelName) = UTF-8
$labelName = Church´s

If I encode the input string using utf8_encode($labelName), and then display encoding and name, I get this result:

mb_detect_encoding($labelName) => UTF-8
$labelName = Church´s

Here's something interesting with SET NAMES
If I add the following just after connection "mysql_query("SET NAMES 'utf8'");", all names with accents turnes in to gibberish.

Fióres Häßelberg -> Fióres Häßelberg

If I run the following directly in my DB interface tool (Toad for MySQL), I get normal results:

SET NAMES 'utf8';
SELECT name FROM sl_label;

In 2006, Derek Sivers write an extensive article regarind this issue: http://www.oreillynet.com/onlamp/blog/2006/01/turning_mysql_data_in_latin1_t.html But I'm not being very successfull in fixing the issues the way he does.

I really banging my head against the wall here. Any help appreciated!!

    run
    SET NAMES 'utf8';
    before your select query

      Did you read the last part of the post?

      As I wrote:

      If I add the following i just after connection "mysql_query("SET NAMES 'utf8'");", all names with accents turnes in to gibberish.

      That means If I do this, all results with accents (æ,ø,å,é, ä etc.) is unreadable in the output.

      Note that If I do "SET NAMES 'utf8'; SELECT name FROM sl_label;" directly in the qury window in the database, I get normal results.

        I 'might' have to look at something like this at some point, so I'm interested and read that article, but I have actually no experience....so sorry if my suggestion/question won't bring you far, but:

        Did you try the stuff suggested in one of the replies?

        httpd.conf:
        AddCharset UTF-8 .utf8
        AddDefaultCharset UTF-8

        php.ini
        default_charset = "utf-8"

        my.cnf
        character-set-server=utf8
        default-collation=utf8_unicode_ci

        Then, if you're writing PHP scripts, soon after opening your connection to mysql, issue one of the following:
        SET NAMES 'utf8';

        Bjom

          Bjom;10918532 wrote:

          Did you try the stuff suggested in one of the replies?

          Hi Bjorn.
          I'm not sure what suggestion in what reply you are reffering to (since it's not a reply to this post (?) )

          And I'm beginning to wonder if some of mye text / replies are not showing - since I've answered this question twice.

          Yes, I've tested "SET NAMES 'utf8'" just after I open the sql connection.
          As I write in my first post, all results from select statements are returned with gibberish. For example: "Fióres Häßelberg" becomes "Fióres Häßelberg".

          I've even tried exporting the table data and inputted it in a editor using utf8, then emporting the same data to the DB. That doesn't help either.

          Using phpMyAdmin or Toad for MySQL, I can run "SET NAMES 'utf8'" and get readable results.

          I'm not able to edit those config files on the production servers.

            UPDATE:

            Just adding "SET NAMES 'utf8'" was not enough. That only gave me gibberish back.
            Addingthe following query returns readable text: "COLLATE='utf8_danish_ci'".

            But I'm still not able to return any results from PHP with the following line:

            mysql_query(SELECT id FROM sl_label WHERE name = '911 brand æøå 2');

              Just a quick note ... might help.

              I had similar problems trying to store/retrieve russian characters. Here's the solution I found (can't remember where now and to be honest, I don't know exactly why it works when what you've already done doesn't).

              First off, make sure the actual fields that contain the characters have the desired collation ... utf8_danish_ci (it doesn't seem to matter what collation the table has, just the field)

              Second, set the 'MySQL connection collation' to the desired value. (I did this with phpMyAdmin - not sure how you do it otherwise)

              Finally, include the following snippet on any page that requires db access ...

              // Make sure any results we retrieve or commands we send use the same charset and collation as the database:
              $db_charset = mysql_query( "SHOW VARIABLES LIKE 'character_set_database'" );
              $charset_row = mysql_fetch_assoc( $db_charset );
              mysql_query( "SET NAMES '" . $charset_row['Value'] . "'" );
              unset( $db_charset, $charset_row );

              I've left a comment in that might make sense of what's going on.

              I also set the charset twice on the page ...

              header('Content-Type: text/html; charset=utf-8');

              AND

              <meta http-equiv="content-type" content="text/html; charset=utf-8" /> 

              but I'm probably overcooking it there!

              As I say, I'm not 100% if this is going to work in your circumstance, but it did in mine.

              Hope that helps!

              P

                Thanks Paul.

                I've kind of given up and come to a compromise.

                I'll be converting accents with htmlentities when storing data, and html_entity_decode when retrieving data from the DB. That seems to work.

                The only drawback I see so far, is that I can't read the names in clear text using phpMySQL or Toad for MySQL.

                  Write a Reply...