Why cannot PHP convert letters as ÆØÅ to lowercase æøå or uppercase when charset is UTF8?

Seem to apply ONLY if one of Æ Ø Å is first letter in string. 😕

The problem occur when i search a database/tables UTF8_UNICODE_CI, with something as

SELECT DISTINCT id FROM TABLE WHERE 1 AND CONCAT(field_one,field_two,field_more,'') LIKE '%Øy%' AND ... ORDER BY ... DESC LIMIT 0 , 10;

The "Øy" gives result while "øy" give none results.

The same problem occur with german Ü and ü and many other letters. Searching e.g. "Bjørn" is OK tough....

Is the problem related PHP or MySQL or both?

What i have tested:
1) setlocale(LC_ALL, 'no_NO');
2) mb_strtoupper(); with MySQL query and tables as UPPER()
3) A First upper function:

function firstUpper($string)
{
    $string = str_replace(array("Æ","Ø","Å"), array("æ","ø","å"), strtolower($string));
    $ord = explode(" ", $string);

$return = "";
foreach ($ord as $val)
{

    $return .= " " . str_replace(array("æ","ø","å"), array("Æ","Ø","Å"), strtoupper($val{0})) . substr($val,1,strlen($val)-1);
}
return $return;
}

4: mysql_query ("SET NAMES utf8");

When I search a Case-insensitive database, it sould not matter if the letters are "Å" or "å", or "M" is "m".

Strange... :xbones: :xbones:

    How about definining/altering the table column to be "CHARACTER SET utf8 COLLATE utf8_general_ci"?

      No. All tables and the table-data are utf8_unicode_ci.

        Hmm...as far as I can see from the manual, LIKE should be case-insensitive unless either of the arguments is not a non-binary string, and CONCAT() should be a non-binary string unless any of its arguments is a non-binary string. So the only thing I can think of is that at least one of the arguments to your CONCAT() is a dreaded non-binary string (maybe a NULL?), in which case you would need to cast it:

        . . . CONCAT(CAST(field_one AS CHAR), CAST(field_two AS CHAR), CAST(field_more AS CHAR, '') LIKE '%Øy%' . . .
        

          There are no NULLs. I tested CAST() but i did not work 🙁 I have also tested COLLATE latin1_danish_ci

          Strange 😕

          I have run the query in PHPMyAdmin as well, just to check my PHP script, but no change.

          Even

          SELECT *
          FROM `table`
          WHERE `field` LIKE '%øy%'
          LIMIT 0 , 30  

          Give no results. I don't know. The text is stored in UTF-8. Øy" give result while "øy" give none results.

            It's mutch similiar to this:
            http://bugs.mysql.com/bug.php?id=30500

            Settings:

            character set client utf8
            character set connection utf8
            character set database utf8
            character set filesystem binary
            character set results utf8
            character set server utf8
            character set system utf8
            collation connection utf8_unicode_ci
            collation database utf8_general_ci
            collation server utf8_general_ci
            init connect SET NAMES utf8

            CREATE TABLE `table_test` (
              `field_one` text collate utf8_unicode_ci NOT NULL
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
            INSERT INTO `table_test` VALUES ('Øy');
            
            $q = "øy"; // as UTF-8..
            $result = mysql_query("SELECT * FROM table_test WHERE field_one LIKE '%$q%'");
            echo mysql_num_rows($result);
            SELECT *
            FROM `table_test`
            WHERE `field_one` LIKE '%øy%'
            LIMIT 0 , 30 

            Results: 0

            SELECT *
            FROM `table_test`
            WHERE `field_one` LIKE '%Øy%'
            LIMIT 0 , 30 

            Results: 1

            I got same results on another server with this settings:

            character set client utf8
            character set connection utf8
            character set database latin1
            character set results utf8
            character set server latin1
            character set system utf8
            collation connection utf8_unicode_ci
            collation database latin1_swedish_ci
            collation server latin1_swedish_ci

            Wildcard huh ? :rolleyes:

              Concat wrote:

              It's mutch similiar to this:
              http://bugs.mysql.com/bug.php?id=30500

              Settings:

              CREATE TABLE `table_test` (
                `field_one` text collate utf8_unicode_ci NOT NULL
              ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
              INSERT INTO `table_test` VALUES ('Øy');
              
              $q = "øy"; // as UTF-8..
              $result = mysql_query("SELECT * FROM table_test WHERE field_one LIKE '%$q%'");
              echo mysql_num_rows($result);
              SELECT *
              FROM `table_test`
              WHERE `field_one` LIKE '%øy%'
              LIMIT 0 , 30 

              Results: 0

              SELECT *
              FROM `table_test`
              WHERE `field_one` LIKE '%Øy%'
              LIMIT 0 , 30 

              Results: 1

              I got same results on another server with this settings:

              character set client utf8
              character set connection utf8
              character set database latin1
              character set results utf8
              character set server latin1
              character set system utf8
              collation connection utf8_unicode_ci
              collation database latin1_swedish_ci
              collation server latin1_swedish_ci

              Wildcard huh ? :rolleyes:

              Now i found a working solution,

              $q = mb_convert_case($string, MB_CASE_TITLE, "utf-8");

              😃

              It do like ucwords() and make ø to Ø and å to Å. It's good enough for me.

                6 months later

                I had similar problems with str_replace and æøå.
                It seems that str_replace simply does not work with æ as search value.

                I changed to ereg_replace() and that did the trick for me 🙂

                  Write a Reply...