Hi there,

does anyone know the difference between these two types of mysql table collations:

utf8_general_ci and utf8_bin

Is one better than the other and if yes, can you explain why?

thanks

    utf8_bin: compare strings by the binary value of each character in the string

    utf8_general_ci: compare strings using general language rules and using case-insensitive comparisons

    utf8_general_cs: compare strings using general language rules and using case-sensitive comparisons

    For example, the following will evaluate at true with either of the UTF8_general collations, but not with the utf8_bin collation:

    Γ„ = A
    Γ– = O
    Ü = U
    

    With the utf8_general_ci collation, they would also return true even if not the same case.

    None of these is inherently "better"; they simply have different functionalities. You need to choose which one best suits your particular needs.

      thanks for your very clear answer πŸ™‚

      which makes me think to another question:

      what are the consequences if utf8_bin collation doesn't evaluate at true Γ„ = A

      I mean, will it change the way my php script work if I use some sort or str_replace functions?
      For example, does it mean these two tests below will return the same result?

      $string = 'Γ„'; // result that would come from my table
      
      str_replace("Γ„","foo",$string); 
      
      str_replace("A","foo",$string);

      thanks

        It only changes how MySQL will process queries; it has no effect on what PHP does. So if the WHERE clause of a query says "WHERE first_name = 'Bob'", the different collations would return matches for first_name values of:

        'Bob' : utf8_bin, utf8_general_ci and utf8_general_cs
        'BΓΆb' : utf8_general_ci and utf8_general_cs
        'BΓ–B' : utf8_general_ci

        Again, this is tranparent to PHP. It only affect how MySQL performs string comparisons while processing queries.

          May I ask your opinion on something? I'm currently developping a blog application in which users could perform a search in blogs posts or post titles. In this case, it would then make sense to use utf8_general_ci, don't you think?
          I'm looking for kind of a best practice for this thing, so you opinion is highly welcome πŸ™‚

            deezerd wrote:

            May I ask your opinion on something? I'm currently developping a blog application in which users could perform a search in blogs posts or post titles. In this case, it would then make sense to use utf8_general_ci, don't you think?
            I'm looking for kind of a best practice for this thing, so you opinion is highly welcome πŸ™‚

            Sounds good to me.

              I'd just like to say thanks as well to NogDog for that explaination too. πŸ™‚
              Cheers!

                Many thanks, it's all clear for me now

                  Write a Reply...