I have an app that write user input to a database. In most cases users are just pasting data into a textarea, and I've noticed that whenever they use an em dash (I assume pasting from Word) it is causing problems when it's reloaded for editing. The reload is done using AJAX, and the JS hangs when these em dashes are present in Internet Explorer. In Safari or FF it doesn't hang, but it doesn't display the em dash properly.

I assume that the issue in page character encoding. If I look at the same data through phpMyAdmin it's fine. I have replicated the utf8 encoding that is used by phpMyAdmin but still no luck. I've also tried doing a replace for all the possible em dash encodes (–, /x97, —, —, —, �, &#8195😉 but it's not actually replacing the character.

Anyone have any ideas of how I can encode for these em dashes properly, or str_replace them with a normal hyphen so it's not a problem?

Thanks,
Gord

PS-I have the same problem with curved single and double quotes, but I have managed to str_replace those using their \x code.

    You could try this, taken from here and changing "-" to "--" (or you could try changing it to "—"):

    <?php 
    
    function convert_smart_quotes($string) 
    { 
        $search = array(chr(145), 
                        chr(146), 
                        chr(147), 
                        chr(148), 
                        chr(151)); 
    
    $replace = array("'", 
                     "'", 
                     '"', 
                     '"', 
                     '--'); 
    
    return str_replace($search, $replace, $string); 
    } 
    
    ?>
    

      Thanks for the post NogDog. For some reason those ASCII codes don't seem to be catching the em dash properly, but I think character encoding must have solved the problem because I can't replicate it right now. I added the following to the page and that seems to have helped:

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

      I don't know why I didn't have that there before, since I usually encode all pages with utf-8. The only thing I wonder now is if they are pasting in with some strange font, could it be causing the issue... time will tell if the problem persists, but I can replicate it as of now.

        Write a Reply...