Looking for any PHP pros who know of a way to delete hidden control characters from a text file? I am using an excel reader class which outputs a hidden control charcter, the inverted question mark, in between each character. I have posted on thier forum about this problem, but have recieved no response.
I have tried a string replace using the ascii value

str_replace(chr(168),'',$str);  

but that doesn't work.
I have also tried trim() using the hex value of the char, but no luck there. Any ideas would be appreciated.
Thanks
PD

    How about just doing this:

    str_replace('¿', '', $str);

    Would that work or no?

    ~Brett

      Brett, I tried that one first, then I tried it like this
      to delete all uncommon ascii chars, but no luck.

      for($i=127;$i < 255;$i++){
              $st_test = str_replace(chr($i),'',$str);
      }
      

      Then I tried it using trim, and still no luck.

      trim($str,"\x00..\x1F");
      

      I think the issue is that they are invisible to php string recognition. I am looking at the multibyte functions such as mb_ereg_replace but that requires a recompile, so I am looking for another solution that would not involve me recompiling, since the multibyte functions may not work either.
      PD

        Hmm.... well I'm completely off here....

        I'm guessing that you've already printed the value just to double check what you're searching for.....

        Have you tried using the 'mb_string' decoder rather than the 'iconv' conversion?

        $object = new Spreadsheet_Excel_Reader();
        $object->setOutputEncoding('mb');

        That may be the solution to all your troubles....

        ~Brett

        ~Brett

          You probably want everything to be UTF-8 internally (otherwise you will get utterly confused).

          Then you can use preg_replace with the /u modifier which handles UTF-8 patterns and strings.

          mbstring is still needed though because functions like strlen() don't return the right result on UTF-8 strings, so you need mb_strlen() instead.

          Mark

            I have tried that one and several other encodings such as 'UTF-16LE', 'UTF-8' 'ISO-8959-1' and all produced the same result. Which is the inverted question mark in between each char.
            To see this I write the extraction of the cells to a file and then view the file showing the invisible chars. Originally I found this issue when I was inserting this information into my db, in which only the first char of each string was inserted. I can insert the data into a field which is of a text type, but then searching for the data becomes a problem, as well as the data be decieving, showing one thing while actually being something else.
            The text editor I use, BBEdit has a feature called zap gremlins, in which it deletes all non ascii chars including those inverted question marks, the data is then able to imported into the db, so if there was a way to 'zap the gremlins' in php that would solve my problem. I have posted this issue on the excel reader sourceforge.net forum, but haven't received any response yet. So I'm still looking.
            Thanks
            PD

              Mark, Are you saying that I will need the multibyte sting functionality compiled in so that preg_replace can read the UTF-8 encoding? And if the mb string functions are available then souldn't I use the mb_ereg_replace() function instead?
              PD

                Write a Reply...