Hello everyone!

I'm new and have a little problem with my excel script..

For a friend i have created a little script to read XLS files... in these XLS files there are a couple of COLUMS. Each collum contains a conversion of another colum.. like

COLUM A - Dutch language
COLUM B - Thai language
COLUM C - English language

so each 3 rows there are 3 words that means the same but in other languages.

Reading these values isn't the problem... i can read the data out of the XLS sheet... but only COLUM B displays ?? ? ? ?? ? because the carakterset is wrong in anyway!

This is the script

<style>
td {
    font-family:"Angsana New";
    panose-1:2 2 6 3 5 4 5 2 3 4;
    mso-font-charset:0;
    mso-generic-font-family:roman;
    mso-font-pitch:variable;
    mso-font-signature:16777219 0 0 0 65537 0;
}
</style>
<?
function conv($string)
{
    $str = "";

for ($x = 0; $x < strlen($string); $x++)
{
    $str .= ord($x);
}
return $str;
}

$excel = new COM("excel.application") or die("Unable to instanciate excel");

//bring it to front  
$excel->Visible = 1;//NOT
//dont want alerts ... run silent  
$excel->DisplayAlerts = 1;  

//open  document  
$excel->Workbooks->Open("d:/excel.xls");  

for ($h = 1; $h <= 3; $h++)
{
    $book  = $excel->Workbooks(1);
    $sheet = $book->Worksheets($h);

    echo "<table>\n";
    //Wtite row
    for ($i = 1; $i <= 150; $i++)
    {
        echo "<tr>\n";
        //write colom
        for ($j = 1; $j <= 4; $j++)
        {
            // echo $sheet->cellsinfo();


            $cell = $sheet->Cells($i, $j); // #Select the cell (Row Column number)
            $cell->activate;               // #Activate the cell
            // echo ($cell->font->FontName);

            if($cell->value == '0')
            {
                $td = "";
            }
            else
            {
                if ($j == 2)
                {
                    // thai language so this needs to be converted in anyway!
                    $td = htmlspecialchars($cell->value);
                }
                else
                {
                    $td = $cell->value;
                }
            }

            echo "<td>" . $td . "</td>\n"; // #write the cell
        }
        echo "</tr>\n";
    }
    echo "</table>\n";
}

//closing excel  
$excel->Quit();  
?>

I've tried to search with google for help for my problem, even posted it on the usenet but didn't receive any answers that could help me solve this problem.

Also tried this..

<?php
    $excel = new COM("excel.application", NULL, CP-874) or die("Unable to instanciate excel");
?>

and other CP charsets but didn't find the correct one..

Does anyone have an idea? i hope so... would like to fix this (hopefully little problem).

Ps sorry for my bad english! 😉

Resources checked:
www.google.nl for excel.application / charset / thai / php and more..
www.phpbuilder.com/columns/alain20001003.php3
other forums and usenet..

Thanks for your time reading this topique!

    I've continued my search but google and other search machines cannot give me the correct answer... please anyone!!

      [SOLVED!] Fixed by adding this

      function utf8entities($source)
      {
          // [url]http://www.asp-php.net/ressources/codes/PHP-UTF8+entities.aspx[/url]
          // array used to figure what number to decrement from character order value
          // according to number of characters used to map unicode to ascii by utf-8
          $decrement[4] = 240;
          $decrement[3] = 224;
          $decrement[2] = 192;
          $decrement[1] = 0;
      
      // the number of bits to shift each charNum by
      $shift[1][0] = 0;
      $shift[2][0] = 6;
      $shift[2][1] = 0;
      $shift[3][0] = 12;
      $shift[3][1] = 6;
      $shift[3][2] = 0;
      $shift[4][0] = 18;
      $shift[4][1] = 12;
      $shift[4][2] = 6;
      $shift[4][3] = 0;
      
      $pos = 0;
      $len = strlen($source);
      $encodedString = '';
      while ($pos < $len)
      {
          $charPos = substr($source, $pos, 1);
          $asciiPos = ord($charPos);
          if ($asciiPos < 128)
          {
               $encodedString .= htmlentities($charPos);
               $pos++;
               continue;
           }
      
           $i=1;
           if (($asciiPos >= 240) && ($asciiPos <= 255)) // 4 chars representing one unicode character
               $i=4;
           else if (($asciiPos >= 224) && ($asciiPos <= 239)) // 3 chars representing one unicode character
               $i=3;
           else if (($asciiPos >= 192) && ($asciiPos <= 223)) // 2 chars representing one unicode character
               $i=2;
           else // 1 char (lower ascii)
           $i=1;
           $thisLetter = substr($source, $pos, $i);
           $pos += $i;
      
           // process the string representing the letter to a unicode entity
           $thisLen = strlen($thisLetter);
           $thisPos = 0;
           $decimalCode = 0;
           while ($thisPos < $thisLen)
           {
               $thisCharOrd = ord(substr($thisLetter, $thisPos, 1));
               if ($thisPos == 0)
               {
                   $charNum = intval($thisCharOrd - $decrement[$thisLen]);
                   $decimalCode += ($charNum << $shift[$thisLen][$thisPos]);
               }
               else
               {
                   $charNum = intval($thisCharOrd - 128);
                   $decimalCode += ($charNum << $shift[$thisLen][$thisPos]);
               }
      
               $thisPos++;
           }
      
           $encodedLetter = '&#'. str_pad($decimalCode, ($thisLen==1)?3:5, '0', STR_PAD_LEFT).';';
           $encodedString .= $encodedLetter;
      }
      
      return $encodedString;
      }
      
      $excel = new COM("excel.application", NULL, CP_UTF8) or die("Unable to instanciate excel");
      

      and using this function for converting the ??? chars to UTF8 &# symbols..

        Ok, so you should use UTF-8. this is always the right way.

        But us MSExcel via COM is not reliable enough for use in a production application. Do not do so, otherwise your application will definitely fail on a regular basis.

        Everyone who tries to do this discovers that it fails. The MSOffice components are not designed for server-side usage, and aren't robust enough.

        Mark

          I do NOT completely agree with you. I think it's good to use if you use it for simple reasons. Otherwise you'd better could convert the file to XML and then get the data out of it.

            Write a Reply...