Hi all,
this is my first post here. I have the following problem. I want to read the contents of a word document with pictures and formatted text with a php script and then display it to the browser window without losing any information. I tried it with fopen , fread etc but the result was not what I expected.
The question: How can I read a formatted text (even a html or PDF but not plain text) and display it exactly as it is in the browser?
Is it possible or am I asking too much?

My PHP knowledge is limited.

Thanks in advance

babil

    if your running on a windows server there is a word.application object availabe using COM.

    look up the COM functions in the manual. i'll try find some more info on the word.application object for you.

      that is an extremely advanced topic as word and pdf documents, are binary documents. you may be able to search google for word file formats, as the beginning of most files like that contain binary data describing the documents contents, but it still will be extremely difficult to parse the document and output its contents exactly. think you will have to know what text is bold, larger size, what font, then find where an image is, decode the image data to output it, and place it in the right spot, read table data. with limited php knowledge its probably not a good first project.

        the problem is that the server runs linux so I can forget the COM functions. I alrady searched google and found nothing. Am I the only one with this problem??

        Regards
        babil

          i know there are many sites out there that do this as a service but they dont sell their software, and the few free scripts, all use com objects.

            thanks everyone, there seems to be no easy solution to this. Is there any other way to do the same thing, to have the contents of a specific section of the site in an external file ( formatted text and pics) and pass it to the browser? I don't have any other idea after spending the last two days in front of my monitor.

            (I think it can be done with mySQL, but no time to learn how to use it)

            cheers
            babil

              a html file is easy enough to open and read using fopen. what exactly is it you are trying to do?

              you mention you think you can do 'it' with mySQL, so now im a little lost as to what 'it' is.

                ok let me explain. I had a word doc with text and pics. I saved it as html via word. Thts the code for opening and displaying it.
                When i display it , the html formatting tags are also visble.
                How can I get rid of them?

                babil

                <?
                $file = 'TAE/tae21.htm';
                $data = file($file) or die('Could not read file!');
                foreach ($data as $line) {
                echo nl2br($line);

                };

                ?>

                  if you want to strip a document of all tags you can do something like this:

                  
                  $data = preg_replace("/<[^>]+>/", "", $data);
                  
                  

                    it did help in getting rid of some tags, but my html is turning now into a plain text (not completely)

                    cheers
                    babil

                      You could consider looking into PHP's [man]tidy[/man] extension. I haven't used the extension, but I do use tidy as a standalone app to strip out Word markup.

                        Thanks for the link to tidy. I am going to check it out tomorrow cause is getting late now.

                        To clear some things out, I am posting my whole code.

                        My aim is to make a website which will be easily updateable. No need to write html code, placing images etc. Just create a word, pdf etc document with the updated content plus images and then upload it to the server.

                        This is the page layout:

                        INDEX.PHP

                        <?php
                        require_once("class.php");

                        if(!isset($fpage))
                        {
                        $fpage = "news.php";
                        }
                        else {
                        $fpage ;
                        }

                        $page = new Page("template.html");

                        $page->replace_tags(array(
                        "title"=>"Hoav Website",

                        "contenthd"=>"Anouncements",
                        "content" => $fpage,
                        "navbar" => "board2.htm",

                        ));

                        $page->output();
                        ?>

                        END INDEX.PHP

                        BEGIN CLASS.PHP

                        <?php
                        class Page
                        {
                        var $page;

                        function Page($template = "template.html") {
                        if (file_exists($template))
                        $this->page = join("", file($template));
                        else
                        die("Template file $template not found.");
                        }

                        function parse($file) {
                        ob_start();
                        include($file);
                        $buffer = ob_get_contents();
                        ob_end_clean();
                        return $buffer;
                        }

                        function replace_tags($tags = array()) {
                        if (sizeof($tags) > 0)
                        foreach ($tags as $tag => $data) {
                        $data = (file_exists($data))? $this->parse($data) : $data;
                        $this->page = eregi_replace("{" .$tag. "}", $data,
                        $this->page);
                        }
                        else
                        die("No tags designated for replacement.");
                        }

                        function output() {
                        echo $this->page;
                        }
                        }
                        ?>

                        END CLASS.PHP

                        BEGIN TEST.CSS

                        #body {margin:0px;
                        background-color: #336699;
                        overflow:auto;

                        }

                        #main {
                        width:100%;
                        vertical-align: top;
                        height:100%;
                        border: none;
                        overflow:scroll;

                        }
                        #head {
                        background-image:url('head.jpg') ;
                        background-repeat:repeat-x;
                        height:60px;
                        position: relative;

                        }
                        #navbar {
                        width: 20%;
                        background-color:#336699 ;
                        vertical-align:top;
                        border:solid 2px #336699;

                        }

                        #content {
                        width: 80%;

                        background-color:#003366;
                        vertical-align:top;
                        border:solid 5px #000000;
                        text-align: center;
                        color:#FFFFCC;
                        font-size: 35px;
                        font-family: serif;
                        overflow: scroll;
                        font-weight: bold;
                        }
                        #p {
                        color:#ffff99;
                        font-size: 15px;
                        text-align: left;
                        font-family: serif;
                        font-weight: lighter;
                        padding-left: 5%;
                        }

                        END TEST.CSS

                        BEGIN TEMPLATE.HTML

                        <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

                        <html>
                        <head>
                        <title>{title}</title>
                        <meta http-equiv="Content-Type"
                        content="text/html; charset=iso-8859-1" />
                        <link rel="stylesheet" type="text/css" href="test.css" />
                        </head>
                        <body id="body">
                        <table id="main" cellpadding="0" cellspacing="0" >
                        <tr>
                        <td id="head" colspan="2">{head}</td>
                        </tr>
                        <tr>
                        <td id="navbar">{navbar}

                        </td>
                        <td id="content">{contenthd} <p id="p">

                        {content}
                        </p>

                        </td>
                        </tr>
                        </table>

                        </body>
                        </html>

                        END TEMPLATE.HTML

                        BEGIN NEWS.PHP

                        <?

                        $file = 'news.txt';
                        $data = file($file) or die('Could not read file!');
                        foreach ($data as $line) {
                        echo nl2br($line);
                        };

                        ?>

                        END NEWS.PHP

                        BEGIN TAE.PHP

                        <?
                        header('Content-Type: text/html');
                        $file = 'TAE/tae2.htm';
                        $data = file($file) or die('Could not read file!');
                        foreach ($data as $line) {
                        echo nl2br($line);

                        };

                        ?>

                        END TAE.PHP

                        Sorry for the long post, but now you can have an overview of my project.

                        Good night and thank you for your efforts.

                        babil

                          Write a Reply...