Hi, I have a CMS that allows users to upload files, and then search them. I need to let them upload and search RTF files, so I need to be able to fetch the text content out of there and store just that in my DB.

I've been trying the rtfclass.php that I got from phpclasses.org but it doesn't seem to read the RTF I have here, at least not into HTML output. It does read into XML but it is including a lot of formatting info as content.

Anyone know of any other RTF parsers?

    Originally posted by DanSearle
    I've been trying the rtfclass.php that I got from phpclasses.org but it doesn't seem to read the RTF I have here, at least not into HTML output. It does read into XML but it is including a lot of formatting info as content.

    Maybe that Markus Fischer bloke who wrote it might be able to help. Me, I've never seen it, nor have I seen your code. For further help here you'll need someone who has read and is familiar with one or the other.


    Anyone know of any other RTF parsers?

    Why, yes; there's the RTF Parse class written by Sergio Manciles. You can find it by searching phpclasses.org.

      Hi, the one written by Sergio Manciles is in fact just an update of the original written by Markus Fischer.

      I've hacked at the class a bit and got it producing plain text output (instead of XML or HTML), but it doesn't handle accented characters well (they stay encoded), and it still outputs a tonne of formatting info that is going to confuse my search results.

      MF's website is not available, but I've dug thru the Wayback Machine's archives and find this comment on the RTFParser class -

      First of all, this class is not yet finished and I don't know if and when I'll work on it again. I didn't lost interest on it but I just started it in my spare time (and spare time got very short lately). You can view and download the source here.

      It's main purpose was always to convert small sized rtf text procuded by the Microsoft richtext control they provide for Visual Basic. My testings are only based on it. So don't expect to get any useful output from a winword document. But you can use the VB application I wrote for testing; the basic styles should work (no promise tough). They VB application for testing is also available for download.

      He wrote that in 2001, after the final update (by him) to the class was made.

      As I need to parse RTFs created by anything, but most likely MS Word or WordPad I don't think this class will do it.

        Write a Reply...