Hi all,
I would like to know how to convert .doc files (or equivalent, like rtf etc) to HTML (or equivalent, like XML) using PHP.
Any help or links would be much appreciated.
Thanks
Hi all,
I would like to know how to convert .doc files (or equivalent, like rtf etc) to HTML (or equivalent, like XML) using PHP.
Any help or links would be much appreciated.
Thanks
I have not found any good solutions for converting .doc to .html other than if running on a Windows server that has MS Office available so that you can access it via [man]COM[/man] functions, then use Word's ability to save as HTML. I tried using some stuff I found to do it via OpenOffice.org-related functions, but could not get it to work.
Interesting question, I may need to look into this in the near future.
So lets change direction a bit on the approach. How about using Javascript? As presumably the user would have the app that made the doc could javascript convert it?
Just throwing the thought out there, feel free to shoot it down.
As for myself, I'd been looking for something that could take uploaded resumes and convert them to HTML for display in search results (a niche job site). Should anyone come up with a solution that works on a Linux server, I think the client would still be interested (and thus so would I).
I have a linux server, so the COM idea would not work.
I'm not sure if Javascript is capable of accessing MS Word functions on the client's computer, is it? If anyone has tried it, please let me know.
computer_nuke;10903072 wrote:I have a linux server, so the COM idea would not work.
I'm not sure if Javascript is capable of accessing MS Word functions on the client's computer, is it? If anyone has tried it, please let me know.
Within a normal web app, it should not (that would be a serious security hole). Within IE as an Active-X component, it probably could, but that would require that the user (a) is using IE and (b) is willing to install the Active-X component.
Designmode
Did a little experiment with the designmode in javascript and found I could copy and paste formatted text from an Open Office document (to IE and Firefox) and got the formatting (bold, font and font size, unordered list and indents) to be copied over as well.
Not sure if it will work for Word but the file was an rtf. I am not sure if you can import a files contents and get the formatting into the field as well. Need to write some code to get all that to happen. But right now I need to step out for a bit so won't be trying that tonight. But maybe someone else will want to play with it.