Hi,
Is there a way to read a Word document with php and detecting styles (headings and such, I think this is used to convert a .doc file to xml?) of the document?
I want to parse a word document, seperate the Titles, Subtitles and Body and store them in each its own column in an MySQL database.
The person writing my Word documents would use a certain format. He would use a certain heading(does this have anything to do with macros?) for the Titles and a certain heading for the Paragraphs, therefore when php reads it and incounters a certain heading, it would know in which column to store it.
Better yet! I have an html form that I use to populate the database. This way I don't have to work with id indexing to relate multiple tables, everything is done automatically. Is there a way that I can read the document, populate the text fields and textareas in the same manner as I explained above, then I could submit it to the database?
I also need to preserve the html tags for tables, lists, lines breaks andimages. The rest can be striped out.
I was thinking I could create a php page with a button, when pressed, it would open the document, converting it to HTML, cleaning it somehow(Dreamweaver doesn't do a good job with that!) and preserve only the tags I need. While reading the document, php would detect titles and columns and would put a custom tag around it, something like <title></title>. Then it would load the form I created and populate it, putting everything in <title> in the Title field and everything in <body> in the definition field. Then I could submit it to the database! Whew!!
Sorry for writing so much text, I was getting these ideas while I was writing.
Before I start coding myself to death, I just wan to know if this seems possible?