hey guys-
I am doing a face-lift for a client on their website. In addition to the face lift, they want to put their old newsletters on the site in a searchable format.
There are litterally 10,000 pages which have already been scanned into pdf format.
I have read all I can find on pdf2ascii, and none of it makes sense to me. The other convers i have found work great- but i don't have all the time in the world to sit there 10,000 times and create html pages out of these. What I want to build is a pdf search engine without a database if possible. Just a simple search and find by the inputted search string. speed is not a big issue to me or the client- they just want it to work.
most of all, they don't want to pay me my hourly rate for 3 years trying to get all these things online and searchable.
Is there ANY way at all out there that i can wirte a simple script (form and a handler) that will browse ALL the pdf's in a folder and return results?
I have no clue how to even start this- all i know is that i DON'T want to do it manually.
any tips, links, programs, anything would be of great help.