It depends on your architecture.
For example, you could have an HTTP gateway server that puts up all the HTML pages for navigation, document searching, etc. A repository server for storing the document images. And a database server to store metainfo of the documents and their locations on disk. They could all exist on the same machine, on separate machines. For performance, I would do it on separate machines.
When the user wants to search, he/she would type in keywords to find the document or use some other search criteria. A database server would hold the meta description of those document images. This will likely be a transactional database (see more on this below).
On return of the document search query, a list would be generated of the documents that match. The list generated would be allow the user to choose among various sizes and formats (JPG, PDF, etc). The list would essentially be HTML anchor links. This is the "GET" of the HTTP request.
So let's say the user searches for flowers. The list could look something like this:
Name: Dandelion
Created: 04/05/05
Description:
A picture of dandelions in my backyard that I need to weed.
PDF Full Thumb
JPG Full Thumb
Name: Roses
Created: 05/03/04
Description:
A bouquet of roses on mom's table.
PDF Full Thumb
JPG Full Thumb
As weedpacket mentioned, these images would be already rendered in various sizes and formats. In other words, they are pre-processed ahead of time to reduce the amount of processing on the server at the time of the request. Of course, the disadvantage to this is the amount of disk space needed.
You need to think carefully about how you are going to organize the files on disk. You also need to think carefully about how you are going to handle the situation when you run out of disk space. You need to design the application so that you can store the documents across multiple physical disks. Also be aware that the operating system will limit the number of files in a folder, for example. Thus, plan on application scaling because disk space will be a premium.
Now when a document is added to repository, it could be queued up. A background task would check the queue and do the conversions to the various sizes and data formats you require. Notice that this would likely be transaction based, so choice of the database is important. You may want to use mySQL InnoDB for this. The same goes true on a delete or update.
Now one would design this to not actually delete the files. They would be moved to a "garbage" folder where another background task would exist to delete (or another application to remove by administrator). You do this in case during the delete transaction, only 2 files get deleted out of the 4. You would then be able to rollback and restore from the garbage folder as the files would still exist.
These images can exist on the gateway server or on another server entirely. Probably best if its on another server to increase performance. You don't want to have background queue processing on a front end HTTP server stealing cycles while users are executing your application. The purpose of the HTTP gateway server should be to simply respond to search requests for your MIME objects (in your case, PDF documents and JPG images), retrieve and respond back to the client with the data object.
These are just some things at the top o f my head.
Feel free to contact me ts10 privately.