I am developing a personal script which queries a large DB and in turn a number of RSS feeds.
How should I go about it? Would it be best to query say 20 rows at a time, refreshing the script then another 20?
Any ideas welcomed.
Cheers!
It's hard to say without knowing the specifics, but my first instinct would be to do some sort of local caching of the RSS feeds so you don't have to retrieve them remotely on each page. This caching could perhaps be done via a cron script that saves the feeds either in files or to the DB, running the script as often as you deem necessary to keep the data fresh enough.
Thanks for your quick response. The class I use to retrieve RSS data does already cache the feeds. The script will need to be run every day and will most likely, at first, query around 5,000 feeds. However, the script needs to be scalable.
if it's not a user run script -
cron job, run it in the background, set it to never time out
dagon wrote:if it's not a user run script - cron job, run it in the background, set it to never time out
The only problem there is server resources? Plus is there a reliable way of stopping it from timing out?
the query per se does not affect resources/ it will create pointers to relevat records. while looping through the records, write the xml. It should be fine.