XML Fetcher Cron Challenge: How often and how much do you choose?
I have a PHP script on a shared web host that picks out from ~ 300 "feeds" 40 that haven't been updated in the last half hour, makes a cURL request, and then delivers it to the user.
SELECT * FROM table WHERE latest_scan < NOW() - INTERVAL 30 MINUTE ORDER BY latest_scan ASC LIMIT 0, 40;
// Make cURL request and process it
I want to be able to deliver updates as quickly as possible , but I don't want to intimidate my server or the servers I receive (these are just a few).
How often should I run a cron job, and should I limit the number of sets per pass? How much?
source to share
It would be nice to "estimate" how often each channel actually changes, so if something has an average of 24 hours per change, you just get it every 12 hours.
Just save #changes and #try and select the ones you need to check ... you can run the script every minute and let some statistics do the rest!
source to share
On a shared host, you may also run into script runtime issues. For example, if your script runs longer than 30 seconds, the server may terminate. If this is the case for your host, you can do some testing / logging of how long it takes to process each feed, and take that into account when you determine how many feeds you should process at the same time.
Another thing I had to do to help fix this is mark "last scan" as updated to . I have processed every single request so that the problem feed does not keep failing and is selected for every cron run. Optionally, you can update the record again on failure and provide a reason (if known) why the failure occurred.
source to share