Seeking a perl or php script to parse out the referrers from apache logs, then check to see if the google search API to see if URLs are cached.
Read multiple log files and parse out the unique referrers excluding referrers from a list of domains.
Check if the URL is listed in a mysql database.
If the URL is not listed, query the google API and see if the file is cached.
Log the URL, timestamp and a cached boolean in the mysql database.
If API returns a max query exceeded error, log to a status file and end the program.
Google API key and the listed of excluded domains should be a config file.
Post any questions.