I need a spider that starts crawling from the URL I enter and then starts crawling all the websites it founds.
It must have the following features:
-tld filter: it must visit only the domain names I specify (e.g. visit only .com or visit only .net)
-it must not visit subdomains
-must save only the domain names it finds and visits. I don't need to index web content: I just need domain names.
-I can set a limit to pages visited for each website
-Must be fast and index thousand of domain names per day - it must run 24/24/7. Must run forever and automatically
-Must save results into external MySQL db, so that the spider can run on several computers at once
-I must have the possibility to use some exclusion filters (e.g. don't follow [url removed, login to view],[url removed, login to view] etc)
-I don't need graphic interface, it can run even from command line
Please contact me for more info: eraser [ a t ] [url removed, login to view]