A client needs to crawl some sites. We just the emails and the domain it came from, he has a software that does that very well but he does not want to spend his bandwidth but another's servers.
This software scans the whole site looking for email strings and saves it. That's what he needs, together with the name fo the portal in the database so that he can send them emails related to it later.
2 portals are really easy. Create a script that scans the whole portal and looks for emails with some kind of verification and then eliminating duplicates and strange emails (non correct format)
The third one is a list of thousands of forums that are open with many many emails. We should scan all pages in each forum looking for emails and save the name of the forum from which it links ok? In red and in parentheses you have the number of forums in each category. He needs to crawl emails that can be found in all forums (he already did a couple with a software and perfect. But he wants to do it from a server automatically.
On the list of a category you see this the list of forums.
So we need to visit each forum, crawl emails and save database with category and name of the forum. Or have a folder with each category and inside each .xls file with emails of that forum, with name "the forum itself". Just run them from a server somewhere and give me the results.
These scripts should be usable in the future as well.
Can you do this. Send me a PM