A client of us would like to create an application that crawls and indexes MSN Groups site at [url removed, login to view]
Let me explain what he needs more in more detail, as far as crawling and indexing is concerned:
- Crawl all MSN Groups that exist in Spain
- Crawl all MSN Groups that are related to traveling, dating, friendship, from all other countries, or any other category that he would like to select.
- Crawl specially all MSN Groups related to traveling to Spain and love.
- Index all of these groups in categories (traveling, science, sports, people, etc), as MSN have them on their own site.
Apart from this, the application should have features like:
- Easily subscribe to all of these groups with different email addresses that we should be able to create with hotmail or gmail
- Crawl all the selected categories or groups, grab email address from the groups forums/posts/threads, so that we can post an email to them from a centralized application that would control that in the admin panel
- Crawl all groups and locate the Forums where users that belong to this group post their messages.
- Once the forums have been located, for instance [url removed, login to view] (spanish group related to the trip of your life), we should crawl messages posted and look for emails on them, ticking those threads as "with emails". We should be able to post a message from the application that we create, and track answers from those emails.
- The application should be able to re-crawl and re-index the indexed groups (and new ones) every X days (category by category, taking into account preset priorities), reindexing and searching for new users on the groups, emails of those people and notify the application about them.
- We should have an admin area where all users that we have created with hotmail (or similar) are showing, new messages received on their hotmail accounts, check their emails from there, reply to the forum where the message belongs to and track that message sent.
This is just a brief summary of overall features.
Anybody with experience at doing this kind of crawling and applications? Please send me a PM with your experience on these fields or if u ever did anything similar to this. Please let me know in your PM how you'd do all this, time required and feasability of all client's needs.
Looking forward to your PM