create a Gnutella crawler that discovers all currently present peers in the system2

2. Description

Using Winsock and Visual Studio .NET 2013, your goal is to create a Gnutella crawler that discovers all currently present peers in the system. Your program will first contact a seed web- server to acquire a set of initial ultrapeers, traverse the entire Gnutella network in the BFS order, and then record the identities of found ultrapeers and their children (i.e., leaf nodes) in some text file. Using this information, you will then analyze the collected data to answer several questions about geographic and domain diversity of peers as well as popularity of individual user agents (i.e., client software).

Requirements for the implementation:

1. Must be able to connect to a GWebCache (specified at the command prompt using a URL

string host[:port][/path] where parts in [] are optional) and download a list of active seed ultrapeers. Make sure to check that the status code of the response is 200 OK and the protocol type in the first line of the response is indeed HTTP.

2. Must be able to use BFS to crawl the entire Gnutella network of ultrapeers starting from the seed list (each ultrapeer must be contacted no more than once, leaf nodes must not be contacted at all). Make sure to check that the response begins with the correct string compliant with the protocol (i.e., GNUTELLA/version statusCode statusText).

3. During the crawl, the program must record all found ultrapeers and their leaves into a set and then write it on disk at the end of the crawl (this set needs to contain unique elements only).

4. The final version must support operation with N threads and crawls up to M contacted ul- trapeers, where both N and M are specified by the user in the command prompt (e.g., [url removed, login to view] [url removed, login to view] 200 300000). For sim- plicity, count each ultrapeer pulled from the BFS queue as “contacted.”

