We would like to have a script which scrapes generic content from various sources over the internet.
Input/Output: The script should take input from a file and should be able to give the output in a file.
Example : It should take Company Name, Country details as Inputs, search the Internet for all publicly available information about the Company like Legal Name, Address, City, Postal Code, Legal ID, Phone Numbers, etc.
Consolidate details from various data sources on the web and create a structured data set comprising of attributes listed above as the output
The script should also be able to assign Confidence ratings to the retrieved data in case where different details for the same entity are extracted from different sources.