Need simple script[Python, Node, PHP or language of your choice] to get and parse web page.
₹600-1500 INR
Đã đóng
Đã đăng vào gần 5 năm trước
₹600-1500 INR
Thanh toán khi bàn giao
I am not looking for a general proposal. Bid only if you have time and you can work on this. It's not a big task, so if you can do this I will award it to you.
Requirement detail is as follows:
Input will be a web URL output should be data parsed from that page. I should be able to run it for multiple URLs in loop/parallel later. Details are as follows.
I want to crawl a web page [login to view URL] and parse data points from the crawled page.
Sample Page URL:
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
There are a total of 4 sections on the page from which we need to parse the information: Top section, Contact Section, Directions Section, Detailed Information Section. Detailed data points required are as follows:
Top Section
- Business Name
- Business City
- Business State
- Business Description
Contact Section
Company Contact Section:
- Company Name,
- Company Phone
- Company Email
- Company Website
- Company Social links like FB, LinkedIn, Pinterest, Twitter, Instagram
- Additional Phone
- Additional Emails
- Additional Websites / Link
Person Contact Section: (Note Contact Person can be multiple)
- Person Name,
- Person Role/Title
- Person Phone
- Person Email,
- Person social links like FB, LinkedIn, Pinterest, Twitter, Instagram
Directions Section
- Address
- Open Hours
Detailed Information Section:
- Location Type
- Established on year
- Annual Revenue Estimate
- Total Employees
- SIC Code
- Business Category
- Memberships
- Page Administered by link and name
You need to write a script to get a page[considering redirections, cookies, etc.] and parse the page to extract above mentioned data points. And loop it for a list of input URL values.
If this goes well, you might have the next job!
Thanks
Hi,
I have worked on similar projects before (using Python) & this is definitely not a SIMPLE script as you mentioned.
A lot of effort will go into inspecting multiple sample webpages and it's behavior (if there is Javascript which works on a click and other actions, it will be a little more difficult). Then we need to look at each website which might have different information available and how to handle that. Creating a loop and extracting the data is the easier part. But the effort required to understand where to look and what to extract will make this project worth much more than what you have given as the budget range.
Also, the source code for the web-scraping is something that would not come cheap. You can potentially use it to scrape millions of other websites with the same structure.
Cheers!