Đã Đóng

Fast Webpage Crawler and Scraper

I need a crawler that will crawl a list of domains that I will load in from a CSV file. The crawler needs to crawl ONLY THE LANDING PAGE - not the entire site - and capture the following and output a CSV file and stored to Dropbox:

1) Does URL have Google Analytics code - yes or no.

Use a search for "Google Analytics" in the source of the page.

2) Is there a link to a privacy policy on the page - yes or no.

Use a search for the word "Privacy" in the link text

3) How many unique internal URL links are present on the page. Return link count.

4) Is the URL secure (SSL) - yes or no.

5) Is the URL mobile-friendly - yes or no.

Use a search for "meta name="viewport"" in the source of the page.

6) Is the domain parked - yes or no.

Look for keywords or phrases in the source code.

7) Is a phone number present on the page? yes or no. Capture the phone number.

8) URL being crawled.

The crawler must be capable of crawling 70,000 URLs per hour.

To be successful, the script will be tested using 70,000 URLs in one hour.

Kĩ năng: Web Scraping, Web Crawling

Xem nhiều hơn: website crawler online, web scraper, best online website crawler free, parsehub, best online website crawler, web crawler, octoparse, scrapy, export data webpage excel sheet using vba, implement html scraper using net, using sql mobile application, sample using sql mobile edition, website scraper using, using sql mobile net, send video streams using rtsp mobile j2me, java upload video using rtsp mobile, fast food ordering system using, using javascript mobile website design, webpage crawler, delete record using sql mobile edition vbnet

Về Bên Thuê:
( 20 nhận xét ) Sheb, United States

ID dự án: #23485154

18 freelancer đang chào giá trung bình $221 cho công việc này

mhmhz

Hi I can deliver a multi-threading desktop tool that process 70k per hour Thanks

$400 USD trong 3 ngày
(97 Nhận xét)
7.6
mantislin

Hi there, I am scraping expert, I have did more than 500+ scraping project, please check my feedback then you will know. Can we discuss more details about this project? then I will provide example data/script for you Thêm

$160 USD trong 5 ngày
(296 Nhận xét)
7.6
chirgeo

Hi. I did read the project description and have a few questions. 1. Do you need the script as well or data only? 2. What is the format of the output data? CSV is OK? We can do other formats as well. 3. Which fields do Thêm

$200 USD trong 4 ngày
(50 Nhận xét)
6.9
cyberloh

hello, i have a 13 years of experience with such tools development - you can check my profile reviews. so i can build such a script for you quick. and you don't need 15 parallel threads or some special vps with it, it Thêm

$180 USD trong 3 ngày
(8 Nhận xét)
6.0
songhku925

Hi. i can develop script to scrape 70,000 URLs per hour using multi-thread function. I have already finished lots of Web Crawling, Scraping and Automation project using python Scrapy, Bs4, Selenium. I can extract data Thêm

$100 USD trong 3 ngày
(7 Nhận xét)
5.0
Biocanin

Hi there. I have carefully read your job description with great interests. I have experience with python, django,selenium and bs4(beautifulSoup) for about 5 years. Please visit my past references here. -http://cars Thêm

$200 USD trong 7 ngày
(7 Nhận xét)
4.4
pandread1

Hi, Im an expert at Web Scraping with Python. The task is clear, we need multi-threaded/asynchronous programming to achieve that speed. It also depends on your network bandwidth but thats supposed to be alright. Conta Thêm

$100 USD trong 3 ngày
(12 Nhận xét)
4.4
VileGnosis

I can make a desktop application that will be multi-threaded to download as many pages as possible. However its speed depends entirely on your internet speed and the response times of the websites its download from. My Thêm

$150 USD trong 1 ngày
(11 Nhận xét)
4.6
iautomationus

Scraping 70,000 urls in an hour is entirely dependent on the hosting this bot will run on. You'll need atleast 15 parallel threads, which is provided on many VPS, VM, and dedicated hosting providers. I've experience wi Thêm

$230 USD trong 2 ngày
(25 Nhận xét)
4.5
bilellh

Hi, i'm an expert in highly responsive website with optimale web technologies.I could do the job perfectly. i will work this project with c # desktop application with buttons, progress bar, multithreading. everything i Thêm

$200 USD trong 7 ngày
(4 Nhận xét)
3.4
rty567

Hi. I can implement python script (with scrapy framework) which will scrape all data needed. BTW: I have implemented similar multi-site scrapers before Have a question: do you need a support of javascript generated sit Thêm

$200 USD trong 5 ngày
(1 Nhận xét)
2.4
valigg

Hey I can provide such scraper done in python + Scrapy. If you have in mind a faster solution than Scrapy, I would like to here what that is. I will integrate with Dropbox SDK for uploading results there. I will wrap Thêm

$100 USD trong 1 ngày
(1 Nhận xét)
1.0
lubomirkalinin43

Hi there. I saw your requirements carefully and I am able to complete it perfectly. I have developed many project similar to your project. I am able to develop your project fast and perfectly. If I am offered, I will p Thêm

$500 USD trong 2 ngày
(0 Nhận xét)
0.0
cristianR909090

Hi I can help you to do your interesting work. And here is my skill: ◆Web Development: Websites Back and Frontend◆ Woocommerce, Magento, Shopify, Prestashop ASP.NET Webform/MVC/WPF/WCF Data Scraping AngularJS, VueJS Thêm

$199 USD trong 7 ngày
(0 Nhận xét)
0.0
bravemaster701

We are experienced crawler. We completed 10+ crawling project in several languages like python, c#, java, php , no matter whether ajax page or not. We can solve your all requirements except one. Crawling speed is depen Thêm

$140 USD trong 7 ngày
(0 Nhận xét)
0.0
sharifahmad2061

I will develop a spider for you using Python Scrapy Framework. The framework supports asynchronous web requests which will pass the 70000/hr requirement. Text me to discuss further.

$120 USD trong 3 ngày
(0 Nhận xét)
0.0
vpnshogun

Hello there, i have expirence with such type of crawlers. You can contact me and discous details. Do you have any specifics in mind how to be implemented? what type of language, proxies cloud/clusters and so on? Chee Thêm

$555 USD trong 7 ngày
(0 Nhận xét)
0.0
ankleshsingh

Hi, Hope you are doing good. I have gone through your requirement and I do have skill set you are looking for. I have expertise in scrapping as well, refer below links shared for review from seller. I am certified a Thêm

$250 USD trong 3 ngày
(0 Nhận xét)
0.0