290072 Whois and php CURL expert

Đang Thực Hiện Đã đăng vào Feb 17, 2009 Thanh toán khi bàn giao
Đang Thực Hiện Thanh toán khi bàn giao

Hi,

- We need a script to get information from SCRAPED urls from the main search engines (google, yahoo. msn) based in a keyword

- Each URL is based in a KEYWORD, each keyword give us results on each Search Engine

- Each Keyword or group of keyword can be stored/saved in a PROJECT

- We should use proxies (CURL)

- To srape the urls we recommend the use of CURL

- Each URL should be sent as a query to the public whois database or to Alexa to get the main details (URL, Company, Name, Phone, Adress, EMAIL)

- To avoid problems use CURL and use IE or Firefox agents

- USE a list of FRESH proxies (with CURL) - use random ones and jump from good to open ones and add into a black list the ones tha tare no working or have a long response

- Each project should be SAVED

- We should be able to export into a CSV file all the DATA from each project

- For GOOGLE use random WAIT between queries and searches, for yahoo use their API, for MSN use a msn passport to be able to scrape results and avoid problems

- For ALEXA use random WAIT too

- NO Duplicate records

- If email in the format AT or DOT - transform to (AT) or (.) (use string replace!)

- Use preg match too and comment the CODE to let us make changes after each Search Engine update

- We need a exclusion list IN THE DATABASE to avoid getting info from cnn, bbc, google... we dont need that info, we should be able to make that list grow over time.

Thanx,

J.

MySQL Odd Jobs PHP

ID dự án: #2036374

Về dự án

Dự án từ xa Jul 11, 2012 đang mở