Đã Đóng

Webcrawler / Spider - Data Extraction

Webcrawler / Spider - Data Extraction

We need a webcrawler / spider that can collect the technical specifications of a particular product

•In essence we will want to input a name and or model number of a particular product and the spider should extract the technical specifications from multiple websites (10-20), you may want to query Google first for the top 10-20 results and then crawl those sites. The number of product could range from 100 to 1000's at a time and we should be able to upload the list with a csv or similar.

•The next step in the process is some level of “fuzzy logic” that will compare the specification names/fields and identify a tolerable level of similarity between the different results and that will be the field label for that particular feature/ specification. i.e. there are generally key technical specifications always mentioned for a particular type of product for example: megapixels for digital cameras.

•The next step is to apply similar same fuzzy logic for the actual specifications themselves as often webmasters don’t always post data accurately or completely and leave some specs out.

•All the data should then be stored in a database that is searchable. The data should be presented in a tabular format.

•Where possible the pdf’s with the technical specifications and or user manuals of the said product, a URL should be supplied by the application, the source URL’s of the data should be included as well

•Our preference is for a web based solution using open source such as php and mySql . The application must be secure and scalable.

•We will require a web based front end to display the results to users, so integration into a CMS such as Wordpress or Joomla would be preferable.

We have many ideas of the logical flow of achieving the above as well as the bigger picture to this entire project, however this will be shared with those short listed as potential suppliers. The code must belong to us and you must be prepared to sign a NDA.

This is the initial project and based on the success of the project there will be ongoing enhancements and features required. Please make sure to read the above properly and send through any questions you have as well as constructive responses.

Kỹ năng: Khai thác dữ liệu, MySQL, PHP, Kiến trúc phần mềm, Web Scraping

Xem thêm: wordpress websites step by step pdf, wordpress webmasters, web scraping solution, web scraping process, top query, spider web data extraction, solution specification, scraping web for ideas, range query, nda model, names for websites ideas, integration specification, google tabular, example of nda, data front, data extraction from web, data spider php, Webcrawler, web extraction, Web Data Extraction, technical specs, spider, similarity, searchable PDF, scraping pdf

Về Bên Thuê:
( 22 nhận xét ) Zichron Yakov, Israel

Mã Dự Án: #1625679

8 freelancer đang chào giá trung bình $1150 cho công việc này

phpMaestro

Hi, We have designed and built websites for various types of businesses very effectively. We work with all of our clients individually to easily coordinate and to keep track of the requirements and scope. We fulfil Thêm

$1450 USD trong 30 ngày
(224 Đánh Giá)
9.1
phpXpertbd

I specialize in similar projects. Please check PM for more details.

$1250 USD trong 18 ngày
(27 Đánh Giá)
6.0
seekdeveloper

Hi, Kindly check PMB

$1500 USD trong 14 ngày
(16 Đánh Giá)
5.7
intellisense1

Respected madam, i am ready to work on this project. Please give me honor to work with you. Also check your inbox. Thanks

$1000 USD trong 30 ngày
(35 Đánh Giá)
5.5
Bistechsupport

Hi, This is Jeni from bistech support. We would like to inform you that we could able to develop this wordpress project. Please find the attached document for your reference. our ball park quote is 750 U Thêm

$750 USD trong 7 ngày
(17 Đánh Giá)
4.4
RedCraft

Can provide a expendable solution for crawling product description. Please check your PMB for clarifications.

$1500 USD trong 20 ngày
(1 Đánh Giá)
3.7
defoladi

Dear Client, OUR S--K-Y--PE IS [url removed, login to view] WE ARE NOT TAKING ADVANCED, YOU CAN PAY AS PER WORK # We are ready to discuss the project with you and based on that move forward Lets discuss Thêm

$750 USD trong 5 ngày
(0 Đánh Giá)
0.0
allecs

Hello, Please check PM for further details.

$1000 USD trong 15 ngày
(0 Đánh Giá)
0.0
danielricha25

Assurance of excellent job done with expert team having 5 years of experience in this field within allotted time with consideration of minute details.

$1000 USD trong 10 ngày
(0 Đánh Giá)
0.0