Đã Đóng

Crawler for Gadget Wiki

We need to get a web crawler made, for extracting structured information from tables on specific websites and storing them in a MongoDB database. The crawler should be developed using a plugin architecture so that we can add more source websites easily. The crawler should also be able to re-visit URLs after a configurable number of days, and update the information in our DB if it changes on the source website. It should also discover new pages on the sites we're interested in (using the sitemap page) and crawl them automatically.

The data needs to be stored in Mongo DB. The schema and some source websites will be shared with you. We prefer Node.js, but crawlers in other languages will also do; however the DB backend must be MongoDB.

Kĩ năng: node.js, NoSQL Couch & Mongo

Xem nhiều hơn: wiki websites, web wiki software, web crawler wiki, web crawler architecture, schema update, js for, architecture of web crawler, node.js mongo, mongodb c++, couch database, website crawler, web crawlers, Node js website, mongo, database crawler, data crawler, crawler, couch, web data crawler, structured database, mongodb add, crawl sites, node crawler, information crawler, data website using web crawlers

Về Bên Thuê:
( 0 nhận xét ) Bangalore, India

ID dự án: #1712005

2 freelancer đang chào giá trung bình ₹16250 cho công việc này


I can develop this tool with Node.js and MongoDB.

₹20000 INR trong 4 ngày
(0 Nhận xét)

Sir, Pls check the PM.

₹12500 INR trong 2 ngày
(0 Nhận xét)