Đã hoàn thành

Screen scraping of 6 websites using open source

I require a very simple application which scrap data from 6 different sites and create an xml output.

The program should use an open source scrapping tool which called

WebHarvest (you can find it in : [url removed, login to view])

What i need from you is a Web Harvest script files which creates variable contains the XML and a small java application which execute the script and print the XML (Example: [url removed, login to view]).

There should not be any code in the java main except running the script and sending parameters value and output the XML (all the logic and the creation of the XML will reside in the scripts)

There will be a total of 6 urls that we require web scraping. Here they are and the requirements. Each site would require its own script:

[url removed, login to view]

Takes a state as a search criteria. Returns pages of results. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

Takes a state as a search criteria. Returns results in a flash outputted view. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

Takes a state as a search criteria. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

[url removed, login to view] (list view)

Takes a zip code AND a price range. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view] with the real estate plugin

Takes a zip code AND a price range. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

Because all of these are real estate websites, you will be required to first do a post search on them in order to scrape the results. The post search query typically requires a zip code, state and/or city

The scripts should be able to be called via java code You will provide both the scripts and the java code

Kĩ năng: Kĩ thuật, Java, MySQL, PHP, Kiến trúc phần mềm, Kiểm tra phần mềm, Web Hosting, Quản lý website, Thử nghiệm trang web

Xem nhiều hơn: open source screen scraper websites, web scrape java query, screen scraper open source, xml websites, www chase com, www bankofamerica com, writing to a file java, writing to a file in java, what program do i use to open a php file, websites net, web scraping tool open source, web scraping price, web scraping application, web page scrapping, the open 2008, range query, open source c++ code, open 2008, java writing to a file, java source net, is php web scraping, hotpads com, google site search asp net, google real estate websites, first source

Về Bên Thuê:
( 3 nhận xét ) Chicago, United States

ID dự án: #3498408

Được trao cho:

abhay78

See private message.

$212.5 USD trong 14 ngày
(94 Đánh Giá)
6.4

5 freelancer đang chào giá trung bình $315 cho công việc này

smartsallar

See private message.

$425 USD trong 14 ngày
(26 Nhận xét)
5.4
brainwithstorm

See private message.

$425 USD trong 14 ngày
(30 Nhận xét)
4.9
ashwinc

See private message.

$212.5 USD trong 14 ngày
(4 Nhận xét)
3.6
cicdev

See private message.

$297.5 USD trong 14 ngày
(3 Nhận xét)
0.0