Đang Thực Hiện

155264 Specialized data grabbing prog

I need a data grabber program for Windows specialized for grabbing full-text statutes (laws) from the web, with post-editing of the downloaded pages.

EXAMPLE OF INPUT (the web sources...):

These are just two examples of full-text statutes that it must be able to grab:

[url removed, login to view]

[url removed, login to view]

Note that they are on multiple pages and that they are hierarchically structured.

EXAMPLE OF OUTPUT (a local html file for each web source...):

The attached file [url removed, login to view] shows how each statute must be post-edited and stored locally after being downloaded from the web.

Note that the multiple downloaded pages of each statute must be merged to create a single file (1 for each statute) and that the hierarchic structure is reproduced by using the h1, h2, h3, etc. tags.


The data grabber must work for every html full-text statutes on the web but I don't expect it to be fully automatic since I think it's impossible to foresee and pre-codify all the possible sites of full-text statutes on the web.

What I require is that it must be as much automatic as it can be, leaving to the user the possibility to finely configure, even by means of regexps (you can assume that the users are IT experts), the spider and the post-editing for each site.


I require full ownership and source code of the program.

I offer escrow payment.

Kỹ năng: .NET, Bất kì công việc gì, Lập trình C, Perl, PHP, Visual Basic

Xem thêm: tags create web site, need data structure, index data structure, foresee, examples data structure, data structure note, data structure examples, data structure example, data structure code, using data structure, need fl, grabbing, delaware, data structured, data experts, data editing text file, c prog, expect output file, html submenu, php view data, php data view, pre data, post editing, site grabber perl, index file data structure

Về Bên Thuê:
( 0 nhận xét )

Mã Dự Án: #1901448