127139 Web scraping

Đang Thực Hiện Đã đăng vào Mar 29, 2007 Thanh toán khi bàn giao
Đang Thực Hiện Thanh toán khi bàn giao

Web address �

[url removed, login to view]

I must have the option to scrape the same data from other web addresses such as

[url removed, login to view]

or

[url removed, login to view]

Although the three web addresses are different the structure of the information once on each site is identical

Task

I need to be able to do the below task on a weekly basis

From the page of the supplied URLs I want to choose, form the drop down box named �select a weekly list', the second option (the date will change weekly) but it will always be the second option the one with �last week' in parenthesis. I will then either leave the option button as the default option �validated' or I may want to choose the option button �decided'. I will then click the button named �view list'

Example

I have chosen the second option from the drop down box 18/3/2007 to 24/3/2007 (last week) and left the option button as the default �validated'. On clicking the �view list button' I am taken to the first result page (page 1 of 8) and it shows there were 74 matching applications found that week (7 pages with 10 results and 1 page with 4 results) the results page by default shows 10 results per page.

The results are shown in table form with seven columns, to access the full data for result 1 I need to click on the arrow button in the seventh column, which then takes me to that individual application which is a form with six tabs at the top to choose from.

On tab 1 (default tab) �Application details' I need the following info transferred to Excel

Application Reference

Address of proposal

Proposal

Type of application

Status

Decision

I then need to click on tab 3 �Applicant Details' and I need the following info transferred to Excel

Applicant's name

I then need to click on tab 4 �Agent Details' and I need the following info transferred to Excel

Agent's name

Agent's address

Agent's phone number

I then need to return to the results page and do the same for results 2, 3, 4, 5, 6, 7, 8, 9 and 10, once I have collected the 10 results info from this page I need to click on to the �next page' button which in this example takes me to page 2 of 8 and shows results 11 to 20, I need to scrape the same data as I did for results 1 to 10. I need to continue to do this process again and again until I reach page 8 which shows results 71 to 74 and again the same data needs to be scraped.

I have carried out this example on Wednesday 28th March 2007 so the second option in the drop down box reads as 18/3/2007 to 24/3/2007 (last week) however on Wednesday 04th April 2007 the second option will read as 25/3/2007 to 31/3/2007 (last week) when this happens the number of results and pages may vary e.g. we may have 50 results on 5 pages or 97 results on 10 pages (the number of results and the number of pages can always be derived from info at the top of the first results page).

Odd Jobs Python Visual Basic

ID dự án: #1873307

Về dự án

Dự án từ xa Jul 11, 2012 đang mở