I require a web-scraping application to be written that does not require additional dependencies (ie nothing other than the base install java/jvm/jre).
*** READ ENTIRE PROJECT REQUIREMENT BEFORE BIDDING ***
Scraper must log to an excel document (see sample in files section)
Sample URL for testing:
[url removed, login to view]|&bedrooms=0&bathrooms=1&Accessible=False&pictures=False&pets=False&ac=False&AgeRestricted=False&smoking=False&coveredParking=False&MaxSqFt=5000&MinSqFt=0&keyword=&sortBy=LastUpdate
- GUI must match design of sample in files section (Scraper [url removed, login to view])
- accept direct input of URL to begin scraping
- allow adding/editing/deleting of saved URLs
- - a saved URL item will contain a title and the URL. Only the title will be displayed in the GUI list
- clicking an entry in the list will load the URL into the URL text box above the list
- select an output file / option (... button) to type or browse to target file location
- - browse dialog must filter for .xlsx by default
- - scraper must log data into an excel document
- - if file exists, data will be appended to the existing file
- - - set all entries in the "Active" column to "No"
- - if file does not exist, scraper must create the target file
- - - First row must contain the following headings: Active, Last Active, Landlord Name, Phone, Contacted, Notes
- "Scrape" button will begin scraping the URL in the URL text box.
- - MUST use the text in the box, as it might be manually edited before running
- connect to website - determine # of listings/pages returned
- visit each listing's page
- save each landlord's name and telephone number (on the right side of the page)
- - if excel file selected earlier exists, scan the file to see if landlord information is already in the file (excluding '
- - if already in the file
- - - update record's "Active" field to "Yes"
- - - update record's "Last Active" field to current date (YYYY/MM/DD format)
- - - go to next property
- - if not already in file
- - - add a record to the file using field structure below
Scraper must go through each page of returned results to get all data. Links for each page are at the bottom of the page. Visual progress should be displayed as scraper runs, and stored in a log file named "[url removed, login to view]" representing the date and time the scraper was executed. See "console and log [url removed, login to view]" for example of what both should look like.
Fields: Active (Yes/No)
Last Active (YYYY/MM/DD format)
Landlord Name (as it appears)
Phone (###-###-#### format, no ( ) around first digits)
Contacted ("No" by default for all new contacts. do not alter for existing contacts)
Notes (leave blank, do not edit)
The end-state is to have an excel document i can use to keep updating and adding new contacts based on the scrape of the [url removed, login to view] website.
Deliverable includes all source code files.
To be considered for this project, you MUST:
- Have a bid proposal within the posted project budget
- Include the phrase "Java is more than just coffee" as the first line in your bid proposal
- State when you will be able to begin actively working the project, and if you are working any other projects at this time
- Visit the sample URL, select the first property, and confirm you can see the name and phone number on the page
*** Your bid will not be considered if you do not conduct the steps above ***
If you are unable to see the name and number listed on the right side of the property's page, you may need to click the green "View Phone Number" button. Since you will be parsing the source code, however, the phone number is accessible from this page. The code format follows:
<span id="ctl00_MainContentPlaceHolder_LLNameLblPrint">Russell Thompson</span></h1>
<span id="ctl00_MainContentPlaceHolder_lblPhoneDisplayPrint">(682) 564-4245</span></b>
=== corrections ===
- An entry is considered a duplicate if the phone number matches. Do not worry about the name.
- Do not put dashes or ( ) in the phone numbers
- For an 800 number, ensure a 1 is placed in front of the number (18005555555)
6 freelancer đang chào giá trung bình $155 cho công việc này
hey I can do this job since I am quite expert in scrapping and automation. I have scrapped [login to view URL], [login to view URL], [login to view URL], [login to view URL], ebay, [login to view URL], hudohomestore, instagram. I use multi threading to Thêm
Dear Sir,I read your job description very carefully .I ready to start this project .i can show sample .if you interested please discuss over live chat about project. Thank you, Arham IT