Đã Hủy

Website Parser/Scraper Needed [Java]

I require a web-scraping application to be written that does not require additional dependencies (ie nothing other than the base install java/jvm/jre).

*** READ ENTIRE PROJECT REQUIREMENT BEFORE BIDDING ***

Scraper must log to an excel document (see sample in files section)

Sample URL for testing:

[url removed, login to view]|&bedrooms=0&bathrooms=1&Accessible=False&pictures=False&pets=False&ac=False&AgeRestricted=False&smoking=False&coveredParking=False&MaxSqFt=5000&MinSqFt=0&keyword=&sortBy=LastUpdate

Scraper requirements:

- GUI must match design of sample in files section (Scraper [url removed, login to view])

- accept direct input of URL to begin scraping

- allow adding/editing/deleting of saved URLs

- - a saved URL item will contain a title and the URL. Only the title will be displayed in the GUI list

- clicking an entry in the list will load the URL into the URL text box above the list

- select an output file / option (... button) to type or browse to target file location

- - browse dialog must filter for .xlsx by default

- - scraper must log data into an excel document

- - if file exists, data will be appended to the existing file

- - - set all entries in the "Active" column to "No"

- - if file does not exist, scraper must create the target file

- - - First row must contain the following headings: Active, Last Active, Landlord Name, Phone, Contacted, Notes

- "Scrape" button will begin scraping the URL in the URL text box.

- - MUST use the text in the box, as it might be manually edited before running

- connect to website - determine # of listings/pages returned

- visit each listing's page

- save each landlord's name and telephone number (on the right side of the page)

- - if excel file selected earlier exists, scan the file to see if landlord information is already in the file (excluding '

- - if already in the file

- - - update record's "Active" field to "Yes"

- - - update record's "Last Active" field to current date (YYYY/MM/DD format)

- - - go to next property

- - if not already in file

- - - add a record to the file using field structure below

Scraper must go through each page of returned results to get all data. Links for each page are at the bottom of the page. Visual progress should be displayed as scraper runs, and stored in a log file named "[url removed, login to view]" representing the date and time the scraper was executed. See "console and log [url removed, login to view]" for example of what both should look like.

Fields: Active (Yes/No)

Last Active (YYYY/MM/DD format)

Landlord Name (as it appears)

Phone (###-###-#### format, no ( ) around first digits)

Contacted ("No" by default for all new contacts. do not alter for existing contacts)

Notes (leave blank, do not edit)

The end-state is to have an excel document i can use to keep updating and adding new contacts based on the scrape of the [url removed, login to view] website.

Deliverable includes all source code files.

To be considered for this project, you MUST:

- Have a bid proposal within the posted project budget

- Include the phrase "Java is more than just coffee" as the first line in your bid proposal

- State when you will be able to begin actively working the project, and if you are working any other projects at this time

- Visit the sample URL, select the first property, and confirm you can see the name and phone number on the page

*** Your bid will not be considered if you do not conduct the steps above ***


If you are unable to see the name and number listed on the right side of the property's page, you may need to click the green "View Phone Number" button. Since you will be parsing the source code, however, the phone number is accessible from this page. The code format follows:

<div class="printcontact">
<h1>Landlord:
<span id="ctl00_MainContentPlaceHolder_LLNameLblPrint">Russell Thompson</span></h1>
<b>Phone:
<span id="ctl00_MainContentPlaceHolder_lblPhoneDisplayPrint">(682) 564-4245</span></b>
</div>


=== corrections ===
- An entry is considered a duplicate if the phone number matches. Do not worry about the name.
- Do not put dashes or ( ) in the phone numbers
- For an 800 number, ensure a 1 is placed in front of the number (18005555555)

Kĩ năng: Java, Phát triển phần mềm, Web Scraping

Xem nhiều hơn: software write mq4, software write chip epson, useful software write book, software write web specs, free software write user guide, software write edid, free software write company profile, software write websites idea, software write book images, software write books, software write protection, free software write book, software write book, information needed online book website, dditional information needed upload flash website, java microsoft word parser, header needed cash making website, parser website, java google result parser, java html scraper, java wiki articles parser, information needed design sales website, java needed, software write protect software, best java html scraper

Về Bên Thuê:
( 4 nhận xét ) x, United States

ID dự án: #11385508

6 freelancer đang chào giá trung bình $155 cho công việc này

shafaqat11

Hello Sir, How are you? I understand your job and very much excited to offer my services for your job. Please feel free to contact me directly to discuss this position further. I am all time online on Skype and Thêm

$20 USD trong 2 ngày
(43 Nhận xét)
5.1
Iamtuheedakram

hey I can do this job since I am quite expert in scrapping and automation. I have scrapped [login to view URL], [login to view URL], [login to view URL], [login to view URL], ebay, [login to view URL], hudohomestore, instagram. I use multi threading to Thêm

$500 USD trong 2 ngày
(18 Nhận xét)
5.0
arhamIT2020

Dear Sir,I read your job description very carefully .I ready to start this project .i can show sample .if you interested please discuss over live chat about project. Thank you, Arham IT

$166 USD trong 5 ngày
(4 Nhận xét)
2.6
hoangduong97

Hello, I am experienced and specialized in C/C++ and Java programming, I have experience in Android. I have taken part in many competitions, including national competition, in which I won 2nd prize. I am currently a Co Thêm

$111 USD trong 1 ngày
(3 Nhận xét)
0.9
elasolova35

Java is more than just coffee I have a lot of experience with custom web scraping, you can check my previous projects about scraping. Also, I am good at web design so that the a good at designing gui/s, or it can Thêm

$100 USD trong 7 ngày
(2 Nhận xét)
0.2
$35 USD trong 1 ngày
(0 Nhận xét)
0.0