Đã hoàn thành

Need an updated web spider scraping program written for the following username DFUSER11653

I need a web scraper written for the .xlsx file in the following directory:

[login to view URL]

The latest .xlsx file within that directory will need to be downloaded.

The name of the file is subject to change and will need to be identified by the latest .xlsx extension.

If a row is blank, skip that row.

If the ship_date column has a past date the data does not need scraped, only scrape the data with the current or future dates in the ship_date column.

The output should be a pipe (|) delimited file with the following column mappings:

origin_city --> data located in column "C", if the column contains a comma and data after the comma only add data BEFORE the comma

origin_state --> data located in column "D"

ship_date --> the date from column "A" changed to the YYYY-MM-DD format, if the date is a past date do not scrape that data

destination_city --> data located in column "F", if the column contains a comma and data after the comma only add data BEFORE the comma

destination_state --> data located in column "G"

receive_date --> leave blank

trailer_type --> the abbreviation located in column "B"

load_size --> data located in column "I"

weight --> data located in column "K"

length --> data located in column "J"

width --> leave blank

height --> leave blank

trip_miles --> leave blank

pay_rate --> data located in column "L"

contact_phone --> data located in column "O"

contact_name --> leave blank

tarp_required --> leave blank

comment --> data located in column "P" and column "Q" add the text "Load#" before data in column "Q"

load_number --> leave blank

commodity --> data located in column "M"

The first line of the output should contain all of the column headers.

Any field that contain no data should be left blank.

Please do not use words like "null" or "blank" in blank columns.

Below is a sample output of the first 5 columns using sample data:

origin_city|origin_state|ship_date|destination_city|destination_state|

chicago|IL|2017-03-15|new york|NY|

kansas city|MO|2017-03-15|houston|TX|

The deliverable will be a Perl .pl file that must run on

Ubuntu Linux and must use Modern::Perl. The Perl .pl file

should be called '[login to view URL]' and the output file should be

called '[login to view URL]'

It will be scheduled in cron to run unattended every 15 minutes.

Please specific what language/OS/modules you plan to use.

Also, please include the word "raccoon" in your bid so I know that

you read this description.

Kĩ năng: Perl, Web Scraping, Linux

Xem nhiều hơn: write web spider, web spider development, need end web designer, web spider source, web spider crawling website robot vbnet, need adult web designer right, web spider collect data, email scraping program, need magento web designer, build program written, game dont need flash web, need create web community, web spider free mysql, web spider source code, web scraping program, need program written, daily updated web scraping, need program written website, i need a developer to help develop a web based software program i will provide full project details once you place a placeholder

Về Bên Thuê:
( 77 nhận xét ) Chillicothe, United States

ID dự án: #26499227

Được trao cho:

(535 Đánh Giá)
7.6

12 freelancer đang chào giá trung bình $135 cho công việc này

freelance4hire80

Hi "raccoon", I plan to use Perl LWP/Mechanize for this project. You can run the script thru crontab then. I have developed many web scraping scripts before. You can check my job history for relevant experience

$88 USD trong 3 ngày
(78 Nhận xét)
6.8
schoudhary1553

Hi, Greetings! ✅checked your project details: Need an updated web spider scraping program written for the following username DFUSER11653 ✅Completed Time: In project deadline We have worked on 600 + Projects. I h Thêm

$180 USD trong 3 ngày
(30 Nhận xét)
6.3
PoojaRautela417

ok,I will develop a scraper for the mentioned website to scrap the required fields according to your requirements.I am expert in Python, PHP, JavaScript,Web Scraping.I have done similar projects with other employer. I Thêm

$250 USD trong 5 ngày
(16 Nhận xét)
4.6
arassolenko

Hello Potential Client. I am professional software developer with 7 years experiences of Pandas, WebScraping, Data Mining and Python. I have developed many project including scraping amazon. I can start working right n Thêm

$70 USD trong 2 ngày
(7 Nhận xét)
4.1
mehamednews

I’ll develop an automated web scraping tool to extract all the needed data & provide a dashboard to control/follow up with the process & download the data in any format you may need it in (JSON, CSV, EXCEL, TEXT, XML, Thêm

$250 USD trong 7 ngày
(5 Nhận xét)
3.3
shashankmistry31

'raccoon' DONE! COMPLETED ! checkout the BELOW DEMO WHERE I HAVE TRIED TO FULFILL YOUR requirement using Python script. [login to view URL] Its fast and efficient and future compatible .I have taken all condi Thêm

$80 USD trong 7 ngày
(2 Nhận xét)
2.5
PKonstiantyn

Hello! Sir Very interesting in your project”Need an updated web spider scraping program written for the following username DFUSER11653” My name is konstiatyn who is professional in webscrapping. I have read your descr Thêm

$140 USD trong 7 ngày
(1 Nhận xét)
2.0
(1 Nhận xét)
1.0
mattfly

Very easy and fast to achieve using Spreadsheet::ParseExcel perl module from CPAN to convert the xls to a csv and then just use awk to convert the file. I have done a lot work work parsing csv files with awk recently a Thêm

$40 USD trong 1 ngày
(1 Nhận xét)
0.4
piskun699

Greetings to Sir!❤ I read through the job details extremely carefully and I am absolutely sure that I can do the project very well. I have a good experience in web scraping. I have worked on similar projects to what y Thêm

$200 USD trong 7 ngày
(0 Nhận xét)
0.0
fisasti

raccoon Hello there! I can do this in PHP. As I understand, you need to download the .xlsx file content every 15 minutes and output it's data in the format you've described to the .pl output file. I did similar data sc Thêm

$165 USD trong 2 ngày
(0 Nhận xét)
0.0