I have setup some crawlers which store data into S3. Each crawler stores the daily scrape load as a separate file. I need to write a regular job which cleans the daily data scraped for each of the different crawlers, joins the data together, and adds a date column as partition. The cleaned and joined data should then be added to another s3 table which aggregates all the clean data on daily basis. Ideally all this should happen within the AWS environment. This final aggregated dataset will then be used to query and extract a daily report and showcase historic trends.
19 freelancer đang chào giá trung bình €26/giờ cho công việc này
Hi nice to meet you Firstly, Thanks for visitting my profile. With more than 8 year in aws. Then I can make it as best. Can we talk? Thanks you in me. Could I use lambda or batch job to make this crawl?
Hi, As a Senior Python engineer, I have experience of web scraping and using amazon web services. I am available to discuss more through private chat. Sincerely. Kirill
Hi, there. AS a Senior Python engineer, I have experience of writing cron jobs using python. And also familiar with amazon lambda. I am available to start immediately. Yuri Ren