Đã Đóng

PHP / cURL Programming Guru needed for Data Scraping Project New or Update Old

PHP / cURL Programmer needed for Data Scraping Project Update

Several years ago, I hired someone from here to develop a data scraper for me.

The scraper worked perfect for what we needed, then we got our IP blocked by the site we were scraping.

The original writer was able to incorporate proxy IPs for the requests and it worked again.

Then everything sat for a few years.

So, we have all the original code sitting on the original site, and it doesn’t work today.

Could be that the server was updated to higher level PHP (my guess is it was written in 5.x?), could be the db. is corrupted. I really don’t know.

I have been tasked to get this back up and running.

Ultimately, I need a data scraper that will pull multiple levels of information from [login to view URL]

The result page will look like

[login to view URL]

We will need most of the variable information on this page. Additionally, On the bottom of the page is a link with personnel that needs to be followed.

Following this link may produce multiple names and position information. In the case of the above it produces

[login to view URL]

which has only a single name.

But in other examples it could potentially produce something like

[login to view URL]

Where it shows multiple names and positions with the company.

The scraper will need to capture all this information and associate it with the appropriate license number.

This data source changes daily (as new people get a license, as licenses expire, or become suspended because they did something wrong, or have a law suit brought against them).

The most important will be the new additions to the db (which will be easy to find by just increasing the license number up one until you find the next license). These need to be found daily or every other day.

But every record will need to be updated verified and updated once a week.

We will need a way of pulling any recorded to which there was a change in the last week, even if it was as simple as correcting the spelling of a name. Maybe a field for last changed.

SOOOO…

I ether need someone to work with what we have OR create something new.

With either option, you will need to create or provide complete documentation in such a manner that if you disappeared, the next developer could easily see what you did, why you did it, and how to correct it should we have problems in the future.

This is the first of 10 sites that will need to be scrapped.

It would be great if we could use the same engine on each one with minor location tweaks.

If this goes well, we would ideally like to work with the same person ongoing for years to come.

The new scraper will need to be compatable with PHP 7.2, mysql 5.7 or Mariadb 10.2 and cURL 7.29

Kĩ năng: cURL, PHP, Python, Kiến trúc phần mềm, Web Scraping

Xem nhiều hơn: php curl multipartform data, php curl screen scraping, data scraping php tool url, php data scraping tool, data scraping php, php curl scraping, use php curl post data site, php curl example send post data, php curl post data php, data entry project needed, code data scraping project php, data scraping project, php curl content disposition form data name

Về Bên Thuê:
( 1 Nhận xét ) Corona, United States

ID dự án: #17899877