Please see the attached file for example URLs.
Given a list of urls the following information needs to be extracted from the page to our server:
Thumbnail image of Page 1 and if it doesn't exist, then the Abstract image
Rename the image according to the following format from data scraped on the same page as the image: Filing Date, Inventor, patent title.
When an existing file already exists with the same name, the script will automatically append an increasing numerical value to the end of the title ( _1, _2 )
Punctuation like periods and commas to be removed from the file name.
The newer pages have a slightly different layout but the script needs to work with both layouts since the urls can not be sorted according to dates. Please see the examples attached.
Preference to coders that can provide a working demo.