Hi, we need a high performance script written in PHP as well as the same script written in google script language that can scan domains very fast to determine their cms (if any i.e. WP, Drupal, Joomla, typo3...) scan for version, installed plugins/themes, grab any contact details from the site (address, telephone, email), grab their facebook page, twitter page, instagram, linkedin... The full list of desired field values is as follows:
Location on site, -- is this install a subfolder from another website 8like blog or news section?
Vertical, - can we determine the category 8manufacturing, retail, clothes, food, restaurant etc
Quantcast, can we grab quantcast info
Alexa, - can we get Alexa Rank
Telephones, - 1 or if multiple then list
Emails, for the website
Twitter, - Id and if possible grab fan count Facebook, - name and if possble follower count
LinkedIn, - name and if possible follwoer count
Google, - name and if posisble follower count
Pinterest, - name and if possible follower count
GitHub, - name if possibel fan/follower count
Instagram, - name and if possibel follower count
Vk, name and if possibel fan count
Vimeo, name and if possible fan count
Youtube, name and if possibel fan count
People, - if there are people listed/employees then their names and titles/blurbs
First Detected, - first tie scanned
Last Found, - same as first until scanned again
First Indexed,- date of first time all info grabbed
Last Indexed - date of last scan
We prefer the domains being called from a Mysql DB and the results being stored back into the DB but if for performance reasons it is faster to store them in say google sheets, that is fine. This script must be scalable (either via duplication or smaller scripts controlled by masters to scale to be able to scan tens/hundreds of thousands of websites per day.
So the project is for developing the same script in 2 different platforms with Super high performance. 1 is google scripts and the other is PHP based. Our preference is DB driven but if your experience says that there is another way to work to improve speed then that is fine. We must however be able to bring results into a database though.