1. Continuously and reliably scrape and collect job posting data from websites like indeed, dice, careerbuilder, Monster, etc. (any one or two would be sufficient). The best solution would be rotating among those sites.
2. It queries jobs based on a randomly generated combination of keywords, such as "java, Dallas Texas".
3. It should be disruption-free and utilize AWS Spot EC2 instance to power the scraping. That means, the solution should include programmatically create a spot instance and start working there.
4. The collected data should be saved to a central server, in a format of zipped csv file or Mongodb.