I need website crawler/grabber which will scrape and store flight availability into MySQL database from
several airline websites (unfortunately they don’t provide API ). I prefer work completed in Java,Python,PHP or C#
Websites I need to be scraped: All Nippon Airways ,United, Delta, Air France
I need only award tickets availability. KVS tool and expert flyer are checking similar stuff.
Database will have simple structure
Date /Airline /flight number /From /Departure time/To /Arrival time/ Availability
15 July 12 / AF / 332 / BOS/13:35 /CDG/ 15:20/ FS+ CS3+YS-
15 July 12 /KL / 225 /BOS /17:50 /AMS/ 9:15 / FS- CS2+YS-
16 July 12 /KL / 1345 /AMS/10:30/ CDG/ 12:00/ FS-CS2+YS-
FS+ = First class availability
CS+=Business class availability
YS-= Economy availabilty
Crawler must check availability every couple of hours to refresh information and add it to database.
I will provide a file with city pairs that needs to be checked (I may need to change it in the future, needs to be adjustable)
It needs to check how many seats are available. 4 passengers is max
Crawler must change IP addresses to avoid blocking (preferably every 10-20 searches )
Some airlines require account number and pwd to access availability(AIr France for example,). I want to be able to add more accounts in the future to a crawler so it will use random account and random IP to avoid blocking. I will provide you all account information if needed.
Please bid only if you have previous experience on similar scraping projects