I am looking for a company who can create a automatic parser that will do the following:
- Based on the information within the RSS feed (Title, Description, permalink, etc.) and use this information to go to the permalink and bring the full text of the article or the blog post. Obviously, the parser will be used for RSS feeds which do not provide full text within the Description tag, which is the practice followed by most news sites like NY Times. The parser must also bring realted images with the text but it must filter out all other unnecessary things such as ads, navigation images, etc.
I have tried developing this parser but the results have been bad. I would like to speak with a company/developer who can show me that he has experience doing HTML parsing before.
This is urgent as my client is demanding something right away.