We would like to have a script developed that does the following for a well established technology news website. Our goal is to have the capability to automate our news process and to possibly expand throughout the U.S. in phases over the next couple of years. We do have a small budget and would like to have bids from programmers that have experience in this area if possible.
We would like the project completed within 30 calendar days.
Here is what we will need:
1. Grab select RSS feeds and follow the link to the html or text pages. Automatically visit the RSS feeds for news releases twice per day. Once in the early am and once in the afternoon. This schedule will be able to be modified by resetting the timing or frequency depending on our needs. There would also be a dedupe feature that would not pick up duplicate releases.
2. Extract full html or text article and other information from that page adding that to a database.
3. The script may need to access password protected areas of certain websites.
4 News releases would then be pulled by a specific category such as: Computers/Internet. Telecom/Wireless etc. that have been assigned by each wire service. Each news article that fits a specific category will then be placed inside that specific news category that fits our website format. There are currently about 10 unique news categories. We would also like a feature to add additional categories easily within the admin panel.
3. News releases that fit into each designated category will then be automatically pulled and placed into a specific geographic state and county.
4. Releases that fit into a geographic area determined by cities within certain counties. Ex: San Diego County - Cities would include: La Jolla, La Mesa, Pacific Beach, El Cajon etc. etc. Orange County: San Clemente, Newport Beach, Irvine etc. etc. Those cities that do not fit into a specific category/city/county would not have its news pulled. (this would be determined by the date line - ex: (SAN DIEGO) (IRVINE)
5. Once the city is recognized it is then distributed into the correct county and into the correct state. (The correct county would be determined by a list we would put together that would cover each city in that state/county). We would also like this designed with the idea of future expansion capabilities to further extend our news coverage into the other U.S. States/Counties beyond California.
6. It would then have certain fields stripped from each release. There will be words, phrases or additional terms that will also need to be stripped and parsed.
7. Once all of the parsing is completed each release would be properly formatted for the website.
8. Article bank. Each release would then rest in an article bank that would enable us to select or delete each edited release that we may want to run on the website. In the article bank we would see the headline and the county origin and would be able to click on the headline link to view the full story if necessary to determine its category placement. We would also need a delete feature if the release didn't fit our areas of interest/categories.
9. There will be 3 admin features in handling the distribution of the releases. One: Manual distribution. Two: Semi-automatic. Three: Automatic. The automatic feature would give us the option to turn it on in case of not being able to access a computer for whatever reason and so as not to disrupt the news distribution to the website. Those releases selected would then go into the proper news category and county after our acceptance. There will also be a category choice next to each release. A drop down menu for each release would be used for this feature. This would enable us to use the drop down menu to override the system so that the release will go into the correct category in this drop down menu.
10. We would also like to offer a company sign-in menu with a unique user name and password. This would offer a company the ability to post their own press releases or calendar of events releases for their convenience. Once approved our system would then have the ability to toss out duplicate submitted releases in case we had already posted that same press release at an earlier time. Whenever a press release is submitted an editable (in case we want to personalize or change the outgoing message) confirmation would be emailed back to that same person. There would be a pending articles section when the selection is in semi-automatic or manual settings. We would also need to have the ability to manage logins with passwords and user names.
11. We would provide a development server to use the new php/perl scripts you will develop for us so you will be able to integrate them into our current administrative section with our existing features and current scripts along with our database.
12. And finally all of the above should be able to be integrated into our existing daily newsletter. This newsletter is currently generated from our admin panel to over 2,400 opt-in subscribers.
13. All programming, scripts and other services contracted for in this project and other work associated with this project will become our exclusive property. We retain worldwide rights to the work that is done for us.
14. We expect complete confidentiality in working on this project.