Due Date: May. 23, 2010
*******************************************
Java Language
Design an intelligent news agent, which read news from the Web and do the following tasks:
1) At training stage, input a set of news, classified into five categories:
Sports, Entertainment, Health, Business, Sci/Tech,
Suggested number of training news: 20 news articles for each category.
Software interface:
Input: a file folder name contain all the training articles, articles names are:
[login to view URL], [login to view URL], …, [login to view URL]
Have a separate file [login to view URL] to indicate the categories of each article,
Sports: S
Entertainment: E
Health: H
Business: B
Sci/Tech: T
Each row in [login to view URL] represent the category of one news article:
e.g.,
[login to view URL], S
[login to view URL], E
[login to view URL], T
…
[login to view URL], T
Output: the model that you build for classification using Naïve Bayes Classifier
2) At testing stage, input a new article, and ask the agent to tell what class it is
Testing 100 different news articles, and report the accuracy
Input: the name of a new article, e.g., [login to view URL]
Output: Class: e.g., T
Requirement:
2) Provide document describing your code design and implementation
3) Provide source code with clear explanation
4) Provide sample news collection and sample testing results
5) Provide report on your testing results and analysis on your results
6) 4 page, report on the algorithm, dataset/problem, and evaluation is expected as part of the project
Note:
1) You can manually edit each news article so that only text information is included;