I started on a piece of code that is simply a Map Reduce job that implements the K-nearest neighbor data mining technique. You specify 3 parameters for the job, which are the input file, output and test data file. The logic of what I have done so far doesn't seem to be working properly. I've attached the code done for far and would like someone to help in getting it to work.
It is important that this code is tested with Hadoop 1.1.2 as the version of Hadoop has a particular significance to the work being done.
The code has some issues with the reducer function and its cleanup.
The implementation is based on the concept provided by the following paper: [url removed, login to view]
The input file and test file are included in the project. The work has been developed using eclipse as a java project. The names of the input, output and test files have been added as arguments in the run configurations
The paper referenced in the description has been also attached to the description