On the hadoop cluster located at [login to view URL], in HDFS, is a directory called /share/spoilers. This directory contains 25000 text files containing "spoiler logs" for a randomized version of The Legend of Zelda: A Link to the Past. Yes, I am a huge nerd. As before, in the "Light World" section, you might see: "Graveyard Ledge:14952": "ProgressiveSword:14952", This means that the location called "Graveyard Ledge" has the item "ProgressiveSword". Each of these sections is labeled by its region. The region, therefore, of the previous item, is "Light World". Your task for this project is to create a Naive Bayes classifier for Hadoop using Spark Machine Learning to predict the region that houses the "PegasusBoots" item. Each file will provide one example for the classifier. You should, before attempting to train the classifier, extract the following features from each file: HookshotLocation: The region that houses the "Hookshot" item. PearlLocation: The region that houses the "MoonPearl" item. MirrorLocation: The region that houses the "MagicMirror" item. FireLocation: The region that houses the "FireRod" item. CaneLocation: The region that houses the "CaneOfSomaria" item. HammerLocation: The region that houses the "Hammer" item. FlipperLocation: The region that houses the "Flippers" item. LWGloves: The number of "ProgressiveGlove" items in the "Light World" region. EasternGloves: The number of "ProgressiveGlove" items in the "Eastern Palace" region. DesertGloves: The number of "ProgressiveGlove" items in the "Desert Palace" region. DMGloves: The number of "ProgressiveGlove" items in the "Death Mountain" region. HeraGloves: The number of "ProgressiveGlove" items in the "Tower of Hera" region. DWGloves: The number of "ProgressiveGlove" items in the "Dark World" region. DarkGloves: The number of "ProgressiveGlove" items in the "Dark Palace" region. SwampGloves: The number of "ProgressiveGlove" items in the "Swamp Palace" region. SkullGloves: The number of "ProgressiveGlove" items in the "Skull Woods" region. ThievesGloves: The number of "ProgressiveGlove" items in the "Thieves Town" region. BootsLocation (Class Value): The region that houses the "PegasusBoots" item. *Note: "Castle Tower", "Ice Palace", "Misery Mire", "Turtle Rock", and "Ganons Tower" cannot contain a glove.
The files are in JSON format, so you will either need to get a way to parse JSON, or simply read the relevant lines from each file. You will need to produce the attributes listed above from the spoiler files on the Hadoop cluster. In the end, I should be able to use your classifier to predict the boots locations for rows that are unlabeled. You will need to convert the categorical values in each of these to integers in order to make it work with spark's machine learning library. Provide any code you used, as well as the file containing the model you built using spark ML.
Hello there,
We have read your requirements. We will complete this in your time. We have 10+ years experience in Software Architecture
Hadoop, Apache Hadoop and development services. We have excellent experienced team who will work on it. Let me know when we start the work.
Looking forward from you.
Thanks