Introduction

Bootstrap Maven project for running local (or cluster if you have one) MapReduce (v1/v2) jobs. I use Maven for partial dependency management, compilation, running MRUnit tests, and packaging a jar file.

Outline

manOfSteelReview - classic word count example, but using Man Of Steel review from various websites. Would like to eventually use Flume to ingest from Twitter streams hash tags
averageWordLength - another classic mapreduce sample, compute average word length
invertedIndex - create an inverted index of every word from list of files
logAnalysisCounter - example project to demonstrate MapReduce Counters functionality
logAnalysis - example project deonstrating MapReduce multiple partitioners, where individual logs files are generated on a month basis
tidem - text analyzer that processes text and provides information about its word contents. Generate key value pairs that shows a count of how many times each word occurs in the text. Result is Primary sort by word length, and a Secondary sort based on ASCII.

Setup

Note: For detailed instructions on setting up Hadoop in your local environment:

http://hadoop.apache.org/docs/stable/single_node_setup.html

Clone this repository and download Hadoop Distribution from here, you will need the libraries from ${hadoop_home}/lib dir http://hadoop.apache.org/releases.html
Add the Hadoop /lib libraries to your classpath in your IDE of choice
For example, if you would like to run "logAnalysis" MapReduce job:

run mvn clean package from the project directory (ex. ${user_workspace}/hadoop-mapreduce-local/code/logAnalysis/target

You should be able to run the mapreduce job using appropriate input and output directories

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
code		code
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Outline

Setup

About

Releases

Packages

Languages

krish-na/hadoop-mapreduce-local

Folders and files

Latest commit

History

Repository files navigation

Introduction

Outline

Setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages