hadoop - Running a hadoop job

Question

It is the first time I'm running a job on hadoop and started from WordCount example. To run my job, I', using this command

hduser@ubuntu:/usr/local/hadoop$ bin/hadoop jar hadoop*examples*.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output

and I think we should copy the jar file in /usr/local/hadoop . My first question is that what is the meaning of hadoop*examples*? and if we want to locate our jar file in another location for example /home/user/WordCountJar, what I should do? Thanks for your help in advance.

score 1 · Accepted Answer

The examples is just wildcard expansion to account for different version numbers in the file name. For example: hadoop-0.19.2-examples.jar

You can use the full path to your jar like so:

bin/hadoop jar /home/user/hadoop-0.19.2-examples.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output

Edit: the asterisks surrounding the word examples got removed from my post at time of submission.

score 1 · Accepted Answer

I think we should copy the jar file in /usr/local/hadoop

It is not mandatory. But if you have your jar at some other location, you need to specify the complete path while running your job.

My first question is that what is the meaning of hadoop*examples*?

hadoop*examples* is the name of your jar package that contains your MR job along with other dependencies. Here, * signifies that it can be any version. Not specifically 0.19.2 or something else. But, I feel it should be hadoop-examples-*.jar and not hadoop*examples*.jar

and if we want to locate our jar file in another location for example /home/user/WordCountJar, what I should do?

If your jar is present in a directory other than the directory from where you are executing the command, you need to specify the complete path to your jar. Say,

bin/hadoop jar /home/user/WordCountJar/hadoop-*-examples.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output

hadoop - Running a hadoop job

2 回答 2

Related

Reference