I am an aws newbie. I created a cluster and ssh'ed into the master node. When I am trying to copy files from s3://my-bucket-name/ to local file://home/hadoop folder in pig using:
cp s3://my-bucket-name/path/to/file file://home/hadoop
i get the error:
2013-06-08 18:59:00,267 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 29 99: Unexpected internal error. AWS Access Key ID and Secret Access Key must be s pecified as the username or password (respectively) of a s3 URL, or by setting t he fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively).
I can not even ls into my s3 bucket. I set the AWS_ACCESS_KEY and AWS_SECRET_KEY without success. Also I could not locate config file for pig to set the appropriate fields.
Any help please?
Edit: I tried to load file in pig using the full s3n:// uri
grunt> raw_logs = LOAD 's3://XXXXX/input/access_log_1' USING TextLoader a
s (line:chararray);
grunt> illustrate raw_logs;
and I get the following error:
2013-06-08 19:28:33,342 [main] INFO org.apache.pig.backend.hadoop.executionengi ne.HExecutionEngine - Connecting to hadoop file system at: file:/// 2013-06-08 19:28:33,404 [main] INFO org.apache.pig.backend.hadoop.executionengi ne.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? fal se 2013-06-08 19:28:33,404 [main] INFO org.apache.pig.backend.hadoop.executionengi ne.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2013-06-08 19:28:33,405 [main] INFO org.apache.pig.backend.hadoop.executionengi ne.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2013-06-08 19:28:33,405 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job 2013-06-08 19:28:33,429 [main] INFO org.apache.pig.backend.hadoop.executionengi ne.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percen t is not set, set to default 0.3 2013-06-08 19:28:33,430 [main] ERROR org.apache.pig.pen.ExampleGenerator - Error reading data. Internal error creating job configuration. java.lang.RuntimeException: Internal error creating job configuration. at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java :160) at org.apache.pig.PigServer.getExamples(PigServer.java:1244) at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser. java:722) at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigS criptParser.java:591) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScript Parser.java:306) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.j ava:189) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.j ava:165) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:500) at org.apache.pig.Main.main(Main.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:187) 2013-06-08 19:28:33,432 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 29 97: Encountered IOException. Exception : Internal error creating job configurati on. Details at logfile: /home/hadoop/pig_1370719069857.log