1

伙计们,以下python脚本以

job state = FAILED

Last State Change: Access denied checking streaming input path: s3n://elasticmapreduce/samples/wordcount/input/

代码:

import boto
import boto.emr
from boto.emr.step import StreamingStep
from boto.emr.bootstrap_action import BootstrapAction
import time

S3_BUCKET="mytesetbucket123asdf"
conn = boto.connect_emr()

step = StreamingStep(
  name='Wordcount',
  mapper='s3n://elasticmapreduce/samples/wordcount/wordSplitter.py',
  reducer='aggregate',
  input='s3n://elasticmapreduce/samples/wordcount/input/',
  output='s3n://' + S3_BUCKET + '/wordcount/output/2013-10-25')

jobid = conn.run_jobflow(
    name="test",
    log_uri="s3://" + S3_BUCKET + "/logs/",
    visible_to_all_users="True",
    steps = [step],)

state = conn.describe_jobflow(jobid).state
print "job state = ", state
print "job id = ", jobid
while state != u'COMPLETED':
    print time.localtime()
    time.sleep(10)
    state = conn.describe_jobflow(jobid).state
    print conn.describe_jobflow(jobid)
    print "job state = ", state
    print "job id = ", jobid

print "final output can be found in s3://" + S3_BUCKET + "/output" + TIMESTAMP
print "try: $ s3cmd sync s3://" + S3_BUCKET + "/output" + TIMESTAMP + " ."
4

1 回答 1

0

问题出在某个地方……如果我们指定 IAM 用户而不是使用角色,那么工作就可以完美地工作。EMR 当然支持 IAM 角色......我们测试的 IAM 角色拥有执行任何任务的完全权限,所以这不是配置错误的问题......

于 2013-11-01T15:35:10.297 回答