0

我有一个 distcp 动作如下

 <action name="ExecuteDataCopyS3ToHDFS">
  <distcp xmlns="uri:oozie:distcp-action:0.2">
        <arg>-Dmapred.job.queue.name=dev</arg>
        <arg>-Dhadoop.security.credential.provider.path=jceks://hdfs/user/ABC/oneaws.jceks</arg>
        <arg>-update</arg>
        <arg>s3a://XXXX/</arg>
        <arg>s3n://XXXX/XXXX/</arg>
        </distcp>
    <ok to="end"/>
    <error to="create-error-file" />
</action>

我在 jceks 文件中添加了两个凭据,如下所示

hadoop credential create fs.s3a.access.key -provider localjceks://file/home/XXX/oneaws.jceks
hadoop credential create fs.s3a.secret.key -provider localjceks://file/home/XXX/oneaws.jceks

hadoop credential create fs.s3n.access.key -provider localjceks://file/home/XXX/oneaws.jceks
hadoop credential create fs.s3n.secret.key -provider localjceks://file/home/XXX/oneaws.jceks

s3a 凭证用于源 aws 位置,s3n 凭证用于目标。

当我运行 oozie 操作时,我得到一个异常,这是堆栈跟踪。

 2016-12-15 17:31:21,933 ERROR [main] org.apache.hadoop.tools.DistCp: Invalid arguments: 
org.apache.hadoop.security.AccessControlException: Permission denied: s3n://XXX/XXX/XXX
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:449)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:181)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at org.apache.hadoop.fs.s3native.$Proxy28.retrieveMetadata(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:476)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.tools.DistCp.setTargetPathExists(DistCp.java:217)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:116)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.oozie.action.hadoop.DistcpMain.run(DistcpMain.java:64)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.DistcpMain.main(DistcpMain.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
 Caused by: org.jets3t.service.impl.rest.HttpException
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:519)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:281)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:942)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2148)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2075)
at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1093)
at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:548)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:174)

当我使用相同的凭据从命令行测试 distcp 时,它工作得非常好。

4

1 回答 1

0

凭证提供程序不适用于 s3 或 s3n 的原因是因为它在您使用 s3n 时查找对 fs.s3n.awsAccessKeyId 和 fs.s3n.awsSecretAccessKey,而不是 fs.s3n.access.key 和 fs.s3n.secret.key . 它的 AWS 库并不一致。

第二个问题是,即使我们传递了正确的键值对,凭证实用程序也会以小写字母存储所有键。因此,当我们使用“fs.s3n.awsAccessKeyId”创建凭证提供程序时,它存储为“fs.s3n.awsaccesskeyid”。因此,当我们使用该凭据提供程序时,它找不到正确的密钥,因为它仍在寻找“fs.s3n.awsAccessKeyId”的值。

您将在此处看到错误:http: //grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-common/2.7.1/org/apache/hadoop/security/alias/AbstractJavaKeyStoreProvider .java?av=f

当我们有https://issues.apache.org/jira/browse/HADOOP-12548时,会有一些解决方法。

于 2016-12-20T21:35:41.397 回答