2

I am developing one project in hadoop using java. When i run my code(jar) on local cluster its working fine but when i run its on amazon multi cluster then it will give exception...

my code for mapreduce job....

job.setJarByClass(ReadActivityDriver.class);

            job.setMapperClass(ReadActivityLogMapper.class);
            job.setReducerClass(ReadActivityLogReducer.class);

            job.setMapOutputKeyClass(Text.class);
            job.setMapOutputValueClass(Text.class);

            job.setInputFormatClass(ColumnFamilyInputFormat.class);
            job.setOutputFormatClass(TextOutputFormat.class);

            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(Text.class);

            ConfigHelper.setInputRpcPort(job.getConfiguration(), pro.getProperty("port"));
            ConfigHelper.setInputInitialAddress(job.getConfiguration(), pro.getProperty("server"));
            ConfigHelper.setInputPartitioner(job.getConfiguration(), "org.apache.cassandra.dht.Murmur3Partitioner");
            ConfigHelper.setInputColumnFamily(job.getConfiguration(), keyspace, columnFamily);

            SlicePredicate predicate = new SlicePredicate().setColumn_names(cn);
            ConfigHelper.setInputSlicePredicate(job.getConfiguration(), predicate);

            FileSystem.get(job.getConfiguration()).delete(new Path("ReadOutput"), true);
            FileOutputFormat.setOutputPath(job, new Path("ReadOutput"));

            job.waitForCompletion(true);

Exception which i am getting...

8020/home/ubuntu/hdfstmp/mapred/staging/ubuntu/.staging/job_201405080944_0010
java.lang.RuntimeException: org.apache.thrift.TApplicationException: Invalid method name: 'describe_local_ring'
    at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:337)
    at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:125)
    at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
    at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
    at com.cassandra.readActivity.ReadActivityDriver.run(ReadActivityDriver.java:117)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at com.cassandra.readActivity.ReadActivityDriver.main(ReadActivityDriver.java:33)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'describe_local_ring'
    at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
    at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_local_ring(Cassandra.java:1277)
    at org.apache.cassandra.thrift.Cassandra$Client.describe_local_ring(Cassandra.java:1264)
    at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:329)
    ... 20 more
java.io.FileNotFoundException: File does not exist: /user/ubuntu/ReadOutput/part-r-00000;
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
    at com.cassandra.readActivity.ReadActivityMySql.calculatePoint(ReadActivityMySql.java:65)
    at com.cassandra.readActivity.ReadActivityDriver.main(ReadActivityDriver.java:36)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
java.io.FileNotFoundException: File does not exist: /user/ubuntu/ReadOutput/part-r-00000;
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
    at com.cassandra.readActivity.MySqlSavePoint.setSavePoint(MySqlSavePoint.java:66)
    at com.cassandra.readActivity.ReadActivityDriver.main(ReadActivityDriver.java:37)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
4

3 回答 3

3

看起来输入/输出格式 jar 和您的集群没有使用相同的 Cassandra 版本。您需要修复 jar,或者升级 AWS Cassandra 节点。

于 2014-05-09T15:28:38.120 回答
1

我认为您的 cassandra 分区中的问题尝试随机分区

ConfigHelper.setInputPartitioner(job.getConfiguration(),"org.apache.cassandra.dht.RandomPartitioner"); 
于 2014-05-14T11:49:54.657 回答
0

终于我得到了答案。。

使用随机分区

ConfigHelper.setInputPartitioner(job.getConfiguration(),"org.apache.cassandra.dht.RandomPartitioner");

而不是喃喃自语

ConfigHelper.setInputPartitioner(job.getConfiguration(),"org.apache.cassandra.dht.Murmur3Partitioner");
于 2014-05-14T11:46:17.267 回答