hadoop - 我可以获得 Hadoop 的分区号吗？

Question

我是hadoop新手。

我想获得输出文件的分区号。

起初，我做了一个定制的分区器。

public static class MyPartitioner extends Partitioner<Text, LongWritable> {

    public int getPartition(Text key, LongWritable value, int numReduceTasks) {

    int numOfChars = key.toString().length();
        return numOfChars % numReduceTasks;
    }
}

有用。但是，我想在 Reducer 上“直观地”输出分区号。

我怎样才能得到一个分区号？

下面是我的减速器来源。

public static class MyReducer extends Reducer<Text, LongWritable, Text, Text>{

    private Text textList = new Text();

    public void reduce(Text key, Iterable<LongWritable> values, Context context)
      throws IOException, InterruptedException {

        String list = new String();

            for(LongWritable value: values) {
                list = new String(list + "\t" + value.toString());
            }

            textList.set(list);

            context.write(key, textList);

    }

}

我想分别在“列表”上放置一个分区号。将有“0”或“1”。

list = new String(list + "\t" + value.toString() + "\t" + ??);

如果有人帮助我，那就太好了。

+

感谢答案，我得到了解决方案。但是，它没有用，我认为我做错了什么。

下面是修改后的 MyPartitioner。

公共静态类 MyPartitioner 扩展 Partitioner {

    public int getPartition(Text key, LongWritable value, int numReduceTasks) {

        int numOfChars = key.toString().length();
        return numOfChars % numReduceTasks;

        private int bring_num = 0;      
        public void configure(JobConf job) {
            bring_num = jobConf.getInt(numOfChars & numReduceTasks);
}

    }

}

score 1 · Accepted Answer

将以下代码添加到 Reducer 类中，以获取类变量中的分区号，以后可以在 reducer 方法中使用该分区号。

String partition;
protected void setup(Context context) throws IOException,
    InterruptedException {
    Configuration conf = context.getConfiguration();
    partition = conf.get("mapred.task.partition");
}

hadoop - 我可以获得 Hadoop 的分区号​​吗？

1 回答 1

Related

Reference

hadoop - 我可以获得 Hadoop 的分区号吗？