0

我只是很困惑,所有内置的可写对象,如 IntWritable、FloatWritable、GenericWritable 等是否默认使用原始比较器进行比较?如果没有,我们应该如何注册它们以使用 rawcomparator。

4

1 回答 1

2

如何获得RawComparator是在JobConf.getOutputKeyComparator

  public RawComparator getOutputKeyComparator() {
    Class<? extends RawComparator> theClass = getClass("mapred.output.key.comparator.class",
            null, RawComparator.class);
    if (theClass != null)
      return ReflectionUtils.newInstance(theClass, this);
    return WritableComparator.get(getMapOutputKeyClass().asSubclass(WritableComparable.class));
  }

Hadoop 将尝试RawComparatormapred.output.key.comparator.class. 如果未设置,hadoop 将尝试将密钥类转换为WritableComparable,并使用它来创建一个WritableComparator. 因此,如果我们不设置客户RawComparator,我们输入WritableComparator.get

  public static synchronized 
  WritableComparator get(Class<? extends WritableComparable> c) {
    WritableComparator comparator = comparators.get(c);
    if (comparator == null) {
      // force the static initializers to run
      forceInit(c);
      // look to see if it is defined now
      comparator = comparators.get(c);
      // if not, use the generic one
      if (comparator == null) {
        comparator = new WritableComparator(c, true);
      }
    }
    return comparator;
  }

WritableComparator.get中,它首先会WritableComparator在地图comparators中搜索。

大多数内置Writables,例如 IntWritable,当它们被加载时,它们会调用define以将它们的WritableComparator(例如,org.apache.hadoop.io.IntWritable.Comparator)放入comparators. 因此,如果您想注册您的 custom RawComparator,您可以使用如下代码(您需要确保这些代码在您的Writable课程主体中):

  static {                                        // register this comparator
    WritableComparator.define(IntWritable.class, new Comparator());
  }

接下来,如果 aWritableComparable没有注册 a会发生什么WritableComparator这是WritableComparator的默认行为。它将调用WritableComparable.compareTo来比较两个键。

于 2013-09-25T05:43:45.650 回答