我只是很困惑,所有内置的可写对象,如 IntWritable、FloatWritable、GenericWritable 等是否默认使用原始比较器进行比较?如果没有,我们应该如何注册它们以使用 rawcomparator。
1 回答
如何获得RawComparator是在JobConf.getOutputKeyComparator:
public RawComparator getOutputKeyComparator() {
Class<? extends RawComparator> theClass = getClass("mapred.output.key.comparator.class",
null, RawComparator.class);
if (theClass != null)
return ReflectionUtils.newInstance(theClass, this);
return WritableComparator.get(getMapOutputKeyClass().asSubclass(WritableComparable.class));
}
Hadoop 将尝试RawComparator从mapred.output.key.comparator.class. 如果未设置,hadoop 将尝试将密钥类转换为WritableComparable,并使用它来创建一个WritableComparator. 因此,如果我们不设置客户RawComparator,我们输入WritableComparator.get。
public static synchronized
WritableComparator get(Class<? extends WritableComparable> c) {
WritableComparator comparator = comparators.get(c);
if (comparator == null) {
// force the static initializers to run
forceInit(c);
// look to see if it is defined now
comparator = comparators.get(c);
// if not, use the generic one
if (comparator == null) {
comparator = new WritableComparator(c, true);
}
}
return comparator;
}
在WritableComparator.get中,它首先会WritableComparator在地图comparators中搜索。
大多数内置Writables,例如 IntWritable,当它们被加载时,它们会调用define以将它们的WritableComparator(例如,org.apache.hadoop.io.IntWritable.Comparator)放入comparators. 因此,如果您想注册您的 custom RawComparator,您可以使用如下代码(您需要确保这些代码在您的Writable课程主体中):
static { // register this comparator
WritableComparator.define(IntWritable.class, new Comparator());
}
接下来,如果 aWritableComparable没有注册 a会发生什么WritableComparator?这是WritableComparator的默认行为。它将调用WritableComparable.compareTo来比较两个键。