0

我有一个名为 User 的表,它有两列,一列称为visitorId,另一列称为friend字符串列表。我想检查是否VisitorId在好友列表中。谁能指导我如何访问地图函数中的表格列?我无法想象数据是如何从 hbase 中的 map 函数输出的。我的代码如下:

ublic class MapReduce {

static class Mapper1 extends TableMapper<ImmutableBytesWritable, Text> {

    private int numRecords = 0;
    private static final IntWritable one = new IntWritable(1);       
    private final IntWritable ONE = new IntWritable(1);
    private Text text = new Text();

    @Override
    public void map(ImmutableBytesWritable row, Result values, Context context) throws IOException {

        //What should i do here??
        ImmutableBytesWritable userKey = new ImmutableBytesWritable(row.get(), 0, Bytes.SIZEOF_INT);

        context.write(userkey,One);     
    }

            //context.write(text, ONE);
        } catch (InterruptedException e) {
            throw new IOException(e);
        }

    }
}



public static void main(String[] args) throws Exception {
    Configuration conf = HBaseConfiguration.create();
    Job job = new Job(conf, "CheckVisitor");
    job.setJarByClass(MapReduce.class);
    Scan scan = new Scan();
    Filter f = new RowFilter(CompareOp.EQUAL,new SubstringComparator("mId2"));
    scan.setFilter(f);
    scan.addFamily(Bytes.toBytes("visitor"));
    scan.addFamily(Bytes.toBytes("friend"));
    TableMapReduceUtil.initTableMapperJob("User", scan, Mapper1.class, ImmutableBytesWritable.class,Text.class, job);

}

}

4

1 回答 1

0

所以结果值实例将包含来自扫描仪的整行。要从结果中获取适当的列,我会执行以下操作:-

VisitorIdVal = value.getColumnLatest(Bytes.toBytes(columnFamily1), Bytes.toBytes("VisitorId"))

friendlistVal = value.getColumnLatest(Bytes.toBytes(columnFamily2), Bytes.toBytes("friendlist"))

这里的 VisitorIdVal 和friendlistVal 的类型是keyValue http://archive.cloudera.com/cdh/3/hbase/apidocs/org/apache/hadoop/hbase/KeyValue.html,要获取它们的值,你可以做一个字节。 toString(VisitorIdVal.getValue()) 从列中提取值后,您可以在“friendlist”中检查“VisitorId”

于 2012-04-19T18:35:24.160 回答