mapreduce - Apache Gora Reducer 用于 Hbase 的多表输出

Question

我在通过 Nutch 爬网的 Hbase 表中有小数据。我们使用 Apache Gora 作为 ORM。我找到了很多示例（mapreduce）来处理 Hbase 中单个表中的数据。但我的问题是我必须将数据复制到多个表中（在减速器中）。如果没有 Gora，则存在一些指南，例如，this question等。但是如何为我的案例做这件事。

score 0 · Accepted Answer

我从来没有按照你的要求做，但你可能会在Gora 教程“构建工作”部分中看到答案。在那里，有一个减速器配置的例子，上面写着：

/* Mappers are initialized with GoraMapper.initMapper() or 
 * GoraInputFormat.setInput()*/
GoraMapper.initMapperJob(job, inStore, TextLong.class, LongWritable.class
    , LogAnalyticsMapper.class, true);

/* Reducers are initialized with GoraReducer#initReducer().
 * If the output is not to be persisted via Gora, any reducer 
 * can be used instead. */
GoraReducer.initReducerJob(job, outStore, LogAnalyticsReducer.class);

然后，GoraReducer.initReducerJob()您可以配置自己的减速器，而不是使用您的链接（如果它是正确答案）：

GoraMapper.initMapperJob(job, inStore, TextLong.class, LongWritable.class
    , LogAnalyticsMapper.class, true);
job.setOutputFormatClass(MultiTableOutputFormat.class);
job.setReducerClass(MyReducer.class);
job.setNumReduceTasks(2);
TableMapReduceUtil.addDependencyJars(job);
TableMapReduceUtil.addDependencyJars(job.getConfiguration());

知道在之前的示例中，映射器发出(TextLong, LongWritable)键值，因此您的减速器将类似于您编写的链接和答案：

public class MyReducer extends TableReducer<TextLong, LongWritable, Put> {

    private static final Logger logger = Logger.getLogger( MyReducer.class );

    @SuppressWarnings( "deprecation" )
    @Override
    protected void reduce( TextLong key, Iterable<LongWritable> data, Context context ) throws IOException, InterruptedException {
        logger.info( "Working on ---> " + key.toString() );
        for ( Result res : data ) {
            Put put = new Put( res.getRow() );
            KeyValue[] raw = res.raw();
            for ( KeyValue kv : raw ) {
                put.add( kv );
            }

        ImmutableBytesWritable key = new ImmutableBytesWritable(Bytes.toBytes("tableName"));
        context.write(key, put);    

        }
    }
}

再说一次，我从来没有这样做过......所以也许行不通：\

mapreduce - Apache Gora Reducer 用于 Hbase 的多表输出

1 回答 1

Related

Reference