1

How to load data which is in any format(like:json,xml,etc) into a hbase table ?Is there any specific output format which can be set in job config(java),so that any form of data can be loaded into hbase or is there any apps which can internally load any form of data into hbase?

4

2 回答 2

2

The data you are inserting to hbase tables should be in bytes.So, even if it is in XML or JSON, you should convert the same into bytes. The reverse logic should be applied while retrieving data from hbase tables. A Utility class containing the conversion logic will do the job.

于 2012-04-27T09:30:53.957 回答
0

当您想将数据存储在 HBase 中时,您需要做出一些额外的选择,它不仅仅是一个文件。例如,您需要确定键是什么、您将拥有哪些列族、它们的特征(例如压缩、使用 TTL 等)以及是否将所有输入存储在单个列中,或者解析它并将片段存储在不同的列中列。

这意味着您必须在存储数据之前进行一些处理,而不仅仅是作业配置。

也就是说,当您想创建一个将写入 HBase 的作业时,您可以通过 TableMapReduceUtil 告诉它涉及哪些表/表,如下所示:

    Job job = new Job(conf, "My Job");
    job.setJarByClass(Mymapred.class);

    Scan scan = new Scan();
    // set the scan parameters ..

    TableMapReduceUtil.initTableMapperJob(
            INPUT_TABLE_NAME,
            scan,
            MyMapper.class,Text.class,Result.class,
            job);

    TableMapReduceUtil.initTableReducerJob(
            OUTPUT_TABLE_NAME,
            MyReducer.class,
            job);
于 2012-05-02T12:54:34.640 回答