hbase - HBASE - 针对行键选择不同的查询

Question

我有一个名为“users”的 hbase 表，rowkey 由三部分组成：

用户身份
消息ID
时间戳

行键看起来像：${userid}_${messageid}_${timestamp}

鉴于我可以散列用户 ID 并使字段的长度固定，无论如何我可以执行 SQL 查询之类的查询：

select distinct(userid) from users

如果 rowkey 不允许我这样查询，这是否意味着我需要创建一个仅包含所有用户 ID 的单独表？我想如果我做这样的事情，当我插入一条记录时它就不再是原子的了，因为我正在处理两个没有事务的表。

score 2 · Accepted Answer

2

您可以这样做，但作为 map/reduce 作业而不是直接查询

于 2012-12-21T05:09:39.730 回答

score 0 · Accepted Answer

你可以使用 HashSet 来做到这一点。像这样的东西：

public Set<String> getDistinctCol(String tableName,String colFamilyName, String colName)
   {
    Set<String> set = new HashSet<String>();
    ResultScanner rs=null;
    Result r = null;
    String s = null;
    try 
    {
        HTable table = new HTable(conf, tableName);
        Scan scan = new Scan();
        scan.addColumn(Bytes.toBytes(colFamilyName),Bytes.toBytes(colName));
        rs = table.getScanner(scan);
        while((res=rs.next()) != null)
        {
            byte [] col = res.getValue(Bytes.toBytes(colFamilyName+":"+colName));                
            s = Bytes.toString(col);
            set.add(s);
        }
    } catch (IOException e) 
    {
        System.out.println("Exception occured in retrieving data");
    }
    finally
    {
        rs.close();
    }
    return set;

*col 在您的情况下是用户 ID。

高温高压

hbase - HBASE - 针对行键选择不同的查询

2 回答 2

Related

Reference