在 HBase 数据库中,我想通过使用附加的“链接”表来创建二级索引。我遵循了这个答案中给出的示例:使用协处理器 HBase 创建二级索引
我对 HBase 的整个概念不是很熟悉,并且我已经阅读了一些关于创建二级索引问题的示例。我仅将协处理器附加到单个表,如下所示:
disable 'Entry2'
alter 'Entry2', METHOD => 'table_att', 'COPROCESSOR' => '/home/user/hbase/rootdir/hcoprocessors.jar|com.acme.hobservers.EntryParentIndex||'
enable 'Entry2'
它的源代码如下:
public class EntryParentIndex extends BaseRegionObserver{
private static final Log LOG = LogFactory.getLog(CoprocessorHost.class);
private HTablePool pool = null;
private final static String INDEX_TABLE = "EntryParentIndex";
private final static String SOURCE_TABLE = "Entry2";
@Override
public void start(CoprocessorEnvironment env) throws IOException {
pool = new HTablePool(env.getConfiguration(), 10);
}
@Override
public void prePut(
final ObserverContext<RegionCoprocessorEnvironment> observerContext,
final Put put,
final WALEdit edit,
final boolean writeToWAL)
throws IOException {
try {
final List<KeyValue> filteredList = put.get(Bytes.toBytes ("data"),Bytes.toBytes("parentId"));
byte [] id = put.getRow(); //Get the Entry ID
KeyValue kv=filteredList.get( 0 ); //get Entry PARENT_ID
byte[] parentId=kv.getValue();
HTableInterface htbl = pool.getTable(Bytes.toBytes(INDEX_TABLE));
//create row key for the index table
byte[] p1=concatTwoByteArrays(parentId, ":".getBytes()); //Insert a semicolon between two UUIDs
byte[] rowkey=concatTwoByteArrays(p1, id);
Put indexput = new Put(rowkey);
//The following call is setting up a strange? recursion, resulting
//...in thesame prePut method invoken again and again. Interestingly
//...the recursion limits itself up to 6 times. The actual row does not
//...get inserted into the INDEX_TABLE
htbl.put(indexput);
htbl.close();
}
catch ( IllegalArgumentException ex) { }
}
@Override
public void stop(CoprocessorEnvironment env) throws IOException {
pool.close();
}
public static final byte[] concatTwoByteArrays(byte[] first, byte[] second) {
byte[] result = Arrays.copyOf(first, first.length + second.length);
System.arraycopy(second, 0, result, first.length, second.length);
return result;
}
}
这在我对 SOURCE_TABLE 执行 put 时执行。代码中有一条注释(请自行查找):“下面的调用是在设置一个奇怪的”。
我在日志中设置了调试打印,确认 prePut 方法仅在 SOURCE_TABLE 上执行,而从未在 INDEX_TABLE 上执行。然而,我不明白为什么会发生这种奇怪的递归,尽管在协处理器中我只执行了一个放在 INDEX_TABLE 上的操作。
我还确认源表上的 put 操作再次只有一个。