1

目标表kudu很大。我有以下内容,scala我想检查该行是否存在于kudu. 这四列是表中的主键,kudu但是当我定义一个上限时,我似乎得到了所有的行。

如何在 中选择特定行kudu?在这里,我希望只返回一行。

val table2 : KuduTable = kuduClient.openTable("event-sets")
    val eventColumns: util.List[String] = List(
      OccurrenceSchema.SetId.name,
      OccurrenceSchema.Period.name,
      OccurrenceSchema.Event.name,
      OccurrenceSchema.Date.name).asJava

     val end:PartialRow  = table2.getSchema.newPartialRow()
    end.addInt(OccurrenceSchema.Period.name,1476)
    end.addInt(OccurrenceSchema.SetId.name,82)
    end.addInt(OccurrenceSchema.Event.name,3195167)
    end.addLong(OccurrenceSchema.Date.name,1367922840000L)

    val kuduScanner: KuduScanner = kuduClient.newScannerBuilder(table2)
      .setProjectedColumnNames(eventColumns)
      .lowerBound(end)
      .exclusiveUpperBound((end))
      .build()

    assert(kuduScanner.hasMoreRows)
    while (kuduScanner.hasMoreRows) {
      val resultIterator: RowResultIterator = kuduScanner.nextRows()
      while (resultIterator.hasNext) {
        val result: RowResult = resultIterator.next()
        assert(result != null)
        logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name))
        logger.info(" : Period Value -- " + result.getInt(OccurrenceSchema.Period.name))
        logger.info(" : Event Value -- " + result.getInt(OccurrenceSchema.Event.name))
        logger.info(" : Date Value -- " + result.getLong(OccurrenceSchema.Date.name)) 
}
}
4

1 回答 1

2

据我了解,您正在表中查找 eaxcly 一条记录。使用扫描仪并定义边界和/或限制对我也不起作用。相反,我通过定义 KuduPredicate 解决了这个问题。下面你会找到我的解决方案。

val builder: KuduScannerBuilder = kuduClient.newScannerBuilder(table2)
// define columns, you want to select
builder.setProjectedColumnNames(eventColumns)

// add predicates to select a record by primary key
val pkPeriod: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Period.name), KuduPredicate.ComparisonOp.EQUAL, 1476)
builder.addPredicate(pkPeriod)
val pkSetId: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.SetId.name), KuduPredicate.ComparisonOp.EQUAL, 82)
builder.addPredicate(pkSetId)
val pkEvent: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Event.name), KuduPredicate.ComparisonOp.EQUAL, 3195167)
builder.addPredicate(pkEvent)
val pkDate: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Date.name), KuduPredicate.ComparisonOp.EQUAL, 1367922840000L)
builder.addPredicate(pkDate)

val kuduScanner: KuduScanner = builder.build()

while (kuduScanner.hasMoreRows) {
  val resultIterator: RowResultIterator = kuduScanner.nextRows()
  while (resultIterator.hasNext) {
    val result: RowResult = resultIterator.next()

    // do whatever you have to do with the selected record
    logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name))
  }
}

我是 Kudu 的新手,因此我不确定这个解决方案是否是最有效的解决方案。至少,它返回了预期的结果。

我的原始代码是用 Java 编写和测试的。我已经将它手动移植到 Scala,但到目前为止我还没有测试过!

于 2016-12-07T11:19:32.770 回答