0

在我的一个收藏中,假设我有以下字段:

f: frozen<tuple<text, set<text>>

假设我想使用 Scala 脚本在该特定字段为空、null、不存在等的位置插入一个条目,在插入之前我将条目的字段映射如下:

sRow("fk") = null // or None, or maybe I simply don't specify the field at all

尝试运行 spark 脚本(来自 Databricks,Spark 连接器版本 1.6)时,出现以下错误:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in stage 133.0 failed 1 times, most recent failure: Lost task 6.0 in stage 133.0 (TID 447, localhost): com.datastax.spark.connector.types.TypeConversionException: Cannot convert object null to com.datastax.spark.connector.TupleValue.
    at com.datastax.spark.connector.types.TypeConverter$$anonfun$convert$1.apply(TypeConverter.scala:47)
    at com.datastax.spark.connector.types.TypeConverter$$anonfun$convert$1.apply(TypeConverter.scala:43)

当使用None而不是null我仍然得到一个错误,虽然一个不同的:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 143.0 failed 1 times, most recent failure: Lost task 2.0 in stage 143.0 (TID 474, localhost): java.lang.IllegalArgumentException: requirement failed: Expected 2 components, instead of 0
    at scala.Predef$.require(Predef.scala:233)
    at com.datastax.spark.connector.types.TupleType.newInstance(TupleType.scala:55)

我知道 Cassandra 没有确切的 null 概念,但我知道在将条目插入 Cassandra 时有一种方法可以将值排除在外,就像我在其他环境中所做的那样,比如为 Cassandra 使用 nodejs 驱动程序。null在插入预期的 TupleValue 或某些用户定义的类型时,如何强制使用-like 值?

4

1 回答 1

0

使用现代版本的 Cassandra,您可以使用“未绑定”功能让它实际跳过空值。这可能最适合您的用例,因为编写null隐式写入墓碑。

请参阅 将空值视为未设置

//Setup original data (1, 1, 1) --> (6, 6, 6)
sc.parallelize(1 to 6).map(x => (x, x, x)).saveToCassandra(ks, "tab1")

val ignoreNullsWriteConf = WriteConf.fromSparkConf(sc.getConf).copy(ignoreNulls = true)
//These writes will not delete because we are ignoring nulls
val optRdd = sc.parallelize(1 to 6)
  .map(x => (x, None, None))
  .saveToCassandra(ks, "tab1", writeConf = ignoreNullsWriteConf)

val results = sc.cassandraTable[(Int, Int, Int)](ks, "tab1").collect

results
/**
  (1, 1, 1),
  (2, 2, 2),
  (3, 3, 3),
  (4, 4, 4),
  (5, 5, 5),
  (6, 6, 6)
**/

还有更细粒度的控件 Full Docs

于 2017-01-04T19:21:14.177 回答