apache-spark - 使用自定义映射 ID 从 spark 写入弹性搜索时出错

翻译自：https://stackoverflow.com/questions/50241351 2018-05-08T19:54:57.910

1081 次

我正在尝试使用自定义映射 ID 编写从 spark 到 Elastic 的数据帧。当我这样做时，我收到以下错误。

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 14.0 failed 16 times, most recent failure: Lost task 0.15 in stage 14.0 (TID 860, ip-10-122-28-111.ec2.internal, executor 1): org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: [DataFrameFieldExtractor for field [[paraId]]] cannot extract value from entity [class java.lang.String] | instance

以下是用于写入 ES 的配置。

var config= Map("es.nodes"->node,
 "es.port"->port,
 "es.clustername"->clustername,
 "es.net.http.auth.user" -> login,
 "es.net.http.auth.pass" -> password,
 "es.write.operation" -> "upsert",
 "es.mapping.id" -> "paraId",
 "es.resource" -> "test/type")

df.saveToEs(config)

我使用的是 5.6 版本的 ES 和 2.2.0 的 Spark。让我知道你们是否对此有任何见解。

谢谢。！

apache-spark - 使用自定义映射 ID 从 spark 写入弹性搜索时出错

0 回答 0

Related

Reference